Dialogue short text clustering method based on form and semantic similarity
Patent Information
- Authority / Receiving Office
- CN Β· China
- Current Assignee / Owner
- EAST CHINA NORMAL UNIV
- Publication Date
- 2014-08-27
- Estimated Expiration
- Not applicable Β· inactive patent
Smart Images
Figure 1 Figure 2 Figure 3
Abstract
Description
technical field
[0001] The invention belongs to the technical field of short text clustering, and relates to a method for clustering short texts of dialogues based on the similarity of string edit distance and the semantic similarity of words. Background technique
[0002] With the rapid development of mobile communication and mobile Internet, various human-machine intelligent dialogue systems have emerged, such as Siri, google now, Xiaoi robot, etc. Taking Xiaoi Robot as an example, the number of users has exceeded 100 million, and there are 10 billion dialogue visits every year and a large amount of valuable dialogue text data are generated. These data are important data sources for user interest mining and knowledge base improvement of intelligent dialogue systems. Clustering analysis on these dialogue text data can gather similar dialogue texts and form several important cluster centers, which can improve the efficiency of mining user interests and extracting knowledge t...