Unsupervised automatic extraction method of microblog new words based on repeated word strings
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- HEFEI UNIV OF TECH
- Publication Date
- 2014-03-26
- Estimated Expiration
- Not applicable · inactive patent
Smart Images
Figure 1 Figure 2 Figure 3
Abstract
Description
technical field
[0001] The invention belongs to the technical field of new word retrieval methods, and relates to a non-supervised automatic extraction method of microblog new words based on repeated word strings. Background technique
[0002] New word recognition is one of the main problems plaguing the field of Chinese word segmentation, and with the development of Weibo, the speed of the emergence of new words has been accelerated. Unsupervised automatic recognition of new words is crucial for other natural language processing tasks. Automatic segmentation of Chinese text is an important basic work in the field of natural language processing. The identification and processing of new words is one of the difficulties that restrict the further improvement of the accuracy of the Chinese word segmentation system. At present, the research on new word extraction mainly focuses on the extraction of entity nouns, especially the extraction of names of people, places, and institut...