New word discovery method and device

A new word discovery and word technology, applied in the field of intelligent interaction, can solve the problems that the accuracy of new word discovery needs to be improved, and achieve the effect of improving update efficiency, reducing the amount of calculation, and reducing the amount of calculation

Active Publication Date: 2018-06-05
SHANGHAI XIAOI ROBOT TECH CO LTD
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The accuracy of new word discovery in the existing technology needs to be improved

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • New word discovery method and device
  • New word discovery method and device
  • New word discovery method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0066] The inventor found that in the received corpus, there will be a class of special nouns. If only the candidate data strings of this type of nouns are judged in the same way as other candidate data strings, the nouns of this type of nouns will be excluded. Candidate data string. However, in practical applications, the candidate data strings of such nouns need to be regarded as new words. Therefore, if all candidate data strings are judged in the same way, the accuracy of the new words obtained needs to be improved.

[0067] In the embodiment of the present invention, by judging the candidate data strings, the candidate data strings are judged, and the candidate data strings are divided into specific candidate data strings and non-specific candidate specific data strings, wherein the specific candidate data strings include the aforementioned special nouns, that is, The base noun, and the relative specific relative position words of the base noun are nouns or adjectives. ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A method and device for discovering new words, said method comprising: preprocessing received corpus to obtain text data; performing line-by-line processing on said text data to obtain sentence data; The sentence data is subjected to word segmentation processing to obtain word data after word segmentation; adjacent word data after word segmentation is combined to generate candidate data strings; judging whether the candidate data strings are specific candidate data strings, The specific candidate data string includes basic nouns, and the words at specific relative positions of the basic nouns are nouns or adjectives; judging the candidate data strings to find new words. The method and device can improve the accuracy rate of new word discovery.

Description

technical field [0001] The invention relates to the field of intelligent interaction, in particular to a new word discovery method and device. Background technique [0002] In many fields of Chinese information processing, it is necessary to complete corresponding functions based on dictionaries. For example, in an intelligent retrieval system or an intelligent dialogue system, through word segmentation, question retrieval, similarity matching, determination of retrieval results or intelligent dialogue answers, etc., each process is calculated by using words as the smallest unit, and the basis of calculation is Word dictionary, so the word dictionary has a great impact on the performance of the entire system. [0003] The progress and changes of social culture and the rapid development of economy and commerce often drive the change of language, and the most rapid manifestation of language change is the emergence of new words. Especially in a specific field, whether the wor...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
CPCG06F16/3334G06F16/3335G06F16/335
Inventor 张昊朱频频
Owner SHANGHAI XIAOI ROBOT TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products