Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and device for discovering new word

A new word discovery and word technology, applied in the field of intelligent interaction, can solve the problems that the accuracy of new word discovery needs to be improved, and achieve the effect of improving update efficiency, reducing the amount of calculation, and reducing the amount of calculation

Active Publication Date: 2016-01-06
SHANGHAI XIAOI ROBOT TECH CO LTD
View PDF5 Cites 16 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The accuracy of new word discovery in the existing technology needs to be improved

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for discovering new word
  • Method and device for discovering new word
  • Method and device for discovering new word

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0066] The inventor found that in the received corpus, there will be a class of special nouns. If only the candidate data strings of this type of nouns are judged in the same way as other candidate data strings, the nouns of this type of nouns will be excluded. Candidate data string. However, in practical applications, the candidate data strings of such nouns need to be regarded as new words. Therefore, if all candidate data strings are judged in the same way, the accuracy of the new words obtained needs to be improved.

[0067] In the embodiment of the present invention, by judging the candidate data strings, the candidate data strings are judged, and the candidate data strings are divided into specific candidate data strings and non-specific candidate specific data strings, wherein the specific candidate data strings include the aforementioned special nouns, that is, The base noun, and the relative specific relative position words of the base noun are nouns or adjectives. ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method and a device for discovering a new word. The method comprises the following steps: carrying out a pretreatment on a received corpus to obtain text data; carrying out a branching treatment on the text data to obtain statement data; carrying out a segmentation treatment on the statement data according to individual words included in a basic dictionary to obtain segmented word data; carrying out a combination treatment on the adjacent segmented word data to generate a candidate data string; judging whether the candidate data string is a specific candidate data string or not, wherein the specific candidate data string comprises a basic noun, and the word at a specific opposite position of the basic noun is a noun or an adjective; and carrying out a judging treatment on the candidate data string to discover the new word. According to the method and the device, the accuracy rate of discovering the new word can be improved.

Description

technical field [0001] The invention relates to the field of intelligent interaction, in particular to a new word discovery method and device. Background technique [0002] In many fields of Chinese information processing, it is necessary to complete corresponding functions based on dictionaries. For example, in an intelligent retrieval system or an intelligent dialogue system, through word segmentation, question retrieval, similarity matching, determination of retrieval results or intelligent dialogue answers, etc., each process is calculated by using words as the smallest unit, and the basis of calculation is Word dictionary, so the word dictionary has a great impact on the performance of the entire system. [0003] The progress and changes of social culture and the rapid development of economy and commerce often drive the change of language, and the most rapid manifestation of language change is the emergence of new words. Especially in a specific field, whether the wor...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
CPCG06F16/3334G06F16/3335G06F16/335
Inventor 张昊朱频频
Owner SHANGHAI XIAOI ROBOT TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products