Supercharge Your Innovation With Domain-Expert AI Agents!

Synonym discovery method and device as well as data processing method and device

A technology for discovering methods and synonyms, applied in the field of data processing, can solve the problems of low accuracy of semantic understanding, recognition of acronyms, etc., and achieve the effect of improving efficiency and accuracy

Active Publication Date: 2016-11-16
SHANGHAI XIAOI ROBOT TECH CO LTD
View PDF7 Cites 21 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, the synonym discovery methods in the prior art cannot recognize acronyms well, so that the accuracy of semantic understanding is low

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Synonym discovery method and device as well as data processing method and device
  • Synonym discovery method and device as well as data processing method and device
  • Synonym discovery method and device as well as data processing method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0044] Abbreviated forms of proper names often appear in Chinese written and daily expressions. The words in these abbreviated forms are called abbreviations of proper names. Abbreviations are part of the original proper names, and abbreviations are also a type of synonyms. For example, "National People's Congress" is an abbreviation of "National People's Congress", "China" is an abbreviation of "People's Republic of China", "Real Madrid" is an abbreviation of "Real Madrid" and so on. However, the synonym discovery methods in the prior art cannot recognize acronyms well, so that the accuracy of semantic understanding is low.

[0045] The embodiment of the present invention obtains the set of phrases to be processed; for any word to be processed in the set of phrases, when there are one or more target words in the set of phrases, so that the When the minimum edit distance is less than the preset threshold, the word to be processed and the corresponding target word are determined ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a synonym discovery method and device as well as a data processing method and device. The synonym discovery method comprises the following steps of: obtaining a to-be-processed word group set, wherein the word group set comprises a plurality of words; and for any to-be-processed word in the word group set, when one or more target words exist in the word group set and the minimum edition distance from the to-be-processed word to the target words is smaller than a preset threshold value, determining a target word corresponding to the to-be-processed word as a synonym word pair, wherein the minimum edition distance is calculated through an edition distance method, the edition distance method comprises a deletion operation, an edition distance corresponding to the deletion operation is smaller than edition distances corresponding to other operations, the edition distance corresponding to the deletion operation is greater than the preset threshold value, and the edition distance of each of the other operations is greater than or equal to the preset threshold value. Through above scheme, the correctness of discovering abbreviations can be improved.

Description

technical field [0001] The invention relates to the field of data processing, in particular to a synonym discovery method and device, and a data processing method and device. Background technique [0002] The synonymous relationship is a very important semantic relationship, which is often used in natural language processing tasks such as information retrieval and text classification. Specifically, before processing tasks such as information retrieval or text classification, it is necessary to obtain synonyms and identify synonyms. For example, in the application scenario of information retrieval, multiple words belonging to synonyms can be classified into one category. When there are keywords with synonyms in the input text, the synonyms can be used instead of the original keywords to search, so that the retrieval system can Provide the user with more text to confirm. [0003] Abbreviated forms of proper names often appear in Chinese written and daily expressions. The wor...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/27
CPCG06F40/247
Inventor 张昊朱频频
Owner SHANGHAI XIAOI ROBOT TECH CO LTD
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More