Control method and device for calculating Chinese word semantic similarity

A technology of semantic similarity and control method, which is applied in the field of calculating the semantic similarity of Chinese words, can solve the problems of lack of research results and unsatisfactory results, and achieve the effect of improving accuracy

Inactive Publication Date: 2013-03-06
EAST CHINA NORMAL UNIV
View PDF4 Cites 26 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, due to its own particularity, Chinese is more complicated than English in terms of word segmentation, grammar, and semantics, so the research results are relatively scarce, and the results are not very ideal.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Control method and device for calculating Chinese word semantic similarity
  • Control method and device for calculating Chinese word semantic similarity
  • Control method and device for calculating Chinese word semantic similarity

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] Other characteristics, objects and advantages of the present invention will become more apparent by reading the detailed description of non-limiting embodiments made with reference to the following drawings:

[0024] figure 1 A flowchart showing the control method for calculating the semantic similarity of Chinese words according to the first embodiment of the present invention. Specifically, this figure shows six steps for calculating the semantic similarity of Chinese words. The first step is step S201, the control device obtains the Chinese word pair to be calculated, wherein the Chinese word pair includes the first word and the second word, and generates corresponding words corresponding to the word according to the word pair A feature vector, wherein the feature vectors are respectively a first feature vector corresponding to the first word and a second feature vector corresponding to the second word, and the feature vector is passed through the text containing th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a control method for calculating Chinese word semantic similarity. The method is combined with a method based on Chinese thesaurus and based on statistic. The method is characterized by comprising the following steps of: a, obtaining a word pair, and obtaining a corresponding feature vector according to the word pair; b, carrying out semantic expansion on the feature vector so as to generate an expanded feature vector; c, carrying out semantic mapping on the feature vector so as to generate a mapped feature vector; and d, calculating the word similarity according to the expanded feature vector and the mapped feature vector.

Description

technical field [0001] The invention relates to the field of text mining, in particular to a method for calculating the semantic similarity of Chinese words. Background technique [0002] Word semantic similarity is an important topic in the field of information processing. It is widely used in word sense disambiguation, machine translation, automatic response, intelligence retrieval, text clustering and other applications. However, word similarity is a highly subjective concept, and how to obtain a similarity close to the human judgment standard is a very difficult task. [0003] Existing word similarity calculations can be roughly divided into two categories: one is based on some kind of world knowledge, and the other is based on large-scale corpus for statistical calculation. The former is based on the semantic dictionary organized by the hierarchical relationship between concepts, and uses the upper-lower relationship and the same relationship between concepts in this k...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27
Inventor 杨燕吴雯吴奔斌霍晓骏王伟杰洪磊张波崔永利贺樑宋树彬
Owner EAST CHINA NORMAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products