Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and device for measuring semantic similarity of Chinese words

A technology of semantic similarity and measurement method, applied in the field of measurement method and device of semantic similarity of Chinese words, to achieve the effect of improving measurement accuracy, fault tolerance and easy operation

Active Publication Date: 2020-06-16
INST OF AUTOMATION CHINESE ACAD OF SCI
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In order to solve the above-mentioned problems in the prior art, that is, in order to solve the technical problem of the accuracy rate of the calculation of the semantic similarity of Chinese words based on word vectors, the present invention provides a method and device for measuring the semantic similarity of Chinese words

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for measuring semantic similarity of Chinese words
  • Method and device for measuring semantic similarity of Chinese words
  • Method and device for measuring semantic similarity of Chinese words

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0050] Preferred embodiments of the present invention are described below with reference to the accompanying drawings. Those skilled in the art should understand that these embodiments are only used to explain the technical principles of the present invention, and are not intended to limit the protection scope of the present invention.

[0051] In the present invention, a natural language model and a migration vector model are designed, which are respectively used to extract the initial word vector of Chinese words and the migration vector of Chinese words, and the word vector is improved by the K adjacent algorithm and the K-means algorithm, so that the word vector can better contain The semantic information of Chinese words improves the accuracy of the semantic similarity of Chinese words by changing the calculation of the similarity of the semantics of Chinese words into the calculation of the similarity of the transfer vector of Chinese words.

[0052] The method for measu...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the technical field of natural language processing, in particular to a Chinese word semantic similarity measuring method and device. The problem that Chinese word semantic similarity is not accurate is solved. The measuring method comprises the steps that a K nearest algorithm is adopted for calculating K neighbor word vectors of an initial word vector corresponding to Chinese words; a K-mean algorithm is adopted for calculating the initial word vector and the center vector of the K neighbor word vectors thereof; according to the initial word vector, the center vectorand a preset migration vector model g, the migration vector of the Chinese words is calculated, wherein the migration vector model g=alpha*m+beta*p, alpha and beta are preset parameters, m is the initial word vector, and p is the center vector; according to the migration vector corresponding to different Chinese words, the semantic similarity of different Chinese words is calculated. The Chinese word semantic similarity calculation accuracy is improved, the word vector can contain more word semantic information, and the system fault tolerance is improved.

Description

technical field [0001] The invention relates to the technical field of natural language processing, in particular to a method and device for measuring the semantic similarity of Chinese words. Background technique [0002] Natural Language Processing (Natural Language Processing, NLP) is an important research field of artificial intelligence. Its basic target technology is to enable computers to have human language functions such as listening, speaking, reading, and writing, and word semantic similarity calculation technology is a natural Key technologies in the field of language processing. At present, word semantic similarity calculation techniques mainly include methods based on corpus statistics, similarity calculation methods based on dictionaries, and similarity calculation methods based on word vectors. [0003] Specifically, the method based on corpus statistics is to calculate the semantic similarity between words by counting large-scale corpus and using the probab...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/30G06F16/33G06F16/35
CPCG06F16/3344G06F16/355G06F40/30
Inventor 李长亮马腾程健
Owner INST OF AUTOMATION CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products