Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Short text similarity calculation method based on semantics

A semantic similarity and similarity calculation technology, applied in the field of semantic-based short text similarity calculation, can solve the problems of short text and interference, and achieve the effects of improving accuracy, concise recognition, and easy expansion

Active Publication Date: 2017-02-01
UNIV OF ELECTRONIC SCI & TECH OF CHINA
View PDF4 Cites 25 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] The purpose of the present invention is: in order to solve the problem that the existing technology cannot effectively solve the short text length and cause the serious interference of individual noise words to the semantic analysis of the entire short text, the present invention proposes a semantic-based short text similarity calculation method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Short text similarity calculation method based on semantics
  • Short text similarity calculation method based on semantics
  • Short text similarity calculation method based on semantics

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0031] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0032] like figure 1 As shown in FIG. 2 , it is a schematic flow chart of the semantic-based short text similarity calculation method of the present invention. A semantic-based short text similarity calculation method, comprising the following steps:

[0033] A. Preprocess the corpus data, and establish word Embedding according to the word2vec hyperparameters;

[0034] B, adopt the hierarchical clustering method to construct the word semantic tree of the corpus;

[0035] C, calculate the semantic similarity between words in the short text according to the inconsistency rate of each connection in...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a short text similarity calculation method based on semantics. The short text similarity calculation method comprises the following steps of: preprocessing corpus data, establishing word Embedding, constructing a word semantic tree, calculating semantic similarity between words in a short text, and calculating the semantic similarity between the short texts. On the basis of deep learning word Embedding, a hierarchical clustering method is combined to create the word semantic tree and calculate the similarity between words in the short text, on the basis, various characteristics of the short text are combined to calculate the semantic similarity between the short texts, and the defect in the prior art that the word semantic tree can not describe a semantic relationship between a fresh word and a known word is effectively solved.

Description

technical field [0001] The invention belongs to the technical field of short text similarity calculation, in particular to a semantic-based short text similarity calculation method. Background technique [0002] The calculation of semantic similarity between short texts has theoretical research value and application background in the fields of artificial intelligence, natural language processing, cognition, semantics, psychology, bioinformatics and so on. Using short text similarity can well overcome the information redundancy in the corpus. At present, many studies have shown that short text similarity calculation can facilitate many natural language processing tasks, such as event detection, information retrieval, text normalization, automatic text summarization, text classification and clustering, etc. The application fields of short text similarity calculation are very extensive, and a good semantic similarity calculation method can greatly improve the performance of ma...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/27G06F17/22G06F17/30
CPCG06F16/35G06F40/194G06F40/30
Inventor 费高雷胡馨月胡光岷
Owner UNIV OF ELECTRONIC SCI & TECH OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products