Computation method of short text similarity based on semantics

A technology for calculating semantic similarity and similarity, which is applied in the field of similarity calculation of short text based on semantics, which can solve the problems of short text and interference, and achieve the effects of improving accuracy, concise recognition, and reducing the proportion of discriminant imbalances.

Active Publication Date: 2020-11-24
UNIV OF ELECTRONICS SCI & TECH OF CHINA
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] The purpose of the present invention is: in order to solve the problem that the existing technology cannot effectively solve the short text length and cause the serious interference of individual noise words to the semantic analysis of the entire short text, the present invention proposes a semantic-based short text similarity calculation method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Computation method of short text similarity based on semantics
  • Computation method of short text similarity based on semantics
  • Computation method of short text similarity based on semantics

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0031] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0032] Such as figure 1 As shown in FIG. 2 , it is a schematic flow chart of the semantic-based short text similarity calculation method of the present invention. A semantic-based short text similarity calculation method, comprising the following steps:

[0033] A. Preprocess the corpus data, and establish word Embedding according to the word2vec hyperparameters;

[0034] B, adopt the hierarchical clustering method to construct the word semantic tree of the corpus;

[0035] C, calculate the semantic similarity between words in the short text according to the inconsistency rate of each connection...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a semantic-based short text similarity calculation method. It includes corpus data preprocessing and establishment of word Embedding, construction of word semantic tree, calculation of semantic similarity between words in short texts, and calculation of semantic similarity between short texts. The present invention is based on word embedding of deep learning, combines hierarchical clustering method to create word semantic tree, calculates the similarity between words in short text, and combines various features of short text on this basis, calculates short text The semantic similarity between words effectively solves the shortcoming that the word semantic tree in the prior art cannot describe the semantic relationship between new words and known words.

Description

technical field [0001] The invention belongs to the technical field of short text similarity calculation, in particular to a semantic-based short text similarity calculation method. Background technique [0002] The calculation of semantic similarity between short texts has theoretical research value and application background in the fields of artificial intelligence, natural language processing, cognition, semantics, psychology, bioinformatics and so on. Using short text similarity can well overcome the information redundancy in the corpus. At present, many studies have shown that short text similarity calculation can facilitate many natural language processing tasks, such as event detection, information retrieval, text normalization, automatic text summarization, text classification and clustering, etc. The application fields of short text similarity calculation are very extensive, and a good semantic similarity calculation method can greatly improve the performance of ma...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/30G06F40/194G06F16/35
CPCG06F16/35G06F40/194G06F40/30
Inventor 费高雷胡馨月胡光岷
Owner UNIV OF ELECTRONICS SCI & TECH OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products