Method for calculating statement similarity

A technology of similarity and cosine similarity, applied in computing, semantic analysis, natural language data processing, etc., can solve problems such as slow performance, improve performance, avoid text retrieval, and save server resources

Inactive Publication Date: 2020-06-05
江苏艾佳家居用品有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

So how to find the most similar problem in the massive knowledge? The traditional method is either a linear comparison one by one. As the number of knowledge items increases, the performance will become slower and slower; or a part of the data will be matched through the search engine first, and then through a linear comparison, the search engine can only match until there are identical words. Words often ignore different descriptions of the same semantics

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for calculating statement similarity
  • Method for calculating statement similarity
  • Method for calculating statement similarity

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0054] Below in conjunction with accompanying drawing, technical scheme of the present invention is described in further detail:

[0055] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention

[0056] Such as figure 1 As shown, a method for calculating sentence similarity specifically includes the following steps;

[0057] Step 1, prepare data set Q, where, Q={Q 1 , Q 2 , Q 3 ,...,Q i};

[0058]Step 2, preprocessing each sentence of the data set;

[0059] Step 3, train on the preprocessed sentence set to obtain word...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for calculating the statement similarity, and relates to the technical field of statement analysis. The method includes: preparing a data set, and preprocessing each statement of the data set; training the preprocessed statement set to obtain a word vector; calculating the similarity of the word vectors of the two statements; performing hierarchical clustering by using sentence vectors of the sentences to obtain a knowledge entry tree; and performing knowledge recommendation on the target statement. According to the method, linear similarity comparison is avoided through clustering, the comparison range is reduced, and the performance is improved; the semantic features of the statements are reserved while the performance is improved, and simple character retrieval is avoided.

Description

technical field [0001] The invention relates to the technical field of sentence analysis, in particular to a method for calculating sentence similarity. Background technique [0002] Question Answering System (QA) is an advanced form of information retrieval system, which can answer questions raised by users in natural language with accurate and concise natural language. A relatively simple way to implement a question answering system is to implement it through a knowledge base. You only need to match the most similar questions to recommend answers. So how to find the most similar problem in the massive knowledge? The traditional way is either a linear comparison one by one, with the increase of knowledge items, the performance will become slower and slower; or first match a part of the data through the search engine, and then through the linear comparison, the search engine can only match the same word Word sentences often ignore such sentences with different descriptions...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/332G06F16/33G06F40/279G06F40/30
CPCG06F16/3329G06F16/3344
Inventor 陈旋王冲崇传兵
Owner 江苏艾佳家居用品有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products