Text similarity measuring method based on semantic analysis and semantic relation network

A technology of text similarity and semantic relationship, applied in the field of text similarity measurement, it can solve the problems of low measurement accuracy, not considering the role and status of words, etc., and achieve the effect of improving the accuracy.

Active Publication Date: 2013-05-08
SHANDONG ZHONGFU INFORMATION IND
View PDF10 Cites 21 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At present, the more typical text similarity methods are mainly based on semantic understanding and mathematical statistics, but the problem with both methods is that they do not consider the role and status of words on text similarity measurement.
Therefore, the accuracy of the measurement is not high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text similarity measuring method based on semantic analysis and semantic relation network
  • Text similarity measuring method based on semantic analysis and semantic relation network
  • Text similarity measuring method based on semantic analysis and semantic relation network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0045] Embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings.

[0046] see figure 1 , 2 , 3, the present embodiment is based on the text similarity measurement method of semantic analysis and semantic relationship network, and it is carried out as follows:

[0047] 1. Input two texts and preprocess the two texts. Preprocessing includes Chinese word segmentation and removal of stop words. The result of preprocessing is a collection of vocabulary.

[0048] Second, calculate the lexical semantic similarity of the two text preprocessing results, and construct a semantic relationship network according to the calculation results. The semantic similarity between two words can be calculated according to a specific semantic dictionary or semantic library, but the results need to be normalized.

[0049] In this step, if figure 2 As shown, the nodes to construct the semantic relationship network are composed of the prepr...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a text similarity measuring method based on semantic analysis and a semantic relation network. First, two texts are input and are preprocessed, and a preprocessed result is a collection of vocabularies; second, the lexical semantic similarity of the preprocessed result of the two texts is calculated, and the semantic relation network in constructed according to a calculation result; third, flow medium numerical values of nodes in the semantic relation network are respectively calculated, and characteristic collection of the two texts is obtained; fourth, a bipartite graph is constructed according to the characteristic collection of the two texts, and a path weight number between two parts of the bipartite graph is set; fifth, the similarity between the two texts is calculated by utilizing a bipartite graph optimal matching method. The text similarity measuring method based on the semantic analysis and the semantic relation network can be used for the technical field of data mining and information retrieval technology such as text clustering and information retrieval. Compared with other existing text similarity calculating methods, the accuracy degree of text similarity calculation is greatly improved.

Description

technical field [0001] The invention belongs to the technical field of text similarity measurement methods, in particular to a text similarity measurement method based on semantic analysis and semantic relationship network. Background technique [0002] With the rapid development of network information, how to quickly and accurately obtain useful information from massive text information resources has become an urgent problem to be solved in the field of data mining and information retrieval technology. [0003] The measurement of text similarity can be used in the technical fields of data mining and information retrieval. For example, the calculation of text similarity is a key step in text clustering, information retrieval, and automatic question answering. At present, the more typical text similarity methods are mainly based on semantic understanding and mathematical statistics, but the problem with both methods is that they do not consider the role and status of words on...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27
Inventor 吴国华尤金朋张祯王玉娟邵根富
Owner SHANDONG ZHONGFU INFORMATION IND
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products