Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Semantic-based text retrieval method

A text and semantic technology, applied in the field of information retrieval, can solve problems such as inability to distinguish one word with multiple meanings or multiple words with one meaning, omission, wrong selection of retrieval results, etc., to achieve the effect of ensuring search efficiency and practicability

Inactive Publication Date: 2014-12-03
ANHUI HUAZHEN INFORMATION SCI & TECH
View PDF4 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Based on the problems existing in the background technology, the present invention proposes a text retrieval method based on semantics, which solves the problem of inability to distinguish one word with multiple meanings or multiple words with one meaning during the retrieval process, resulting in wrong selection or omission of retrieval results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Semantic-based text retrieval method
  • Semantic-based text retrieval method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] refer to figure 1 , a semantic-based text retrieval method proposed by the present invention, which converts words into concepts, uses the concepts of words to replace words for indexing and retrieval, and sorts the retrieved documents to avoid polysemy and polysemy. misleading search results.

[0025] The retrieval method of the present invention specifically comprises the following steps:

[0026] S1. Establish a concept tree according to the concept of words, and calculate a word similarity matrix;

[0027] S2. Referring to the prefabricated ontology, extract the concept of the target document, and perform index processing on the target document according to the concept, and generate an index file;

[0028] S3. Carry out word segmentation for the user's initial query, find similar items with a query word similarity greater than the threshold value M in the word similarity matrix, and add the similar items to the user query in an "or" relationship;

[0029] S4. The...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a semantic-based text retrieval method, and solves the problems that polysemy or synonymy cannot be distinguished and retrieval results are mistakenly selected or missed in the retrieval process. According to the method, concepts of words replace words for searching and retrieval, and the retrieved files are sorted; the method specifically comprises steps as follows: S1, a concept tree is established according to the concepts of words, and a word similarity matrix is calculated; S2, the concept of a target file is extracted in reference of a preset body, the target file is subjected to indexing processing according to the concept, and an index file is generated; S3, word segmentation is performed on initial query of a user, similar items whose similarity to the query words is larger than the threshold value M are found out from the word similarity matrix, and the similar items are added to the user query in an OR manner; S4, a search engine searches the target file according to the user query; S5, according to the similarity of words, the similarity of the files is evaluated, and the files are sorted; and S6, file data are read, and a sorting result is output.

Description

technical field [0001] The invention relates to the technical field of information retrieval, in particular to a text retrieval method based on semantics. Background technique [0002] Modern society has entered the information age, and the information resources contained in the Internet continue to grow, becoming an important source of information. At present, there are many technologies that provide customized information search, but some of these technologies have high requirements for basic information facilities, long implementation period, and high system construction and maintenance costs. The main customers are super-large enterprises and governments, and ordinary enterprises and individuals cannot afford it; Some of them can only support the most basic information retrieval functions, the retrieval scope is small, and the retrieval results are not comprehensive. In particular, it is very common for one word to have multiple meanings and multiple words to have one m...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
CPCG06F16/36G06F16/30G06F16/322
Inventor 贾岩
Owner ANHUI HUAZHEN INFORMATION SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products