Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Text retrieval method based on word vector and association pattern intersection expansion

A technology of association patterns and word vectors, applied in the field of information retrieval

Inactive Publication Date: 2020-11-06
GUANGXI UNIVERSITY OF FINANCE AND ECONOMICS
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to propose a text retrieval method based on the intersection and expansion of word vectors and association patterns, which can be used in the field of information retrieval, such as Chinese web information retrieval systems or search engines, to improve and enhance the query performance of information retrieval systems , to reduce query subject drift and word mismatch problems in information retrieval

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text retrieval method based on word vector and association pattern intersection expansion
  • Text retrieval method based on word vector and association pattern intersection expansion
  • Text retrieval method based on word vector and association pattern intersection expansion

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0063] 1. In order to better illustrate the technical solution of the present invention, the related concepts involved in the present invention are introduced as follows:

[0064] 1. Itemset

[0065] In text mining, a text document is regarded as a transaction, each feature word in the document is called an item, the collection of feature word items is called an itemset, and the number of all items in an itemset is called the itemset length. k_itemsets refer to itemsets containing k items, where k is the length of the itemsets.

[0066] 2. Antecedents and Consequences of Association Rules

[0067] Let x and y be an arbitrary set of feature terms, and the implication in the form of x→y is called an association rule, where x is called the antecedent of the rule, and y is called the consequent of the rule.

[0068] 3. Copulas function and feature word association pattern support and confidence

[0069]Copulas function theory (see document: Sklar A.Fonctions de repartitionàn di...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a text retrieval method based on word vector and association pattern intersection expansion. The method comprises the following steps: firstly, inquiring and retrieving a Chinese document set by a user to obtain an initial document set; then, carrying out rule extension word mining and word vector semantic learning training on the initial inspection document set to respectively obtain a rule extension word set and a word vector extension word set;, the rule extension words contain feature word association information based on statistical analysis and the word vector extension words contain rich context semantic information; and performing intersection fusion on the rule extension word set and the word vector extension word set to obtain a final extension word set toimprove the extension word quality and realizing query extension. Experimental results show that the method can solve the problems of query topic drifting and word mismatching in information retrievaland improve the information retrieval performance, the retrieval performance is higher than that of similar comparison methods in recent years, and the method has good application value and popularization prospects.

Description

technical field [0001] The invention relates to a text retrieval method based on the intersection of word vector and association pattern expansion, and belongs to the technical field of information retrieval. Background technique [0002] In the face of the massive information resources of the Internet in the era of big data, how to accurately and efficiently query more required information from network big data information has always been a concern of researchers in the field of information retrieval in academia and industry. Query expansion is one of the core key technologies to solve the problem of information retrieval word mismatch and query topic drift. Query expansion refers to adding other feature words related to the original query semantics, making up for the lack of semantic information caused by the original query being too simple, and achieving improvement. Purpose of Information Retrieval Performance. The information retrieval method based on query expansion h...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/33G06F16/332
CPCG06F16/3325G06F16/3334G06F16/3335G06F16/3338G06F16/334
Inventor 黄名选
Owner GUANGXI UNIVERSITY OF FINANCE AND ECONOMICS
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products