Information retrieval method based on interrogative extension

A technology of information retrieval and interrogative words, which is applied in the field of information retrieval and can solve problems such as low retrieval efficiency and mismatched retrieval intent of retrieval results

Inactive Publication Date: 2014-07-02
PEKING UNIV
View PDF5 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The present invention mainly solves the problems in the prior art that the retrieval efficiency is low and the retrieval result does not match the retrieval intent, and provides a software information retrieval method based on interrogative word expansion

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Information retrieval method based on interrogative extension
  • Information retrieval method based on interrogative extension

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0028] Before the statistical step, all question-answer pairs were crawled on the programming-specific question-and-answer website stackoverflow, and some question-answer pairs were randomly selected as statistical objects.

[0029] First, classify the part-of-speech tagging of the questions according to the answers to the questions. According to the question words of the questions, they are divided into categories such as how, where, why, what, which, etc. After analysis, it is found that the questions of who and when do not exist in this field, so they are not considered who, when categories.

[0030] Then regard the text in the answer as consisting of individual words, root these words, and extract the part-of-speech features of the text. Think of the code in the answer as a code fragment composed of a sentence of code statement, and judge whether there is code in the answer, and if it exists, consider whether there are features such as judgment statement, loop statement, a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to an information retrieval method, in particular to a software information retrieval method based on interrogative extension. The information retrieval method includes the statistical step, the analytical step and the retrieval step, wherein in the statistical step, existing question answers of a question and answer website are classified, the features of answer pairs of various types are extracted, and discrimination features between the answer pairs of various types are acquired through machine learning; in the analytical step, retrieval questions are processed through a natural language to acquire interrogatives, and retrieval vectors and the discrimination features are combined to form new retrieval vectors; in the retrieval step, retrieval is conducted in a software knowledge base through the retrieval vectors. The method has the following advantages that software information retrieval accuracy can be improved through the relation between the interrogatives and the answers in the questions and the answers; retrieval results can be filtered and reordered through the relation between the interrogatives and the answers in the questions and the answers, and therefore the screening speed of a user is increased.

Description

technical field [0001] The invention relates to an information retrieval method, in particular to a software information retrieval method based on interrogative word expansion. Background technique [0002] The software knowledge base is a special database used for software knowledge management, which stores software-related codes, documents, and questions and answers, so as to facilitate the collection, arrangement and extraction of software knowledge. [0003] Retrieval is an important function provided by software knowledge base. For a query sentence entered by the user, the retrieval system extracts the query words and performs similarity matching, sorts the retrieval results and returns them to the user. [0004] The current software information retrieval tools mainly use keyword matching, word frequency statistics (TF-IDF) and other technologies. These simple keyword combinations ignore the potential semantic information when people ask questions, and it is difficult ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/35G06F16/951
Inventor 邹艳珍张灵箫
Owner PEKING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products