Synonym mining method and device for question and answer retrieval system

A retrieval system and synonym technology, applied in the field of information retrieval, can solve problems such as dissatisfaction, achieve the effects of improving efficiency and accuracy, strong portability, and solving semantic deformation problems

Active Publication Date: 2019-11-12
ENJOYOR COMPANY LIMITED
View PDF9 Cites 23 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0010] In order to overcome the above-mentioned deficiencies, the present invention aims to provide a synonym mining method and device for a question-and-answer retrieval system. The present invention classifies the question-and-answer corpus and extracts keywords by category to obtain a keyword set to be processed. The large corpus in the field conducts word vector training, and calculates the cosine similarity of the word vector to obtain the generalized related word set of the current category keywords, and then performs part-of-speech screening to obtain the abbreviated related word set, and then

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Synonym mining method and device for question and answer retrieval system
  • Synonym mining method and device for question and answer retrieval system
  • Synonym mining method and device for question and answer retrieval system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0055] Example: such as figure 1 As shown, a synonym mining device for a question-and-answer retrieval system includes: a data acquisition module, a data preprocessing module, a synonym set acquisition module, and a feedback correction module;

[0056] The data acquisition module is used to crawl and collect question-and-answer corpus of different categories in the vertical field and a large corpus as training corpus;

[0057] The data preprocessing module is used for corpus data preprocessing, including data cleaning, text classification, data word segmentation, and keyword extraction; wherein, keyword extraction is to carry out fine-grained word segmentation of Query in the question and answer corpus as keyword set I, And extract keywords from Answer to get keyword set II, merge keyword set I and keyword set II to get the initial keyword set that needs to be mined, and then perform part-of-speech screening, mainly reserved nouns, verbs and adjectives, etc. The final keyword...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a synonym mining method and device for a question and answer retrieval system. According to the invention, question and answer corpora are classified; keyword extraction is carried out by category to obtain a set of keywords to be processed, meanwhile, word vector training is carried out on large corpora in the vertical field; cosine similarity of the word vectors is calculated to obtain a generalized related word set of the keywords of the current category; part-of-speech screening is performed to obtain a set of thumbnail-related words, an Euclidean distance is calculated in the thumbnail related word set; synonym pairs are obtained, the co-occurrence frequency of the synonym pairs is counted; the replacement probability of the synonyms is calculated, finally, feedback correction is conducted on the synonym pairs which do not meet the retrieval recall threshold value according to the retrieval recall result obtained after replacement of the synonyms, the semantic deformation problem obtained after replacement of the synonyms is well solved, and the accuracy of synonym mining and the accuracy of question and answer pair retrieval results are improved.

Description

technical field [0001] The invention relates to the field of information retrieval, in particular to a synonym mining method and device for a question-and-answer retrieval system. Background technique [0002] With the rapid development of Internet technology, in the face of massive information and resources, traditional search engines cannot well meet people's comprehensive, fast, and accurate knowledge acquisition needs. Intelligent question-and-answer retrieval systems based on accurate knowledge acquisition will become the future development direction. In recent years, with the continuous rise of artificial intelligence and the development and progress of technology, the application of question and answer retrieval system has penetrated into all walks of life, and has gradually become a very practical and popular way of knowledge acquisition. [0003] Synonym replacement, as an important technology of question-answer retrieval system, is the basic and necessary work in ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/9032G06F16/906G06F16/951G06F17/27
CPCG06F16/90332G06F16/906G06F16/951
Inventor 郑申文丁锴陈涛王开红李建元
Owner ENJOYOR COMPANY LIMITED
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products