Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Anti-interference knowledge base question-answering method and system fusing retrieval and machine reading understanding

A reading comprehension and machine technology, applied in the field of artificial intelligence, can solve problems such as no document search algorithm process, algorithm process is not full-text search, and the robustness of the reading comprehension model is not high

Active Publication Date: 2020-12-18
广州探迹科技有限公司
View PDF6 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The mainstream algorithm process is not full-text search, and there are not many implementations of knowledge graphs, research document search, and machine reading comprehension (the reading comprehension model is not robust), and there is no comprehensive and mature document search algorithm process

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Anti-interference knowledge base question-answering method and system fusing retrieval and machine reading understanding
  • Anti-interference knowledge base question-answering method and system fusing retrieval and machine reading understanding
  • Anti-interference knowledge base question-answering method and system fusing retrieval and machine reading understanding

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment

[0088] Provide the specific embodiment of the present invention below, comprise the steps:

[0089] Step 1. Establishment of the inverted index of the document and storage in the database, using the open source framework ElasticSearch. In addition, using public corpora such as Baidu Encyclopedia, Chinese Wikipedia, Sogou News, People's Daily, Zhihu Q&A, Weibo, literary works, Siku Quanshu, etc., the training word vectors are used as synonym expansion data.

[0090] Step 2. Capture the user's query text Q.

[0091] Step 3. Full-text search + synonym expansion to recall TOP-K1 (1000) candidate documents: According to the user query, use the BM25 algorithm and RM3 query expansion technology to return TOP-K1 candidate knowledge documents from the distributed ElasticSearch database . And take the BM25 score as the confidence level.

[0092] Step 4. Rough sorting model training and prediction Select TOP-K2 (50) related documents:

[0093] Step 4.1, self-produce 10,000 query texts,...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention provides a text query method and device integrating retrieval and machine reading understanding, a readable storage medium and computing equipment, and aims to realizehigh-precision search and directly extract answers from search results and return the answers to a user. The method comprises the steps of receiving a query request of a user, wherein the query request comprises a query text; conducting searching according to the query text to obtain a preset first number of candidate documents; inputting the preset first number of candidate documents and the query text into a preset binary classification model, and selecting a preset second number of candidate documents from the preset first number of candidate documents; inputting the preset second number ofcandidate documents and the query text into a preset paragraph extraction reading understanding model, and selecting a preset third number of paragraphs or sentences from the preset second number ofcandidate documents; and returning the preset third number of paragraphs or sentences to the user.

Description

technical field [0001] The present invention relates to the technical field of artificial intelligence, in particular to a text query method, device, readable storage medium and computing equipment that integrate retrieval and machine reading comprehension. Background technique [0002] ElasticSearch's BM25 algorithm is an upgraded and improved version of the TF-IDF algorithm, but it is essentially based on word frequency, inverse document frequency and other features related to the number of word occurrences for matching search. It is a search algorithm based on keyword matching . However, in practical applications, the keywords entered by the user may be semantically related to the search content, and may not exactly match the keywords in the search content. For example, "leave application process" and "leave application steps" are only semantically related. A certain knowledge in the knowledge base is the "leave request step", but the user input is the "leave request pr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/33G06F16/332G06F16/35G06F40/211G06F40/284G06K9/62G06N3/04G06N3/08G06N5/02
CPCG06F16/3329G06F16/3344G06F16/35G06F40/211G06F40/284G06N3/08G06N5/025G06N3/045G06F18/241Y02D10/00
Inventor 陈开冉黎展谢智权
Owner 广州探迹科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products