Unlock instant, AI-driven research and patent intelligence for your innovation.

Question classification method and application thereof

A classification method and a technology of question sentences, which are applied in the computer field, can solve problems such as training difficulties of recurrent neural networks, ambiguity of word vectors, etc.

Pending Publication Date: 2021-04-02
UNIV OF ELECTRONIC SCI & TECH OF CHINA
View PDF1 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In view of the above defects, the present invention provides a method for classifying questions, which can be used in Chinese medical question answering systems. The method uses a multi-layer attention mechanism to obtain questions using a BERT (Bidirectional Encoder Representations from Transformers) model trained on a large-scale corpus. The word vector representation solves the ambiguity problem of existing word vectors, and proposes to combine the idea of ​​residual network to solve the defect of difficult training of cyclic neural network, speed up the training process, make the model converge faster, and improve the classification accuracy of questions

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Question classification method and application thereof
  • Question classification method and application thereof
  • Question classification method and application thereof

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0050] The present embodiment provides a method for classifying questions in a Chinese medical question-and-answer system, and specifically adopts the following steps:

[0051] S1. Data collection of medical questions:

[0052] 1250 questions in the medical field are obtained from the doctor-patient dialogue data open on the website (http: / / jib.xywy.com / ), and are manually labeled, and each question corresponds to a category. Questions are divided into 16 categories, which are common medicines for diseases, foods suitable for diseases, inspection items required for diseases, foods not to eat for diseases, recommended drugs for diseases, symptoms of diseases, concurrent diseases of diseases, departments of disease treatment, methods of disease treatment, diseases Treatment time, disease cure probability, disease treatment cost, disease preventive measures, disease etiology, disease description, and disease contagiousness.

[0053] S2. Question preprocessing: convert the tradit...

Embodiment 2

[0082] Based on the scheme given in Example 1, the following specific examples and in conjunction with the attached figure 2 The present invention is described in further detail. In this specific embodiment, taking the question "Can leukemia be cured?" as an example, we first preprocess the text, and obtain the word vector x corresponding to each word of the sentence through the BERT model i , the sentence can be expressed as v q , and then input it into a two-layer bidirectional GRU network. The input of the first layer of the GRU network is the word vector obtained by the BERT model. Through the first layer, we can extract the hidden state of each word The original word vector x i and the hidden state vector of the first layer of the GRU network Connect, input to the second layer of GRU, and extract the advanced hidden state of each word Then use the attention layer to the hidden state vector of the second layer of the GRU network learning weight α i , using the l...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a question classification method which comprises the steps: carrying out the training under a large-scale statement through a pre-trained BERT model, obtaining the word vectorrepresentation of a question, and enabling each word to have different vector representations under different questions; processing a text sequence by utilizing a two-layer bidirectional GRU network structure, wherein an original word vector is combined with the first-layer output of the GRU network to serve as the second-layer input of the GRU by combining the thought of a residual network, so that the convergence rate of the model can be increased, important characteristics of questions are concerned by utilizing an attention mechanism, and the classification precision of the questions is improved. According to the method, the dependency problem of the question text can be further captured by utilizing the GRU model, and meanwhile, the attention mechanism is utilized to endow the features with greater influence on question classification with higher weights, and therefore the question classification precision in the medical question and answer system is further improved.

Description

technical field [0001] The invention relates to the field of computer technology, in particular to a question classification method and its application, in particular to a Chinese question classification method, in particular to a Chinese medical question answering system question classification method. Background technique [0002] With the development and popularization of network technology, the application prospect of medical question answering system is very broad. Users can learn about relevant medical knowledge or consult treatment methods for certain diseases through the medical question-and-answer system. At present, the mainstream construction method of the medical question answering system based on the knowledge base is based on the method of text matching, that is, firstly identify the relevant entities in the question and the corresponding category of the question, and then find the answer from the knowledge base according to the entity and the category of the q...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/35G06F40/216G06F40/284G06K9/62G06N3/04
CPCG06F16/35G06F40/216G06F40/284G06N3/045G06F18/2415G06F18/214
Inventor 杨世刚刘勇国杨尚明李巧勤朱嘉静张云
Owner UNIV OF ELECTRONIC SCI & TECH OF CHINA
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More