A method and apparatus for evaluating the quality of a system recovery

A technology of systematic evaluation and evaluation method, applied in the field of evaluation of system response quality, which can solve the problems of low relevance of questions, difficult evaluation of response quality, evaluation and other problems.

Pending Publication Date: 2019-03-15
IFLYTEK CO LTD
View PDF6 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Among them, the search method is to find the most suitable reply from the existing dialogue database. The reply generated by the search formula is manually marked in advance, and no reply other than the dialogue database will be generated. For the evaluation of the reply quality of the retrieval system, you can use The accuracy of the retrieval is judged, for example, to judge whether the first K sentences obtained by the retrieval contain the best answer to the question, therefore, it can also evaluate the quality of the retrieval system’s reply well; while the generative method generally constructs The probability distribution model generates the reply with the highest probability under the premise of a given question sentence. Compared with the retrieval method, the generative method can generate replies that have not appeared in the existing corpus, which is flexible, but the generative method is also It may produce replies that do not conform to grammatical rules, or even replies that are less relevant to questions. Therefore, for generative methods, it is difficult to evaluate the quality of their replies, and it is difficult to evaluate the quality of their replies accurately.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method and apparatus for evaluating the quality of a system recovery
  • A method and apparatus for evaluating the quality of a system recovery
  • A method and apparatus for evaluating the quality of a system recovery

Examples

Experimental program
Comparison scheme
Effect test

no. 1 example

[0096] see figure 1 , which is a schematic flowchart of a method for evaluating system reply quality provided in this embodiment, the method includes the following steps:

[0097]S101: Generate a system evaluation index of the target dialogue system.

[0098] In this embodiment, any dialogue system that implements reply quality evaluation in this embodiment is defined as the target dialogue system, and the target dialogue system may use a generative method to construct a probability distribution model and use it as a non-task-type chat system (such as non-task chatbots, etc.), the system can not only generate replies that do not appear in the existing corpus through generative methods, but also may generate replies that do not conform to grammatical rules, or even generate replies that are less relevant to the question .

[0099] Therefore, in this embodiment, in order to accurately evaluate the reply quality of the target dialogue system, it is first necessary to generate a...

no. 2 example

[0136] This embodiment will introduce the specific working process and construction process of the topic correlation model, and the first evaluation index P1 can be generated based on the output results of the topic correlation model.

[0137] see figure 2 , which shows a schematic flow chart of generating the first evaluation index of the target dialogue system provided by this embodiment, and the process includes the following steps:

[0138] S201: Using a pre-built topic correlation model, determine the topic correlation between each selected reply of the target dialogue system and a corresponding question.

[0139] In this example, image 3 A schematic diagram of the structure of the subject correlation model provided in this embodiment. The structure of the model is a layered structure, which can be specifically divided into a sentence representation layer, an interaction layer, a convergence layer, and a correlation calculation layer.

[0140] Each selected reply in t...

no. 3 example

[0184] This embodiment will introduce the specific working process and construction process of the semantic similarity model, and the second evaluation index P2 can be generated based on the output result of the semantic similarity model.

[0185] see Figure 4 , which shows a schematic flow chart of generating the second evaluation index of the target dialogue system provided by this embodiment, and the process includes the following steps:

[0186] S401: Using a pre-built semantic similarity model, determine the semantic similarity between each selected reply of the target dialogue system and the corresponding manual reply.

[0187] In this example, Figure 5 A schematic structural diagram of the semantic similarity model provided in this embodiment.

[0188] Define each selected reply in the selected reply set used to generate the second evaluation index as reply A, and define the manual reply of its corresponding question as reply A', such as Figure 5 As shown, input t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A method and apparatus for evaluate that quality of a system recovery is disclosed, The method comprises: Firstly, the system evaluation index of the target dialogue system is generated, includes generating a first evaluation indicator based on the thematic relevance between each selected response of the target dialog system and the corresponding question, A second evaluation index is generated based on the semantic similarity between each selected response of the target dialogue system and the corresponding manual response, and at least one of the third evaluation indicators generated based on the likelihood of each selected reply of the target dialog system being a generic reply, and then according to the generated system evaluation index, determine the response quality of the target conversation system, It can be seen that when evaluating the response quality of the target conversation system, the present application takes into account the subject correlation between the system response and the question, the semantic similarity between the system response and the manual response, and the possibility that the system response is a general response, so that the response quality ofthe target conversation system can be evaluated more accurately.

Description

technical field [0001] The present application relates to the technical field of natural language understanding, in particular to a method and device for evaluating the quality of system responses. Background technique [0002] As an important application of artificial intelligence technology, chatbots have been widely used in various intelligent terminal devices, such as mobile phones and wearable devices. At present, chatbots can be divided into task-type chatbots and non-task-type chatbots according to their uses, and chatbots with different uses have different implementation technologies. [0003] Task-based chatbots interact with users through the task-based dialogue system to complete related tasks, such as ordering food, weather query tasks, etc. The evaluation of the response quality of the task-based dialogue system is generally based on whether the task is completed or not, and the completion of the task. It is judged by the number of dialogue rounds performed by ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27G06F16/332G06F16/35
CPCG06F40/289G06F40/30
Inventor 陈泽陈志刚刘权
Owner IFLYTEK CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products