Unlock instant, AI-driven research and patent intelligence for your innovation.

Model training method, and task type visual dialogue problem generation method and device

A technology for model training and tasks, applied in the field of mobile communication, can solve the problems of effectively integrating different text features, adjusting inaccuracy, and small amount of effective information, so as to improve the success rate of tasks, reduce interaction rounds, and avoid repetition. proposed effect

Active Publication Date: 2021-03-30
BEIJING UNIV OF POSTS & TELECOMM
View PDF5 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In this way, training based on the above supervision signals will guide the generated questions to be consistent with the questions in real-person dialogues
This way, simple questions are easier to generate
However, due to the small amount of words contained in simple questions and the small amount of effective information that is conducive to accurately guessing the target object, more rounds of interaction are required to guess the target object
In practical applications, the maximum number of interactions is limited. In this way, if the maximum number of interactions is less than the actual number of interactions required, it will happen that the target object cannot be accurately guessed after completing the maximum number of interactions. Execution failed
Moreover, the expression of natural language is diverse, and different expressions can have similar effects. However, the model training in the above method is driven only by the difference in the expression form of the generated sentences, that is, it is only based on real dialogues and predictions. The feedback adjustment of the supervision signal generated by the difference between the dialogues will lead to the inaccuracy of the adjustment, thereby reducing the accuracy of the final trained model, and then affecting the success rate of the visual dialogue task based on the model.
[0005] In addition, the inventor found in the process of implementing the present invention that: using the existing end-to-end supervised learning method, the same problem may be raised repeatedly
In this method, when the encoder is trained, the self-training and fitting is based on a single supervisory signal, making the model training process a black box, which in turn makes the above problems insufficiently interpretable, that is, it is impossible to analyze the above problems based on the training process. s reason
For this reason, the inventors have found through research and analysis and simulation verification that one of the reasons for generating repetitive problems in the above-mentioned dialogue process is that in the existing end-to-end supervised learning method, the dialogue text information is not hierarchically encoded, so, for the The answer information that is valued will not be allocated enough attention, which may cause the model to be unable to effectively integrate different text features, so that the information answered by the user will be forgotten, and the question will be repeatedly asked. Happening

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Model training method, and task type visual dialogue problem generation method and device
  • Model training method, and task type visual dialogue problem generation method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0039] In order to make the purpose, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0040] figure 1 It is a schematic flow chart of the model training method in Embodiment 1 of the present invention, as figure 1 As shown, the model training method realized in this embodiment mainly includes:

[0041] Step 101. Obtain human dialogue data and feature data of corresponding images.

[0042] This step is used to obtain data for training the model, that is, real data of conversations between real people, and feature data of images associated with the conversations.

[0043] In practice, you can learn from GuessWhat! The above-mentioned data is obtained in a dataset, but not limited thereto.

[0044] Step 102. For each round of question-answer data in the human dialogue data, determine the category of the question and generat...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a model training method, and a task type visual dialogue problem generation method and device. The model training method comprises the following steps: obtaining human dialoguedata and the feature data of a corresponding image, determining the question category of each round of question and answer data, and generating a question category label; and traversing each round ofquestion and answer data according to a dialogue sequence, and training a preset task type visual dialogue question generation model by utilizing the round of question and answer data, the corresponding question category label and the feature data of the image. The training comprises the steps of generating a context vector and image features with text guidance based on a round of question and answer data currently input to the model and feature data of an image; and predicting the question category of the next round of question and answer data based on the context vector and the image features, predicting the question of the next round of question and answer data within the category range, and adjusting the network parameters of the model based on the prediction result. Dialogue interaction turns can be reduced, and the task success rate is improved.

Description

technical field [0001] The invention relates to mobile communication technology, in particular to a model training method and a method and device for generating task-type visual dialogue problems. Background technique [0002] At present, the visual dialogue task system is one of the hot research technologies of artificial intelligence. The visual dialogue task system determines the user's target object in the relevant scene based on the question-and-answer information between the agent and the user, so as to help the user complete certain tasks for the target object. Visual dialogue is an interdisciplinary study of multi-disciplinary knowledge and methods in natural language processing, computer vision, knowledge representation learning, and reasoning. At present, the existing visual dialogue systems can be roughly divided into two types: task-based visual dialogue and chatting visual dialogue. Among them, how to generate effective questions is one of the difficult proble...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/332G06F16/35G06F40/35G06N3/04
CPCG06F16/3329G06F16/35G06F40/35G06N3/044G06N3/045
Inventor 史亚楠王小捷袁彩霞谭言信
Owner BEIJING UNIV OF POSTS & TELECOMM
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More