Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Generating method, device and equipment for labeled data, and storage medium

A technology for labeling data and targets, applied in the field of data processing, can solve the problems of time-consuming, low efficiency, and cumbersome interactive data process for manual construction of training, so as to reduce labor costs, simplify the data acquisition process, and achieve the effect of efficient acquisition.

Active Publication Date: 2018-12-07
MOBVOI INFORMATION TECH CO LTD
View PDF5 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In the process of implementing the present invention, the inventor found that the prior art has the following defects: acquiring training interaction data from log files requires excessive dependence on the performance of the interaction system, and only the interaction data of the scene supported by the system can be obtained; The process of interactive data is cumbersome and inefficient, requiring a lot of human time

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Generating method, device and equipment for labeled data, and storage medium
  • Generating method, device and equipment for labeled data, and storage medium
  • Generating method, device and equipment for labeled data, and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0034] figure 1 It is a flow chart of a method for generating annotation data provided by Embodiment 1 of the present invention. This embodiment is applicable to the case of efficiently obtaining annotation data applied to a multi-round interactive system, and the method can be executed by an annotation data generation device , the device can be implemented by software and / or hardware, and generally can be integrated in various computer equipment, specifically including the following steps:

[0035] S110. Obtain sample condition information matching the required sample provided by the data demander.

[0036] Wherein, the sample condition information includes: the current semantic understanding protocol of the requirement sample, the historical semantic understanding protocol of the historical sample associated with the requirement sample, the sample type of the requirement sample, and the grammar rules of the requirement sample.

[0037] The data demander generally refers to ...

Embodiment 2

[0067] figure 2 It is a flowchart of a method for generating labeled data provided by Embodiment 2 of the present invention. This embodiment is embodied on the basis of the above-mentioned embodiments. In this embodiment, the candidate labeled samples will be selected according to the sample condition information Perform a rationality check to obtain target labeling samples, specifically: in the sample condition information, obtain the current semantic understanding agreement of the demand sample; in the candidate labeling samples, obtain the agreement with the current semantic understanding The field value to be verified corresponding to the first target field included in ; if it is determined that the value of the field to be verified matches the field value corresponding to the first target field in the current semantic understanding protocol, then the alternative The labeled sample is determined as the target labeled sample. Correspondingly, such as figure 2 As shown, th...

Embodiment 3

[0088] image 3 It is a flow chart of a method for generating labeled data provided by Embodiment 3 of the present invention. This embodiment is embodied on the basis of the above-mentioned embodiments. In this embodiment, the candidate labeled samples will be selected according to the sample condition information Perform a rationality check to obtain target labeling samples, specifically: in the sample condition information, obtain the grammatical rules of the required samples; in the candidate labeling samples, find the third corresponding to the grammatical rules A target field and a fourth target field; if it is determined that the search result matches the grammatical rule, then determine the candidate labeled sample as the target labeled sample. Correspondingly, such as image 3 As shown, the method of this embodiment may include:

[0089] S310. Obtain sample condition information matching the required sample provided by the data demander.

[0090] S320. Provide the s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

Embodiments of the invention disclose generating method, device and equipment for labeled data, and a storage medium. The method comprises the following steps of acquiring sample condition informationprovided by a data demand party and matched with a demand sample, wherein the sample condition information comprises the current semantic understanding protocol of the demand sample, the historical semantic understanding protocol of the historical sample associated with the demand sample, the sample type of the demand sample and the syntax rule of the demand sample; providing the sample conditioninformation to at least one data labeling party, and obtaining an alternative labeling sample which is generated by the data labeling party according to the sample condition information; carrying outrationality verification on the alternative labeling sample according to the sample condition information to obtain a target labeling sample; and according to the target labeling sample and the sample condition information, constructing a structured labeled data, wherein the required multi-round interaction system data can be efficiently acquired, the data acquisition process is simplified, and the labor cost is lowered.

Description

technical field [0001] Embodiments of the present invention relate to the technical field of data processing, and in particular, to a method, device, device, and storage medium for generating labeled data. Background technique [0002] Multi-round interaction systems are more and more widely used in the field of existing intelligent electronic products. For example, multi-round interaction based on context dialogue scenarios plays a pivotal role in the field of intelligent questions, and is also an important function and huge problem. In practical applications, the problem that the intelligent question answering system needs to solve is likely to be a complex process knowledge, rather than a simple question-and-answer form. [0003] Currently, rule-based models are more commonly used in multi-round interactive systems. As the application scenarios of multi-round interactive systems become more and more complex, it is difficult for pure rule-based models to meet the needs o...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/27
CPCG06F40/253G06F40/30
Inventor 王晓雪吴世伟
Owner MOBVOI INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products