Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Training sample generation method and device, electronic equipment and storage medium

A technology of training samples and initial texts, applied in the field of machine learning, can solve problems such as poor robustness, low accuracy, and lack of training data resources, and achieve the effect of improving recognition accuracy, improving pertinence, and enhancing robustness

Pending Publication Date: 2020-02-14
TENCENT TECH (SHENZHEN) CO LTD
View PDF0 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, in this process, there are defects that the model trained for open domains does not perform well in specific domains (low accuracy, poor robustness), and the lack of training data resources for machine reading comprehension tasks in specific domains. question

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Training sample generation method and device, electronic equipment and storage medium
  • Training sample generation method and device, electronic equipment and storage medium
  • Training sample generation method and device, electronic equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0071] In order to make the purpose, technical solution and advantages of the present invention clearer, the present invention will be described in further detail below in conjunction with the accompanying drawings, and the described embodiments should not be considered as limiting the present invention, and those of ordinary skill in the art do not make any All other embodiments obtained under the premise of creative labor belong to the protection scope of the present invention.

[0072] In the following description, references to "some embodiments" describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or a different subset of all possible embodiments, and Can be combined with each other without conflict.

[0073] Before further describing the embodiments of the present invention in detail, the nouns and terms involved in the embodiments of the present invention are described, and the nouns and terms involved in the...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a training sample generation method. The training sample generation method comprises the steps of obtaining to-be-processed initial text data; performing word segmentation processing on the initial text data to form keywords matched with the initial text data; screening the initial text according to a keyword matched with the initial text data to form a target text for a specified service; training a corresponding text processing model through the target text; and performing domain data augmentation processing on the target text according to a training result of the textprocessing model to form a training sample for a specified service. The invention further provides a training sample generation device, electronic equipment and a storage medium. According to the method, the pertinence of the training sample can be improved, so that the training sample is more suitable for a machine reading understanding task. Meanwhile, the recognition accuracy of the neural network model in the specific service field is improved, and the robustness of the neural network model is enhanced.

Description

technical field [0001] The present invention relates to machine learning technology, in particular to a training sample generation method, device, electronic equipment and storage medium. Background technique [0002] In the existing technology, with the development of machine learning technology, machine reading can be realized through the BERT (BidirectionalEncoder Representations from Transformers, bidirectional encoder derived from Transformers) mechanism. In the process of processing text data using the model corresponding to BERT, The text data can be split into characters one by one, and then each character is input into the corresponding model of BERT in turn to obtain the corresponding output result. However, in this process, there are defects that the model trained for open domains does not perform well in specific domains (low accuracy, poor robustness), and the lack of training data resources for machine reading comprehension tasks in specific domains. question....

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/335G06N3/04G06N3/08
CPCG06F16/335G06N3/04G06N3/08Y02D10/00
Inventor 闫昭张士卫张倩汶饶孟良曹云波
Owner TENCENT TECH (SHENZHEN) CO LTD
Features
  • Generate Ideas
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More