Unlock instant, AI-driven research and patent intelligence for your innovation.

Method and device for generating pre-labeled sample, server and medium

A sample and negative sample technology, applied in the computer field, can solve problems such as difficulty in ensuring the balance of positive and negative samples, high cost, etc., and achieve the effect of improving generation efficiency and quality and improving training effect.

Pending Publication Date: 2022-06-07
JD DIGITS HAIYI INFORMATION TECHNOLOGY CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, relying entirely on manual labeling of a large number of samples can easily lead to high costs, and it is difficult to ensure the balance of the number of positive and negative samples.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for generating pre-labeled sample, server and medium
  • Method and device for generating pre-labeled sample, server and medium
  • Method and device for generating pre-labeled sample, server and medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028]The present disclosure is further described in detail below in conjunction with the accompanying drawings and embodiments. It will be appreciated that the specific embodiments described herein are used only to explain the invention in question and not to qualify the invention. It should also be noted that, in order to facilitate the description, only a portion of the accompanying drawings is shown in relation to the invention.

[0029] It should be noted that, without conflict, the embodiments in the present disclosure and the features in the embodiments may be combined with each other. The present disclosure will be described below with reference to the accompanying drawings and in conjunction with an embodiment.

[0030] Figure 1 Illustrated may be applied to the present disclosure for generating prelabeled samples and methods for pre-training models or for generating prelabeled samples and apparatus for pre-training models of exemplary architecture 100.

[0031] as Figur...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention discloses a method and device for generating a pre-labeled sample, a server and a medium. According to one specific embodiment, the method comprises the steps that a preset annotation database is obtained, and the corresponding relation between a question text and a scene is recorded in the preset annotation database; multiple rounds of scene positioning data to be pre-labeled are obtained, and the multiple rounds of scene positioning data comprise at least one question and a corresponding scene; matching at least one question in the multi-round scene positioning data with a question text in the preset annotation database, and determining a scene corresponding to the matched question text as a matched scene; and according to the determined matching scene and the scene corresponding to the at least one question sentence to be matched, generating a pre-labeled sample with balanced positive and negative samples based on the multi-round scene positioning data. According to the embodiment, large-scale automatic generation of the pre-labeled samples with balanced positive and negative samples is realized.

Description

Technical field [0001] Embodiments of the present disclosure relate to the field of computer technology, specifically to methods for generating pre-labeled samples and pre-trained models, apparatus, servers and media. Background [0002] With the development of machine learning technology, pre-trained models have been successfully applied by academia and industry in various tasks in various fields related to natural language processing, such as text classification, text matching, text generation, and machine translation. [0003] In the prior art, the parameter adjustment of the pre-trained model usually requires a large amount of manual labeling data to participate in the training to achieve good results. However, relying entirely on manual labeling of a large number of samples can easily lead to excessive costs and difficult to ensure the balance of positive and negative sample sizes. Contents of the Invention [0004] Embodiments of the present disclosure propose methods for ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/30G06F16/33G06F16/31
CPCG06F40/30G06F16/334G06F16/31
Inventor 宋双永吴良庆何晓冬
Owner JD DIGITS HAIYI INFORMATION TECHNOLOGY CO LTD