Unlock instant, AI-driven research and patent intelligence for your innovation.

Training sample construction method, device, electronic equipment and storage medium

A technology of training samples and construction methods, which is applied in text database query, electronic digital data processing, unstructured text data retrieval, etc., can solve the problems of high labor cost, low efficiency, and difficulty in constructing sparse samples, and achieve an increase in the number of , The effect of improving construction efficiency

Active Publication Date: 2021-12-21
ZHIZHESIHAIBEIJINGTECH CO LTD
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The present invention provides a training sample construction method, device, electronic equipment and storage medium to solve the defects of difficulty in constructing sparse samples, low efficiency and high labor cost in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Training sample construction method, device, electronic equipment and storage medium
  • Training sample construction method, device, electronic equipment and storage medium
  • Training sample construction method, device, electronic equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0041] The term "and / or" in the embodiments of this application describes the association relationship of associated objects, indicating that there may be three relationships, for example, A and / or B, which may mean: A exists alone, A and B exist simultaneously, and B exists alone These three situations. The character " / " generally indicates that the contextual objects are an "or" relationship.

[0042] The term "plurality" in the embodiments of the present application refers to two or more, and other quantifiers are similar.

[0043] The technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are only some of the embodiments of the present application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention provides a training sample construction method, device, electronic equipment and storage medium, wherein the method includes: based on the trained initial classification model, classifying the unmarked text to obtain the classification result of the unmarked text; based on the For classification results of unlabeled texts, difficult samples and / or first candidate sparse samples are screened from the unlabeled texts; and the difficult samples and / or first candidate sparse samples are marked to obtain training samples. The training sample construction method, device, electronic equipment, and storage medium provided by the present invention classify unlabeled texts by using the trained initial classification model, thereby screening them according to the classification results of each unlabeled text to obtain difficult samples and the first candidate sparse sample, and mark the results on the basis of the screening results to obtain training samples, which can greatly improve the construction efficiency of training samples and effectively increase the number of sparse samples obtained.

Description

technical field [0001] The present invention relates to the technical field of natural language processing, in particular to a training sample construction method, device, electronic equipment and storage medium. Background technique [0002] In deep learning projects, the quality of the training set often directly affects the training effect of the model, so building a good training set plays an important role in optimizing the model effect. [0003] However, in practical application scenarios, constructing an effective training set needs to be slowly accumulated in specific tasks, which takes a long time and requires manual labeling of each sample, resulting in high labor costs. Especially when the training samples of some categories are sparsely distributed in the whole data set, the labor cost and time cost of obtaining such sparse samples will be very high, and the number of sparse samples collected is also very limited. Among them, a sparse sample is a sample in which...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/33G06F16/335G06F16/35G06F40/117
CPCG06F16/3346G06F16/335G06F16/355G06F40/117
Inventor 吴杨龙刘兆来李大海
Owner ZHIZHESIHAIBEIJINGTECH CO LTD