Sample expansion method, terminal, device and readable storage medium

A sample expansion and sample data technology, applied in the field of machine learning, can solve the problems of high cost of labeling samples, time-consuming, low efficiency of labeling sample expansion, etc., to achieve high robustness and accuracy, reduce costs, and improve expansion efficiency. Effect

Active Publication Date: 2020-06-16
WEBANK (CHINA)
View PDF11 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The main purpose of the present invention is to provide a sample expansion method, terminal, device, and readable storage medium, aiming to solve the problem of high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Sample expansion method, terminal, device and readable storage medium
  • Sample expansion method, terminal, device and readable storage medium
  • Sample expansion method, terminal, device and readable storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0017] It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0018] Such as figure 1 as shown, figure 1 It is a schematic structural diagram of a terminal in the hardware operating environment involved in the solution of the embodiment of the present invention.

[0019] Such as figure 1 As shown, the terminal may include: a processor 1001 , such as a CPU, a network interface 1004 , a user interface 1003 , a memory 1005 , and a communication bus 1002 . Wherein, the communication bus 1002 is used to realize connection and communication between these components. The user interface 1003 may include a display screen (Display), an input unit such as a keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface and a wireless interface. Optionally, the network interface 1004 may include a standard wired interface and a wireless interface (...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a sample expansion method and device, a terminal and a readable storage medium. The method comprises the following steps of: selecting sample data from a preset labeled sampledata set as seed data, selecting word data based on the seed data, obtaining seed data of a marked sample data set, obtaining a word type of the word data, determining an expansion mode of the markedsample data set based on the word type, updating the word data in the seed data based on the expansion mode, and taking the updated seed data as expansion sample data to expand the marked sample dataset. Sample data expansion is carried out on the labeled sample data through different expansion modes, according to the method, the cost of obtaining the annotation sample is reduced, the sample expansion efficiency is improved, meanwhile, the generated expansion sample data and the annotated sample data obey the same data distribution, and it can be guaranteed that the model generated by training the sequence annotation model through the expansion sample has very high robustness and accuracy.

Description

technical field [0001] The present invention relates to the technical field of machine learning, in particular to a sample expansion method, a terminal, a device and a readable storage medium. Background technique [0002] In the field of machine learning, data annotation is the starting point for machines to perceive the real world. To some extent, unlabeled data is useless data. In particular, the training sequence labeling model requires a large amount of labeling data. You can purchase sample data that has been labeled by a third party, but the cost is very expensive. If you use manual labeling to label the data, not only the complexity is relatively high, but also when there are many word labels , labeling takes a long time, resulting in low efficiency in generating training sample data for the sequence labeling model. [0003] The above content is only used to assist in understanding the technical solution of the present invention, and does not mean that the above con...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F40/284G06F40/247G06F16/36G06F16/335
CPCG06F16/374G06F16/335
Inventor 周楠楠杨海军徐倩
Owner WEBANK (CHINA)
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products