Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Model training method and device, short message auditing method and device, equipment and storage medium

A model training and SMS technology, applied in computing models, character and pattern recognition, instruments, etc., can solve the problems of no significant improvement in model performance and no amount of information provided, and achieve the effect of reducing labeling costs.

Pending Publication Date: 2020-12-15
BEIJING BAIDU NETCOM SCI & TECH CO LTD
View PDF5 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, in related technologies, the newly added labeled samples may not provide new information, which does not significantly help the performance of the model.
Taking the SMS review business as an example, a large number of SMS logs are generated every day. If samples to be labeled are randomly selected from them, after a certain labeling cost is spent, it may not significantly help the performance of the model.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Model training method and device, short message auditing method and device, equipment and storage medium
  • Model training method and device, short message auditing method and device, equipment and storage medium
  • Model training method and device, short message auditing method and device, equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0044] Exemplary embodiments of the present application are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present application to facilitate understanding, and they should be regarded as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

[0045] figure 1 is a flowchart of a model training method according to an embodiment of the present application. see figure 1 , the model training method includes:

[0046] Step S110, performing sample reduction on the first unlabeled sample to obtain a second unlabeled sample;

[0047] Step S120, input the second unlabeled sample into the machine learning model ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a model training method, a short message auditing method and device, equipment and a storage medium, and relates to the field of artificial intelligence. According to the specific implementation scheme, model training is as follows: performing sample reduction on a first unlabeled sample to obtain a second unlabeled sample; inputting the second unlabeled sample into a machine learning model for prediction to obtain a probability corresponding to a prediction result of the second unlabeled sample; selecting a third unlabeled sample from the second unlabeled samples according to the probability; and training the machine learning model by using the labeled third unlabeled sample. According to the embodiment of the application, redundant samples are removed through sample reduction, so that the selected samples have certain representativeness. And an active learning technology is used, and a machine learning model is used to further select a sample with the most annotation value for the current model and a large amount of information, so that the annotation cost is reduced.

Description

technical field [0001] The present application relates to the field of computer technology, in particular to the field of artificial intelligence. Background technique [0002] Model training requires a large amount of manually labeled data. And with the continuous development of the business, it is necessary to continuously supplement the latest annotation data, so that the model can be iteratively optimized with the business. However, in related technologies, the newly added labeled samples may not provide new information, which does not contribute significantly to the performance improvement of the model. Taking the SMS review business as an example, a large number of SMS logs are generated every day. If samples to be labeled are randomly selected from them, after a certain labeling cost is spent, it may not be significantly helpful to improve the performance of the model. Contents of the invention [0003] The present application provides a model training method, a s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06N20/00G06K9/62
CPCG06N20/00G06F18/24G06N3/08G06N3/045G06F18/22G06N7/01
Inventor 何烩烩王乐义刘明浩郭江亮
Owner BEIJING BAIDU NETCOM SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products