Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Processing method for automatically labeling, training and predicting mass data

A technology for automatic labeling and massive data

Active Publication Date: 2020-04-14
CHANGCHUN JIACHENG NETWORK ENG
View PDF11 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Therefore, when performing machine learning based on large-scale data, a large amount of labor needs to be invested in data labeling and then model training. There is a large amount of Internet data that requires a lot of labor in the early stage when performing machine learning, which takes a long time and the model update cycle is long. The problem of heavy workload and slow results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Processing method for automatically labeling, training and predicting mass data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0020] The present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0021] Such as figure 1 A processing method for automatically labeling, training, and predicting massive data shown includes the following steps:

[0022] Step 1. Collect data:

[0023] I. Use the Python technical framework scrapyd to write crawlers, set keywords for collection, specify the combination relationship between keywords, grab data that matches keywords in news, post bars, forums and other websites, and collect data such as news titles, texts, replies, etc. Carry out structured storage and save to the data management platform;

[0024] Python (Computer Programming Language) is a cross-platform computer programming language and an object-oriented dynamic type language. With the continuous updating of versions and the addition of new language features, it is increasingly used in independent , Development of large-scale pro...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a processing method for automatically labeling, training and predicting mass data. The processing method comprises the steps of 1 collecting data; 2 training a model; 3 updating a prediction model; and 4 iteratively updating. By continuously repeating the method provided by the invention for machine learning, the cost of manually labeling data can be reduced, and the accuracy of data recognition is improved. According to the method provided by the invention, manual labeling and model training collection are alternately increased; the workload is reduced; the model updating period is short; consumed time is short; and the effect taking speed is high.

Description

technical field [0001] The invention relates to a processing method, in particular to a processing method for automatically marking, training and predicting massive data. Background technique [0002] In the process of solving large-scale machine learning, it is necessary to invest in data labeling in the early stage, first carry out a small amount of data labeling, and then use the characteristics of machine learning to perform auxiliary supervised learning in the subsequent process, correct the results of machine learning labeling, and then feed back to the next step In the learning process of rounds, the above process is repeated to continuously enhance the accuracy of machine learning. Therefore, when performing machine learning based on large-scale data, a large amount of labor needs to be invested in data labeling and then model training. There is a large amount of Internet data that requires a lot of labor in the early stage when performing machine learning, which tak...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06N20/00
CPCG06N20/00
Inventor 李波张少卓李旭孙洪鑫安天博
Owner CHANGCHUN JIACHENG NETWORK ENG
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products