Unlock instant, AI-driven research and patent intelligence for your innovation.

Data preprocessing method for machine learning algorithm and related equipment

A data preprocessing and machine learning technology, applied in the field of data processing, can solve the problems of data loss, CNN full connection mode redundancy, increased training time, etc., to achieve the effect of improving accuracy and efficiency, saving manpower and material resources, and improving usability

Pending Publication Date: 2021-07-23
航天网安技术(深圳)有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, in the development of machine learning algorithms based on structured data such as data mining and user profiling, there will be a certain degree of data loss during data collection. If the machine learning algorithm is directly developed without screening, it will cause due to Invalid data leads to increased training time, or performance degradation, and even training cannot be performed when there are missing values. For filling missing values, brainstorming is often used at present, or relevant experts are often used to fill in, resulting in a lot of waste of manpower and material resources.
In the existing technology, principal component analysis (PCA) dimensionality reduction method and convolutional neural network (CNN) are used to reduce the influence of invalid features on the results. The new features generated by the former are difficult to establish with the actual application scenarios, and manual settings are required. , Adjust the threshold of cumulative explainable variance; the latter has insufficient biological support, no memory function, and the CNN full connection mode is too redundant and inefficient; in addition, both of the above two methods exist: when there are missing values ​​in the data set, dimensionality reduction The problem that the method will fail

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data preprocessing method for machine learning algorithm and related equipment
  • Data preprocessing method for machine learning algorithm and related equipment
  • Data preprocessing method for machine learning algorithm and related equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0051] In order to make the purpose, technical solutions and advantages of the present disclosure clearer, the present disclosure will be further described in detail below in conjunction with specific embodiments and with reference to the accompanying drawings.

[0052] It should be noted that, unless otherwise defined, the technical terms or scientific terms used in the embodiments of the present disclosure shall have ordinary meanings understood by those skilled in the art to which the present disclosure belongs. "First", "second" and similar words used in the embodiments of the present disclosure do not indicate any sequence, quantity or importance, but are only used to distinguish different components. "Comprising" or "comprising" and similar words mean that the elements or items appearing before the word include the elements or items listed after the word and their equivalents, without excluding other elements or items.

[0053] As mentioned in the background technology s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a data preprocessing method for a machine learning algorithm and related equipment. The method comprises the following steps: acquiring to-be-processed original data; performing screening based on the missing value of each feature of the original data to obtain first screening data; performing screening based on the same value of each feature in the first screening data to determine second screening data; filling the missing value of each feature in the second screening data to obtain complete data; and performing standardization processing on the complete data according to a preset standardization processing rule to obtain the complete data after standardization processing. According to the embodiment of the invention, data preprocessing can be carried out on the structured data, the data availability and the data quality are improved by processing the abnormal value of the data, and a large amount of manpower and material resources during development of a machine learning algorithm are saved.

Description

technical field [0001] The present disclosure relates to the technical field of data processing, in particular to a data preprocessing method and related equipment for machine learning algorithms. Background technique [0002] With the development of machine learning technology, the demand for machine learning technology in more and more industries is becoming more and more urgent. However, in the development of machine learning algorithms based on structured data such as data mining and user profiling, there will be a certain degree of data loss during data collection. If the machine learning algorithm is directly developed without screening, it will cause due to Invalid data leads to increased training time, or decreased performance, or even training cannot be performed when there are missing values. For filling missing values, brainstorming or consulting with relevant experts is often used to fill in, resulting in a lot of waste of manpower and material resources. In the...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/9035G06K9/62G06N20/00
CPCG06F16/9035G06N20/00G06F18/2411G06F18/214
Inventor 郑凤
Owner 航天网安技术(深圳)有限公司