Supercharge Your Innovation With Domain-Expert AI Agents!

Method and device for data processing

A data processing and data sample technology, applied in the field of information processing, can solve the problem that data processing methods cannot meet the high efficiency and high precision of missing value processing at the same time, so as to reduce the time required for processing missing values, improve correctness, and improve The effect of processing speed

Active Publication Date: 2020-04-21
GUANGZHOU SHIYUAN ELECTRONICS CO LTD
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In view of this, the embodiment of the present invention provides a data processing method and device to solve the technical problem that the data processing method in the prior art cannot meet the high efficiency and high precision requirements of missing value processing at the same time

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for data processing
  • Method and device for data processing
  • Method and device for data processing

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0024] Embodiment 1 of the present invention provides a data processing method. The method can be executed by a data processing device, wherein the device can be implemented by hardware and / or software, and can generally be integrated into a data processing platform. figure 1 is a schematic flow chart of the data processing method provided in Embodiment 1 of the present invention, such as figure 1 As shown, the method includes:

[0025] S101. Obtain a data sample.

[0026] In this embodiment, the data sample may be an entity class data sample, and the data sample includes a first data sample and a second data sample, wherein the first data sample is a data sample including missing values, and the second data sample is not including missing values data sample.

[0027] In a specific application, the data sample can be pre-stored in the database corresponding to the data processing platform. When obtaining the data sample, the data sample can be directly called from the stora...

Embodiment 2

[0040] figure 2 It is a schematic flowchart of a data processing method provided in Embodiment 2 of the present invention. This embodiment is optimized on the basis of the above embodiments, and further, before the calculation of the similarity between the attribute values ​​of the data samples that include missing values ​​and the attribute values ​​of data samples that do not include missing values, further includes: The initial contribution of each attribute of the data sample is obtained according to the attribute corresponding to the missing value, and each attribute is a related attribute of the attribute corresponding to the missing value.

[0041] Further, the attribute values ​​of the related attribute and the attribute corresponding to the missing value are all continuous values; The similarity between them, specifically: calculate the similarity between the related attribute values ​​of the data samples that include missing values ​​and the related attribute value...

Embodiment 3

[0072] image 3 It is a schematic flowchart of a data processing method provided by Embodiment 3 of the present invention. This embodiment is optimized on the basis of the above embodiments. Further, the determination of the number of filling samples required to fill the missing value according to the sample number determination rule includes: according to the non-missing rate of the corresponding attribute of the missing value and the not included The number of data samples with missing values ​​determines the first number of samples required to fill the missing values; the missing value is filled according to the contribution rate of the relevant attribute of the attribute corresponding to the missing value and the number of data samples that do not include the missing value The required second number of samples; determining the number of filling samples required to fill the missing value according to the first number of samples and the second number of samples.

[0073] Co...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

An embodiment of the invention discloses a data processing method and device. The method comprises steps as follows: data samples are acquired; similarity between attribute values of the data samples containing missing values and attribute values of the data samples containing no missing values is calculated; filling samples are determined from the data samples containing no missing values according to the similarity; filling values are determined according to attributes corresponding to the missing values in the filling samples, and the data samples containing the missing values are updated according to the filling values. With the adoption of the technical scheme, the missing values are filled according to the attribute values of the data samples which have higher similarity to the data samples corresponding to the missing values and contain no missing values, attribute characteristics and distribution features of the missing values are taken into comprehensive consideration, the data samples containing the missing values are not needed to be deleted, so that correctness of the filling values and effectiveness of data information can be improved, processing speed of the missing values is increased, and time required for processing the missing values is shortened.

Description

technical field [0001] The present invention relates to the technical field of information processing, in particular to a data processing method and device. Background technique [0002] In recent years, with the development of information processing technology, big data has been more and more applied to various fields such as navigation system or urban planning. [0003] The current big data architecture is usually data flow-oriented for data processing, that is, first obtain data from the data source and store the obtained data, then preprocess the data, and then perform processing based on the preprocessed data. Data modeling, data analysis and data mining, and finally realizing data monetization. It can be seen that data preprocessing is the basis of the entire data processing process in the big data structure, and its quality and accuracy may directly affect the definition of indicators for data dimension modeling in subsequent links, the selection of data mining algor...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/245G06F16/21
CPCG06F16/217G06F16/245
Inventor 徐骄
Owner GUANGZHOU SHIYUAN ELECTRONICS CO LTD
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More