Data filling method and device based on clustering algorithm and computer equipment

A technology of clustering algorithm and filling method, applied in the field of big data, which can solve problems such as errors and missing data

Pending Publication Date: 2020-01-07
CHINA PING AN PROPERTY INSURANCE CO LTD
View PDF0 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, during data acquisition, or during data processing, missing data may occur
The traditional method is to ignore these missing data, but this method will cause errors when using missing data for data mining and analysis

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data filling method and device based on clustering algorithm and computer equipment
  • Data filling method and device based on clustering algorithm and computer equipment
  • Data filling method and device based on clustering algorithm and computer equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0039] see figure 1 , in a data filling method based on a clustering algorithm of the present embodiment, the following steps may be included:

[0040] Step 01, determine the attributes of the missing data.

[0041] In the process of data collection or transmission, due to human error or mechanical reasons, null values ​​may be caused, resulting in missing data. In this embodiment, the positioning of missing data can be realized by using a null value positioning method.

[0042] In the embodiment of the present invention, after the missing data is located, the attribute of the missing data may be determined according to the data content. For example, if a boy's love for basketball is missing, then the love for basketball is determined as an attribute of the missing data. For another example, if a user has missing data on the probability of renewal of the purchased target insurance after expiration, then the probability of renewal of the target insurance after expiration is ...

Embodiment 2

[0096] see Figure 5 The data filling method based on the clustering algorithm of the present embodiment is based on the first embodiment, including the following steps:

[0097] Step 501, determining attributes of missing data.

[0098] Step 502, performing binary group integration on the data according to the attributes of the missing data.

[0099] Step 503, clustering the data after the binary group integration to form clusters.

[0100] In the embodiment of the present invention, in order to realize the filling of the missing data, the data with the same attribute as the missing data can be clustered according to the data after the binary group integration and the reference data as a benchmark. For example, based on boys as the benchmark, clustering the degree of love for basketball can form multiple clusters. The formed clusters are all boys’ love for basketball, but the degree of love is different. For example, five clusters are formed , respectively: Like it very mu...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a clustering algorithm-based data filling method and device, and a computer device. The method comprises the steps of determining attributes of missing data; performing two-tuple integration on the data according to the attributes of the missing data; clustering the data after the two-tuple integration to form a class cluster; determining a class cluster where the missing data is located; determining a reference data set for filling the missing data according to the class cluster where the missing data is located; and filling the missing data according to the reference data set. According to the method, the missing data can be filled, the accuracy of the filled missing data is ensured, and a basis is provided for the accuracy of data mining and analysis.

Description

technical field [0001] The present invention relates to the field of big data technology, in particular to a data filling method, device and computer equipment based on a clustering algorithm. Background technique [0002] With the rise of big data, the demand for data processing has become larger and wider. However, missing data may occur during data acquisition, or during data processing. The traditional method is to ignore these missing data, but this method will cause errors when using missing data for data mining and analysis. Contents of the invention [0003] The object of the present invention is to provide a data filling method, device and computer equipment based on a clustering algorithm, which are used to solve the problems existing in the prior art. [0004] To achieve the above object, the present invention provides a data filling method based on a clustering algorithm, characterized in that, the method comprises the following steps: [0005] Identify attr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/215G06F16/2458G06K9/62
CPCG06F16/215G06F16/2465G06F18/23
Inventor 杨春春
Owner CHINA PING AN PROPERTY INSURANCE CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products