Unlock instant, AI-driven research and patent intelligence for your innovation.

Multi-core clustering method for rapidly processing missing heterogeneous data

A technology of heterogeneous data and multi-source heterogeneous data, applied in multi-core clustering, multi-core clustering field dealing with missing heterogeneous data, can solve problems such as effective information discount and consistent data distribution

Pending Publication Date: 2019-08-30
CHANGSHA UNIVERSITY OF SCIENCE AND TECHNOLOGY
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this direct and blunt filling method does not guarantee that the data distribution after filling missing values ​​is consistent with the original data. Therefore, it is easy to reduce the effective information extracted from the data, so a better filling method is needed to improve multi-core clustering. performance

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-core clustering method for rapidly processing missing heterogeneous data
  • Multi-core clustering method for rapidly processing missing heterogeneous data
  • Multi-core clustering method for rapidly processing missing heterogeneous data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] The specific embodiment of the present invention is described in detail below in conjunction with accompanying drawing:

[0038] Such as figure 1 shown. The input of the present invention is multi-source heterogeneous data with partial missing values, and the flow of the multi-core clustering method for quickly processing missing heterogeneous data mainly includes four steps: the first step is to perform 0 on the missing multi-source heterogeneous data Fill initialization; the second step is to use multiple basic kernel functions to perform multi-core learning on the initialized multi-source heterogeneous data to generate a multi-kernel matrix; the third step is to perform multi-kernel clustering on the generated multi-kernel matrix to generate pseudo-labels; then, Use low-rank estimation to update the missing values ​​of each base kernel matrix that makes up the multi-kernel matrix; the fourth step is to learn the multi-kernel joint coefficients based on the clusterin...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a multi-core clustering method for rapidly processing the missing heterogeneous data. The multi-core clustering method comprises the following steps of 1, carrying out 0 filling initialization on the missing multi-source heterogeneous data; 2, carrying out multi-core learning on the initialized multi-source heterogeneous data by utilizing a plurality of base kernel functions to generate a multi-core matrix; 3, performing multi-core clustering on the generated multi-core matrix to generate a pseudo label; then carrying out the missing value updating on each basic core matrix forming the multi-core matrix by using low-rank estimation; and 4, based on a clustering result, learning the multi-core joint coefficient by using an extreme learning machine. According to the method, the multi-core clustering method is used for achieving the fast learning of the heterogeneous data, the core completion technology is used for fully completing the information of missing data,the clustering performance is improved, and the problem that a traditional multi-core clustering method cannot effectively process the multi-source heterogeneous data is solved.

Description

technical field [0001] The invention belongs to the field of data mining and machine learning, relates to a multi-core clustering method, in particular to a multi-core clustering method for processing missing heterogeneous data, which can be applied to Web data analysis, biological information analysis, financial investment analysis, Intelligent medical analysis and other fields. Background technique [0002] With the development of the computer field, the concept of "Internet +" has penetrated into all walks of life. In the era of big data, data in these fields has different data formats and sources, and tends to be multi-source heterogeneous. Data from multiple data sources with different types, structures, and distributions is called "multi-source heterogeneous data". For example: the data analyzed when designing a recommendation system may simultaneously contain different types of data such as text, images, and videos from multiple social platforms such as Twitter, Fac...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06K9/62G16B15/20
CPCG16B15/20G06F18/23213
Inventor 向凌云赵国汗王进曾道建李文军王磊
Owner CHANGSHA UNIVERSITY OF SCIENCE AND TECHNOLOGY