Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Grey correlation clustering method based on LDTW distance

A clustering method and grey correlation technology, applied in the fields of instruments, character and pattern recognition, computer parts, etc., can solve problems such as deviations

Inactive Publication Date: 2018-06-12
CHONGQING UNIV OF POSTS & TELECOMM
View PDF4 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

It can be seen that the existing methods still have flaws, which lead to deviations when we mine the potential information.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Grey correlation clustering method based on LDTW distance
  • Grey correlation clustering method based on LDTW distance
  • Grey correlation clustering method based on LDTW distance

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0078] As shown in Table 1, it is a specific calculation example for calculating the gray correlation degree based on LDTW distance, which is about my country's GDP X from 2001 to 2005 0 and primary industry X 1 , Secondary industry X 2 , Output value of the tertiary industry X 3 The data;

[0079] Gross Production X 0 and primary industry X 1 , Secondary industry X 2 , Output value of the tertiary industry X 3 The initial values ​​​​of the sequence are corresponding to:

[0080] X' 0 =(1,1.0966,1.2379,1.4576,1.6691)

[0081] X' 1 =(1,0.0452,1.1032,1.3548,1.4903)

[0082] X' 2 =(1,1.0444,1.2606,1.4929,1.7576);

[0083] X' 3 =(1,1.1256,1.2915,1.4574,1.6368);

[0084] Gross Production X 0 and primary industry X 1 , Secondary industry X 2 , Output value of the tertiary industry X 3 The zeroing images of the starting point of the sequence correspond to:

[0085]

[0086]

[0087]

[0088]

[0089] Table 1 China's 2001-2005 GDP and output value of var...

Embodiment 2

[0106] As shown in Table 4, in order to consider the performance of the gray relational degree model based on LDTW distance when the sequence data is missing (that is, the sequence length is inconsistent), the data in Table 1 are partially lost. The initial value of the sequence is like:

[0107] X' 0 =(1,1.0966,1.2379,1.4576,1.6691)

[0108] X' 1 =(1,1.0452,1.1032,1.3548,1.4903)

[0109] X' 2 =(1,1.2606,1.4929,1.7576)

[0110] X' 3 =(1,1.1256,1.4574,1.6368)

[0111] The initial zeroing image is:

[0112]

[0113]

[0114]

[0115]

[0116] Table 4 China's 2001-2005 GDP and output value of various industries (some missing, unit: 100 billion yuan)

[0117] sequence

2001

2002

2003

2004

2005

X 0

109.7

120.3

135.8

159.9

183.1

X 1

15.5

16.2

17.1

21

23.1

X 2

49.5

--

62.4

73.9

87

X 3

44.6

50.2

--

65

73

[0118] As an achievable way, the distance ma...

Embodiment 3

[0149] The gray relational degree model based on LDTW distance in the present invention can realize the clustering of certain biological related data sequences in bioinformatics, and better analyze the similarity of some biological sequences, such as finding the commonality between similar proteins. It is of great significance to deduce the biological functions of these proteins. Proteins with similar structures have similar functions, so we cluster proteins with similar functions into one group to help biologists study protein functions.

[0150] The biological clustering method of the gray correlation degree model of LDTW distance of the present invention includes clustering of yeast protein positioning sites and clustering of abalone age. Yeast attributes include mcg: McGeoch's signal sequence identification method; gvh: von Heijne's signal sequence identification method; alm: score of ALOM transmembrane region prediction program; mit: discriminant analysis score for amino ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the field of excavation, particularly to an gray correlation clustering method based on LDTW distance, comprises the following steps of: processing an original data set to obtain a pre-processed sequence; constructing a reference sequence for the maximum value of each dimension in the pre-processed sequence; calculating the LDTW distance and the bending path length of thepre-processed sequence and the reference sequence; calculating the gray correlation degree between the pre-processed sequence and the reference sequence based on the LDTW distance; dividing the critical value interval into a plurality of critical sections, according to the result of the gray correlation degree, if the gray correlation degree of the two sequences falls within the same critical section, grouping the two sequences into one type. The invention reduces the error of the similarity measure between the two sequences, and can provide help for biologists to study the function of proteins.

Description

technical field [0001] The invention belongs to the field of data mining, in particular to a gray relational clustering method for biometric data based on dynamic time warping distance (DTW under limited warping path length, LDTW) under limited warping length. Background technique [0002] With the massive increase in the scale of biological databases, more and more people use computer programs to automatically classify. Bioinformatics brings great hope to human beings, but also brings opportunities and challenges to data analysts. Tens of billions of data are pouring into public databases, and relying on experimental methods to analyze these data is time-consuming and expensive. Therefore, it is necessary to find effective and fast calculation methods to automatically analyze these data. The huge biological information database poses many challenging problems to data mining technology, and also provides broad opportunities. Clustering analysis technology is an important ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06K9/62
CPCG06F18/2321G06F18/22
Inventor 代劲何雨虹宋娟吴朝文
Owner CHONGQING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products