A large-scale data group searching method based on time sequence density clustering

A large-scale data and search method technology, applied in the field of information retrieval, can solve problems such as insufficient system scalability and low computing efficiency, and achieve the goals of reducing I/O and cross-domain communication traffic, improving data access efficiency, and improving search efficiency Effect

Pending Publication Date: 2019-05-03
SUN YAT SEN UNIV
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] To sum up, the problems existing in the existing technology are: the current density-based group search method fa

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A large-scale data group searching method based on time sequence density clustering
  • A large-scale data group searching method based on time sequence density clustering
  • A large-scale data group searching method based on time sequence density clustering

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0050] The accompanying drawings are for illustrative purposes only and cannot be construed as limiting the patent;

[0051]In order to better illustrate this embodiment, some parts in the drawings will be omitted, enlarged or reduced, and do not represent the size of the actual product;

[0052] For those skilled in the art, it is understandable that some well-known structures and descriptions thereof may be omitted in the drawings.

[0053] The technical solutions of the present invention will be further described below in conjunction with the accompanying drawings and embodiments.

[0054] Such as figure 1 As shown, a large-scale data group search method based on time-series density clustering includes the following steps:

[0055] S1: According to a given node, define three initial states and original clusters of the node; the initial state includes initial state, unexecuted state, and executed state; the original clusters are the executed core points and the executed co...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a large-scale data group searching method based on time sequence density clustering, which comprises the following steps: on the basis of collection of hundred million nodes online, carrying out preliminary pretreatment on the collected nodes, constructing an original cluster and a clustering graph for expressing a node relationship, and finding a group where a merging node is located according to a communication relationship of representative cluster nodes. Along with each round of iteration of the algorithm, node contribution degree scores at different moments are calculated, and range query is executed on the nodes according to the score values. Under the condition that the correctness of final group discovery is ensured, the search efficiency of a large-scale data network can be well improved by adopting a time sequence-based density clustering method. By the adoption of the scheme, the key problems of high-energy physical data intensive access and diversified data query requirements can be solved by reducing I/O and cross-domain communication traffic.

Description

technical field [0001] The invention belongs to the field of information retrieval, and more specifically relates to a large-scale data group search method based on time series density clustering. Background technique [0002] The energy consumption of high-performance computing is one of the main bottlenecks in the promotion of large-scale supercomputing applications in my country. High-energy physics operations have a huge amount of calculations. However, there is currently no effective solution to batch arrival tasks. The method of giving some nodes and classifying them into groups (also called clusters) according to the similarity of their attributes is called a clustering algorithm. The current clustering methods can be divided into: partition-based clustering methods (such as K-MEANS algorithm), hierarchy-based clustering methods (such as BIRCH algorithm), and density-based clustering methods (such as DBSCAN algorithm). Among them, the density-based clustering method ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62G06F16/901
Inventor 姚嘉豪
Owner SUN YAT SEN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products