A Parallelized Frequent Probabilistic Subgraph Search Method Based on Merge Clustering

A search method and probabilistic technology, applied in structured data retrieval, special data processing applications, instruments, etc., can solve problems such as high space-time complexity, achieve the effects of ensuring clustering accuracy, increasing scalability, and shortening calculation time

Active Publication Date: 2018-05-18
SOUTHEAST UNIV
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Purpose of the invention: Aiming at the problems and deficiencies in the prior art, the present invention provides a parallelized frequent probability subgraph search method based on merge clustering, which can effectively solve the problem of using simple hierarchical clustering while ensuring calculation accuracy. The class handles the problem of excessive time and space complexity of large-scale probability graph data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Parallelized Frequent Probabilistic Subgraph Search Method Based on Merge Clustering
  • A Parallelized Frequent Probabilistic Subgraph Search Method Based on Merge Clustering
  • A Parallelized Frequent Probabilistic Subgraph Search Method Based on Merge Clustering

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030]Below in conjunction with specific embodiment, further illustrate the present invention, should be understood that these embodiments are only used to illustrate the present invention and are not intended to limit the scope of the present invention, after having read the present invention, those skilled in the art will understand various equivalent forms of the present invention All modifications fall within the scope defined by the appended claims of the present application.

[0031] The present invention comprises the following stages during concrete implementation:

[0032] Step 1, preprocessing the probability subgraph. From the input probabilistic network (probability graph), all probability subgraphs of a given node size are identified without repetition or omission. Using the implementation structure based on the Spark parallel framework, the obtained probability subgraphs are stored in the HDFS file system, and then all probability subgraph files are loaded into ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a merge clustering-based parallel frequent probability subgraph searching method. In allusion to the problem that the existing frequent subgraph searching method is high in time-space consumption and cannot satisfy the big data environment requirement, the method comprises the following steps: mapping a probability subgraph into circuit topology to be processed by adopting a node voltage method; clustering the probability subgraph by utilizing merge clustering so as to effectively reduce the time expenditure; and finally realizing the method on the basis of a Spark frame, so as to further improve the calculation speed and expandability.

Description

technical field [0001] The invention relates to a parallelized frequent probability subgraph mining method based on merging and clustering, which is used to realize rapid mining and identification of frequent probability subgraphs in a large-scale probability network, and belongs to the technical field of computer data mining. Background technique [0002] With the continuous emergence of information and Internet technologies, a large amount of network data has been generated, such as social networks, biological networks, etc., and these networks can be represented by graph models. How to efficiently implement data mining on graph datasets has become one of the hot issues in the field of data mining research. In practical applications, many graph data exist in the form of probability. For example, in the field of bioinformatics, the biological network data obtained by research usually have inevitable experimental errors or noise data; at the same time, the process of biolog...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
CPCG06F16/285
Inventor 杨鹏顾梁王春艳
Owner SOUTHEAST UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products