Data segmenting method and system of distributed graph calculating system

A system data and graph computing technology, applied in the field of data processing, can solve problems such as low data processing efficiency, and achieve the effect of saving computing overhead, reducing communication overhead, and improving data processing efficiency

Active Publication Date: 2015-01-14
BEIHANG UNIV
View PDF4 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] In view of the above existing problems, the present invention provides a data segmentation method and system of a distributed graph computing system, which is used t

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data segmenting method and system of distributed graph calculating system
  • Data segmenting method and system of distributed graph calculating system
  • Data segmenting method and system of distributed graph calculating system

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0024] figure 1 It is a flowchart of Embodiment 1 of a data segmentation method for a distributed graph computing system of the present invention, such as figure 1 As shown, the method includes:

[0025] Step 101: Determine the similarity measurement value between each data node in the data to be processed and each first neighboring node of itself according to a preset algorithm;

[0026] The method provided in this embodiment is applicable to a scenario where a distributed data processing system is used to perform data segmentation and storage of large-scale graph structure data, and multiple processing hosts are provided in the distributed data processing system. It is worth noting that after the system receives a large amount of to-be-processed data, the existing distributed processing method can be used to pre-distribute the to-be-processed data into multiple processing hosts, but the method provided in this embodiment The processing can be performed on the data to be processed...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a data segmenting method and system of a distributed graph calculating system. The data segmenting method includes determining a similarity measurement value between each data node and an own first adjacent node in to-be-processed data; acquiring appearing times of a label of each first adjacent node, and determining whether at least two labels identical in appearing times exist or not; if yes, determining second adjacent nodes corresponding to the labels, and determining a label of each data node according to a similarity measurement value between each data node and the corresponding second adjacent node; dividing the data nodes with a same label into a same community, and storing the data nodes belonging to the same community into a same processing host. Community dividing of the data nodes is realized by taking similarity features among the data nodes into full consideration and on the basis of the labels, and operation expenditure is saved; the data nodes close in relationship are distributed into the same processing host, so that expenditure in communication among different processing hosts is reduced.

Description

technical field [0001] The invention belongs to the field of data processing, and in particular relates to a data segmentation method and system of a distributed graph computing system. Background technique [0002] In recent years, massive data generated in many fields represented by the Internet, high-energy physics, and computational biology have put forward higher requirements for data processing systems. Among these massive data, graph-structured data composed of nodes and edges is an important data structure that can effectively represent data relationships in many different fields. For example, social network data can be represented as a graph-structured data, where nodes represent Users, and edges represent the relationship between users. For example, if two users follow each other, it means that there is an edge between the two corresponding nodes. Similarly, the web page data of the Internet can also be represented as a graph structure data. Nodes represent web pa...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/951
Inventor 李博宋骐李建欣于伟仁
Owner BEIHANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products