Unlock instant, AI-driven research and patent intelligence for your innovation.

Diversity graph sorting method for large-scale graph data based on spark

A variety of graph sorting technology, applied in the direction of electronic digital data processing, digital data information retrieval, special data processing applications, etc., can solve the problem that large-scale graph data cannot be processed effectively, and meet the requirements of fast processing, intuitive model, The effect of good scalability and high efficiency

Active Publication Date: 2019-05-03
YUNNAN UNIV
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] The purpose of the present invention is to overcome the deficiencies in the prior art, provide a kind of diversity graph sorting method based on Spark's large-scale graph data, overcome the defect and deficiency that existing diversity graph sorting technology can't effectively process large-scale graph data , to provide technical support for the diversity graph sorting and application of large-scale graph data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Diversity graph sorting method for large-scale graph data based on spark
  • Diversity graph sorting method for large-scale graph data based on spark
  • Diversity graph sorting method for large-scale graph data based on spark

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0040] The Spark-based large-scale graph data diversity graph sorting method of the present invention includes two steps: (1) a calculation preparation part, whose main functions are: first, execute personalized PageRank, and obtain relevant node sets and nodes The personalized PageRank value (abbreviated as, ppr) of , secondly, complete the neighbor information collection of the nodes on the graph, laying the foundation for the calculation of the distance between nodes; (2) the calculation implementation part, its main function is based on the ppr between nodes The weighted distance value is used to obtain the top-k node ranking result that combines relevance and diversity through k iterations.

[0041] The present invention is described in detail below in conjunction with examples, such as figure 1 shown, including the following steps:

[0042] (1), obtain the query-related node set of personalized PageRank

[0043] (1.1), read the edge table file through the GraphLoader.e...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a Spark-based diversity graph sorting method for large-scale graph data. The diversity graph sorting of the graph data is carried out in combination with a classic personalized Page Rank algorithm and a distance-based diversity measurement method by taking the diversity graph sorting of the large-scale graph data as a goal and taking a method for measuring a distance between nodes in the graph data as a basis. The method has the advantages of expandability, higher efficiency and the like, meets the data storage and calculation requirements of the diversity graph sorting of the massive graph data, and provides a powerful technical support for to-be-solved key problems in massive graph data analysis processing and mining, and the like.

Description

technical field [0001] The invention belongs to the technical field of data mining and information retrieval, and more particularly relates to a Spark-based large-scale graph data diversity graph sorting method. Background technique [0002] Ranking is one of the basic tasks of information retrieval, data mining and social network analysis. In an information retrieval system, a better sorting method can ensure that mining results with high relevance to user queries and low information redundancy can be presented in a limited display space, thereby minimizing the user's query abandonment rate and improving the user's query performance. The experience of information retrieval service is of great significance. [0003] Graph data composed of a large number of nodes and edges representing the relationship between nodes, because the graph lacks an explicit order, the graph ordering is particularly critical in the process of graph data analysis and application. Existing graph da...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/2457G06F16/2458
CPCG06F16/24578G06F16/2465
Inventor 李劲岳昆胡矿王钰杰高仁尚
Owner YUNNAN UNIV