Unlock instant, AI-driven research and patent intelligence for your innovation.

Data node similarity calculation method and device

A technology of data similarity and data nodes, which is applied in the field of big data computing, can solve problems such as unguaranteed computing time, unsupported hard drives of most nodes in the cluster, and task failures, so as to reduce the amount of data nodes and data replication, and improve computing Efficiency and success rate effects

Active Publication Date: 2020-09-04
CHINA TELECOM CORP LTD
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, the two association operations have become the main bottleneck of performance. During the two association processes, network data transmission and disk read and write are exponentially improved, and the hard disks of most nodes in the cluster cannot support them. Task failures often occur, resulting in unguaranteed computing time.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data node similarity calculation method and device
  • Data node similarity calculation method and device
  • Data node similarity calculation method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] Various exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It should be noted that the relative arrangements of components and steps, numerical expressions and numerical values ​​set forth in these embodiments do not limit the scope of the present invention unless specifically stated otherwise.

[0026] At the same time, it should be understood that, for the convenience of description, the sizes of the various parts shown in the drawings are not drawn according to the actual proportional relationship.

[0027] The following description of at least one exemplary embodiment is merely illustrative in nature and in no way taken as any limitation of the invention, its application or uses.

[0028] Techniques, methods and devices known to those of ordinary skill in the relevant art may not be discussed in detail, but where appropriate, such techniques, methods and devices should be considered part of the...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a data node similarity calculation method and apparatus, and relates to the field of big data calculation. The method comprises the steps of removing a mapping relationship ofrepeated data nodes with similar association in a data node association table, thereby forming a simplified data node association table; performing associated data node collection and partitioning onthe simplified data node association table to form an associated route table; establishing an association relationship between a relationship set in a partition of the associated route table and datanode eigenvectors; and according to the association relationship, calculating the similarity between the data nodes. The node and data replication amount in the data node similarity calculation process is greatly reduced, so that the data node similarity calculation efficiency and success rate are improved.

Description

technical field [0001] The invention relates to the field of big data computing, in particular to a data node similarity computing method and device. Background technique [0002] Node similarity calculation plays an increasingly prominent role in the era of big data. By comparing the data correlation between distributed data nodes, and through the correlation logic, the processing process of similarity identification, comparison and aggregation is carried out. It has a wide range of applications in information retrieval, data mining and other fields. With the explosive growth of the number of Internet users and content, the demand for similarity calculation on large-scale data has become increasingly strong. Similarity calculations are performed under the traditional MapReduce framework, and the node traversal mode is usually used to compare and summarize similarities, which results in heavy calculations, and generates a large number of intermediate process data tables, wh...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/2458G06F16/22G06F16/2455G06F16/27
CPCG06F16/2228G06F16/2456G06F16/2465G06F16/27G06F2216/03
Inventor 武娟庞涛钱锋刘晓军陈学亮
Owner CHINA TELECOM CORP LTD