Unlock instant, AI-driven research and patent intelligence for your innovation.

Self-sorting Algorithm of Primary and Secondary Keyword Based on Treemap for Two-dimensional Uncertain Length Data

A technology for determining data and secondary keywords, applied in the field of data processing, can solve the problems of uncertainty of index type, uncertainty of data length, MapReduce sorting is not suitable for sorting of primary and secondary keywords, etc., to achieve high efficiency, convenient and orderly storage , reduce the effect of sorting operations

Active Publication Date: 2018-05-29
四川医科大学
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The two-dimensional data with uncertain length is sorted by primary and secondary keywords, because the data length is uncertain, the index type is uncertain (it can be a string type or an integer), and it is sorted by primary and secondary keywords (not the data itself). Therefore Traditional sorting algorithms are not suitable for primary and secondary key sorting of two-dimensional uncertain length data
The operating condition in the Map stage is the data source (direct / indirect file) that must be determined. Two-dimensional data with an uncertain length cannot generate a certain data source. MapReduce sorting is not suitable for primary and secondary key sorting of two-dimensional data with an uncertain length.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Self-sorting Algorithm of Primary and Secondary Keyword Based on Treemap for Two-dimensional Uncertain Length Data
  • Self-sorting Algorithm of Primary and Secondary Keyword Based on Treemap for Two-dimensional Uncertain Length Data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0041] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings.

[0042] Such as figure 1 As shown, the self-sorting algorithm of primary and secondary keywords based on the two-dimensional length uncertain data of TreeMap disclosed by the present invention comprises the following steps:

[0043] Step 1. Organize the separator set in ascending order of separator usage frequency

[0044] The present invention converts pair of two-dimensional length uncertain data into TreeMap Key, so as to access / traverse the two-dimensional length of pair cached in TreeMapBuffer through Key Not sure about the data. Conversely, in order to support applications based on TreeMapBuffer, such as: to find the primary key and secondary key of the largest data, it is necessary to use Key to parse out the primary key and secondary key. In order to c...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention discloses a TreeMap based primary and secondary keyword self-sorting algorithm of two-dimensional uncertain-length data. The algorithm comprises: selecting a separator that can distinguish a primary keyword from a secondary keyword from a separator set; by means of a <primary keyword, secondary keyword> pair of the two-dimensional uncertain-length data (a data type of the primary and secondary keywords can be a integral type or a character string type) and data corresponding to the <primary keyword, secondary keyword> pair, separately constructing a Key and a Value of a TreeMap; and inserting the two-dimensional uncertain-length data into a TreeMap buffer area by means of the separator, the Key and the Value. The algorithm provided by the present invention can be applied to data association, data online acquisition / collection, analysis of data (such as a sum and an average value) according to the primary keyword and the like in a Reduce stage in a MapReduce technology, and fulfills the aim of carrying out sorting according to the primary and secondary keywords requirements by inserting the two-dimensional uncertain-length data into the TreeMap buffer area.

Description

technical field [0001] The invention relates to a data processing method, in particular to a self-sorting algorithm of primary and secondary keywords based on TreeMap-based two-dimensional uncertain length data. Background technique [0002] In this patent, two-dimensional data with uncertain length refers to data with a fixed dimension but an uncertain length. Its dimension is two-dimensional: it is represented by a primary keyword and a secondary keyword. String; access two-dimensional data with uncertain length through <primary key, secondary key> pair. The sorting requirement for two-dimensional data with uncertain length is: sort by the primary key first, and then sort by the secondary key when the primary key is the same. [0003] Sorting is the operation of adjusting a set of unordered data (records) into ordered (ascending or descending) data. Existing sorting methods include traditional sorting algorithms, spreadsheet sorting in office automation, and MapRed...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
CPCG06F16/182G06F16/2282
Inventor 胡自权徐勇龙汉安尹德辉夏纪毅
Owner 四川医科大学
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More