Data ranking method and device, server and storage medium

A data sorting and data technology, applied in the field of data processing, can solve the problem of not configuring multiple window partition sorting and so on

Active Publication Date: 2018-11-02
GUANGZHOU HUYA INFORMATION TECH CO LTD
View PDF5 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] In the process of implementing the present invention, the inventor found that the prior art has the following defects: when directly using the window function to sort data on Hive, multiple window partitions will not be configured to jointly sort the data, and only a single window partition will be started to perform data sorting. Full data sorting

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data ranking method and device, server and storage medium
  • Data ranking method and device, server and storage medium
  • Data ranking method and device, server and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0028] figure 1 It is a flow chart of a data sorting method provided in Embodiment 1 of the present invention. The method in this embodiment can be executed by a data sorting device, which can be implemented by hardware and / or software, and can generally be integrated in a server or In a server cluster, for example, a server or a server cluster corresponding to the Hadoop distributed file system. The method of this embodiment specifically includes:

[0029] 101. Search for data to be sorted according to the data attributes in the data sorting instruction, where the data to be sorted includes data attributes and values.

[0030] Those skilled in the art can understand that, when using the SQL (Structured Query Language, Structured Query Language) window function to sort the data in the Hive data warehouse in the Hadoop distributed file system, multiple window partitions will not be configured to jointly When data is sorted, only one window partition is started to sort all the...

Embodiment 2

[0053] figure 2 It is a flowchart of a data sorting method provided by Embodiment 2 of the present invention. This embodiment is optimized on the basis of the above embodiments. In this embodiment, a method of intercepting the initial normalized value by setting the effective number of digits to obtain the normalized value is given. According to the first preset number or the second preset number The specific implementation method of dividing the data to be sorted by setting the number.

[0054] Correspondingly, the method in this embodiment specifically includes:

[0055] 201. Search for data to be sorted according to the data attributes in the data sorting instruction, where the data to be sorted includes data attributes and values.

[0056] 202. Standardize the values ​​of the data to be sorted to obtain corresponding initial normalized values, intercept each initial normalized value according to the set effective digits to obtain each standardized value, and sort the st...

Embodiment 3

[0073] image 3 It is a flow chart of a data sorting method provided by Embodiment 3 of the present invention. This embodiment is optimized on the basis of the above embodiments. In this embodiment, a method of dividing the data to be sorted into data partitions to be sorted according to the first preset number and using the partition number as the position information of all data is provided. detailed description.

[0074] Correspondingly, the method in this embodiment specifically includes:

[0075] 301. Search for data to be sorted according to the data attributes in the data sorting instruction, where the data to be sorted includes data attributes and values.

[0076] 302. Standardize the values ​​of the data to be sorted to obtain corresponding initial normalized values, intercept each initial normalized value according to the set effective digits to obtain standardized values, and sort the standardized values ​​according to the data sorting instruction.

[0077] 303. ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention discloses a data ranking method and device, a server and a storage medium. The data ranking method comprises the following steps: finding data to be ranked according todata properties in a data ranking command, wherein the data to be ranked include data properties and values; partitioning the data to be ranked into two or more data partitions to be ranked accordingto a ranking result of standard values corresponding to the values of the data to be ranked, and determining partition dimension identifiers and total data position information corresponding to the data partitions to be ranked; and calling the data partitions to be ranked according to the partition dimension identifiers into one window partition to be ranked, and determining a sequencing result through the window partition to be ranked and the total data position information. According to the technical scheme in the embodiment of the invention, a plurality of window partitions make responsesto a ranking window function for a Hive data warehouse, and the ranking speed of data stored in the Hive data warehouse is increased.

Description

technical field [0001] The embodiments of the present invention relate to the technical field of data processing, and in particular to a data sorting method, device, server and storage medium. Background technique [0002] Hive is a data warehouse tool based on the Hadoop distributed file system. It can map structured data files into a database table and provide a simple SQL query function. It can convert SQL statements into MapReduce tasks for execution. Its advantage is that the learning cost is low, and simple MapReduce statistics can be quickly realized through SQL-like statements, without the need to develop special MapReduce applications, which is very suitable for statistical analysis of data warehouses. [0003] In the process of realizing the present invention, the inventor found that the prior art has the following defects: when directly using the window function to sort data on Hive, multiple window partitions will not be configured to jointly sort the data, and o...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/24
CPCG06F40/18
Inventor 曾志华仇贲
Owner GUANGZHOU HUYA INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products