Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method, device and equipment for processing data skew and storage medium

A technology for processing data and data. It is applied in the field of data processing and can solve the problems of increasing data processing resources, affecting operating efficiency, and affecting operating time.

Pending Publication Date: 2020-04-21
CHINA PING AN PROPERTY INSURANCE CO LTD
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Since the merged file needs to be split when obtaining a single original input file, the increase in data processing resources will affect the running time, and the increase in Map data will lead to an increase in data processing time, which will affect the operating efficiency. Therefore, it cannot be effectively solved. Data Skew Processing Problem

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method, device and equipment for processing data skew and storage medium
  • Method, device and equipment for processing data skew and storage medium
  • Method, device and equipment for processing data skew and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0089] It should be understood that the specific embodiments described here are only used to explain the present application, not to limit the present application. The terms "first", "second" and the like in the specification and claims of the present application and the above drawings are used to distinguish similar objects, and are not necessarily used to describe a specific sequence or sequence. It is to be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments described herein can be practiced in sequences other than those illustrated or described herein. Furthermore, the terms "comprising" and "having", as well as any variations thereof, are intended to cover a non-exclusive inclusion, for example, a process, method, system, product or device comprising a series of steps or modules is not necessarily limited to the expressly listed Those steps or modules, but may include other steps or modules that are not clearly l...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the field of big data, and provides a method, device and equipment for processing data skew and a storage medium. The method comprises the steps: presetting the capacity of ato-be-stored space, and setting the data type and target number in the to-be-stored space; partitioning the to-be-stored space according to a preset rule based on the capacity and the data types to obtain the size and the number of sub-storage spaces corresponding to each data type; according to the size of each sub-storage space and the target number corresponding to the data type, determining the size and the number of target storage spaces in each partition through a partition rule; setting a random number of each partition according to the number of the target storage spaces in each partition; marking the random number and a preset judgment condition on each part of data in the to-be-stored space; and analyzing the content marked on each part of data in the to-be-stored space through the random grouping function so as to store the data corresponding to each data type into the target to-be-stored space. By adopting the scheme, the problem of data skew processing can be effectively solved.

Description

technical field [0001] The present application relates to the field of data processing, and in particular to a method, device, device and storage medium for processing data skew. Background technique [0002] With the rapid development of technologies such as the Internet of Things, cloud computing, and network bandwidth, big data computing is widely used. In big data computing, the transmission or storage or processing of massive data information often leads to data skew. In the existing data warehouse tool Hive, there are no adjustable parameters or callable functions to directly solve the problem of data skew when reading data, so that Hive and other big data computing engines are reading When the Hive table data is skewed, it cannot be processed in a timely and effective manner, resulting in the failure of the entire task to be completed within the stipulated time limit, thus failing to meet business requirements. [0003] In the current data skew processing, by detect...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/22G06F16/27G06F16/28
CPCG06F16/2282G06F16/278G06F16/283Y02D10/00
Inventor 余可帆
Owner CHINA PING AN PROPERTY INSURANCE CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products