Method and device for data processing

A data processing device and data processing technology, applied in the field of data processing, can solve problems such as long data consumption, achieve the effects of reducing the number and time of garbage collection, improving data reading efficiency, and improving data processing efficiency

Inactive Publication Date: 2017-08-04
BOYAA ONLINE GAME DEV SHENZHEN
View PDF3 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Based on this, it is necessary to provide a data processing method to solve the problem that Spark loads data from disk for a long time, which can improve the efficiency of loading data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for data processing
  • Method and device for data processing
  • Method and device for data processing

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0061] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0062] figure 1 It is a flowchart of a data processing method in an embodiment. Such as figure 1 As shown, a data processing method includes the following steps 102 to 112:

[0063] Step 102, configuring the memory distributed file system and the Spark cluster, and associating the memory distributed file system with the Spark cluster.

[0064]Specifically, the memory distributed file system is a distributed storage based on off-heap memory. The off-heap memory separates data storage and data processing, and the data is stored outside the JVM (Java Virtual Machine, virtual machine) heap space of...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a method and a device for data processing. The method comprises the following steps: configuring a memory distributed file system and a Spark cluster, associating the memory distributed file system and the Spark cluster; starting the Spark cluster, and initializing the Spark cluster; reading source data from the memory distributed file system; preprocessing the source data, to obtain preprocessed data; analyzing the preprocessed data to obtain a data analysis result; and storing the data analysis result in the memory distributed file system. The above method and the device for data processing read the source data from the memory distributed file system, provide cross-cluster file sharing service in memory level speed for an external reading device, so that data reading efficiency is improved. The memory distributed file system separates data storage and data processing, reduces garbage recycle frequency and time in the data processing, and improves data processing efficiency.

Description

technical field [0001] The invention relates to the field of data processing, in particular to a data processing method and device. Background technique [0002] With the development of computer and network technology and the popularization of electronic equipment, more and more users use the network, which generates a large amount of data. In order to quickly obtain real business value from massive data, it is necessary to build a high-performance Data analysis application system. [0003] Spark is an iterative processing framework based on memory. It has the characteristics of high throughput and strong fault tolerance. Due to its excellent internal scheduling mechanism and fast distributed computing capabilities, it can perform iterative calculations at a very fast speed. In a certain Quasi-real-time analysis and processing can be realized for some specific services. However, when Spark processes data, if the source data is stored on the disk, it needs to load the data ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/182
Inventor 郑壮杰
Owner BOYAA ONLINE GAME DEV SHENZHEN
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products