Unlock instant, AI-driven research and patent intelligence for your innovation.

A data reading method and device

A data reading and metadata technology, applied in the computer field, can solve the problems of slow data reading process, large memory consumption of the main thread, etc.

Active Publication Date: 2021-06-08
BEIJING GRIDSUM TECH CO LTD
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, when reading data through the above-mentioned SQL running in Spark SQL, it is necessary to read all the metadata in the entire data table to the main thread of the distributed system. The entire data reading process is very slow, and the main thread Thread memory consumption is high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A data reading method and device
  • A data reading method and device
  • A data reading method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0047] An exemplary embodiment of the present invention will be described in more detail below with reference to the accompanying drawings. While the exemplary embodiments of the present invention are shown in the drawings, it is understood that the present invention can be implemented in various forms and is not limited thereto. Instead, these embodiments are provided to be more thoroughly understood, and the range of the scope of the invention can be communicated to those skilled in the art.

[0048] The embodiment of the present invention provides a data reading method, such as figure 1 As shown, the method is not directly obtained in the main thread of the distributed system, but the file content that needs to be loaded in advance is configured in the file list, and the path information in the file list is loaded from the metadata. The data content read, thereby increasing the reading speed of the data, reducing the memory consumption of the main thread, and provides the follo...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a data reading method and device, which relate to the technical field of computers. The main purpose is to increase the data reading speed and reduce the memory consumption of the main thread. The main technical solution of the invention is: to obtain a list of files to be loaded, The path information corresponding to the file to be loaded in the metadata is recorded in the file list; according to the path information, the corresponding file content in the file list is loaded from the metadata; data processing is performed on the file content , obtain the file content with the same data structure as in the metadata, and generate a data reading result. The present invention is mainly used for data reading.

Description

Technical field [0001] The present invention relates to the field of computer technology, and more particularly to a data reading method and apparatus. Background technique [0002] With the arrival of information global data, the massive data from multimedia and Internet begins to spread to various industries. Traditional database technologies are handling massive data, especially when non-structural content data, and the processing of these big data and Analysis has become an important and urgent need. [0003] The big data processing platform has experienced the initial Hadoop and HBase, and the subsequent SQL-based Hive, Spark, etc., and the current application is Spark SQL, Spark SQL is a component of Spark, as Apache Spark Big Data Framework Part of the processing of structured data and the query of SPARK data, with SPARK SQL, ETL operations (such as JSON, PARQUET, Database) can be performed for data in different formats, and then complete specific query operations. [0004...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/27G06F16/25G06F9/50
CPCG06F9/5016G06F16/25G06F16/27
Inventor 陈克凡
Owner BEIJING GRIDSUM TECH CO LTD