Small file merge method and data acquisition system

A data query and small file technology, applied in the database field, can solve the problems of consuming large I/O overhead, achieve the effects of consuming a large amount of I/O overhead, improving merge efficiency, and reducing startup time

Active Publication Date: 2017-01-04
BEIJING GRIDSUM TECH CO LTD
View PDF6 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] Embodiments of the present invention provide a small file merging method and a data query system to at least solve the technical problem of consuming a large amount of I / O overhead caused by storing the intermediate temporary results of the process of merging small files

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Small file merge method and data acquisition system
  • Small file merge method and data acquisition system
  • Small file merge method and data acquisition system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027]In order to enable those skilled in the art to better understand the solutions of the present invention, the following will clearly and completely describe the technical solutions in the embodiments of the present invention in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments are only It is an embodiment of a part of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts shall fall within the protection scope of the present invention.

[0028] It should be noted that the terms "first" and "second" in the description and claims of the present invention and the above drawings are used to distinguish similar objects, but not necessarily used to describe a specific sequence or sequence. It is to be understood that the data so used are interchangeable under appropriate c...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a small file merge method and a data acquisition system. The small file merge method comprises the following steps: the data acquisition system receives a merge request from a client side; the data acquisition system generates a merge task according to the merge request; the data acquisition system distributes the merge task to a data merge progress running in a data storage node of a database, wherein the data merge progress merges small files stored in the data storage node to obtain the merged files, wherein the data merge progress is a progress in the data acquisition system. The small file merge method solves the technical problem of huge I/O expenses caused by a middle temporary result in the process of storing and merging the small files.

Description

technical field [0001] The invention relates to the field of databases, in particular to a method for merging small files and a data query system. Background technique [0002] For data files stored in the Hadoop Distributed File System (HDFS for short), the address information of the data files will be mapped to the memory of the Namenode (Namenode). When the client reads the data in HDFS file, the client will first search the address information of the data file in the Namenode memory, and then return the corresponding address information to the client, and the client will directly read the data file on the Datanode through the returned address information. When the data in HDFS continues to increase, the number of files increases, and the address information that needs to be mapped also increases, which occupies a large amount of Namenode memory space. If there are a large number of small files in an HDFS system, a large amount of Namenode memory space will be consumed. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/113G06F16/14G06F16/182
Inventor 谢宁
Owner BEIJING GRIDSUM TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products