Resource management method for massive unstructured data

An unstructured data and data technology, applied in the field of resource management for massive unstructured data, can solve the problem of time-consuming processing, and achieve the effect of effective organization and management, efficient query, and improved data processing efficiency

Inactive Publication Date: 2016-06-15
ENC DATA SERVICE CO LTD
View PDF4 Cites 41 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] Aiming at the increasingly large and time-consuming technical problems of urban unstructured data, the present invention proposes a resource management method for massive unstructured data, which can effectively organize and manage massive unstructured data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Resource management method for massive unstructured data
  • Resource management method for massive unstructured data
  • Resource management method for massive unstructured data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] The present invention will be described in detail below in conjunction with the accompanying drawings. The following examples do not limit the invention. Without departing from the spirit and scope of the inventive concept, changes and advantages that can be imagined by those skilled in the art are all included in the present invention.

[0020] figure 1 It is a flow chart of data processing according to the method of the present invention, including three parts: data storage process, metadata table, data index table and metadata index creation process, and data request processing process.

[0021] The data storage process will be described in detail below. The storage process includes creating data tables on HBase and storing original data on HBase and HDFS as required. Such as figure 1 As shown, the details are as follows:

[0022] Step a1: First, create a corresponding data table on HBase according to the data to be uploaded. The content of the created data tab...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a resource management method for massive unstructured data. The method includes the steps that a, the storage mode of the massive unstructured data is determined according to the size of a file of the massive unstructured data, and the file is stored on an HDFS or in an HBase; b, metadata information of the data is stored in the HBase, and query speed is increased by building index tables of metadata according to themes, tags and other information of the metadata; c, when the metadata is queried, the index tables of the metadata can be searched for according to the themes or the tags of the metadata needing to be searched for, and a data table is fast positioned; d, when unstructured data records are queried, the data index table corresponding to the data table needs to be found according to the naming rule of the data index tables, semantic tags of the data are queried in the data index table, recording main keys of the data needing to be searched for are found, and the data is fast positioned in the data table according to the main keys. By means of the resource management method, the massive unstructured data can be effectively organized and managed, and fast and efficient query can be performed.

Description

technical field [0001] The invention relates to the field of distributed database HBase and distributed file system HDFS, in particular to a resource management method for massive unstructured data. Background technique [0002] The full name of HDFS is Hadoop Distributed Filesystem, which is the flagship file system of Hadoop. Its idea comes from the Google File System (Google File System, GFS), and is suitable for the access mode of writing once and reading many times, meeting the application scenarios of urban multi-source data. It is a distributed file system suitable for storing large files and can be used as a data source for Hadoop and Spark. [0003] HBase is an open source distributed database developed based on Google Bigtable. It is not a traditional relational database. Its original purpose is to solve the theoretical and practical shortcomings of traditional relational databases when dealing with large-scale massive data. Since the underlying data of HBase is ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/134G06F16/148G06F16/182
Inventor 张善海熊贵喜蔡朝辉杜博文凌萍谢志普
Owner ENC DATA SERVICE CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products