Big data indexing method and system

A big data and indexing technology, applied in the field of database indexing under the background of big data, can solve complex problems and achieve the effect of improving efficiency, realizing space utilization, and efficient space utilization

Active Publication Date: 2013-07-10
TSINGHUA UNIV
View PDF2 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Big data storage supporting temporal continuity is a very complex problem

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Big data indexing method and system
  • Big data indexing method and system
  • Big data indexing method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0038] Embodiment 1: Application of pLSM index engine unit in wireless sensor network

[0039]There are a large number and types of sensors in the Internet of Things. A specific application scenario of the Internet of Things is in the submarine deep water exploration project. A large number of sensors are distributed in a certain ocean area to collect various information such as seawater velocity, temperature, and salinity in real time. Different information is organized into various structured and unstructured data, and each sensor regularly sends the collected data to the data center server unit through the gateway. The database management system of the server unit of the data center implements dynamic insertion and real-time analysis of data streams by establishing pLSM indexes.

[0040] In this embodiment, the data stream is stored in a key-value pair manner. The key value of the data is the UID uniformly assigned by the database management system, and the value is vari...

Embodiment 2

[0053] Embodiment 2: Steps of related index operations supported by the pLSM index engine unit

[0054] 1. Insertion: first insert the record into the memory component C0, if the capacity of C0 reaches the upper limit after insertion, then judge whether the capacity of the external memory component C1 reaches the upper limit. If the capacity of C1 reaches the upper limit, all records in C1 will be rolled into C2. Otherwise, insert C0 records into the C1 component in batches, then clear C0, and insert new records into C0.

[0055] 2. Query: To query data with a key value of 100, first search in the memory C0 component, and return the corresponding value if found; otherwise, search for the corresponding Delete Filter module 1 and Delete Filter module of the external memory components C1 and C2 at the same time Whether the result of 100 in 2 is true, if both are true, the search fails and the query ends. Otherwise, turn to the Bloom Filter module of the component that is false ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a big data indexing method and system and relates to the technical field of database index. The method comprises the following steps of: sending data generated by all terminal devices connected with a data center server unit into an NoSQL (Not Only Structured Query Language) database unit; establishing a pLSM (physical Log-Structured Merge) indexing engine unit; and carrying out indexing operation by a user. The system comprises the data center server unit, the NoSQL database unit and the pLSM indexing engine unit, wherein the data center server unit is used for receiving the data generated by all the connected terminal devices and sending the data into the NoSQL database unit; the NoSQL database unit is used for storing the data sent by the data center server unit; and the pLSM indexing engine unit is used for taking a COLA (Cache-Oblivious Lookahead Array) as an external storage assembly, taking a Skip List as an internal storage assembly and additionally arranging Delete Filter modules in an internal storage to support delete operation on the external storage assembly.

Description

technical field [0001] The invention relates to the technical field of database indexing under the background of big data, in particular to a big data indexing method and system. Background technique [0002] In the process of informatization, the amount of personal and enterprise data is growing rapidly. Due to the rise of social networks, e-commerce and Internet of Things technologies, various mobile terminals, sensors and traditional devices are generating various unstructured data all the time. In 2011, the McKinsey & Company Global Institute published a research report stating that data has penetrated into various industries and business functions, and has gradually become an important factor of production in human society. By the end of 2011, the total amount of global data reached 1.9ZB (1ZB=1×10 12 GB), will reach 8ZB by 2015, and will reach approximately 35ZB by 2020. In the face of huge and rapidly growing data, efficient management and analysis of data has beco...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 张勇王津高旸邢春晓
Owner TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products