Method and device for caching massive data

A high-speed, massive data technology, applied in the field of data processing, can solve the problems of data redundancy, unsuitable for statistical analysis platform data processing, etc., to achieve the effect of improving efficiency and breaking through capacity limitations

Active Publication Date: 2016-12-07
深圳希施玛数据科技有限公司
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The embodiment of the present invention provides a method for caching massive data to solve the problem that in the prior art, a relational database is used to store and process massive data, which tends to have a large amount of data redundancy and is not suitable for data processing on a statistical analysis platform

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for caching massive data
  • Method and device for caching massive data
  • Method and device for caching massive data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0024] figure 1 The implementation flow of the massive data caching method provided by the first embodiment of the present invention is shown, and the process is described in detail as follows:

[0025] In step S101, the original data acquired in the database is converted into data in a standardized matrix format, and the converted data is cached in a mat file format.

[0026] In this embodiment, the data in the standardized matrix format is matlab standard matrix data. The mat file is a standard format for matlab data storage. mat files are standard binary files that can be saved and loaded in ASCII code.

[0027] Among them, converting the raw data obtained in the database into a standardized matrix format specifically includes:

[0028] Obtain data in the cell format from the database; specifically, obtain the original data from the database through multiple data adapters, the original data is obtained from the database to the matlab platform in a cell matrix format, eac...

Embodiment 2

[0055] figure 2 An example of the architecture of the massive data cache device provided by the second embodiment of the present invention is shown, as shown in figure 2 As shown, the architecture of the massive data cache device includes a database layer, an adapter layer, a data interface routing layer, a data extraction and cache scheduling layer, and a graphic user interaction layer from bottom to top. The specific applications of each layer are as follows:

[0056] 1) Database layer

[0057] The database layer mainly stores original data through the database, and extracts the original data through a series of stored procedures.

[0058] 2) Adapter layer

[0059] The adapter layer mainly extracts original data from the database, and converts the extracted original data into matlab standard matrix data. The adapter layer includes a plurality of data adapters, each of which handles a type of data.

[0060] The specific application of the adapter layer is as follows: fi...

Embodiment 3

[0073] image 3 The composition structure of the mass data cache device provided by the third embodiment of the present invention is shown, and for the convenience of description, only the parts related to the embodiment of the present invention are shown.

[0074] The mass data cache device can be a software unit, a hardware unit or a combination of software and hardware running in the terminal device, or it can be integrated into the terminal device as an independent pendant or run in the application system of the terminal device middle.

[0075] The massive data cache device includes a first cache unit 31 , a judgment unit 32 , a data acquisition unit 33 and a data filling unit 34 . Among them, the specific functions of each unit are as follows:

[0076] The first cache unit 31 is used to convert the raw data obtained in the database into data in a standardized matrix format, and cache the converted data in a mat file format;

[0077] A judging unit 32, configured to jud...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention is applicable to the technical field of data processing, and provides a method and device for caching massive data. The method includes: converting the original data obtained in the database into data in a standardized matrix format, and converting the converted data into mat File format cache; when receiving the user's data request information, judge whether there is all the data corresponding to the data request information in the cached mat file; if not, use the statistical model algorithm to obtain the missing data from the database data; the missing data is converted into data in a standardized matrix format and filled into the mat file, and the data corresponding to the data request information in the filled mat file is fed back to the User; convert the missing data into data in a standardized matrix format and cache it in a mat file format. Through the invention, data storage redundancy can be greatly reduced, and the efficiency of data storage and reading can be improved.

Description

technical field [0001] The invention belongs to the technical field of data processing, and in particular relates to a method and device for caching massive data. Background technique [0002] With the advent of the big data era, the demand for processing and storage of massive data is increasing. At present, relational databases are mainly used to store and process massive data. However, since the relational database stores record-type data, it is easy to have a large amount of data redundancy, and it is very inconvenient to use data conversion in statistical analysis. Moreover, relational databases request different data and need to be associated with different tables, and the query interface is used separately, and the efficiency of data acquisition is low. In addition, although the relational database provides a memory-level cache function, the data will be cached in the server's memory when the data is requested, and the next time the same request will be much faster....

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F3/06G06F17/30
Inventor 林健武李倬杨波凌宗平
Owner 深圳希施玛数据科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products