A scalable learning index method and system

An indexing and storage system technology, applied in the field of scalable learning indexing methods and systems, can solve problems such as difficulty in adding or removing models and data, reducing data access efficiency, large space and time overhead, and ensuring data availability. Scalability, improve scalability, reduce the effect of time overhead

Active Publication Date: 2022-05-20
HUAZHONG UNIV OF SCI & TECH
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, due to the high dependency between models, changing one model will change other models, which makes it difficult for us to add or remove a certain part of the model and data
[0005] (2) Expensive overhead: For the problem that the data cannot be inserted well due to the high degree of dependence between the models, some learning index methods use a buffer to save the newly inserted data, so that the newly inserted data will not be destroyed The original data distribution, however, will result in access to data by accessing two structures (ie, the original structure and the buffer), which greatly reduces the efficiency of data access
In order to separate and store the data in other places, the existing learning index methods either separate the data covered by multiple models, or build a data conversion table to separate and migrate the data, but both of these methods require retraining. When rebuilding, and introduce a lot of space and time overhead
[0006] Overall, existing learning indexing methods suffer from poor scalability

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A scalable learning index method and system
  • A scalable learning index method and system
  • A scalable learning index method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0062] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention. In addition, the technical features involved in the various embodiments of the present invention described below can be combined with each other as long as they do not constitute a conflict with each other.

[0063] In the present invention, the terms "first", "second" and the like (if any) in the present invention and drawings are used to distinguish similar objects, and are not necessarily used to describe a specific order or sequence.

[0064] The traditional tree index model such as figure 1 As shown in (a) in , consider it as a prediction model, that is, predict the location of the data storage...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an expandable learning index method and system, belonging to the field of computer data storage, including: sampling key-value pairs in a key-value storage system to obtain an ordered training data set; using the training data set to train and obtain Multiple linear regression models, each linear regression model is used to index the data in a data interval, and the data areas covered by each linear regression model do not overlap each other; each linear regression model according to <key,model>The key and model are the maximum data and model parameters in the range covered by the model respectively; use the hierarchical bucket structure to process newly inserted data; each hierarchical bucket structure corresponds to a data participating in training; each hierarchical bucket The structure includes a parent bucket, and the data in the parent bucket is ordered; each parent bucket corresponds to a sub-bucket, and the data in the sub-bucket is ordered, and the data in the sub-bucket is smaller than the data in the corresponding parent bucket. The invention can effectively improve the scalability of the learning index.< / key,model>

Description

technical field [0001] The invention belongs to the field of computer data storage, and more specifically relates to an expandable learning index method and system. Background technique [0002] In today's big data era, how to efficiently store and access data has become an important issue of concern to all walks of life. Computer systems usually use various index structures to efficiently store and access data according to different requirements, among which the tree index structure is an important structure to satisfy range requests. Many existing methods, such as CSS-Tree, CSB+-tree, FAST, etc., use memory, cache, or SIMD (Single Instruction Multiple Data, Single Instruction Multiple Data) optimization to make the tree structure provide fast data access, But these structures usually take up a lot of memory space. Once these structures overflow the limited memory due to the ever-increasing data, the efficiency of data access will be seriously reduced. [0003] The exist...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06N20/00G06F16/901
CPCG06N20/00G06F16/901
Inventor 华宇李鹏飞
Owner HUAZHONG UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products