Data indexing method based on string suffixes

A technology of data indexing and strings, applied in the field of data indexing, can solve problems such as complex construction, long index update delay, and inability to achieve transaction consistency, so as to improve query efficiency and avoid bottlenecks

Active Publication Date: 2017-10-24
成都索贝视频云计算有限公司
View PDF6 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] This method not only has the problems of the second method, the index update delay is long, the transaction consistency cannot be achieved, and the construction is very complicated

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data indexing method based on string suffixes

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] The technical solution of the present invention will be further described in detail below in conjunction with the accompanying drawings, but the protection scope of the present invention is not limited to the following description.

[0038] Such as figure 1 as shown,

[0039] A data indexing method based on a string suffix, specifically including the following two parts;

[0040] [The process of creating an index]

[0041] S1: write data

[0042] Modify or insert new data to form a new table, enable transaction locks, and lock the new table to avoid dirty data.

[0043] Synchronize the new table data to the old table, and ensure data consistency through timestamps between the new table and the old table.

[0044] Copy the updated data to the index buffer, and extract the metadata, row ID, and add a string suffix.

[0045] There are two main functions of the index cache area: first, to protect the original data, and second, to extract and store metadata, and build a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a data indexing method based on string suffixes. The method comprises an index creation step and a data indexing step. The index creation step comprises the substeps that S1, data is written, metadata and row ID are extracted, and the string suffixes are added; S2, an index is created; and S3, transaction judgment is performed, wherein a data writing transaction is judged, if writing succeeds, a transaction lock is opened, and if writing fails, logic deletion is performed, and the data is recovered. The data indexing step comprises the substeps that S01, index matching is performed; S02, an index pointer list is acquired, wherein a coding value of the index is quickly positioned in a B+ tree, and leaf nodes of the value form the index pointer list containing keywords of the index; and S03, the data is read, and an index result is found according to an index pointer array. According to the method, starting from a language system of a non-Latin language family, a string suffix sorting algorithm and a B+ tree algorithm are utilized to construct the index with characters being units, and problems concerning efficiency and accuracy of fuzzy query are solved.

Description

technical field [0001] The invention relates to the field of data indexing, in particular to a data indexing method based on character string suffixes. Background technique [0002] For now, there are three main methods of data fuzzy query: [0003] Through the "like" fuzzy matching query function provided by the database itself. [0004] Although this method is simple and easy to use, it cannot use indexes. When the amount of data is not large, it can still be tolerated; but if the amount of data is slightly larger, the query speed will be very slow, and it is difficult to meet the needs of the converged media era. [0005] Through the extended functions of the database, such as using the DB full-text index (such as the fulltext match function provided in mysql). [0006] The shortcomings of this method are mainly due to the differences in language and culture at home and abroad, and there are great defects in Chinese word segmentation, which cannot well support the retr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/2228
Inventor 吴春中张浩阳
Owner 成都索贝视频云计算有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products