Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Suffix array construction method

A suffix array and construction method technology, applied in electrical digital data processing, special data processing applications, instruments, etc., can solve the problems of large space complexity, slow running speed, limited application, etc., achieving fast running speed, easy implementation, Small space consumption effect

Inactive Publication Date: 2011-06-01
农革
View PDF0 Cites 20 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The existing linear time suffix array construction algorithms have the disadvantages of slow running speed and large space complexity [3, 4, 5, 7, 8], which limits their application in practice

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Suffix array construction method
  • Suffix array construction method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0057] The present invention will be further elaborated below in conjunction with the accompanying drawings.

[0058] Such as figure 1 As shown, the present invention proposes a novel linear time suffix array construction method (SA-IS), which can effectively overcome the shortcomings of the existing linear time suffix array construction algorithm. The pseudo codes of each step in the flow chart are given as follows, wherein each The elements of the array are stored in a left-to-right manner, that is, the first element is on the far left and the last element is on the right.

[0059] SA-IS (S, SA)

[0060] S: input string; (length is n characters, including n1 LMS substrings)

[0061] SA: suffix array of S;

[0062] S1: integer array; (record the new character string formed after renaming each LMS substring in S, the length is n1)

[0063] SA1: Suffix array of S1

[0064] t: Boolean array; (record the type of each character in S, the length is n)

[0065] P1: integer arr...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a suffix array construction method within a linear time. The method comprises the following steps of: 1) scanning a character string S from right to left, comparing two adjacent characters S[i] and S[i+1] which are scanned at the present to obtain the type of each character and the type of the suffix, and recording the types by using an array t; 2) scanning the array t from left to right, finding out all positions where an LMS character appears, obtaining initial pointers of all LMS sub strings, and recording the pointers of the LMS sub strings by using P1; 3) sequencing all the LMS sub strings in the S via the pointer array P1 of the LMS sub strings and arrays B and SA; 4) renaming each LMS sub string in the character string S according to a sequenced result obtained in the step 3 to form a new shortened string S1; 5) if each character in the S1 is unique, directly sequencing the suffixes of the S1 to calculate the suffix array SA1 of the S1, otherwise, recursively calling an SA-IS algorithm by using the S1 and the SA1 which serve as input parameters; 6) concluding and calculating the suffix array SA of the S according to the suffix array SA1 of the S1; and 7) returning.

Description

technical field [0001] The invention relates to a method for constructing a string suffix array, in particular to a method for automatically completing the construction of the string suffix array by a computer in linear time. Background technique [0002] The string suffix array is a space-saving alternative data structure of the suffix tree. It was first proposed by Manber and Myers in the literature [1, 2], which can realize the algorithm equivalent to the suffix tree in a smaller space. Suffix arrays are used extensively in applications such as data indexing and pattern matching. This paper invents a new suffix array construction algorithm, which can construct its suffix array for any given string in linear time. [0003] The following terms are used in the presentations herein. [0004] Character set A character set ∑ is a set that establishes a total order relationship, that is, any two different elements α and β in ∑ can be compared in size, or α<β, or α>β. Th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
Inventor 农革
Owner 农革
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products