Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A method and system for correctness verification of suffix array and longest common prefix

A correctness verification and longest common technology, which is applied in electrical digital data processing, natural language data processing, instruments, etc., can solve the problem that the correctness verification of the suffix array and the longest common prefix cannot be performed at the same time, so as to reduce the time and space overhead Effect

Active Publication Date: 2020-08-18
SYSU CMU SHUNDE INT JOINT RES INST +1
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The present invention provides a correctness verification method and system of a suffix array and the longest common prefix to overcome the problem that the correctness verification of the suffix array and the longest common prefix cannot be performed simultaneously

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method and system for correctness verification of suffix array and longest common prefix
  • A method and system for correctness verification of suffix array and longest common prefix

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0026] The basic idea of ​​this embodiment is: first, after SA and LCPA are constructed, scan SA from left to right to find out all LMS suffixes and calculate the LCP value between adjacent LMS suffixes at the same time; Secondly, according to the definition of LCP, use The fingerprint function calculates the fingerprint value of the longest common prefix of adjacent LMS suffixes, and saves the first character on the right of each longest common prefix, and verifies the correctness of the LMS suffix and its LCP value according to the same fingerprint value and different saved characters. Then, use the LMS suffix and its LCP value to inductively sort the L-type suffix and its LCP, and then use the L-type suffix and its LCP value to inductively sort the S-type suffix and its LCP value. Finally, compare the known SA with the newly calculated SA1 and the known LCPA with the newly calculated LCPA1. If the comparisons of the two groups are exactly the same, it means that SA and LCPA ...

Embodiment 2

[0057] The embodiment of the present invention also provides a suffix array and the longest common prefix correctness verification system, such as figure 2 Schematic diagram of the structure, including:

[0058] File reading and writing module 1, used for reading and writing character strings, SA, SA1, LCPA and LCPA1 files;

[0059] The L / S suffix identification module 2 is used to identify whether the character string suffix type is L type or S type;

[0060] LMS suffix identification module 3, for identifying the LMS suffix of character string;

[0061] LMS suffix and its LCP value correctness verification module 4, the main function of this module is: scan the SA from left to right, obtain the LMS suffix in it and use RMQ to calculate its LCP value, and then use the fingerprint function to calculate the common values ​​of adjacent LMS suffixes The fingerprint value of the prefix, and save the first character to the right of their common prefix. If the fingerprint functi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a correctness verification method and system of a suffix array and a longest common prefix. The method includes the steps that T is scanned once from right to left, the size of a character T[i] and the size of a subsequent character T[i+1] are compared according to the definition of suffix types, and the types of the character T[i] and the suffix suf(T, i) of T are calculated and recorded in t[i]; elements in SA1 and LCPA1 are initialized as -1; SA is scanned once from left to right, and all LMS suffixes and LCP values thereof in SA are found according to an array t and recorded in SA1 and LCPA1 in sequence respectively; the adjacent LMS suffixes and the LCP values thereof in SA1 are subjected to correctness verification according to the character string T, the array t, SA1 and LCPA1; L-type suffixes and LCP values thereof are inductively sorted according to the character string T, the array t, B, C, SA1 and LCPA1; S-type suffixes and LCP values thereof are inductively sorted according to the character string T, the array t, B, C, SA1 and LCPA1; SA, SA1, LCPA and LCPA1 are scanned once in sequence, whether SA and SA1 are identical and LCPA and LCPA1 are identical or not is determined through comparison, and if the two groups are identical through comparison, SA and LCPA of T are correct.

Description

technical field [0001] The invention relates to the field of correctness verification of arrays, and more specifically, to a correctness verification method and system of a suffix array and the longest common prefix. Background technique [0002] The suffix array SA is any given a string T, the string consisting of all characters from any position of T to its end is called the suffix of T. Obviously, a string with a length of n contains n suffixes, sort these n suffixes in lexicographical order, and store their addresses in an integer array, which is called the suffix array of the string. The longest common prefix is ​​the number of characters of the common prefix between two adjacent suffixes in the suffix array. The length of the longest common prefix array LCPA corresponding to a character string of n length is also n, the first element of LCPA is 0, and n-1 LCP values ​​corresponding to n suffixes are stored sequentially from the second element. The combination of the ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/194
CPCG06F40/194
Inventor 韩凌波农革吴裔
Owner SYSU CMU SHUNDE INT JOINT RES INST
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products