Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

65 results about "Suffix array" patented technology

In computer science, a suffix array is a sorted array of all suffixes of a string. It is a data structure used, among others, in full text indices, data compression algorithms and within the field of bibliometrics. Suffix arrays were introduced by Manber & Myers (1990) as a simple, space efficient alternative to suffix trees. They had independently been discovered by Gaston Gonnet in 1987 under the name PAT array (Gonnet, Baeza-Yates & Snider 1992).

Correctness verification method and system of suffix array and longest common prefix

ActiveCN107015952AImplement correctness verificationReduce time and space overheadNatural language data processingSpecial data processing applicationsArray data structureValidation methods
The invention relates to a correctness verification method and system of a suffix array and a longest common prefix. The method includes the steps that T is scanned once from right to left, the size of a character T[i] and the size of a subsequent character T[i+1] are compared according to the definition of suffix types, and the types of the character T[i] and the suffix suf(T, i) of T are calculated and recorded in t[i]; elements in SA1 and LCPA1 are initialized as -1; SA is scanned once from left to right, and all LMS suffixes and LCP values thereof in SA are found according to an array t and recorded in SA1 and LCPA1 in sequence respectively; the adjacent LMS suffixes and the LCP values thereof in SA1 are subjected to correctness verification according to the character string T, the array t, SA1 and LCPA1; L-type suffixes and LCP values thereof are inductively sorted according to the character string T, the array t, B, C, SA1 and LCPA1; S-type suffixes and LCP values thereof are inductively sorted according to the character string T, the array t, B, C, SA1 and LCPA1; SA, SA1, LCPA and LCPA1 are scanned once in sequence, whether SA and SA1 are identical and LCPA and LCPA1 are identical or not is determined through comparison, and if the two groups are identical through comparison, SA and LCPA of T are correct.
Owner:SYSU CMU SHUNDE INT JOINT RES INST +1

Classification method, device and system based on platelet differentially expressed gene marker

The invention belongs to the computer technology field, and provides a classification method, device and system based on a platelet differentially expressed gene marker. The method comprises the steps that a sequencing reading sequence of a target sample platelet transcriptome is acquired; a comparison result between the sequencing reading sequence and a human genome is acquired according to a suffix array search algorithm and a sequence splitting/searching/extending strategy; a gene expression estimation value is determined according to a maximum likelihood method; a gene expression difference of a positive sample set and a negative sample set is acquired through a linear statistical method; a hyperplane expression is constructed according to the positive sample set and the negative sample set; an entity gene expression estimation values are classified according to the hyperplane expression, the entity gene expression estimation values and a support vector machine principle. According to the classification method, device and system based on the platelet differentially expressed gene marker, the differentially expressed gene marker can be quickly and accurately identified, and the classification precision of corresponding individuals of a group is improved.
Owner:张渠

A suffix array indexing method and apparatus for real-time data stream

ActiveCN109299152ASolve the problem of low retrieval efficiencyRetrieval does not affectDigital data information retrievalSpecial data processing applicationsReal-time dataArray data structure
The invention discloses a suffix array indexing method for a real-time data stream. The method comprises the following steps: a server receives the real-time data stream, extracts source data, and pretreats the source data into documents; parsing the document, distributing the document according to the domain, receiving the source data in each domain, and starting an independent thread to index and store the data; a domain consists of a plurality of segments. After receiving the source data, the domain object writes the source data directly into the segments and sets the segment source data update signal to return the response. If all domains of the document return a response, the response information is returned to the client; the suffix array construction tool listens for the segment source data update signal in the background, automatically constructs the suffix array for the segment source data, and generates the segment suffix array; a segment source data, a segment suffix array,and a segment information are linked into a full suffix array index, and the source data is indexed successfully. The invention can index heterogeneous data in real time without word segmentation, andadopts asynchronous mode to generate index to accelerate response time. The invention is suitable for data indexing field.
Owner:SUN YAT SEN UNIV

Short message search method and system based on suffix arrays

The invention relates to a short message search method based on suffix arrays. The method comprises the steps that S1, a suffix array is constructed for each short message in a short message list, and then all suffix array items in all the suffix arrays obtained through construction are ordered; S2, when a keyword for searching for a short message is received, all characters in the received keyword are sequentially used as indexes for binary search according to a character receiving order; S3, the i(th) character in the keyword is used as an index to perform binary search in all the suffix array items which are ordered, and the suffix array corresponding to the suffix array items with the first character being the index is used as an i(th) search result; S4, it is assumed that i=i+1, the i(th) character in the keyword is used as an index to perform binary search in the suffix array items contained in an (i-1)th search result, and then the suffix array corresponding to the suffix array items with the first character being the index is used as the i(th) search result; and S5, the step S4 is executed repeatedly till i is greater than n, and at the moment the short message corresponding to the i(th) search result is used as a short message search result to be output, wherein n is the number of characters contained in the keyword.
Owner:SYSU CMU SHUNDE INT JOINT RES INST +1
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products