Finding a best matching string among a set of stings

A technology of best matching and character strings, which is applied in the fields of electrical digital data processing, digital data information retrieval, instruments, etc.

Active Publication Date: 2014-11-19
IBM CORP
View PDF6 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the accurate method is still too slow for many high-complexity multiple sequence alignment problems (even when executed on a graphics processing unit (GPU)), and the computational cost of pairwise sequence alignment is relative to the sequence length quadratic equation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Finding a best matching string among a set of stings
  • Finding a best matching string among a set of stings
  • Finding a best matching string among a set of stings

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0122] In the following, identical elements are numbered in these figures or are similar elements or perform equivalent functions. Elements that have been discussed previously are not necessarily discussed in subsequent figures if the function is equivalent.

[0123] Those skilled in the art know that various aspects of the present invention can be implemented as a system, method or computer program product. Therefore, various aspects of the present invention can be embodied in the following forms, that is: a complete hardware implementation, a complete software implementation (including firmware, resident software, microcode, etc.), or a combination of hardware and software implementations, These may collectively be referred to herein as "circuits," "modules," or "systems." Furthermore, in some embodiments, various aspects of the present invention can also be implemented in the form of a computer program product embodied in one or more computer-readable media having computer...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

In one aspect, the invention relates to a method for finding a best matching string sopt among a set Z of strings for a reference string t, the method comprising: - representing (101), for each pair of strings sz and t, a dynamic programming problem for calculating a final alignment score scz as a matrix Az (300) of cells, each cell Az (i,,j) (316) representing an intermediate result Fz(i,,j)to be calculated; - calculating (102) a current optimal alignment boundary threshold oab- threshold, for each string sz of the set of strings Z, executing: - calculating (104) a prospective final alignment score Pz(i,,j) of a candidate alignment of the strings sz and t for a cell Az(i,,j); - determining (105), if Pz(i,,j) improves the current oab-threshold; - if no improvement is determined, aborting (106) the calculation and prohibiting any extension of the candidate alignment covering said cell Az (i, j); - if an improvement is determined, continuing (107) the calculation for calculating the final alignment score scz for said string sz.

Description

technical field [0001] The present invention relates to the field of data processing, and more particularly to the field of dynamic programming. Background technique [0002] The problem of finding the best matching string and corresponding optimal alignment score from a set of strings (usually stored in a database) with respect to a reference string (or 'query string') is a common problem in many technical fields. Especially in the field of bioinformatics and text analysis. Various sequence alignment algorithms are known for aligning two or more biological sequences (such as protein sequences, DNA sequences or RNA sequences) and they can be used, for example, to determine evolutionarily conserved sequence homologies, to determine sequences stored in Text documents that are highly similar to reference texts in databases or the Internet. [0003] Sequence (or 'string') alignment algorithms are computationally very complex. Since identifying optimal alignments between more ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G16B45/00
CPCG06F17/30G06F16/90344G06F16/3334G06F16/3344G16B45/00G06F7/24G06F2207/228G06F17/16G06F17/17
Inventor M·基斯泽基斯K·扎泽伊克基T·德泽埃德泽克基G·库库塞恩斯基
Owner IBM CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products