Biological sequencing

A biological sequence and sequencing technology, applied in bioinformatics, sequence analysis, instruments, etc., can solve problems such as missing, error, and enclosure, and achieve the effects of reducing errors, increasing speed, and reducing complexity

Pending Publication Date: 2021-10-19
BIOCLUE BV
View PDF18 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

First, even the best assembled reference genomes contain deletions and errors
Second, it is not possible to find a suitable graphical representation to enclose all the necessary information to counteract problems that arise later when the process of graphical mapping will be performed
Neither De Bruijn graphs, directed graphs, nor bidirected graphs can accurately represent chains
Third, it seems possible to create reference groups using current technology, but the constructed groups are largely unusable in practice due to the lack of structural coordinates
This makes it impossible to provide the necessary accuracy or positional data needed to construct usable pan-genome maps
Furthermore, the use of k-mer lacks the specificity to distinguish multidimensional parameters in genetic information

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Biological sequencing
  • Biological sequencing
  • Biological sequencing

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0182] Example 1: Sequencing according to an embodiment of the invention

[0183] By way of illustration, embodiments of the present invention are not limited in this regard, Figure 7 Examples of possible sequencing implementations are shown in . The figures illustrate possible different method steps of a sequencing method according to an embodiment of the invention. The method comprises, after obtaining at least a first read of a biopolymer or biopolymer fragment, and typically during the further receipt of reads of a biopolymer or biopolymer fragment to be sequenced, parsing incoming, e.g., received , reads with fingerprints, called HYFTs TM . After resolution, an alignment (eg, matching) can be performed in order to obtain a map representing the sequence of the biopolymer or biopolymer fragment. Alignment can be performed by alignment with directed graphs, such as directed acyclic graphs. The latter may be a universal genome reference map, but embodiments are not li...

Embodiment 2

[0184] Embodiment 2: the processing of protein database

[0185] Example 2a: About HYFT found in the protein database TM Analysis of fingerprints against protein databases

[0186] To illustrate HYFT TM The ubiquity of fingerprints in biological sequence databases, taking the Protein Data Bank (PDB) as an example of a large, commonly available biological sequence database, and using a repository of fingerprint data strings obtained as described above was processed according to the invention. The results were analyzed with respect to various indicators, a selection of which is given below.

[0187] Figure 12 and Figure 13 HYFT of processed protein sequences of lengths up to 50 and lengths up to 5000+ are shown, respectively TM Coverage (in %). Here, coverage is the sequence unit in the total sequence length attributed to HYFT TM part of the fingerprint. In other words, coverage is the combined length of one or more first parts divided by the total sequence length. ...

Embodiment 3

[0197] Example 3: Comparison between sequence searches known in the prior art and those described herein

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

In a first aspect, the present invention relates to a method for sequencing a biopolymer or biopolymer fragment, taking into account information contained in a repository of fingerprint data strings, the method comprising: (a) obtaining at least one read for the biopolymer or biopolymer fragment using a sequencer, and (b) processing the read by the computer-implemented steps of: (b1) searching the read for occurrences of one or more of the characteristic biological subsequences represented by the fingerprint data strings and (b2) validating or rejecting the read by, for each occurrence, determining whether or not a sequence unit consecutive to the characteristic biological subsequence conforms with the combinatory data in the repository, and / or (b1') searching a head and / or tail of the read for an occurrence of one of the characteristic biological subsequences represented by the fingerprint data strings and (b2') predicting one or more consecutive sequence units of the read from the combinatory data in the repository.

Description

technical field [0001] The present invention relates to the processing of biological sequence information, and more particularly to the generation of said biological sequence information, eg by sequencing and / or sequence assembly. Systems and methods are provided for generating biological sequence information during a sequencing process. Background technique [0002] Biological sequencing has advanced at an astonishing pace over the past few decades, making possible the Human Genome Project, which achieved the complete sequencing of the human genome more than 15 years ago. To drive this development, numerous technological advances are required, from advances in sample preparation and sequencing methods to data acquisition, processing, and analysis. At the same time, new scientific fields have emerged and developed, including genomics, proteomics, and bioinformatics. [0003] Driven by the emphasis on data acquisition in the post-genomic era, this development has led to the...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G16B30/20
CPCG16B30/20G16B20/20G16B50/10G16B50/50G16B15/00G16B30/10G16B50/30
Inventor D·范海夫特A·范海夫特I·布兰兹E·范海夫特
Owner BIOCLUE BV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products