Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and device for detecting human genome virus integration site

A technology that integrates sites and genomes, applied in the fields of genomics, proteomics, instruments, etc., can solve the problems of complex construction process, complicated and cumbersome processing, and inability to ensure accuracy, achieve low requirements for computing resources, and reduce mismatches. Probability, high practical value effect

Active Publication Date: 2020-04-03
GUANGZHOU JINYUDA LOGISTICS CO LTD
View PDF5 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] At present, the more common detection method is the blastn method, which uses NGS reads to compare human and virus accounting libraries. It mainly detects which viruses are integrated into the human genome. However, due to the large sample library and relatively sensitive comparison, As a result, the comparison of blastn is complicated and diverse, with many results, complex and cumbersome processing, and many false positives and false negatives; more importantly, its integration site information needs to be reprocessed and calculated
[0005] Another method is to use soft-clipping reads for processing. For example, software such as ViralFusionSeq, Virus-Clip, HGT-ID, and VirTect are all processed based on this method, but due to the soft-clipping reads obtained by the comparison software bwa or bowtie2 It does not consider comparing to both ends, which will make subsequent processing more difficult, and cannot ensure that the two ends of the reads can be compared to the human genome and viral genome respectively
[0006] Others are based on reads that have not been compared to the human genome using de novoassembly and then aligning, which can ensure the length and integrity of the viral sequence, but because reassembly requires a lot of memory and computing resources, There are also many possibilities for the constructed conting, and the accuracy cannot be guaranteed. The software based on this method includes ViralFusionSeq and VirusFinder
[0007] There are also methods based on combining human reference genes with viral genes and reconstructing new mixed reference genomes to determine integration sites, such as VirusFinder and VERSE. The accuracy of the reference genome also has the problem of being unable to quickly and accurately locate the integration site

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for detecting human genome virus integration site
  • Method and device for detecting human genome virus integration site
  • Method and device for detecting human genome virus integration site

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0045] A method for detecting the human genome virus integration site is carried out according to the following steps, and its technical route is as follows: figure 1 shown.

[0046] 1. Pre-processing.

[0047] In order to ensure that the fastq sequence used for comparison meets the quality requirements, it is necessary to use the fastq data quality control filtering software, and use the fastp software to filter the fastq file of the obtained sequencing result data.

[0048] 2. Genome comparison.

[0049] Such as figure 2 As shown, when the virus is integrated into the human genome, there are four situations in the reads obtained by sequencing:

[0050] A: Paired reads can be completely aligned to the human genome.

[0051] B: Paired reads can be completely aligned to the viral genome.

[0052] C: One of the paired reads can be completely aligned to the human genome, and the other can be completely aligned to the viral genome.

[0053] D: One of the paired reads can be...

Embodiment 2

[0271] A device for detecting virus integration sites in human genome, including a data acquisition module, an analysis module and an output module.

[0272] The data acquisition module is used to acquire the data obtained by genetic testing.

[0273] The analysis module analyzes the human genome virus integration site according to the method described in Example 1, and analyzes the virus species to obtain the name of the inserted virus.

[0274] The output module is used to output and display the results obtained by the analysis module.

Embodiment 3

[0276] According to the method of Example 1, using the method of soft-clipping reads, analysis and processing with HGT-ID software were used for comparative verification.

[0277] 1. Method.

[0278] The same off-machine data of gene detection were used to analyze according to the method of Example 1 and HGT-ID software respectively.

[0279] 2. Results.

[0280] (1) Analysis results of the method in Example 1.

[0281] Table 1. Output Result 1

[0282] Read ID chr:start-end:chain insertion site virus type SRR1609136-17458832 chr8:128230627-128230679:1 chr8:128230627 Alphapapillomavirus 7 SRR1609136-25424758 chr8:128230627-128230680:1 chr8:128230627 Alphapapillomavirus 7 SRR1609136-26787992 chr8:128230627-128230678:1 chr8:128230627 Alphapapillomavirus 7 SRR1609136-30119552 chr8:128230627-128230675:1 chr8:128230627 Alphapapillomavirus 7 SRR1609136-74414072 chr8:128230627-128230674:1 chr8:128230627 Alphapap...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a method and a device for detecting a human reference genome virus integration site, and belongs to the technical field of gene detection bioinformatics analysis. The method comprises the steps of genome alignment, sliding cutting, short sequence alignment and integration. According to the method, reads which are not compared with a human reference genome and a virus genome at the same time are subjected to sliding cutting, the cut sub-sequences (or short sequences) are compared again, through clustering, correlation and covariance processing of the comparison positions and the sequence of a certain reads segmentation sequence, all highly possible comparison positions can be listed in the analysis process, reads for simultaneously comparing human reference genomesand virus genomes can be accurately found out and accurately positioned, and the error range is within 3bp. The method is low in calculation resource requirement and high in operation speed, and has relatively high practical value.

Description

technical field [0001] The invention relates to the technical field of gene detection bioinformatics analysis, in particular to a method and device for detecting human genome virus integration sites. Background technique [0002] Many viruses that infect humans can integrate their genomes into the human genome, and the interaction between viral genes and the human host genome can lead to diseases such as cancer and AIDS. Studies have shown that 10-15% of cancers are caused by viral infections, such as HPV or HBV, and the integration of these viruses into the human genome is likely to be the main cause of cancer. [0003] Accurate detection of viral integration sites can provide useful information on virus-related cancer pathogenesis, tumor evolution, and tumor treatment, and next-generation sequencing (NGS) can provide technical and data support for the detection of viral genome integration into the human genome. [0004] At present, the more common detection method is the ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G16B20/30G16B40/30
CPCG16B20/30G16B40/30
Inventor 蒙裕欢关宇佳严慧孟博于世辉
Owner GUANGZHOU JINYUDA LOGISTICS CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products