A single-sample copy number variation detection method based on second-generation sequencing

By constructing a CNV negative sample reference set and using hidden Markov models and filtering methods, the problems of poor cross-platform applicability and high cost of second-generation sequencing technology in single-sample CNV detection are solved, realizing efficient and sensitive single-sample CNV detection, which is applicable to various NGS platforms and cancer types.

CN116189763BActive Publication Date: 2026-06-16AMOY DIAGNOSTICS CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
AMOY DIAGNOSTICS CO LTD
Filing Date
2023-02-21
Publication Date
2026-06-16

AI Technical Summary

Technical Problem

Existing second-generation sequencing technologies suffer from problems such as high cost, limited detection range, large sample volume requirements, significant impact of batch samples on detection results, and poor cross-platform applicability in single-sample CNV detection, making it impossible to achieve flexible and rapid single-sample CNV detection.

Method used

A reference set was constructed by merging sequencing data of multiple CNV-negative samples. The CNV status of the samples was predicted using a hidden Markov model. The optimal reference subset was selected and the data features were normalized. Combined with a Naive Bayes-Gaussian model and a filtering method, single-sample CNV detection was achieved.

🎯Benefits of technology

It achieves stable detection across different NGS platforms, batches, and cancer types, with high throughput and high resolution, and can sensitively detect exon-level CNV variations, reducing resource and cost requirements and improving detection accuracy and flexibility.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116189763B_ABST
    Figure CN116189763B_ABST
Patent Text Reader

Abstract

The application provides a single-sample copy number variation detection method based on second-generation sequencing, comprising the following steps: merging sequencing data of multiple CNV-negative samples based on a second-generation sequencing technology to obtain a CNV-negative reference set; training a hidden Markov model by using a sequencing sample with labeled CNV, predicting CNV states of each probe of a to-be-detected sample, and selecting a gene sequence corresponding to a probe predicted as negative to obtain an aligned sample of the to-be-detected sample; selecting a subset with the highest similarity to the aligned sample of the to-be-detected sample from the CNV-negative reference set, and statistically normalizing data features of each probe of the to-be-detected sample by using the optimal reference subset; predicting the normalized data features of the to-be-detected sample by using the trained hidden Markov model, and marking the probes as negative (Negative), gain (Gain) and loss (Loss) again to obtain a full CNV region; and filtering a real CNV region, which is simple, accurate, highly adaptable and does not depend on a negative control sample of the to-be-detected sample or a negative sample reference set of the same batch.
Need to check novelty before this filing date? Find Prior Art