Supercharge Your Innovation With Domain-Expert AI Agents!

Clustering method and system for unnormalized single-cell transcriptome sequencing data

A transcriptome sequencing, non-standardized technology, applied in the field of clustering methods and systems for non-standardized single-cell transcriptome sequencing data, can solve problems such as inaccurate clustering results and poor clustering results, and improve accuracy , the effect of improving performance

Active Publication Date: 2022-07-12
NANKAI UNIV
View PDF10 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The inventors found that the traditional clustering fusion-based single-cell transcriptome sequencing data clustering method introduces more poor clustering results when generating clustering result sets participating in clustering fusion, making the final clustering results inaccurate

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Clustering method and system for unnormalized single-cell transcriptome sequencing data
  • Clustering method and system for unnormalized single-cell transcriptome sequencing data
  • Clustering method and system for unnormalized single-cell transcriptome sequencing data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0041] like figure 1 As shown, Embodiment 1 of the present disclosure provides a clustering method for non-standardized single-cell transcriptome sequencing data, including the following processes:

[0042] S1: Single-cell RNA sequencing data is stored in a matrix. The two dimensions of the matrix represent cells and genes, respectively, and the matrix value represents the expression of a gene in a cell.

[0043] After obtaining the input matrix, first select genes that are differentially expressed between cells for subsequent analysis, specifically, select some genes with higher coefficients of variation (mean divided by variance).

[0044] S1: Use UMAP to perform dimensionality reduction analysis on the matrix processed in S1.

[0045] In the high-dimensional part, the following formula is used to model the similarity between cells:

[0046]

[0047] where ρ i is the distance from the closest data point to i, and d can be any generalized distance that satisfies symmetr...

Embodiment 2

[0069] Embodiment 2 of the present disclosure provides a clustering system for unstandardized single-cell transcriptome sequencing data, including:

[0070] a data acquisition module, configured to: acquire single-cell transcriptome sequencing data;

[0071] a preprocessing module, configured to: preprocess the acquired sequencing data;

[0072] The preliminary clustering module is configured to: perform dimensionality reduction and clustering processing on the preprocessed sequencing data to obtain clustering results;

[0073] The clustering elimination module is configured to: arrange the clustering results from small to large or from large to small according to the Spearman correlation, and delete the clustering results with the small Spearman correlation from the gap where the Spearman correlation changes the most;

[0074] The hierarchical clustering module is configured to: take the average value of the equivalence relation matrix of each clustering result after deletio...

Embodiment 3

[0077] Embodiment 3 of the present disclosure provides a computer-readable storage medium on which a program is stored, and when the program is executed by a processor, implements the clustering of non-standardized single-cell transcriptome sequencing data oriented as described in Embodiment 1 of the present disclosure steps in the method.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present disclosure provides a clustering method and system for non-standardized single-cell transcriptome sequencing data, which acquires single-cell transcriptome sequencing data; preprocesses the acquired sequencing data; performs dimensionality reduction and clustering on the preprocessed sequencing data Class processing to obtain clustering results; arrange the clustering results according to Spearman correlation from small to large or from large to small, and delete the cluster results with small Spearman correlation from the gap where the Spearman correlation changes the most; Perform hierarchical clustering on the average value of the equivalence relationship matrix of each clustering result to obtain the final clustering result; the present disclosure makes the clustering result participating in the clustering fusion and other clustering results far behind before the clustering fusion stage. The anomalous clustering results are eliminated, thereby improving the performance of cluster fusion.

Description

technical field [0001] The present disclosure relates to the technical field of biological cell processing, and in particular, to a clustering method and system for non-standardized single-cell transcriptome sequencing data. Background technique [0002] The statements in this section merely provide background related to the present disclosure and do not necessarily constitute prior art. [0003] Single-cell sequencing technologies are widely used in practical studies such as discovering differentiation relationships between cells and differences in gene expression among different types of cells. Downstream analysis of these single-cell sequencing technologies is often based on unsupervised clustering of cells. [0004] The inventors found that the traditional cluster fusion-based single-cell transcriptome sequencing data clustering method introduces many poor clustering results when generating a clustering result set participating in clustering fusion, making the final clus...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G16B50/30G06K9/62
CPCG16B50/30G06F18/231G06F18/23213G06F18/22
Inventor 刘健潘逸辰陈娇
Owner NANKAI UNIV
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More