Liver lobule partition identification method and device based on spatial transcriptome sequencing data, computer readable storage medium and product

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By normalizing, regressing, principal component analysis, and Bayesian clustering of the spatial transcriptome data of the liver, the problem of liver lobule partitioning was solved, and rapid and accurate liver lobule partitioning was achieved. The results were consistent with the H&E staining results, and the distribution of expression abundance of specific markers was as expected.

CN118016168BActive Publication Date: 2026-06-30INNOVATION CENTER OF YANGTZE RIVER DELTA ZHEJIANG UNIVERSITY

View PDF 2 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: INNOVATION CENTER OF YANGTZE RIVER DELTA ZHEJIANG UNIVERSITY
Filing Date: 2024-02-02
Publication Date: 2026-06-30

Application Information

Patent Timeline

02 Feb 2024

Application

30 Jun 2026

Publication

CN118016168B

IPC: G16B40/30; G16B30/00; G06F18/2135; G06F18/27; G06F18/23

AI Tagging

Technology Topics

Principal component analysisTranscriptome Sequencing

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Current technologies struggle to effectively identify partitions in spatial transcriptome sequencing data of liver lobules.

Method used

By normalizing the spatial transcriptome data matrix of the liver, regression analysis was performed to identify highly variable genes. Principal component analysis was then conducted, combined with Bayesian analysis and unsupervised clustering, to identify the partitions of liver lobules.

Benefits of technology

Rapid identification of liver lobule regions based on spatial transcriptome sequencing data was achieved, filling a gap in this field. The identification results were consistent with H&E staining results, and the distribution of expression abundance of specific biomarkers met expectations.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN118016168B_ABST

Patent Text Reader

Abstract

This invention discloses a method, apparatus, computer-readable storage medium, and product for identifying liver lobule regions based on spatial transcriptome sequencing data, relating to the field of liver lobule region technology. The method includes: normalizing the spatial transcriptome data matrix of the liver and performing regression analysis to identify highly variable genes; performing principal component analysis on the highly variable genes to determine principal components; performing Bayesian analysis on the principal components to perform unsupervised clustering of cells / regions in the spatial transcriptome data matrix to obtain cell / region cluster information; and identifying liver lobule regions based on the cluster information. This invention achieves rapid identification of liver lobule regions based on spatial transcriptome sequencing data by spatial clustering, limiting the number of clusters, and combining this with the expression abundance distribution of specific biomarkers.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of liver lobule partitioning technology, and in particular to a method, apparatus, computer-readable storage medium, and product for liver lobule partitioning identification based on spatial transcriptome sequencing data. Background Technology

[0002] The liver is composed of repetitive anatomical units called hexagonal lobules. Blood flows radially inward from the portal vein (PV) at the corner of the lobule through sinusoidal vessels to the central vein (CV) in the middle of the lobule. Hepatocytes are arranged concentrically radially within the lobule, forming three distinct zones: the PV zone, the intermediate zone (IZ), and the CV zone. Numerous studies have reported that the liver exhibits high spatial heterogeneity, with hepatocytes in different spatial zones possessing different biological functions. For example, PV zone hepatocytes primarily perform oxidative metabolism, gluconeogenesis, fatty acid β-oxidation, and cholesterol biosynthesis, while CV zone hepatocytes are mainly responsible for glycolysis, lipogenesis, glutamine synthesis, xenobiotic metabolism, and bile acid synthesis. Therefore, rapid identification of lobular zones is of great significance for further exploring the biological functions of parenchymal and non-parenchymal cells in different liver zones and their relationship with the development and progression of clinical liver diseases.

[0003] In recent years, the rapid development of spatially resolved transcriptomics (SRM) sequencing technology has been widely used to explore cellular spatial heterogeneity because it achieves transcriptome sequencing while preserving the spatial location of cells and their genes. However, rapidly identifying liver lobule regions based on SRM sequencing data has become a challenge. Summary of the Invention

[0004] Based on this, the purpose of the present invention is to provide a method, apparatus, computer-readable storage medium, and product for identifying liver lobule regions based on spatial transcriptome sequencing data.

[0005] To achieve the above objectives, the present invention provides a method for identifying liver lobule regions based on spatial transcriptome sequencing data, comprising: normalizing the spatial transcriptome data matrix of the liver and performing regression analysis to identify highly variable genes; performing principal component analysis on the highly variable genes to determine principal components; performing Bayesian analysis on the principal components and performing unsupervised clustering on cells / regions in the spatial transcriptome data matrix to obtain cell / region cluster information; and identifying liver lobule regions based on the cluster information.

[0006] To achieve the above objectives, the present invention also provides a computer device, comprising: a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the steps of the above-described method for identifying liver lobule regions based on spatial transcriptome sequencing data.

[0007] To achieve the above objectives, the present invention also provides a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the steps of the above-described liver lobule partitioning identification method based on spatial transcriptome sequencing data.

[0008] To achieve the above objectives, the present invention also provides a computer program product, including a computer program / instruction that, when executed by a processor, implements the steps of the above-described liver lobule partitioning identification method based on spatial transcriptome sequencing data.

[0009] According to specific embodiments provided by the present invention, the present invention discloses the following technical effects:

[0010] This invention achieves rapid identification of liver lobule regions based on spatial transcriptome sequencing data by spatial clustering, limiting the number of clusters, and combining the expression abundance distribution of specific markers, filling a gap in this field. Attached Figure Description

[0011] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0012] Figure 1 The flowchart of the liver lobule partitioning method based on spatial transcriptome sequencing data provided by this invention

[0013] Figure 2 A schematic diagram of the spatial transcriptome data partitioning results of normal mouse liver tissue.

[0014] Figure 3 A schematic diagram of the spatial transcriptome data partitioning results for normal regenerated liver tissue in mice;

[0015] Figure 4 A schematic diagram of the spatial transcriptome data partitioning results for mouse fibrotic liver tissue;

[0016] Figure 5 A schematic diagram of the spatial transcriptome data partitioning results for mouse fibrotic regenerated liver tissue. Detailed Implementation

[0017] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0018] The liver lobule partitioning identification method based on spatial transcriptome sequencing data provided by this invention uses R language (Version: 4.1.3) and consists of two steps. The first step is to perform unsupervised clustering of cells / regions in the spatial transcriptome sequencing data. The second step is to determine the liver lobule partition (portal vein region, i.e., PV region; intermediate region, i.e., IZ region; central vein region, i.e., CV region; unknown region) to which each cell / region belongs through conditional statements, thereby achieving rapid identification of liver lobule partitions.

[0019] To make the above-mentioned objects, features and advantages of the present invention more apparent and understandable, the present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments.

[0020] Example 1

[0021] like Figure 1 As shown in this embodiment, the liver lobule partitioning identification method based on spatial transcriptome sequencing data includes:

[0022] S1: The spatial transcriptome data matrix of the liver is normalized and regression analysis is performed to identify highly variable genes; the columns of the spatial transcriptome data matrix are cells / regions and behavioral genes.

[0023] S2: Perform principal component analysis on the highly variable gene to determine the principal components.

[0024] S3: Perform Bayesian analysis on the principal components, and perform unsupervised clustering on the cells / regions in the spatial transcriptome data matrix to obtain cluster information of cells / regions; the cluster information includes first cluster information and second cluster information; both the first cluster information and the second cluster information include multiple cluster names.

[0025] S4: Identify liver lobule regions based on the cluster information.

[0026] Furthermore, step S1 specifically includes:

[0027] S11: Obtain the spatial transcriptome data matrix of the liver, with columns representing cells / regions, behavioral genes, and values representing gene expression values.

[0028] Assume that the spatial transcriptome sequencing data matrix st_data is a matrix containing S columns of cells / regions and G rows of genes, where the values in st_data represent the gene expression values of each cell / region; st_meta is a data frame containing S columns of cell / region metadata, with three columns named "spot", "x", and "y", recording the names, x-coordinates, and y-coordinates of the S cells / regions; and st_gene is a data frame containing G rows of gene metadata, with one column named "Gene", recording the names of the G genes.

[0029] The `SingleCellExperiment` function in the `SingleCellExperiment` package (Version: 1.16.0) of the R language converts `st_data` and `st_meta` into a `SingleCellExperiment` object, named `st_sce`. The corresponding function and parameters are as follows:

[0030] st_sce<-SingleCellExperiment(assays=list(counts=st_data),

[0031] rowData = st_gene,

[0032] colData = st_meta).

[0033] S12: Perform logarithmic normalization on the data matrix obtained in the first step.

[0034] Log normalization of st_sce is performed using the logNormCounts function from the R language's scuttle package (Version: 1.4.0). The corresponding function and parameters are as follows:

[0035] st_sce<-logNormCounts(st_sce).

[0036] The variance of each gene in `st_sce` is modeled using the `modelGeneVar` function from the `scran` package (Version: 1.22.1) in R, resulting in a data frame `st_gene_info` containing G rows of genes. The corresponding function and parameters used are as follows:

[0037] st_gene_info<-modelGeneVar(st_sce,assay.type="logcounts").

[0038] S13: Perform regression analysis on the normalized data matrix from step S12 to identify the top 2000 highly variable genes.

[0039] The `getTopHVGs` function from the R language's `scran` package (Version: 1.22.1) is used to filter a data frame `st_gene_top` containing 2000 highly variable genes (HVGs) from `st_gene_info`, and to label the 2000 highly variable genes in `st_sce`. The corresponding function and parameters used are as follows:

[0040] st_gene_top<-getTopHVGs(st_gene_info,n=2000)

[0041] rowData(st_sce)[["is.HVG"]]<-(rownames(st_sce)%in%st_gene_top).

[0042] Furthermore, step S2 specifically includes:

[0043] Principal Component Analysis (PCA) is performed on cells / regions in st_sce based on 2000 HVGs and the runPCA function of the scater package (Version: 1.22.0) and the ExactParam function of the BiocSingular package (Version: 1.10.0). This yields a low-dimensional representation of each cell / region with 15 principal components (PCs). The corresponding functions and parameters used are as follows:

[0044] st_sce<-runPCA(st_sce,

[0045] subset_row = st_gene_top,

[0046] ncomponents = 15,

[0047] exprs_values = "logcounts",

[0048] BSPARAM = ExactParam()).

[0049] Furthermore, step S3 specifically includes:

[0050] Based on the 15 principal components in step S2, Bayesian analysis was performed to conduct unsupervised clustering of cells / regions in the spatial empty transcriptome data. Two types of cluster information for cells / regions were calculated and obtained, defined as CI. β and CI α CI βIf there are three cluster names, "1", "2", and "3", then define set C. β ={1,2,3};CI α If there are four cluster names, namely "1", "2", "3" and "4", then define set C. α ={1,2,3,4}.

[0051] Unsupervised clustering of S cells / regions is performed using the `st_sce` function and the `spatialCluster` function from the R language's `BayesSpace` package (Version: 1.4.1). The number of clusters is set to 3 and 4, respectively. Cluster information for the S cells / regions is calculated and stored in `st_meta`, named `bayesspace_clu3` and `bayesspace_clu4`. `bayesspace_clu3` contains three cluster names: "1", "2", and "3"; `bayesspace_clu4` contains four cluster names: "1", "2", "3", and "4". Unless otherwise specified, all functions mentioned above use default parameters. The corresponding functions and their parameters are as follows:

[0052] st_sce3<-spatialCluster(st_sce,

[0053] q = 3,

[0054] platform="ST",

[0055] d=7,

[0056] init.method="mclust",

[0057] model = "t",

[0058] gamma = 2,

[0059] nrep = 10000,

[0060] burn.in = 100,

[0061] save.chain = TRUE)

[0062] sce_meta3<-as.data.frame(st_sce3@colData)

[0063] st_meta$bayesspace_clu3<-as.character(sce_meta3$spatial.cluster)

[0064] st_sce4<-spatialCluster(st_sce,

[0065] q = 4,

[0066] platform="ST",

[0067] d=7,

[0068] init.method="mclust",

[0069] model = "t",

[0070] gamma = 2,

[0071] nrep = 10000,

[0072] burn.in = 100,

[0073] save.chain = TRUE)

[0074] sce_meta4<-as.data.frame(st_sce4@colData)

[0075] st_meta$bayesspace_clu4<-as.character(sce_meta4$spatial.cluster).

[0076] Furthermore, step S4 specifically includes:

[0077] S41: Define a set and CI β and CI α The percentage of each cluster name in the set; define the set. and CI β and CI α The median GLUL gene expression value in the cells / regions corresponding to each cluster name; define the set. and CI β and CI α The median ALB gene expression value in the cells / regions corresponding to each cluster name.

[0078] S42: When P α When the minimum value in the range is less than or equal to 10%, that is, min{a|a∈P} α If the percentage is less than or equal to 10%, then CI will be... α As cluster information of cells / regions, and through conditional functions on C αThe cluster names in the data are used to identify the liver lobule regions (portal vein region, i.e., PV region; intermediate region, i.e., IZ region; central vein region, i.e., CV region; unknown region).

[0079] S421: For C α For any cell cluster name X in the dataset, if the median expression value of its corresponding GLUL gene is... Equal to E αG The maximum value in the range, and the median of its corresponding ALB gene expression value. Not equal to E αA The maximum value in, and its corresponding proportion. Not equal to P α If the minimum value is found, then the cell cluster name X corresponds to the CV region. The specific function is as follows:

[0080] X|X∈C α =CV

[0081]

[0082] S422: For C α For any cell cluster name X in the dataset, if the median expression value of its corresponding ALB gene is... Equal to E αG The maximum value in the expression and the median of the corresponding GLUL gene expression value. Not equal to E αA The maximum value in, and its corresponding proportion. Not equal to P α If the minimum value is found, then the cell cluster name X corresponds to the PV region. The specific function is as follows:

[0083] X|X∈C α =PV

[0084]

[0085] S423: For C α Given any cell cluster name X, if the percentage of its corresponding cluster name is... equals P α If the minimum value is found, then the cell cluster name X corresponds to the Unknown region. The specific function is as follows:

[0086] X|X∈C α =PV

[0087]

[0088] S424: For C α For any cell cluster name X in the array, after determining the CV region, PV region, and Unknown region, the last remaining cluster name corresponds to the IZ region.

[0089] S43: When P α When the minimum value in the range is greater than 10%, that is, min{a|a∈P} α If the percentage is greater than 10%, then CI will be... β As cluster information of cells / regions, and through conditional functions on C β The cluster names in the data are used to identify the liver lobule regions (portal vein region, i.e., PV region; intermediate region, i.e., IZ region; central vein region, i.e., CV region).

[0090] S431: For C β For any cell cluster name X in the dataset, if the median expression value of its corresponding GLUL gene is... Equal to E βG The maximum value in the range, and the median of its corresponding ALB gene expression value. Not equal to E βA The maximum value in the range indicates that the cell cluster name X corresponds to the CV region, as shown in the following function:

[0091] X|X∈C β =CV

[0092]

[0093] S432: For C β For any cell cluster name X in the dataset, if the median expression value of its corresponding ALB gene is... Equal to E βG The maximum value in the expression and the median of the corresponding GLUL gene expression value. Not equal to E βA If the maximum value is found, then the cell cluster name X corresponds to the PV region. The specific function is as follows:

[0094] X|X∈C β =PV

[0095]

[0096] S433: For C β For any cell cluster name X in the array, after determining the CV and PV regions, the last remaining cluster name corresponds to the IZ region.

[0097] The R language execution process corresponding to step S4 is as follows:

[0098] (1) Create a Seurat object named obj using st_data and the CreateSeuratObject function of the Seurat package in R (Version: 4.1.1). Normalize obj using the NormalizeData function of the Seurat package and extract the normalized data matrix named st_ndata. st_ndata is a matrix containing S columns of cells / regions and G rows of genes. The values in st_ndata represent the normalized gene expression values of each cell / region. The corresponding functions and parameters are as follows:

[0099] obj<-CreateSeuratObject(st_data)

[0100] obj <- NormalizeData(obj)

[0101] st_ndata<-obj[["RNA"]]@data.

[0102] (2) Extract the gene expression values of GLUL and ALB from st_ndata and store them in st_meta, naming them Glul and Alb respectively. Assume that species represent the species information from which st_ndata originates. If the species is "Human", the gene names are "GLUL" and "ALB"; otherwise, the gene names are "Glul" and "Alb". The corresponding functions and parameters are as follows:

[0103]

[0104] (3) Calculate the percentage of each cluster name in column bayesspace_clu4 based on the st_meta data frame, and store it in the st_zonation data frame, named ratio. If the minimum value in the ratio column is greater than 10%, the cluster names in bayesspace_clu3 are used as the classification information of cells / regions in column S, stored in the "cluster" column of st_meta, and the unique cluster names are defined as cluster_name; if the minimum value in the ratio column is less than or equal to 10%, the cluster names in bayesspace_clu4 are used as the classification information of cells / regions in column S, stored in the "cluster" column of st_meta, the cluster name with the smallest percentage is defined as st_unknown, corresponding to the unknown region, and other unique cluster names are extracted and defined as cluster_name. The corresponding functions and parameters used are as follows:

[0105]

[0106]

[0107] (4) Calculate the median expression values of Glul and Alb genes in cells / regions corresponding to each cluster name in cluster_name based on st_meta, and store them in exp_Glul and exp_Alb respectively. Define a new data frame res_zonation with three columns named "cluster_name", "Glul" and "Alb" to store cluster_name, exp_Glul and exp_Alb information respectively. The corresponding functions and parameters are as follows:

[0108]

[0109] (5) Calculate the ranking of gene expression values in the "Glul" and "Alb" columns of the res_zonation data frame from highest to lowest, and sum the rankings of the two rankings. Store these rankings in the res_zonation data frame and name them "order_Glul", "order_Alb", and "order_sum". If all values in "order_sum" are equal, the cluster name corresponding to the first-ranked value in "order_Glul" represents the CV region (named clu_cv), the cluster name corresponding to the second-ranked value in "order_Glul" represents the IZ region (named clu_iz), and the cluster name corresponding to the third-ranked value in "order_Glul" represents the PV region (named clu_pv). If the values in "order_sum" are not completely equal, and the cluster name corresponding to the first-ranked value in "order_Glul" and "order_Alb" is the same, it means that there is no distinguishable region in the data. The liver lobules are divided into regions; if the values in "order_sum" are not completely equal, and the first-ranked cluster name in "order_Glul" and "order_Alb" is not the same, then the first-ranked cluster name in "order_Glul" represents the CV region (named clu_cv), the first-ranked cluster name in "order_Alb" represents the PV region (named clu_pv), and the remaining cluster names in the "cluster_name" column of the res_zonation data frame represent the IZ region (named clu_iz). The corresponding functions and parameters used are as follows:

[0110]

[0111]

[0112] (6) Based on the cluster names in st_meta and the cluster names corresponding to each partition of the liver lobule, including clu_cv, clu_iz, and clu_pv, the liver lobule partitions (portal vein region, i.e., PV region; intermediate region, i.e., IZ region; central vein region, i.e., CV region; unknown region) belonging to S cells / regions can be identified and stored in the "zonation" column of the st_meta data frame. The corresponding functions and parameters used are as follows:

[0113] st_meta[st_meta$cluster==clu_cv,]$zonation<-"CV"

[0114] st_meta[st_meta$cluster==clu_iz,]$zonation<-"IZ"

[0115] st_meta[st_meta$cluster==clu_pv,]$zonation<-"PV".

[0116] This embodiment achieves rapid identification of liver lobule regions based on spatial transcriptome sequencing data by spatial clustering, limiting the number of clusters, and combining the expression abundance distribution of specific markers, filling a gap in this field.

[0117] The following is a specific case study to verify the liver lobule partitioning identification method based on spatial transcriptome sequencing data provided in this embodiment.

[0118] Four sets of spatial transcriptome data were downloaded from the FigShare database (https: / / figshare.com / s / e0592ec38062e53c2c5d). The first set was spatial transcriptome sequencing data of normal mouse liver tissue sections. The second set was spatial transcriptome sequencing data of mouse normal liver tissue sections on day 7 after regeneration following two-thirds resection. The third set was spatial transcriptome sequencing data of mouse fibrotic liver tissue sections. The fourth set was spatial transcriptome sequencing data of mouse fibrotic liver tissue sections on day 7 after regeneration following two-thirds resection.

[0119] (1) This invention was used to rapidly identify liver lobule regions using the first set of spatial transcriptome data. For example... Figure 2 The results show that the liver lobule regions rapidly identified by this invention are consistent with the H&E staining results, and the expression level of Glul in the CV region is significantly higher than that in the IZ region and PV region, showing an increasing gradient along the PV-CV axis of the liver lobule; the expression level of Alb in the PV region is significantly higher than that in the IZ region and CV region, showing a decreasing gradient along the PV-CV axis of the liver lobule.

[0120] (2) This invention was used to rapidly identify liver lobule regions using the second set of spatial transcriptome data. For example... Figure 3 The results show that the liver lobule regions rapidly identified by this invention are consistent with the H&E staining results, and the expression level of Glul in the CV region is significantly higher than that in the IZ region and PV region, showing an increasing gradient along the PV-CV axis of the liver lobule; the expression level of Alb in the PV region is significantly higher than that in the IZ region and CV region, showing a decreasing gradient along the PV-CV axis of the liver lobule.

[0121] (3) This invention was used to rapidly identify liver lobule regions using the third set of spatial transcriptome data. For example... Figure 4 The results show that the liver lobule regions rapidly identified by this invention are consistent with the H&E staining results, and the expression level of Glul in the CV region is significantly higher than that in the IZ region and PV region, showing an increasing gradient along the PV-CV axis of the liver lobule; the expression level of Alb in the PV region is significantly higher than that in the IZ region and CV region, showing a decreasing gradient along the PV-CV axis of the liver lobule.

[0122] (4) This invention was used to rapidly identify liver lobule regions using the fourth set of spatial transcriptome data. For example... Figure 5 The results show that the liver lobule regions rapidly identified by this invention are consistent with the H&E staining results, and the expression level of Glul in the CV region is significantly higher than that in the IZ region and PV region, showing an increasing gradient along the PV-CV axis of the liver lobule; the expression level of Alb in the PV region is significantly higher than that in the IZ region and CV region, showing a decreasing gradient along the PV-CV axis of the liver lobule.

[0123] The above four application scenarios demonstrate that the liver partitions automatically identified by this method are consistent with the actual results.

[0124] Example 2

[0125] A computer device includes: a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the steps of the liver lobule partitioning identification method based on spatial transcriptome sequencing data in Embodiment 1.

[0126] Example 3

[0127] A computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the steps of the liver lobule partitioning identification method based on spatial transcriptome sequencing data in Embodiment 1.

[0128] Example 4

[0129] A computer program product includes a computer program that, when executed by a processor, implements the steps of the liver lobule partitioning method based on spatial transcriptome sequencing data in Embodiment 1.

[0130] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.

[0131] This document uses specific examples to illustrate the principles and implementation methods of the present invention. The descriptions of the above embodiments are only for the purpose of helping to understand the method and core ideas of the present invention. Furthermore, those skilled in the art will recognize that, based on the ideas of the present invention, there will be changes in the specific implementation methods and application scope. Therefore, the content of this specification should not be construed as a limitation of the present invention.

Claims

1. A method for identifying liver lobule regions based on spatial transcriptome sequencing data, characterized in that, include: The spatial transcriptome data matrix of the liver was normalized, and regression analysis was performed to identify highly variable genes. The columns of the spatial transcriptome data matrix are cells / regions and behavioral genes; Principal component analysis was performed on the highly variable genes to determine the principal components; Bayesian analysis was performed on the principal components, and unsupervised clustering was performed on the cells / regions in the spatial transcriptome data matrix to obtain the cluster information of cells / regions. The cluster information includes first cluster information and second cluster information; both the first cluster information and the second cluster information include multiple cluster names. The liver lobule regions are identified based on the cluster information; Specifically, the identification of liver lobule regions based on the cluster information includes: S41: Define a set ={ }and ={ } are respectively and The percentage of each cluster name in the set; define the set. ={ }and ={ } are respectively and The median GLUL gene expression value in the cells / regions corresponding to each cluster name; define the set. ={ }and ={ } are respectively and The median ALB gene expression value in the cells / regions corresponding to each cluster name; S42: When If the minimum value in the range is less than or equal to 10%, then... As cluster information of cells / regions, and through conditional functions... Hepatic lobule partitioning is performed using cluster names in the data; S421: For For any cell cluster name X in the dataset, if the median expression value of its corresponding GLUL gene is... equal The maximum value in the expression and the median of the corresponding ALB gene expression value. Not equal to The maximum value in, and its corresponding proportion. Not equal to If the minimum value is found, then the cell cluster name X corresponds to the CV region. The specific function is as follows: S422: For For any cell cluster name X in the dataset, if the median expression value of its corresponding ALB gene is... equal The maximum value in the expression and the median of the corresponding GLUL gene expression value. Not equal to The maximum value in, and its corresponding proportion. Not equal to If the minimum value in the range is found, then the cell cluster name X corresponds to the PV region. The specific function is as follows: S423: For Given any cell cluster name X, if the percentage of its corresponding cluster name is... equal If the minimum value is found, then the cell cluster name X corresponds to the Unknown region. The specific function is as follows: S424: For For any cell cluster name X in the data, after determining the CV region, PV region, and Unknown region, the last remaining cluster name corresponds to the IZ region; S43: When If the minimum value in the range is greater than 10%, then... As cluster information of cells / regions, and through conditional functions... Hepatic lobule partitioning is performed using cluster names in the data; S431: Regarding For any cell cluster name X in the dataset, if the median expression value of its corresponding GLUL gene is... equal The maximum value in the expression and the median of the corresponding ALB gene expression value. Not equal to The maximum value in the range indicates that the cell cluster name X corresponds to the CV region, as shown in the following function: S432: For For any cell cluster name X in the dataset, if the median expression value of its corresponding ALB gene is... equal The maximum value in the expression and the median of the corresponding GLUL gene expression value. Not equal to If the maximum value is found, then the cell cluster name X corresponds to the PV region. The specific function is as follows: S433: For For any cell cluster name X in the array, after determining the CV and PV regions, the last remaining cluster name corresponds to the IZ region.

2. The liver lobule partitioning method based on spatial transcriptome sequencing data according to claim 1, characterized in that, Based on the cluster information, the liver lobule partitions are identified, specifically including: Determine the proportion of each cluster name in the first cluster information and the second cluster information; Determine the median GLUL gene expression value in the cells / regions corresponding to each cluster name in the first and second cluster information; Determine the median ALB gene expression value in the cells / regions corresponding to each cluster name in the first and second cluster information; Based on the stated proportions, the median GLUL gene expression value, and the median ALB gene expression value, liver lobule regions are identified.

3. The liver lobule partitioning method based on spatial transcriptome sequencing data according to claim 2, characterized in that, Based on the stated proportions, the median GLUL gene expression value, and the median ALB gene expression value, liver lobule partitioning is identified, specifically including: When the minimum percentage of cluster names in the first cluster information is less than or equal to 10%, the first cluster information is used as the cluster information of the cell / region. Based on the median expression values of the GLUL gene and the median expression values of the ALB gene, liver lobule partitioning is performed on the cluster names in the first cluster information using a conditional function. When the minimum percentage of cluster names in the first cluster information is greater than 10%, the second cluster information is used as the cluster information of the cell / region. Based on the median expression values of the GLUL gene and the ALB gene, liver lobule partitioning is performed on the cluster names in the second cluster information using a conditional function.

4. The liver lobule partitioning method based on spatial transcriptome sequencing data according to claim 1, characterized in that, Based on the median expression values of the GLUL gene and the ALB gene, liver lobule partitioning is performed on the cluster names in the first cluster information using a conditional function, specifically including: For any cell cluster name X in the first cluster information, if the median expression value of the corresponding GLUL gene is equal to the maximum median expression value of the GLUL gene, the median expression value of the corresponding ALB gene is not equal to the maximum median expression value of the ALB gene, and the corresponding percentage is not equal to the minimum percentage, then cell cluster name X corresponds to the CV region. For any cell cluster name X in the first cluster information, if the median ALB gene expression value is equal to the maximum median ALB gene expression value, the median GLUL gene expression value is not equal to the maximum median GLUL gene expression value, and the corresponding percentage is not equal to the minimum percentage, then cell cluster name X corresponds to the PV region. For any cell cluster name X in the first cluster information, if the proportion of the corresponding cluster name is equal to the minimum proportion, then cell cluster name X corresponds to the Unknown region. For any cell cluster name X in the first cluster information, after determining the CV region, PV region, and Unknown region, the last remaining cluster name corresponds to the IZ region.

5. The liver lobule partitioning method based on spatial transcriptome sequencing data according to claim 1, characterized in that, Based on the median expression values of the GLUL gene and the ALB gene, liver lobule partitioning is performed on the cluster names in the second cluster information using a conditional function, specifically including: For any cell cluster name X' in the second cluster information, if the median expression value of the corresponding GLUL gene is equal to the maximum median expression value of the GLUL gene, and the median expression value of the corresponding ALB gene is not equal to the maximum median expression value of the ALB gene, then the cell cluster name X' corresponds to the CV region. For any cell cluster name X' in the second cluster information, if the median ALB gene expression value is equal to the maximum median ALB gene expression value, and the median GLUL gene expression value is not equal to the maximum median GLUL gene expression value, then the cell cluster name X' corresponds to the PV region. For any cell cluster name X' in the second cluster information, after determining the CV and PV regions, the last remaining cluster name corresponds to the IZ region.

6. A computer device, comprising: The memory and processor contain a computer program stored in the memory and executable on the processor, characterized in that the processor executes the computer program to implement the steps of the liver lobule partitioning identification method based on spatial transcriptome sequencing data as described in any one of claims 1-5.

7. A computer-readable storage medium having a computer program stored thereon, characterized in that, When executed by a processor, the computer program implements the steps of the liver lobule partitioning identification method based on spatial transcriptome sequencing data as described in any one of claims 1-5.

8. A computer program product, comprising a computer program, characterized in that, When executed by a processor, the computer program / instruction implements the steps of the liver lobule partitioning identification method based on spatial transcriptome sequencing data as described in any one of claims 1-5.