Intelligent agent and visual analysis method and system for spatial transcriptome data, medium and product
By employing intelligent agent and visualization analysis methods, combined with a large language model and WebGL rendering engine, the problems of complex parameter configuration, insufficient rendering performance, and difficulty in integrating multi-source data in spatial transcriptome data analysis are solved, achieving efficient and secure localized data processing, which is applicable to the biomedical field.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- HEFEI BEIMING SPACE BIOTECHNOLOGY CO LTD
- Filing Date
- 2026-03-06
- Publication Date
- 2026-06-12
AI Technical Summary
Existing spatial transcriptome data analysis software suffers from problems such as complex parameter configuration, insufficient data rendering performance, difficulty in integrating multi-source heterogeneous data, and unmet data privacy and localization requirements.
By employing intelligent agent and visualization analysis methods, natural language instructions are parsed through a large language model to generate structured function call sequences. Combined with the WebGL rendering engine and disk mapping technology, high-performance and secure localized data analysis is achieved.
It significantly lowers the barrier to entry, enhances big data processing capabilities, improves the interactive analysis experience, and ensures data security, making it suitable for the data compliance requirements of the biomedical field.
Smart Images

Figure CN122201427A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the fields of bioinformatics and artificial intelligence, and in particular to an intelligent agent and visualization analysis method, system, medium and product for spatial transcriptome data. Background Technology
[0002] Biomedical research is undergoing a historic leap from "single-cell resolution" to "spatial panoramic resolution." While traditional single-cell transcriptome sequencing (scRNA-seq) reveals cellular heterogeneity within tissues and organs, it struggles to analyze intercellular communication networks and the structural features of the tissue microenvironment due to the loss of cellular spatial location information during tissue dissociation. Spatial transcriptomics (ST) technology fills this gap.
[0003] Currently, mainstream spatial transcriptomics technologies are mainly divided into two categories: next-generation sequencing (NGS) based technologies and imaging-based technologies.
[0004] NGS-based technologies, such as 10x Genomics' Visium platform, use microarrays with spatial barcodes to capture mRNA from tissue slices. This technology offers high throughput, but early resolution was limited by the size of the capture spot (55 micrometers in diameter), typically containing several to dozens of cells per spot.
[0005] Imaging and in-situ capture technologies, such as 10x Genomics' Xenium, NanoString's CosMx, BGI's Stereo-seq, and MERFISH, achieve ultra-high resolution at the sub-cellular and even single-molecule levels through in-situ hybridization or nanosphere arrays. For example, Stereo-seq technology utilizes DNA nanosphere (DNB) arrays with resolutions reaching 500 nanometers or 715 nanometers, and the amount of data (bins) generated by a single chip often reaches tens of millions or even hundreds of millions.
[0006] Despite rapid advancements in upstream data production technologies, downstream data analysis software has lagged behind, becoming a major bottleneck limiting the widespread application of spatial transcriptomics technology. Existing technologies suffer from several key challenges:
[0007] Pain Point 1: "Parameter anxiety" and high learning costs; The current analytics ecosystem is primarily dominated by open-source toolkits in the Python environment (such as Scanpy, Squidpy, and Cell2location). These tools are powerful, but they rely entirely on command-line interface (CLI) operation. Users not only need to be proficient in Python or R programming, but also must have a deep understanding of statistics to comprehend and set complex algorithm parameters.
[0008] For example, when using the Leiden algorithm for clustering, the choice of the resolution parameter directly determines the granularity of cell subpopulation division; when using Cell2location for deconvolution, the hyperparameter detection_alpha needs to be precisely adjusted according to the sensitivity of the sequencing technology. For most clinicians and wet laboratory biologists, these parameters are like "black boxes," and blindly setting them often leads to erroneous biological conclusions. Although commercial software such as BioTuringBBrowser exists, its level of intelligence is limited, mainly relying on preset fixed procedures or simple natural language queries (such as "show CD4 expression levels"), and it cannot automatically recommend the optimal analysis strategy based on the specific distribution characteristics of the data (such as sparsity and noise level).
[0009] Pain Point 2: Rendering performance bottleneck for large-scale data; With the widespread adoption of technologies such as Xenium and Stereo-seq, the data volume of a single sample has grown exponentially. Traditional bioinformatics visualization tools (such as plotting systems based on Matplotlib or Seurat) primarily generate static plots. When faced with point cloud data exceeding millions of points, these tools are extremely slow to render and cannot support smooth zooming, panning, and lasso selection.
[0010] While some existing software attempts to incorporate GPU acceleration, they often have strict requirements for graphics card drivers and lack efficient solutions for web applications. Achieving real-time, smooth interaction with tens of millions of spatial data points on a typical local workstation is a pressing technical challenge that needs to be addressed.
[0011] Pain Point 3: The "island effect" of multi-source heterogeneous data; Different sequencing platforms from different vendors generate data in various formats (e.g., .h5 for 10x, .gef or .gem for BGI, and the more common .h5ad). Researchers often need to use different software to process data from different platforms, lacking a unified, standardized analysis platform to integrate this heterogeneous data. Furthermore, existing commercial software (such as Loupe Browser) is often a closed ecosystem, supporting only its own format and incompatible with third-party open-source algorithms (such as SPACEL, Tangram, etc.).
[0012] Pain Point 4: Data privacy and localization needs; Spatial transcriptome data often involves valuable clinical samples (such as tumor slides) containing sensitive human genetic information. While cloud-based analytics platforms offer powerful computing capabilities, they also pose compliance risks of data leakage or cross-border data export. Researchers urgently need an analytics system that can run locally and offline, and possesses high-performance computing capabilities.
[0013] In conclusion, developing an automated analysis system that integrates intelligent decision-making capabilities of large language models, supports high-performance WebGL rendering, is compatible with multi-source heterogeneous data, and ensures local security has extremely high scientific research value and market demand. Summary of the Invention
[0014] Based on the technical problems existing in the background technology, this invention proposes a method, system, medium and product for intelligent agent and visualization analysis of spatial transcriptome data.
[0015] The intelligent agent and visualization analysis method for spatial transcriptome data proposed in this invention includes: S1. Receive raw spatial transcriptome data from multiple heterogeneous sources and parse it to obtain metadata features; S2. Input the user's natural language analysis instructions and the metadata features into the pre-trained large language model, and generate a structured function call sequence containing algorithm selection and hyperparameter configuration based on the preset bioinformatics tool description library. S3. Parse the structured function call sequence, automatically arrange and execute the corresponding bioinformatics analysis algorithm in the asynchronous thread pool, and feed back the execution progress through the signal slot mechanism; S4. Using a WebGL-based rendering engine, the output of the analysis algorithm is mapped into a visual graphic, and real-time interactive operation is supported by the user through a two-way communication channel between the browser and the local backend. The analysis context is dynamically updated based on the interactive feedback. S5. Update the analysis intent based on user feedback, and iterate to step S2 to regenerate the function call sequence, or directly proceed to step S3 to adjust the parameters and re-execute the analysis algorithm without reloading the original data.
[0016] Furthermore, the metadata features include at least cell number, sequencing depth distribution, gene detection number distribution, and spatial sparsity.
[0017] Furthermore, the parsing yields metadata features, specifically: The spatial coordinate matrix and gene expression matrix in the original spatial transcriptome data are analyzed using a unified object model, and the full loading mode or disk mapping mode is selected according to the data scale.
[0018] Furthermore, the process of parsing the spatial coordinate matrix and gene expression matrix in the original spatial transcriptome data using a unified object model, and selecting either a full loading mode or a disk mapping mode based on the data scale, specifically involves: If the raw spatial transcriptome data is identified as image-type spatial data, the coordinate system of the high-resolution tissue image and the gene expression matrix will be automatically aligned. If the size of the original spatial transcriptome data exceeds the preset memory threshold, disk mapping is enabled to keep the gene expression matrix in disk storage format and read data slices on demand only during computation or rendering using memory mapping technology. The AES-GCM algorithm is used to encrypt and store items marked as sensitive data in the original spatial transcriptome data, and decryption is performed in real time when loading through memory.
[0019] Furthermore, the process of generating the structured function call sequence is as follows: Construct system prompts, which include an analysis context consisting of historical interaction records and visualization status, a statistically processed metadata feature summary, and function definitions from a pre-defined bioinformatics tool description library; The system prompts and the user's natural language analysis commands are input into a pre-trained large language model; The system receives and parses JSON format instructions output by the large language model, extracts a structured function call sequence from it, and the structured function call sequence contains one or more algorithm function names to be called and corresponding parameter key-value pairs, wherein the parameter key-value pairs are adaptively filled according to the metadata features; Verify whether the type and value range of the parameter key-value pair meet the preset requirements of the corresponding algorithm function. If there are invalid parameters, trigger the automatic parameter correction logic or generate a prompt message requesting user confirmation.
[0020] Furthermore, the step of using a high-performance WebGL-based rendering engine to map the output of the analysis algorithm into visual graphics specifically involves: The parsed spatial coordinate matrix and gene expression matrix are packaged by the backend processing module into structured data suitable for graphics rendering and transmitted to the frontend rendering module through the communication channel. The front-end rendering module calls a WebGL-based rendering engine to create geometric objects based on the structured data and configure shader programs to achieve visual encoding of gene expression levels and high-performance rendering. Establish spatial indexes for geometric objects in the rendered scene to respond to user interactions with region selection on the graphics, quickly query and highlight the data points corresponding to the selected region.
[0021] Furthermore, the analysis algorithms include a cell type deconvolution algorithm based on Bayesian statistics, a spatial domain identification algorithm based on graph neural networks, and a cell communication inference algorithm based on optimal transport theory.
[0022] A computer system includes a memory, a processor, and a computer program stored in the memory, characterized in that the processor executes the computer program to implement the method described above.
[0023] A computer-readable storage medium storing a plurality of classification programs, the plurality of classification programs being invoked by a processor to execute the method described above.
[0024] A computer program product includes a computer program that is executed by a processor to implement the method described above.
[0025] Those skilled in the art will understand that all or part of the steps of the above method embodiments can be implemented by hardware related to program instructions. The aforementioned program can be stored in a computer-readable storage medium. When the program is executed, it performs the steps of the above method embodiments. The aforementioned storage medium includes various media that can store program code, such as ROM, RAM, magnetic disk, or optical disk.
[0026] The advantages of the intelligent agent and visualization analysis method, system, medium and product for spatial transcriptome data provided by this invention are as follows: (1) Significantly lowering the threshold for use: Through the "intelligent agent + function call" mechanism, the complex bioinformatics parameter configuration process is automated and intelligent, enabling researchers without a coding background to perform expert-level data analysis. (2) Enhancing big data processing capabilities: The singleton data transfer center (hereinafter referred to as DataHub) reads large-scale data on demand through disk mapping. The combination of lazy loading and WebGL rendering engine breaks through the dual bottlenecks of memory and video memory, realizing smooth analysis and visualization support for ultra-large-scale data such as Stereo-seq. (3) Enhancing the interactive analysis experience: Real-time GPU rendering replaces the traditional static image viewing method, greatly improving the efficiency of data exploration and the ability to discover scientific problems. (4) Ensuring data security and privacy: The fully localized deployment architecture and AES encryption mechanism perfectly meet the stringent requirements for data compliance in the biomedical field. Attached Figure Description
[0027] Figure 1 This is a schematic diagram of the overall hardware and software architecture of the present invention; Figure 2 A flowchart illustrating the logic of intent recognition, parameter reasoning, and function calls for the intelligent agent control module; Figure 3 This is a diagram illustrating the data loading state transition and encryption / decryption mechanism of DataHub. Figure 4 This is a schematic diagram illustrating the communication mechanism between the high-performance visualization rendering pipeline based on WebGL and QWebChannel. Figure 5 The diagram shows the workflow and processing results for analyzing Xenium mouse brain data using spatial transcriptomics technology. Figure 6 This is a schematic diagram illustrating the interactive visualization interface and the rendering results of Xenium mouse brain data using spatial transcriptomics technology. Detailed Implementation
[0028] The technical solution of the present invention will now be described in detail through specific embodiments. Many specific details are set forth in the following description to provide a thorough understanding of the invention. However, the present invention can be implemented in many other ways different from those described herein, and those skilled in the art can make similar modifications without departing from the spirit of the invention. Therefore, the present invention is not limited to the specific embodiments disclosed below.
[0029] like Figures 1 to 6 As shown, the intelligent agent and visualization analysis method for spatial transcriptome data proposed in this invention includes: S1. Receive multi-source heterogeneous spatial transcriptome raw data and parse to obtain metadata features; when the size of the raw data exceeds a preset memory threshold, enable backend mapping mode to maintain the gene expression matrix in disk storage format; S2. Input the user's natural language analysis instructions and the metadata features into the pre-trained large language model, and generate a structured function call sequence containing algorithm selection and hyperparameter configuration based on the preset bioinformatics tool description library. S3. Parse the structured function call sequence, automatically arrange and execute the corresponding bioinformatics analysis algorithm in the asynchronous thread pool, and feed back the execution progress through the signal slot mechanism; S4. Using a WebGL-based rendering engine, the output of the analysis algorithm is mapped into a visual graphic, and real-time interactive operation is supported by the user through a two-way communication channel between the browser and the local backend. The analysis context is dynamically updated based on the interactive feedback. S5. Update the analysis intent based on user feedback, and iterate to step S2 to regenerate the function call sequence, or directly proceed to step S3 to adjust the parameters and re-execute the analysis algorithm without reloading the original data.
[0030] This embodiment develops an automated analysis solution that integrates intelligent decision-making capabilities of large language models, supports high-performance WebGL rendering, is compatible with multi-source heterogeneous data, and ensures local security. It has extremely high scientific research value and market demand, and aims to solve the problems of complex parameter configuration, insufficient big data rendering performance, and difficulty in integrating multi-source data in existing technologies.
[0031] The advantages of this embodiment are: (1) Significantly lowering the barrier to entry: Through the "intelligent agent + function call" mechanism, the complex bioinformatics parameter configuration process is automated and intelligent, enabling researchers without a coding background to perform expert-level data analysis. (2) Enhancing big data processing capabilities: The combination of DataHub's disk mapping and WebGL rendering engine breaks through the dual bottlenecks of memory and video memory, achieving smooth support for ultra-large-scale data such as Stereo-seq. (3) Enhancing the interactive analysis experience: Real-time GPU rendering replaces the traditional static image viewing method, greatly improving the efficiency of data exploration and the ability to discover scientific problems. (4) Ensuring data security and privacy: The fully localized deployment architecture and AES encryption mechanism perfectly meet the stringent requirements for data compliance in the biomedical field.
[0032] The spatial multi-omics data analysis system corresponding to this spatial multi-omics data analysis method specifically includes: a data access and DataHub management module, an intelligent agent control module, an algorithm orchestration and execution module, and a high-performance visualization engine.
[0033] 1. Data access and DataHub management module; This module serves as the system's data foundation, employing a unified object model to encapsulate spatial transcriptome data from various sources (Visium, Xenium, Stereo-seq, etc.). The built-in DataHub subsystem utilizes an innovative memory management strategy.
[0034] This module receives raw spatial transcriptome data from multiple heterogeneous sources, uses a unified object model to parse the spatial coordinate matrix and gene expression matrix in the raw spatial transcriptome data, and automatically selects between full loading mode and disk mapping mode based on the data scale. Specifically: For example... Figure 3 As shown, the source platform for the input spatial transcriptome raw data is identified. This platform includes formats such as 10x Visium, 10x Xenium, BGI Stereo-seq, and the general H5AD. The format of this spatial transcriptome raw data is parsed into a standard structure containing obs (cell attributes), var (gene attributes), obsm (multidimensional matrix), and uns (unstructured data), and a spatial coordinate index is established. If the spatial transcriptome raw data is identified as image-type spatial data, the coordinate systems of the high-resolution tissue image and the gene expression matrix are automatically aligned. If the size of the spatial transcriptome raw data exceeds a preset memory threshold, disk mapping is enabled, keeping the gene expression matrix in disk storage format, and data slices are read on demand only during computation or rendering using memory mapping technology. Items marked as sensitive data in the spatial transcriptome raw data are encrypted using the AES-GCM algorithm and decrypted in real time when loaded into memory.
[0035] Among them, the disk mapping mode: for ultra-large-scale data, it utilizes the characteristics of the HDF5 file system to keep the main body of the data on the disk and read it on demand through memory mapping technology, which significantly reduces memory footprint and makes it possible to analyze massive amounts of data on ordinary PCs.
[0036] AES-GCM Encryption: Integrates the GCM mode of the Advanced Encryption Standard (AES), supporting local encrypted storage and authentication / decryption of high-value data (PremiumData), ensuring data security when stored at rest.
[0037] 2. Intelligent Agent Control Module (AI Agent Controller); This module introduces "Agentic Workflow," which upgrades the Large Language Model (LLM) from a simple "chatbot" to the "driver" of the system.
[0038] The intelligent agent control module is activated to automatically scan the unified object model and extract the metadata features of the data. The metadata features include at least the number of cells, the distribution of sequencing depth, the distribution of gene detections, and spatial sparsity.
[0039] Among them, Metadata Awareness: The intelligent agent control module can automatically scan the raw spatial transcriptome data in DataHub and extract key statistical features such as total cell count, number of genes, sequencing depth (UMI Counts), and mitochondrial ratio as metadata features.
[0040] Function call mechanism: The intelligent agent control module does not directly output natural language text, but instead outputs structured JSON instructions. For example, based on the "low sequencing depth" feature detected in the metadata, the intelligent agent control module will automatically construct instructions to call the dimensionality reduction algorithm and automatically lower the parameter n_comps (principal component count) to suppress noise.
[0041] That is, the user's natural language analysis instructions and the metadata features are input into a pre-trained large language model, and a structured function call sequence containing algorithm selection and hyperparameter configuration is generated based on a preset bioinformatics tool description library.
[0042] Specifically: such as Figure 2 As shown, the generation process of the structured function call sequence is as follows (a1) to (a4): (a1) Construct system prompt words, which include an analysis context consisting of historical interaction records and visualization status, a metadata feature summary after statistical processing, and function definitions in a preset bioinformatics tool description library.
[0043] (a2) Input the system prompts and the user’s natural language analysis instructions into the pre-trained large language model.
[0044] During the inference process of the large language model, the bioinformatics tool description library can map scattered and ambiguous user natural language commands (such as "find differentially expressed genes") into precise, executable function names, parameter structures, and legal value ranges. This is the foundation for the large language model to make reliable tool calls. Simultaneously, as part of the system prompts, the bioinformatics tool description library strictly limits the range of tools the large language model can "think" and "call," preventing invalid, fictitious, or unsupported function calls. Furthermore, the bioinformatics tool description library defines a unified application programming interface (API) for all available analysis algorithms. Regardless of whether the specific backend tool is Seurat, Scanpy, or a custom script, the exposed interface to the model and frontend is consistent. This achieves modularity, standardization, and scalability of the analysis workflow. Finally, the tool definitions in the bioinformatics tool description library can be associated with the parsed metadata features. This allows the large language model to intelligently select the most suitable tool from the bioinformatics tool description library based on the actual situation of the current data, and recommend or fill in reasonable values for parameters (such as resolution and gene number) that are appropriate for the current data scale, achieving preliminary automated parameter configuration.
[0045] (a3) Receive and parse the JSON format instructions output by the large language model, and extract the structured function call sequence from it. The structured function call sequence contains one or more algorithm function names to be called and corresponding parameter key-value pairs, wherein the parameter key-value pairs are adaptively filled according to the metadata features.
[0046] (a4) Verify whether the type and value range of the parameter key-value pair meet the preset requirements of the corresponding algorithm function. If there are invalid parameters, trigger the automatic parameter correction logic or generate a prompt message requesting user confirmation.
[0047] Through steps (a1) to (a4), the powerful semantic understanding and reasoning capabilities of the large language model are securely and effectively integrated with precise tools in the professional field, the objective characteristics of current data, and the status of historical operations. This greatly improves usability while ensuring the rigor and efficiency of the professional analysis process. This is something that traditional scripted or graphical button-based analysis platforms struggle to achieve.
[0048] 3. Algorithm orchestration and execution module; This module integrates cutting-edge spatial transcriptome analysis algorithms from the Python ecosystem (such as Squidpy, Cell2location, SPACEL, Tangram, etc.) for backend processing. It employs an asynchronous concurrency architecture based on QThreadPool thread pool technology, separating time-consuming bioinformatics computation tasks (Worker Threads) from the main interface thread (GUI Thread). Real-time progress feedback is achieved through a signal / slot mechanism, ensuring smooth interface responsiveness during analysis and completely resolving the issue of interface "freezing" during computation in traditional bioinformatics software.
[0049] 4. High-performance visualization engine; This module is based on a Python + Web hybrid architecture. The backend Python handles data processing, while the frontend uses WebGL / Canvas technology for GPU-accelerated rendering. Specifically, it utilizes a WebGL-based rendering engine to map the output of the analysis algorithm into visual graphics, and supports real-time user interaction through a two-way communication channel between the browser and the local backend, dynamically updating the analysis context based on interactive feedback.
[0050] Specifically: such as Figure 4 As shown, the analyzed spatial coordinate matrix and gene expression matrix are packaged into structured data suitable for graphics rendering by the backend processing module Python, and transmitted to the frontend rendering module through a communication channel. The frontend rendering module calls a WebGL-based rendering engine to create geometric objects based on the structured data and configure shader programs to achieve visual encoding of gene expression levels and high-performance rendering. Spatial indexes are established for the geometric objects in the rendering scene to respond to user interaction with region selection on the graphics, quickly query and highlight the data points corresponding to the selected region, thereby mapping the output of the analysis algorithm into a visual graphic.
[0051] Specifically, the rendering pipeline is optimized by using the THREE.js library to build a WebGL rendering pipeline and leveraging GPU parallel processing to accelerate rendering. Real-time color mapping and anti-aliasing of gene expression levels are achieved by dynamically adjusting the size and color of points using custom materials.
[0052] LOD (Level of Detail) technology: For point clouds with millions of points, a multi-level detail rendering strategy is adopted, and data slices are dynamically loaded according to the scaling level to optimize rendering performance.
[0053] QWebChannel bidirectional communication: Establishes a mapping between Python objects and JavaScript objects, supporting millisecond-level linkage of multiple views such as spatial graphs and UMAP graphs.
[0054] Example 1: Overall system architecture and localized deployment environment; (b1) Hardware and operating system environment; This system (SSO Studio) is designed as a desktop application that runs on a local workstation. Recommended hardware specifications are: Processor (CPU): Multi-core processor (such as Intel Core i7 / i9 or AMD Ryzen series), 8 cores or more are recommended to support QThreadPool's multi-threaded concurrent computing.
[0055] Memory (RAM): 32GB or higher is recommended. While DataHub supports disk mapping to save memory, ample memory can significantly improve speed when performing complex matrix operations (such as Bayesian inference for Cell2location).
[0056] Graphics Processing Unit (GPU): A dedicated graphics card that supports OpenGL 4.0 and above (such as the NVIDIA GeForce RTX series), with 6GB or more of video memory recommended for high-performance WebGL rendering.
[0057] Storage: It is recommended to use NVMe SSD solid-state drives to improve the read and write speed of large blocks of HDF5 files.
[0058] Operating System: Supports Windows 10 / 11 Professional.
[0059] (b2) Software technology stack; This system adopts a layered architecture design, mainly including: GUI layer: Developed based on the PyQt5 framework. PyQt5 provides a rich set of native controls (such as menus, dialog boxes, and tree views), ensuring a desktop-level experience and stability for the application.
[0060] Visualization layer: Embedded QWebEngineView control, loading HTML5 / JavaScript-based front-end pages. The front-end uses the Three.js or Deck.gl library to call the WebGL interface.
[0061] Logical layer: Based on Python 3.9+ environment. It integrates core algorithm libraries such as Scanpy (standard library for single-cell analysis), Squidpy (spatial statistical analysis), Cell2location (spatial deconvolution), and SPACEL (spatial domain recognition).
[0062] Communication layer: QWebChannel is used to enable cross-language calls between Python objects and JavaScript objects.
[0063] Example 2: DataHub Data Management and Security Mechanism; (c1) Unified Object Model; DataHub is the system's data hub. Whether the user imports a 10x Visium spatial directory (containing a high-resolution tissue diagram tissue_hires_image.png and a coordinate file tissue_positions_list.csv) or a BGI Stereo-seq .gef file, DataHub parses and maps them into in-memory AnnData objects.
[0064] adata.X: Stores gene expression matrices (sparse matrix format scipy.sparse.csr_matrix); adata.obs: Stores metadata about cells / spots (such as barcode, UMI counts, cluster ID). adata.var: Stores the metadata of genes (such as Gene Name, Gene ID); adata.obsm['spatial']: Stores two-dimensional or three-dimensional spatial coordinates; adata.uns: Stores unstructured data (such as adjacency matrices and color maps).
[0065] (c2) Disk mapping mode and memory optimization; like Figure 5 As shown, for ultra-large-scale data (e.g., 5 million cells, 30,000 genes) generated by Xenium or Stereo-seq, directly loading it all into memory would instantly exhaust RAM (leading to an OOM error). The DataHub in this embodiment implements intelligent disk mapping, alleviating memory pressure through loading.
[0066] Judgment logic: During the data loading phase, the system checks the file size. If it exceeds a preset threshold (e.g., file size > 2GB or number of cells > 500,000), the disk mapping mode is automatically activated.
[0067] Implementation mechanism: Utilize the File-backed feature of the h5py library to point adata.X to the address of an HDF5 file on the disk, instead of reading the content.
[0068] Read on demand: When an algorithm (such as PCA) needs to compute, DataHub controls the iterator to read data in chunks (e.g., 10,000 rows at a time), and releases memory after calculating intermediate results.
[0069] Visualization optimization: When a user views a specific area on the interface, DataHub calculates the corresponding index slice based on the viewport bounding box, and renders only the data for that area read from disk. Figure 6 As shown.
[0070] (c3) Data security and encryption; To meet the confidentiality requirements of high-value data, DataHub has a built-in encryption module.
[0071] Encryption algorithm: AES-GCM (Advanced Encryption Standard - Galois / CounterMode) algorithm is used. This algorithm not only provides confidentiality but also data integrity verification.
[0072] Key management: The user's decryption key (passphrase) is salted and hashed using the PBKDF2 algorithm.
[0073] Transparent decryption: When a user opens an encrypted project and enters the correct password, DataHub creates a virtual file handle stream in memory. Data is decrypted the moment it is read from the memory buffer and destroyed immediately after computation. Throughout the entire process, only ciphertext is stored on the disk, preventing data leakage caused by physical copying to the hard drive.
[0074] Example 3: Automated analysis workflow based on intelligent agent control; This embodiment details how the intelligent agent achieves the transformation "from natural language to code execution." This is a key feature that distinguishes it from existing competitors (such as 10x Loupe, which only provides visualization, and BioTuring, which only provides retrieval).
[0075] (d1) Architecture of the intelligent agent control module; Context Manager: Responsible for maintaining the current analysis state (loaded data, completed steps, and user's historical commands).
[0076] LLM Interface (Large Model Interface): Adapts to OpenAI API (GPT-4) or locally deployed DeepSeek / Llama 3 models.
[0077] Executor: Responsible for mapping JSON instructions to Python function calls.
[0078] (d2) Metadata-Driven Inference; The following is a specific implementation process demonstrating how the Agent automatically configures clustering tasks: Step 1: User instructions; The user typed in the dialog box: "Help me cluster this sample to see how many cell types there are. It is a highly heterogeneous tumor sample." Step 2: Metadata extraction; The agent scans the DataHub and extracts the following features: n_obs (number of cells) = 4,500 (Visium data); mean_counts (average sequencing depth) = 15,000 (high); data_source = "10x Visium"; Step 3: Construct the system prompt word; The system builds the following Prompt and sends it to the LLM: System: You are an expert bioinformatician using Scanpy. User Context: The user wants to perform clustering. The sample is a highly heterogeneous tumor. Data Metadata: 4500 spots, high sequencing depth (Visium). Available Tools: run_pca(n_comps: int) run_neighbors(n_neighbors: int) run_leiden(resolution: float) Task: Generate the function call sequence. Step 4: LLM inference and JSON generation; LLM makes inferences based on biological knowledge: Reasoning 1: Visium data typically has fewer points, but each point contains multiple cells, and the user mentioned "highly heterogeneous tumors," implying that a higher resolution is needed to distinguish subtle subpopulations.
[0079] Reasoning 2: Visium's resolution is usually set to 0.5-0.8, but considering "high heterogeneity", it is recommended to increase it to 1.0.
[0080] Reasoning 3: High sequencing depth allows for the use of more principal components (PCs) (e.g., 30-50) to retain information.
[0081] LLM outputs JSON: JSON [ { "function": "run_pca", "parameters": {"n_comps": 50} }, { "function": "run_neighbors", "parameters": {"n_neighbors": 15} }, { "function": "run_leiden", "parameters": {"resolution": 1.0} } ] Step 5: Automated Execution The Executor parses the JSON and executes `sc.tl.pca`, `sc.pp.neighbors`, and `sc.tl.leiden` sequentially. Before each step, the system automatically checks the validity of the parameters. After execution, the clustering result (`adata.obs['leiden']`) is written to DataHub, and the frontend is notified to refresh color 1.
[0082] (d3) Complex scenarios: Spatial deconvolution; For more complex tasks, such as cell type mapping using Cell2location, the advantages of Agents are more pronounced.
[0083] Cell2location has two key hyperparameters: N_cells_per_location (the expected number of cells per spot) and detection_alpha (detection sensitivity).
[0084] It's difficult for ordinary users to configure. The intelligent agent control module automatically sets default values based on the data source (e.g., Visium set to 30, Stereo-seq bin50 set to 1), and automatically calculates and adjusts detection_alpha based on the sequencing depth distribution of the single-cell reference set (scRNA-seq Reference) to ensure model convergence. "Code-based expert experience" This is the core technological barrier of this embodiment.
[0085] Example 4: High-performance WebGL interactive visualization engine; (e1) Rendering Pipeline. To achieve real-time interaction (60 FPS) of millions of point clouds, this system bypasses the traditional Matplotlib and directly manipulates the GPU.
[0086] Data transmission: The Python backend packages the coordinates (x, y, z) and color values (r, g, b, a) into a Float32Array and transmits it to the frontend via QWebChannel.
[0087] THREE.js Renderer: Creates a WebGL renderer on the front end using the THREE.js library. It dynamically adjusts the size and color of points through custom ShaderMaterial materials and leverages THREE.js's built-in rendering pipeline for efficient rendering, specifically including: The size of the dot is dynamically adjusted according to the viewing distance; Use anti-aliasing technology to smooth the edges of the displayed points; When rendering data points, use THREE.js's BufferGeometry and ShaderMaterial materials to optimize the rendering process and ensure efficient rendering; The color mapping of the point cloud is dynamically generated based on the attributes of the data points, and supports color mapping in both discrete and continuous modes.
[0088] (e2) Spatial Indexing & Lasso When a user uses the mouse to draw an irregular closed curve (Lasso) or a rectangular lasso box to select an area, the system needs to quickly determine which points out of millions of points are inside the curve.
[0089] Traditional method: Traverse all points and use ray casting to determine the location. This method involves a huge amount of computation and can cause the interface to lag.
[0090] The method in this embodiment: Spatial index optimization: A quadtree spatial index is used, which is built when the data is loaded, optimizing the selection operation and significantly improving the selection response speed.
[0091] Selection processing: By quickly filtering the candidate point set, the ray casting method is only performed on the candidate point set, reducing the amount of computation and reducing the response time from seconds to milliseconds.
[0092] Example 5: Asynchronous task scheduling mechanism; To prevent UI Freeze during the execution of time-consuming bioinformatics algorithms (such as training deep learning models), this system is designed with an asynchronous architecture based on QThreadPool.
[0093] Main Thread: Responsible only for UI rendering, event handling, and simple logic checks.
[0094] Worker Threads: All computational tasks involving Scanpy or Squidpy are encapsulated as QRunnable objects and submitted to the thread pool.
[0095] Communication mechanism: The Worker object defines PyQt signals: signals.progress (progress), signals.finished (complete), and signals.error (error).
[0096] During algorithm execution (e.g., Epoch iteration of Cell2location), the Worker emits a progress signal through a callback function.
[0097] The slot function in the main thread receives the signal and updates the progress bar at the bottom of the screen.
[0098] Resource Management: When the user clicks "Cancel", the main thread sets a flag. The worker thread checks the flag in the next iteration and exits safely, releasing the GPU memory.
[0099] This embodiment successfully integrates the cognitive capabilities of large language models with the rendering capabilities of high-performance graphics engines into the field of spatial transcriptomics analysis through innovative system architecture design. The DataHub design solves the fundamental challenges of data scale and heterogeneity, while the introduction of the intelligent agent control module completely transforms human-computer interaction, making complex bioinformatics analysis readily accessible. This not only significantly improves research efficiency but also lays a solid technical foundation for the widespread application of spatial transcriptomics technology in clinical pathology diagnosis.
[0100] Based on the above description of the embodiments, those skilled in the art will understand that the intelligent agent and visualization analysis method, system, medium, and product for spatial transcriptome data described in this embodiment can be implemented in pure software or deployed and run on general-purpose or dedicated computing hardware platforms. Based on this essence, the technical solution of this embodiment can be specifically implemented in the form of a software product containing program instructions. This software product can be stored on various non-volatile storage media or directly deployed as a local or cloud service. The program instructions are used to cause computer devices with processing capabilities—including but not limited to personal computers, server clusters, mobile terminals, or other network devices—to execute the steps described in this embodiment.
[0101] The above description is only a preferred embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any equivalent substitutions or modifications made by those skilled in the art within the scope of the technology disclosed in the present invention, based on the technical solution and inventive concept of the present invention, should be covered within the scope of protection of the present invention.
Claims
1. An intelligent agent and visualization analysis method for spatial transcriptome data, characterized in that, include: S1. Receive raw spatial transcriptome data from multiple heterogeneous sources and parse it to obtain metadata features; S2. Input the user's natural language analysis instructions and the metadata features into the pre-trained large language model, and generate a structured function call sequence containing algorithm selection and hyperparameter configuration based on the preset bioinformatics tool description library. S3. Parse the structured function call sequence, automatically arrange and execute the corresponding bioinformatics analysis algorithm in the asynchronous thread pool, and feed back the execution progress through the signal slot mechanism; S4. Using a WebGL-based rendering engine, the output of the analysis algorithm is mapped into a visual graphic, and real-time interactive operation is supported by the user through a two-way communication channel between the browser and the local backend. The analysis context is dynamically updated based on the interactive feedback. S5. Update the analysis intent based on user feedback, iterate to step S2 to regenerate the function call sequence, or directly proceed to step S3 to adjust parameters and re-execute the analysis algorithm without reloading the original data.
2. The method according to claim 1, characterized in that, The metadata features include at least cell number, sequencing depth distribution, gene detection number distribution, and spatial sparsity.
3. The method according to claim 1, characterized in that, The metadata features obtained from the parsing are specifically as follows: The spatial coordinate matrix and gene expression matrix in the original spatial transcriptome data are analyzed using a unified object model, and the full loading mode or disk mapping mode is selected according to the data scale.
4. The method according to claim 3, characterized in that, The process involves parsing the spatial coordinate matrix and gene expression matrix in the raw spatial transcriptome data using a unified object model, and selecting either a full loading mode or a disk mapping mode based on the data size. If the raw spatial transcriptome data is identified as image-type spatial data, the coordinate system of the high-resolution tissue image and the gene expression matrix will be automatically aligned. If the size of the original spatial transcriptome data exceeds the preset memory threshold, disk mapping is enabled to keep the gene expression matrix in disk storage format and read the data on demand only during computation or rendering using memory mapping technology. The AES-GCM algorithm is used to encrypt and store items marked as sensitive data in the original spatial transcriptome data, and decryption is performed in real time when loading through memory.
5. The method according to claim 1, characterized in that, The process of generating the structured function call sequence is as follows: Construct system prompts, which include an analysis context consisting of historical interaction records and visualization status, a statistically processed metadata feature summary, and function definitions from a pre-defined bioinformatics tool description library; The system prompts and the user's natural language analysis commands are input into a pre-trained large language model; The system receives and parses JSON format instructions output by the large language model, extracts a structured function call sequence from it, and the structured function call sequence contains one or more algorithm function names to be called and corresponding parameter key-value pairs, wherein the parameter key-value pairs are adaptively filled according to the metadata features; Verify whether the type and value range of the parameter key-value pair meet the preset requirements of the corresponding algorithm function. If there are invalid parameters, trigger the automatic parameter correction logic or generate a prompt message requesting user confirmation.
6. The method according to claim 3, characterized in that, The step of using a high-performance WebGL-based rendering engine to map the output of the analysis algorithm into visual graphics specifically involves: The parsed spatial coordinate matrix and gene expression matrix are packaged by the backend processing module into structured data suitable for graphics rendering and transmitted to the frontend rendering module through the communication channel. The front-end rendering module calls a WebGL-based rendering engine to create geometric objects based on the structured data and configure shader programs to achieve visual encoding of gene expression levels and high-performance rendering. Establish spatial indexes for geometric objects in the rendered scene to respond to user interactions with region selection on the graphics, quickly query and highlight the data points corresponding to the selected region.
7. The method according to claim 1, characterized in that, The analysis algorithms include a cell type deconvolution algorithm based on Bayesian statistics, a spatial domain identification algorithm based on graph neural networks, and a cell communication inference algorithm based on optimal transport theory.
8. A computer system comprising a memory, a processor, and a computer program stored in the memory, characterized in that, The processor executes the computer program to implement the method according to any one of claims 1-7.
9. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a plurality of classification programs, which are used by a processor to execute the method as described in any one of claims 1-7.
10. A computer program product, comprising a computer program, characterized in that, The computer program is executed by a processor to implement the method described in any one of claims 1-7.