System and method for spatial omics discovery using multi-agent large language models
A system using a large language model automates spatial omics data analysis and visualization through natural language interaction, addressing inefficiencies and accessibility issues, enabling researchers to independently analyze spatial data without coding skills.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- UNIV OF PITTSBURGH OF THE COMMONWEALTH SYST OF HIGHER EDUCATION
- Filing Date
- 2025-11-17
- Publication Date
- 2026-06-25
AI Technical Summary
There is great complexity and inefficiency involved in analyzing spatial omics data, particularly for researchers who lack programming skills or access to bioinformatics resources, leading to time-consuming analyses and a dependency on specialized knowledge and collaboration with bioinformaticians.
A system utilizing a large language model (LLM) to analyze and visualize single cell or spot-based spatial omics data, allowing users to interact with spatial data using natural language queries, automate the analysis process, and provide real-time visualizations without requiring coding skills, thereby democratizing advanced spatial transcriptomics analysis.
Empowers researchers to independently explore spatial data and gain insights, reducing dependency on coding skills and enabling faster hypothesis generation and validation, making advanced spatial transcriptomics analysis more accessible to non-experts.
Smart Images

Figure US2025055741_25062026_PF_FP_ABST
Abstract
Description
[0001] 072396.1115
[0002] SYSTEM AND METHOD FOR SPATIAL OMICS DISCOVERY USING MULTI¬
[0003] AGENT LARGE LANGUAGE MODELS
[0004] CROSS-REFERENCE TO RELATED APPLICATIONS
[0005] This application claims the benefit of priority of U.S. Provisional Patent Application No. 63 / 735,697, filed December 18, 2024, the content of which is incorporated herein by reference in its entirety, and to which priority is claimed.
[0006] TECHNICAL FIELD
[0007] This disclosure generally relates to spatial omics discovery.
[0008] BACKGROUND
[0009] Spatial omics is a group of technologies, including spatial transcriptomics and spatial proteomics, that maps the gene or protein expressions of identified cells and their locations within tissue sections, while preserving the spatial origin of the data. This method can be used to study the transcriptional activity of mRNA molecules at the cellular level and can even be used to study the subcellular localization of mRNA molecules. Spatial omics can be used in a variety of biological contexts, including embryo development, immune-cell responses to antigens, and different types of cancers. Spatial transcriptomics can use a variety of methods, including Hybridization techniques, RNA sequencing (RNA-Seq) technology, Barcoded DNA arrays, Barcoded oligonucleotide-conjugated probes, and in situ hybridization. Spatial proteomics can be performed using diverse approaches, including imaging mass spectrometry, antibody-based imaging platforms, multiplexed immunofluorescence, mass-tag barcoded antibodies, and in situ protein labeling methods. The position of a cell relative to its neighbors and non-cellular structures can provide important information about the cell's phenotype, state, and function.
[0010] A large language model (LLM) is a computational model notable for its ability to achieve general-purpose language generation and other natural language processing tasks such as classification. Based on language models, LLMs acquire these abilities by learning statistical relationships from vast amounts of text during a computationally intensive selfsupervised and semi -supervised training process. LLMs can be used for text generation, a form of generative artificial intelligence (Al), by taking an input text and repeatedly predicting the next token or word.
[0011] 1
[0012] ACTIVE 126368586.1 072396.1115
[0013] SUMMARY OF PARTICULAR EMBODIMENTS
[0014] The purpose and advantages of the disclosed subject matter will be set forth in and apparent from the description that follows, as well as will be learned by practice of the disclosed subject matter. Additional advantages of the disclosed subject matter will be realized and attained by the methods and systems particularly pointed out in the written description and claims hereof, as well as from the appended drawings.
[0015] To achieve these and other advantages, and in accordance with the purpose of the disclosed subject matter, as embodied and broadly described, the disclosed subject matter presents systems, methods, and apparatuses that can be used to generating analytic results for user queries. For example, certain non-limiting embodiments can be used to analyze and visualize single cell or spot-based spatial transcriptomics data responsive to users’ naturallanguage queries.
[0016] In certain non-limiting embodiments, the analytic system can retrieve curated question-and-code pairs from a knowledgebase containing entries deemed most similar to the query submitted by the user. The analytic system can subsequently generate executable code in response to the user query, utilizing contextual information derived from the knowledgebase in conjunction with its pre-existing knowledge. The analytic system can execute the generated code to produce an output responsive to the query. In the event of an error during the execution of the generated code, the analytic system can utilize both the generated code and the resulting error information as inputs to an automatic error correction module, which can, in certain embodiments, modify or regenerate the code to facilitate successful execution.
[0017] In certain non-limiting embodiments, one or more computing systems can receive, from a client system, a user selection of a spatial omics dataset. The computing systems can then determine metadata associated with the spatial omics dataset. The computing systems can then receive, from the client system, a user query regarding the spatial omics dataset. The computing systems can then generate one or more prompts based on the user query. In one feature, the prompts include the metadata and are configured to be inputted into a large language model (LLM) to elicit a response from the LLM. The computing systems can then generate, by the LLM based on the prompts, a code executable on the spatial omics dataset. The computing systems can then execute the code on the spatial omics dataset. The computing systems can then generate, by the LLM based on the prompts and the execution result, a response to the user query. The computing systems can further send, to the client system, instructions for presenting the response.
[0018] 2
[0019] ACTIVE 126368586.1 072396.1115
[0020] In certain non-limiting embodiments, one or more computer-readable non- transitory storage media embodying software is operable when executed to receive, from a client system, a user selection of a spatial omics dataset. The computer-readable non-transitory storage media embodying software is further operable when executed to determine metadata associated with the spatial omics dataset. The computer-readable non-transitory storage media embodying software is further operable when executed to receive, from the client system, a user query regarding the spatial omics dataset. The computer-readable non-transitory storage media embodying software is further operable when executed to generate one or more prompts based on the user query. In one feature, the prompts comprise the metadata and are configured to be inputted into a large language model (LLM) to elicit a response from the LLM. The computer-readable non-transitory storage media embodying software is further operable when executed to generate, by the LLM based on the prompts, a code executable on the spatial omics dataset. The computer-readable non-transitory storage media embodying software is further operable when executed to execute the code on the spatial omics dataset. The computer- readable non-transitory storage media embodying software is further operable when executed to generate, by the LLM based on the prompts and the execution result, a response to the user query. The computer-readable non-transitory storage media embodying software is further operable when executed to send, to the client system, instructions for presenting the response.
[0021] In certain non-limiting embodiments, a system can comprise one or more processors and a non-transitory memory coupled to the processors comprising instructions executable by the processors. The processors are operable when executing the instructions to receive, from a client system, a user selection of a spatial omics dataset. The processors are further operable when executing the instructions to determine metadata associated with the spatial omics dataset. The processors are further operable when executing the instructions to receive, from the client system, a user query regarding the spatial omics dataset. The processors are further operable when executing the instructions to generate one or more prompts based on the user query. In one feature, the prompts comprise the metadata and are configured to be inputted into a large language model (LLM) to elicit a response from the LLM. The processors are further operable when executing the instructions to generate, by the LLM based on the prompts, a code executable on the spatial omics dataset. The processors are further operable when executing the instructions to execute the code on the spatial omics dataset. The processors are further operable when executing the instructions to generate, by the LLM based on the prompts and the execution result, a response to the user query. The processors are further
[0022] 3
[0023] ACTIVE 126368586.1 072396.1115 operable when executing the instructions to send, to the client system, instructions for presenting the response.
[0024] Furthermore, the disclosed embodiments of the methods, computer readable non-transitory storage media, and systems can have further non-limiting features as described below.
[0025] In certain non-limiting embodiments, the computing systems can further generate one or more visualizations associated with the spatial omics dataset. The computing systems can then send, to the client system, instructions for presenting the visualizations.
[0026] In certain non-limiting embodiments, the computing systems can send, via an application-programming-interface (API) to an external system, a request for the visualizations. The computing systems can then receive, via the API from the external system, the visualizations.
[0027] In certain non-limiting embodiments, the metadata comprises spatial coordinates of each cell and their corresponding cell types.
[0028] In certain non-limiting embodiments, the computing systems can send, via an application-programming-interface (API) to the LLM, a request for a response to the user query. The computing systems can further receive, via the API from the LLM, the response.
[0029] In certain non-limiting embodiments, the computing systems can further access, by the first agent, a software tool from a plurality of software tools for generating the code. In one feature, wherein the plurality of software tools comprise one or more of a data analysis tool, a plotting tool, or a text response tool.
[0030] In certain non-limiting embodiments, the computing systems can further select, by the LLM based on the execution result and the prompt, a second agent among a plurality of agents to generate the response. The computing systems can then access, by the second agent, a software tool from a plurality of software tools for generating the response. In one feature, the plurality of software tools comprise one or more of a data analysis tool, a plotting tool, or a text response tool
[0031] In certain non-limiting embodiments, the prompts comprise at least a generic prompt and a customizable prompt. The generic prompt is applicable to a plurality of spatial omics datasets. The computing systems can update the customizable prompt by adding the metadata.
[0032] In certain non-limiting embodiments, the computing systems can further detect one or more errors in the code. The computing systems can then correct the errors in the code.
[0033] 4
[0034] ACTIVE 126368586.1 072396.1115
[0035] In certain non-limiting embodiments, the user query includes a multi-step analytical query associated with a plurality of actions. The computing systems can decompose the user query into a plurality of sub-tasks. The computing systems can further map each subtask to a particular software tool among a plurality of software tools. Generating the code or executing the code is further based on executions of the plurality of software tools associated with the plurality of sub -tasks.
[0036] In certain non-limiting embodiments, decomposing the user query into the subtasks is performed by the LLM using a particular prompt comprising instructions for generating execution plans and access to conversation history. The sub-tasks include at least a first subtask and a second sub-task. The computing systems can execute the first sub-task. The computing systems can further, subsequent to executing the first sub-task, execute the second sub-task based on execution results associated with the first sub-task.
[0037] In certain non-limiting embodiments, the response includes a combination of intermediate outputs generated from executions of the plurality of sub-tasks.
[0038] In certain non-limiting embodiments, the computing systems can access, from a storage, a conversation history comprising textual records of prior user inputs and system responses. The computing systems can further detect whether the user query is a follow-up to a previous query based on the conversation history.
[0039] In certain non-limiting embodiments, the computing systems can access, from the storage, prior results generated during prior interactions with one or more users. The prior results include a plurality of output objects stored in a structured format separate from the conversation history. The computing systems can further retrieve one or more relevant prior results associated with the user query. The response is generated further based on the relevant prior results.
[0040] In certain non-limiting embodiments, the computing systems can generate a summary of the conversation history when a number of exchanges exceeds a threshold. The computing systems can further provide the summary to the LLM. The response is generated further based on the summary.
[0041] In certain non-limiting embodiments, the computing systems can receive, from the client system, a request to reset conversational context. The computing systems can further, in response to the request, clear the conversation history. The computing systems can further reset internal memory configured to cache the conversation history.
[0042] In certain non-limiting embodiments, the computing systems can further update the one or more prompts by injecting one or more question-code pair examples to the user
[0043] 5
[0044] ACTIVE 126368586.1 072396.1115 query by referencing a curated knowledge-base of question-code pairs to carry out spatial omics or bioinformatics workloads. As an example and not by way of limitation, the injected question-code pair examples can be the most similar question-code pair examples.
[0045] In certain non-limiting embodiments, the computing systems can further inject one or more question-code pair examples to the user query by referencing a curated knowledge base of question-code pairs.
[0046] It is to be understood that both the foregoing general description and the following detailed description are exemplary and are intended to provide further explanation of the disclosed subject matter claimed. These and other features, aspects, and advantages of the disclosure will be apparent from a reading of the following detailed description together with the accompanying drawings, which are briefly described below. The invention includes any combination of two, three, four, or more of the above-noted embodiments as well as combinations of any two, three, four, or more features or elements set forth in this disclosure, regardless of whether such features or elements are expressly combined in a specific embodiment description herein. This disclosure is intended to be read holistically such that any separable features or elements of the disclosed invention, in any of its various aspects and embodiments, should be viewed as intended to be combinable unless the context clearly dictates otherwise.
[0047] BRIEF DESCRIPTION OF THE DRAWINGS
[0048] FIG. 1 illustrates an example flow diagram for analyzing and visualizing spatial omics data responsive to a user query.
[0049] FIGS. 2A-2E illustrates example user interfaces for spatial omics discovery using multi-agent large language models.
[0050] FIGS. 3A-3C illustrates other example user interfaces for spatial omics discovery using multi-agent large language models.
[0051] FIG. 4 illustrates an example method for spatial omics discovery using multiagent large language models.
[0052] FIG. 5 illustrates an example computer system.
[0053] DETAILED DESCRIPTION
[0054] There is great complexity and inefficiency involved in analyzing spatial omics data, particularly for researchers who lack programming skills or access to bioinformatics resources. Conventional spatial biology analysis can be time-consuming, requiring specialized
[0055] 6
[0056] ACTIVE 126368586.1 072396.1115 knowledge in both computational tools and biology, often necessitating collaboration with bioinformaticians, which creates a bottleneck for researchers who need timely insights from their data.
[0057] The embodiments disclosed herein can eliminate the aforementioned bottleneck by providing a system where biologists can directly interact with spatial data using naturallanguage queries. The system can automate the analysis process, reducing dependency on coding skills, which makes advanced spatial transcriptomics analysis more accessible to nonexperts. The democratization of advanced analysis tools can help biologists independently explore spatial data and gain insights into their research. The embodiments disclosed herein can also empower researchers to leverage a vast collection of existing open datasets from different biospecimens, enhancing the system’s utility and scope, which biologists can explore at their own pace. The embodiments disclosed herein will further help smaller labs with fewer bioinformaticians compete with well-funded institutions by faster hypothesis generation and validation.
[0058] In particular embodiments, an artificial-intelligence (Al) powered analytic system can seamlessly analyze and visualize single cell or spot-based spatial omics data. The analytic system can include a large language model (LLM) working on the backend with a visualization frontend. The analytic system can allow users such as biologists to interact with spatial omics data by asking questions or giving instructions in over 80 spoken languages. The analytic system can translate these queries into code, perform the necessary computations, and visualize the results in real-time, all without requiring a user to write any code. The analytic system can transform complex bioinformatics workflows into an accessible, code-free experience, empowering users to independently analyze their data, regardless of computational expertise. By leveraging LLMs and integrating them with spatial biology tools, the analytic system can function as an Al-driven assistant that simplifies spatial omics data exploration, interpretation, and discovery. The analytic system can further provide a user-friendly interface designed for users without coding backgrounds, enabling intuitive data exploration and analysis. Although disclosure describes analyzing and visualizing particular data by particular systems in particular manners, this disclosure contemplates analyzing and visualizing any suitable data by any suitable system in any suitable manner.
[0059] To begin with, a user can load any single cell or spot-based spatial omics data via the visualization frontend of the analytic system. The visualization frontend can provide an interactive interface for data visualization, allowing users to explore spatial data intuitively. The analytic system can support various widely used spatial omics data formats, including
[0060] 7
[0061] ACTIVE 126368586.1 072396.1115
[0062] AnnData (H5AD) and comma separated values (CSV) files, containing spatial coordinates, gene expression data, and cellular metadata such as cell types, cell area, density, etc.
[0063] After the dataset is loaded, the analytic system can automatically interpret key metadata like spatial locations and cell types and initialize the system based on the main metadata such as spatial coordinates of each cell and their corresponding cell types without the user specifying the exact names of these columns in the data table. In certain non-limiting embodiments, the analytic system can identify standard column names for the “X” and “Y” locations of the cells and the cell type information (which is the default column by which the cells are grouped). In certain non-limiting embodiments, the analytic system can identify standard column names for the “conditions” with which the data should be grouped. These include, but not limited to sample identifiers, Tumor Micro Array (TMA) core identifiers, patient identifiers, etc. that are available in a multi-sample dataset. Hence, the initialization here means that the user does not have to select these columns if they are found automatically. If the analytic system cannot identify the columns, the user can still make the changes, e.g., by clicking on certain buttons on the user interface (UI). Users can also make any changes to the automatically identified fields. The analytic system can meanwhile provide manual adjustment options to the user. Column information for multi-sample datasets will automatically render all grouped samples in the same UI in a grid pattern by dynamically updating the “X” and “Y” locations of the cells across each of the “conditions”.
[0064] As the user interacts with the data, the analytic system can provide real-time visualization that is updated dynamically, allowing for immediate insights and hypothesis testing. In certain non-limiting embodiments, the analytic system can send an API (application programming interface) call to an external system for visualizations. The external system can allow the analytic system to load the cells as a scatter plot colored with unique colors based on the group (cell types). In one embodiment, each cell type can have a unique color, and this plot can be zoomed in or rotated. Any new spatial plot can be added as a new layer so that users can view or hide the layers at will. The analytic system can allow the users to mark or annotate regions using annotation tools available in the UI. The user can then ask complex biological questions regarding the loaded data in natural language, with the system generating and executing necessary analysis code to provide answers.
[0065] FIG. 1 illustrates an example flow diagram 100 for analyzing and visualizing spatial transcriptomics data responsive to a user query. To begin with, a user can ask a question (user query 102) such as “can you make a spatial plot of the tumor cells and 10 neighbors?”. The user query 102 can be received by the analytic system 104. In certain non-limiting
[0066] 8
[0067] ACTIVE 126368586.1 072396.1115 embodiments, the analytic system 104 includes an agent selection 106 module, a tool selection 108 module, a context management 110 module, a multi-hop reasoning 112 module, and a task planning 114 module. The agent selection 106 module can decide which agent to call based on the question’s context or in which stage of data analysis and visualization the analytic system 104 is calling an agent. The tool selection 108 module can select the appropriate tool (e.g., data analysis tool, plotting tool, or text response tool) based on the user’s query 102.
[0068] The task planning 114 module can support multi-step analytical queries. Complex queries 102 with multiple actions (e.g., “What are the cell types? Can you plot a bar chart?”) require sequential tool execution. The task planning 114 module enables the analytic system 104 to decompose such queries 102 into multiple sub-tasks, each mapped to a specific tool call. Task planning 114 can be initiated by the LLM, which uses a prompt that includes instructions to generate execution plans and for access to conversation history. The loop logic and execution orchestration can be handled by supporting software modules. The output of each tool is fed back into the analytic system 104, enabling the next tool to execute with updated context. The final result is a combination of all intermediate outputs. For example, the analytic system 104 first retrieves cell types, then uses that output to generate a bar plot. This dynamic chaining of tools allows for more robust and context-aware responses.
[0069] The multi-hop reasoning 112 module is integrated into the task planning process. The multi-hop reasoning 112 module enables more effective task decomposition and execution flow, which can be considered a component of task planning.
[0070] The context management 110 module involves handling conversation history, follow-up question detection, and context summarization. While LLMs have their own internal context mechanisms, the context management 110 adds an external layer of context management through memory integration. The analytic system 104 works with a memory management system 116 to enable enhanced functionality.
[0071] In certain non-limiting embodiments, the analytic system 104 can fetch curated question and code pairs from a knowledge base 130 that is most similar to the question asked by the user. The analytic system 104 can then generate code 132 to answer the question based on the rich context available from the knowledge base 130 as well as its existing knowledge.
[0072] In an isolated coding environment 134, the analytic system 104 can execute the code 136. In the event of an error during code execution, the generated code and error can be used by the analytic system’s 104 automatic error correction module to correct the code for execution. The analytic system 104 can then extract the code output 138.
[0073] 9
[0074] ACTIVE 126368586.1 072396.1115
[0075] The analytic system 104 can then make the output conversational 140. The analytic system 104 then visualizes the output 142. The visualizations can be rendered via a user interface 144.
[0076] In certain non-limiting embodiments, the analytic system 104 can save history and user feedback 146. The saved history and user feedback can include conversations 148, code and code outputs 150, and user feedback 152. The analytic system 104 then waits for the next question 154.
[0077] In certain non-limiting embodiments, the memory management system 116 enables context-aware responses in a conversational Al agent by storing conversation history 118 (past interactions) and prior results 120. Conversation history 118 refers to the textual record of user inputs and system responses. However, conversation history 118 differs from prior results 120, which include actual output objects such as images, tables, or CSV files generated during the conversation. These prior results 120 are stored separately in a structured format (e.g., a dictionary or database).
[0078] A follow-up question detection 122 module analyzes conversation history 118 to determine whether a new query 102 is a continuation of a previous one. If so, the memory management system 116 retrieves relevant prior data to generate more accurate and coherent answers, improving the continuity and relevance of multi-turn dialogue. The memory management system 116 includes a lightweight language model designed to determine whether a user’s current question is a follow-up to previous interactions. This lightweight language model operates using a prompt that instructs the model to perform follow-up detection, incorporating the current question 102 and relevant conversation history 118. Most of the conversation memory can be cached in RAM using a structured database format, while longterm logs are stored in a long-term persistent storage 126 (e.g., disk) for archival and analysis. When a new question 102 is asked, the lightweight language model automatically accesses the relevant cached history to make its determination. Additionally, prior results 120 such as the output of each question 102 are stored alongside the question 102 itself, enabling efficient retrieval and reuse in future steps of the conversational pipeline.
[0079] The context summarization 124 module can be used to manage long conversation histories 118 within memory constraints. As a dialogue grows, it becomes inefficient to send the entire history 118 to the language model due to context length limits. To solve this, a lightweight language model generates a compact summary that captures the key questions and answers, allowing only the essential information to be retained. The memory management system 116 maintains two components: a condensed summary of the overall
[0080] 10
[0081] ACTIVE 126368586.1 072396.1115 conversation and a short window of the most recent exchanges. When the conversation exceeds a certain threshold (e.g., around 55 exchanges), the full history is replaced with the summary and latest interactions. This keeps the context clean and efficient, ensuring the main LLM can respond accurately without exceeding memory limits. This threshold is configurable, and users could customize it based on their needs, e.g., opting to retain longer histories at the cost of performance.
[0082] In certain non-limiting embodiments, the full conversation — including user inputs, system responses, code, and outputs — can be stored on disk as a text file for long-term archival. This persistent storage 126 is distinct from the in-memory cache used during active sessions and serves several purposes beyond immediate interaction. The persistent storage 126 allows users to revisit past exchanges, refine prompt strategies, and build datasets for training custom language models for local deployment. By supporting both retrospective analysis and future development, this storage capability enhances usability and enables more advanced applications of the analytic system 104. In certain non-limiting embodiments, code outputs are not automatically saved in prior results 120. However, users can manually save the entire conversation including code, code outputs, and feedback via dedicated buttons on the user interface that copy or export the session to disk. This disk-based storage 126 is intended for user reference, while cached memory is used to support the LLM’s decision-making during active sessions.
[0083] The memory management system 116 further includes a module for clearing chat and memory 128. After a conversation ends, the analytic system 104 offers two ways to reset context and memory. One option clears the visible chat from the interface, while the other resets the internal memory used by the language model, including cached summaries and recent exchanges. These functions are connected, since the memory management system 116 stores a version of the chat history. Clearing both ensures a clean slate without restarting the software, which is useful when switching to a new dataset or task to prevent prior context from influencing new interactions.
[0084] The memory management system 116 can support recent advancements in agent design, specifically addressing challenges in maintaining efficient and relevant context. Instead of sending the entire conversation history to the language model, the memory management system 116 selectively retains only the most relevant information, improving context engineering. This approach also enhances system efficiency by reducing unnecessary data transmission and lowering API costs. By ensuring the language model receives concise and focused input, the system improves the agent’s response quality. Altogether, this forms part of
[0085] 11
[0086] ACTIVE 126368586.1 072396.1115 a state-of-the-art memory module designed to enable scalable, cost-effective, and high- performing conversational agents.
[0087] Details of each operation and component illustrated in FIG. 1 are described as follows.
[0088] In certain non-limiting embodiments, the user can have access to a chat window where they can ask questions. Once the question is asked, the analytic system 104 can call the LLM backend, e.g., via an API. The LLM backend can handle natural language processing, query interpretation, example curation from the knowledgebase, code generation, error correction, and response formatting through specialized agents.
[0089] In certain non-limiting embodiments, the LLM can have multiple agents that each can perform a certain task when instructed so. With the agent selection 106 module, the LLM can decide which agent to call based on the question’s context or in which stage of data analysis and visualization the analytic system is calling an agent. For example, if the code is not generated, the LLM knows that it cannot use an agent that uses the output of the code. This type of systems that can coordinate multiple agents to carry out a variety of tasks can be called intelligent “agentic” systems. The analytic system can be considered such an agentic system that is specifically designed to help users interact with spatial omics data without any coding knowledge. This is why the analytic system is especially useful for clinicians and biologists since the users do not have to write code to get answers to their questions. The LLM can generate the code and answer those questions for them.
[0090] In certain non-limiting embodiments, the LLM can comprise a code agent, a conversation agent, and an error correction agent. The code agent can interpret user queries and decide how to generate answers by selecting appropriate tools from a repository of tools. The code agent can be responsible for generating code snippets that perform data analysis or visualization tasks.
[0091] The functionality of the code agent can include tool selection, where the code agent can use the tool-calling capabilities of the LLM to select the appropriate tool (e.g., data analysis tool, plotting tool, or text response tool) based on the user’s query. In certain nonlimiting embodiments, the analytic system can provide a data analysis tool, a plotting tool, and a text response tool. The data analysis tool can generate code for questions that require structured outputs like tables, lists, or statistical summaries. The data analysis tool can be selected when the user asks for computations, statistical analysis, or data retrieval. For example, a user query can be calculating the percentage of different cell types or listing available genes in the dataset.
[0092] 12
[0093] ACTIVE 126368586.1 072396.1115
[0094] The plotting tool can generate code for creating visualizations such as bar plots, violin plots, UMAPs, and spatial plots of cells colored based on metadata. The plotting tool can be selected when the user requests visual representations of the data. For example, a user request can be plotting the spatial distribution of a specific cell type or creating a violin plot of gene expression levels.
[0095] The text response tool can provide direct text answers without the need for code execution. The text response tool can be selected when the answer can be provided directly from existing knowledge or the dataset’ s metadata. For example, a user query can be explaining what a specific column represents or providing definitions or general explanations, or general questions about the pathology related to the tissue type, questions about answered that are already answered that do not require further code generation, etc.
[0096] The code agent can determine which tool to choose because the analytic system generates the prompt and defines the code agent in such a way in the prompt that the code agent can perform tool calling. Tool calling is a functionality of the LLM, where once given a list of tools, its descriptions, and how to call these tools, the agent can call the correct tool to answer the question.
[0097] The functionality of the code agent can also include code generation, where the code agent can generate code snippets (e.g., python) that can be executed to produce the desired output once the tool is decided. The code generation itself can be done by the LLM. In certain non-limiting embodiments, the generated code can be executed on the host machine in a sandboxed environment to perform data analysis. This improves security since the generated code is isolated in the sandbox during execution. The host machine can be a user computing device if the data is loaded from the user computing device; or a remote server if the data is uploaded to the server and loaded from the server; or a cloud instance if the data is uploaded to the cloud instance and loaded from the cloud instance, any suitable host machine.
[0098] The functionality of the code agent can additionally include prompt engineering, where the code agent’s prompt includes instructions on the dataset’s structure, available tools, and how to format the code. For example, when receiving a user query “what are the available cell types,” the action of the code agent can be selecting the data analysis tool and generates code to list cell types.
[0099] The code agent’s prompt can be significantly enhanced by incorporating relevant question-code pairs that closely align with the user query. This is achieved using Retrieval-Augmented Generation (RAG), a framework that dynamically retrieves authoritative, curated examples from a specialized knowledge base. This knowledge base
[0100] 13
[0101] ACTIVE 126368586.1 072396.1115 contains spatial omics and bioinformatics-related question-code pairs sourced from code tutorials, scientific publications, and custom examples. These examples supplement gaps in the LLM’s training data, providing domain-specific insights that are tailored to user needs.
[0102] By injecting these curated examples into the prompt, RAG enhances the code agent’s ability to generate accurate and contextually appropriate responses. This process ensures that the code agent can reference concrete, human-curated examples, enabling it to address similar queries with greater precision and relevance. In cases where no sufficiently similar question exists in the knowledge base, the code agent’s prompt remains unaffected, relying solely on the LLM’s inherent capabilities. This dynamic and context-aware augmentation greatly improves the system’s ability to provide high-quality, domain-specific code generation.
[0103] The conversation agent can extract the code output, or the answer generated by the code agent. The conversation agent can make the output conversational by converting it into a human-friendly format and sharing it as a reply to the conversation. The conversation agent can ensure that users receive clear and concise answers to their questions. The functionality of the conversation agent can include output interpretation, where the conversation agent can interpret code outputs, including textual results or confirmation messages. The functionality of the conversation agent can also include response formatting, where the conversation agent can format the responses using Markdown or Hypertext Markup Language (HTML) to enhance readability, including summaries, lists, or explanations. The functionality of the conversation agent can additionally include prompt engineering, where the conversation agent’s prompt guides it to produce responses that are informative and tailored to the user’s query. For example, with code output like “['B-celf, 'T-celf, Macrophage', 'Fibroblast'],” the response of the conversation agent can be presenting the list in a readable format with additional context.
[0104] The error correction agent can detect and correct any errors in the code generated by the code agent. The error correction agent can ensure that the code runs smoothly and provides accurate results. The functionality of the error correction agent can include error detection, where the error correction agent can monitor code execution for exceptions or errors. The functionality of the error correction agent can also include code refinement, where the code agent can adjust the code to fix issues, such as syntax errors or incorrect function usage. The functionality of the error correction agent can additionally include iterative improvement, where the error correction agent can loop through the correction process until the code executes successfully. The functionality of the error correction agent can further include prompt
[0105] 14
[0106] ACTIVE 126368586.1 072396.1115 engineering, where the error correction agent’s prompt instructs it to focus on debugging and code correction strategies. For example, for an error encountered as “Attribute Error: 'Data Frame' object has no attribute 'var names'” the action of the error correction agent can include modifying the code to correct the attribute access.
[0107] Each agent operates based on specific prompts, which include detailed instructions that guide their behavior and output formats. A prompt can be thought of as an instruction sheet that the worker needs to adhere to. The exact steps to take or the exact format in which the answers are expected can be designed in the prompt. This process is called “prompt engineering”, enabling each agent to perform its designated tasks effectively. Hence, each of the agents knows what it needs to do since the analytic system is prompting them specifically for a corresponding task.
[0108] In certain non-limiting embodiments, prompt engineering can involve user input, code agent prompting, tool calling, error handling, conversation agent prompting, and response delivery. For user input, the user can pose a question in natural language. For code agent prompting, the code agent can receive the question along with its specialized prompt, which can include dataset information (e.g., metadata, available columns, etc.), descriptions of available tools and how to call them, and instructions on code formatting and execution. For tool calling, based on the prompt and the user’s question, the code agent can select the appropriate tool and generate the corresponding code. For error handling, if code execution results in errors, the error correction agent can be activated, the error correction agent can use its prompt to detect and correct errors in the code. For conversation agent prompting, once the code executes successfully, the conversation agent can use its prompt to format the output into a user-friendly response. For response delivery, the analytic system can return the formatted response to the user, which can include text, visualizations, or both.
[0109] The detailed workflow of prompt engineering is illustrated as follows. Step 1 can be user input. The action can be that the user asks a question via the chat interface, e.g., “what are the available cell types in the dataset?” Step 2 can be code agent processing. The prompt content can include system description of information about the dataset’s structure (e.g., .obs, .var, .x, .obsm), available tools which describe the data analysis tool, plotting tool, and text response tool, code examples including sample code snippets demonstrating how to answer different types of questions, and instructions as guidelines on how to select tools and format code.
[0110] An example prompt is illustrated below.
[0111] 15
[0112] ACTIVE 126368586.1 072396.1115
[0113] You are a world-class programmer and an expert bioinformatician that can complete any goal by executing python code. You are specialized in single-cell spatial transcriptomics research and bioinformatics. You will be working with an 'AnnData' object that contains the dataset. You will be given questions to answer. There are three ways you can answer a question based on the agents and tools available. Here are some example questions and how you can answer them. The code formating might be incorrect in the examples below. Please make sure you generate proper code formatting for Python code:
[0114] User: What does the column 'cell type' represent?
[0115] Answer: functi ons . T extOutput Acti on( thoughts="According to the column name and its values in the first few rows, it seems like the column 'cell type' represents the type of cell.", text="The column 'cell type' represents the type of cell." )
[0116] User: Plot the violin plot of ACE2.
[0117] Answer: functi ons . Pl otOutput Action( thoughts="I will plot the violin plot of ACE2 expression.", plotcode- 'import seaborn as snsWnimport pandas as pdWnimport matplotlib.pyplot as plt\\n\\ndef plot_violinplot(self, ax, data, gene_name):\\n # Check if the gene exists in the AnnData objectWn if gene_name in self. data.var_names:\\n # Fetch gene expression dataWn expression_data = self.dataf:, gene_name].X.toarray().flatten() if hasattr(self. data.X, 'toarray') else self.dataf:, gene_name].X.flatten()\\n # Create a DataFrame with expression dataWn df = pd.DataFrame('gene_name': expression_data)\\n # Plot the violin plotWn sns.violinplot(x=gene_name, data=df, ax=ax)\\n ax. set_title(f Violin plot of
[0118] ACE2')\\n ax.set_xlabel(gene_name)\\n else:\\n print(f Gene ACE2 not found
[0119] 16
[0120] ACTIVE 126368586.1 072396.1115 in the dataset. ')\\n\\n# Call the plot function with the ACE2 geneWnself.pl ot_in_matplotlib_widget(plot_violinplot, self. data, 'ACE2')" )
[0121] Based on prompt processing, the code agent can interpret the question and decide that the data analysis tool is appropriate and generate code to retrieve and output the list of cell types. For example, the generated code can be: python output = str(self.data.obs['cell_type'].unique().tolist())
[0122] In certain non-limiting embodiments, a tool-calling mechanism can be used to select the right tool for a given task. The tool-calling mechanism can be informed by the agent’ s prompt. The prompt can include tool descriptions including detailed explanations of each tool’s purpose and when to use them, usage examples including sample code demonstrating tool invocation, and selection criteria as guidelines on choosing the appropriate tool based on the query’s context. For example, a user query can be “plot the spatial expression of the gene MALATE” The decision of the code agent can be selecting the plotting tool. The generated code can be as below. python import matplotlib.pyplot as pit colormap = plt.get cmap('viridis') gene_name = 'MALAT1' if gene name in self.data.var names: expression data = self.dataf:, gene_name].X.toarray().flatten() if hasattr(self.data.X, 'toarray') else self.dataf:, gene_name].X.flatten() colors = colormap(expression data) x coords = self. data.obs[self.x_column_dropdown.currentText()]. values y coords = self. data.obs[self.y_column_dropdown.currentText()]. values self.viewer.add_points(np.column_stack([x_coords, y coords]), name=f {gene name} expression', size=50, face_color=colors[:, :3]) else: print(f Gene {gene name} not found in the dataset.')
[0123] Step 3 can be code execution. The generated code can be executed in the analytic system. If the code is successfully executed, an output can be produced.
[0124] Step 4 can be error correction, if necessary. If code execution fails, the error correction agent can be activated. The prompt content for error correction can include instructions focused on
[0125] 17
[0126] ACTIVE 126368586.1 072396.1115 identifying and fixing code errors and debugging strategies specifying guidelines on common error types and correction methods. The error correction agent can analyze the error, adjust the code, and retry the execution.
[0127] Step 5 can be conversation agent processing. The prompt content for conversation agent can include instructions that emphasize generating clear, concise, and informative responses, formatting guidelines specifying use of markdown or HTML to enhance readability, and sample responses to guide style and tone. The conversation agent can take the code output and format it into a human-readable response. An example formatted response is illustrated as below. markdown
[0128] ### Available Cell Types
[0129] The dataset includes the following cell types:
[0130] - B-cell
[0131] - T-cell
[0132] - Macrophage
[0133] - Fibroblast
[0134] - Endothelial cells
[0135] - [Additional cell types...]
[0136] In total, there are X unique cell types.
[0137] Step 6 can be response delivery. The formatted response can be presented to the user in the chat interface. If applicable, visualizations can be rendered in the user interface.
[0138] In certain embodiments, the analytics system can generate the prompts before analyzing and visualizing a specific dataset. The prompts can include generic portion that can be applied to any spatial omics data. The prompts can additionally include a portion that can be updated to include metadata of the specific dataset loaded to the analytic system. For example, the metadata can include the columns that are available in dataset, or what each of the columns means, etc. The analytic system can add the metadata to one or more prompts and provide the one or more prompts to the LLM.
[0139] Upon loading a dataset, the analytic system can automatically initialize by analyzing the dataset’s metadata. The analytic system can identify standard column names for spatial coordinates (e.g., “X” and “Y” locations of cells) and grouping variables (e.g., “cell type”). If the analytic system cannot automatically identify these columns, users can manually select or modify them through the intuitive user interface. This feature can streamline the setup process, allowing users to focus on data analysis rather than configuration.
[0140] 18
[0141] ACTIVE 126368586.1 072396.1115
[0142] In certain non-limiting embodiments, the external system describe above can serve as the visualization frontend for the analytic system, offering a powerful and flexible environment for exploring spatial data. The integration with the external system can include interactive scatter plots. For example, cells can be displayed as scatter plots, colored based on grouping variables such as cell types. Users can zoom, pan, and rotate the plots for detailed examination.
[0143] The integration can also include layer management. New visualizations can be added as separate layers, enabling users to toggle visibility, adjust settings, and compare different datasets or analysis results.
[0144] The integration can also include annotation tools. The annotation tools can allow users to mark regions of interest or specific cells, which can be leveraged by the LLM to answer targeted questions about those areas.
[0145] The integration can also include API integration. The analytic system can leverage APIs provided by the external system to seamlessly render visualizations generated by the code agent, which include both spatial plots and standard statistical graphs. The visualization can be displayed in additional windows within the user interface. The APIs can allow the analytic system to connect the LLM backend to the external system. The prompts can be designed to guide the LLM to use the APIs of the external system as needed. The conversation agent can convert the code output to a human readable format and show it to users in a UI window by calling the correct APIs required to do so. The code agent can call the plotting tool to visualize the spatial plot in the external system or plots such as bar plots, violin plots, and other standard plots using an additional UI window. All of the above actions by the agents can be enabled using the different LLM prompts. The prompts can instruct specific agents to use specific tools and show them in the correct visualization windows added by the analytic system using the correct APIs of the external system.
[0146] FIGS. 2A-2E illustrates example user interfaces for spatial transcriptomic discovery using multi-agent large language models. FIG. 2A illustrates a user loading a spatial transcriptomic dataset. For example, the user can drag 210 image(s) to open or use the menu shortcuts 220 provided by the user interface. FIG. 2B illustrates the visualizations 230 of the loaded dataset. FIG. 2C illustrates a chat interface 240 where the user can ask a question. For example, the question 242 can be “what are the available genes in this data?” FIG. 2D illustrates the chat interface 240 where the user can ask another question 242. For example, the question 242 can be “can you plot a violin plot of the KSHV.K2 gene?” FIG. 2E illustrates the chat interface 240 where the user receives the response 244. In FIG. 2E, the user question 242 can be “can you plot a bar plot of the percentage of cell types?” The analytic system 104 can analyze
[0147] 19
[0148] ACTIVE 126368586.1 072396.1115 the question 242, generate and execute code on the loaded dataset, and generate the bar plot 246 from the execution results. As can be see, the bar plot 246 is on the bottom part of the chat interface 240.
[0149] FIGS. 3A-3C illustrates other example user interfaces for spatial transcriptomic discovery using multi-agent large language models. In certain non-limiting embodiments, the visualization 142 interacts directly with the conversational agent. This integration has made the visualization experience smoother and more robust. Additionally, features are implemented for the visualization to improve flexibility and expand its functionality for users.
[0150] The user interfaces illustrated in FIGS. 3 A-3C introduce a clean, tab-based layout with dedicated sections for configuration 302, chat 304, visualization 306, and user feedback 308. Scientific plots appear in a separate visualization tab 306, and a color legend can be shown for clarity. The color legend is displayed for clarity and separates multiple tissue samples (or “cores”) by cancer type and stage. Users can browse datasets 314 via multiple entry points. Upon loading a dataset, the analytic system 104 automatically generates a summary — including cell count, gene count, and feature types — within seconds. The user interface improves usability, organization, and the overall interaction between users, the agent, and the visualizations.
[0151] The user interface introduces a color map 312 where each color represents a specific cell type. For example, green can be used for pericytes, and violet can be used for lymphatic endothelial cells. Users can click on any color to dynamically change it through the user interface. The user interface also supports zooming, annotations, and improved layout organization. The user interface further comprises an interactive Q&A toggle 316, which enables users to interact with the data visually and contextually while asking questions — an important feature for spatial transcriptomics.
[0152] In certain non-limiting embodiments, the analytic system 104 uses specific functions to extract color information from each visualization layer. For example, in the cell type layer 318, the function identifies and displays the corresponding cell type colors.
[0153] In certain non-limiting embodiments, each point 320 in the visualization represents a single cell, and its color indicates the cell type. However, the underlying data is much richer — each cell contains measurements for approximately 5,000 genes, captured through gene expression levels. The real analytical power lies in querying and visualizing these gene-level features. For example, the analytic system 104 can display the expression of one or multiple genes across cells via the user interface. This enables users to explore complex biological phenomena, such as molecular mechanisms in cancer and interactions between immune cells
[0154] 20
[0155] ACTIVE 126368586.1 072396.1115 and tumors. The dataset is highly sophisticated and difficult to analyze manually, and the analytic system 104 combined with the user interface allows biologists to extract meaningful insights that would otherwise be inaccessible due to the data’s complexity.
[0156] In the user interface, the toggle 316 enables interactive question answering, which scopes the query 322 to only the cells currently visible on the screen. This gives users more precise control, allowing them to focus on specific regions of interest in the data. For example, if a user zooms into a cluster and asks about cell types, the agent will return results only for that subset. This feature enhances the utility of the analytic system 104 by enabling targeted exploration of localized patterns within complex datasets.
[0157] The active layer 324 feature allows users to focus their queries on specific subsets of data within the visualization. As users interact with the agent and generate plots — such as spatial plots of tumor stages — multiple layers are added, each representing different conditions or attributes (e.g., cell types, tumor stages). Each layer has its own color scheme and can be toggled on or off for clarity. When the active layer toggle 324 is enabled, any question 322 asked in the chat (e.g., “Show only the nodular stage”) is applied only to the currently selected layer. For example, if the nodular stage layer is active, the agent will respond using only the data from that layer. If the toggle 324 is off, the analytic system 104 defaults to using either the first layer or the one with the most cells. This feature gives users fine-grained control over which subset of data their queries apply to, enabling targeted analysis — such as asking for spatial plots of T cells within a specific tumor stage — without affecting unrelated layers.
[0158] In the dataset used for illustrating the user interfaces in FIGS. 3A-3C, each point represents a cell, with color indicating cell type. For example, red point can be for T cells (immune cells) and green point can be for lymphatic endothelial cells, which in this case are tumor cells. The underlying data is highly dimensional, with each cell containing expression levels for around 5,000 genes. The analytic system 104 together with the user interface allows users to explore spatial relationships between immune cells and tumors across different cancer stages 332. For example, in the early “patch” stage, T cells are densely clustered around tumor cells, suggesting active immune targeting. In contrast, in the later “nodular” stage 334, T cells are largely excluded from tumor regions, indicating potential immune evasion by the tumor.
[0159] To investigate functional differences between T cells in these stages, the analytic system 104 performs differential gene expression analysis. This reveals which genes are more or less active in each stage, helping to determine whether T cells have become dysfunctional. The LLM can then interpret these results, identifying reduced immune activity in late-stage T cells — insights that previously required weeks of collaboration between bioinformaticians and
[0160] 21
[0161] ACTIVE 126368586.1 072396.1115 biologists. Furthermore, users can zoom in to regions in the spatial map to study the morphology and proximity of the cells with others. This capability dramatically accelerates discovery, enabling biologists without computational expertise to visualize, interact with, analyze, and interpret complex molecular data. The user interface also supports multi-layered comparisons, such as identifying virus-infected cells in Kaposi sarcoma, a tumor caused by KSHV. Infected cells can be easily highlighted by adding specific data layers.
[0162] A user can define what qualifies as an infected cell and ask the agent to create a new column in the dataset to store that information. Once the definition is applied, the agent can run an interactive query on the visible cells and update the dataset accordingly. This allows users to immediately add a new visualization layer showing only the infected cells.
[0163] As illustrated in FIGS. 3 A-3C, some dots represent virus-infected cells 326. To better understand which cell types are infected, the user interface can overlay two data layers and adjust the dot size for infected cells. This allows users to visually identify infected cells within specific types — such as the dots for T cells and lymphatic endothelial (tumor) cells 328. Most tumor cells are infected, while only a few T cells show infection.
[0164] By switching to earlier tumor stages, users can observe different infection patterns. Additionally, the user interface allows users to explore hidden features like gene expression. For example, the user interface can display the expression 330 of a viral gene such as KSHV.K2. The gradient color legend shows expression levels — brighter colors indicate higher activity. Overlaying this with cell types reveals that KSHV.K2 is highly expressed in infected tumor cells. This enables detailed analysis of how viral genes behave across different cell types, helping researchers study the impact of viral infection on immune cells versus tumor cells. Such insights are typically difficult to obtain with conventional tools, but the embodiments disclosed herein make it accessible and intuitive for biologists.
[0165] FIG. 4 illustrates an example method 400 for spatial omics discovery using multi-agent large language models. The method can begin at step 410, where the computing system can receive, from a client system, a user selection of a spatial omics dataset. At step 420, the computing system can determine metadata associated with the spatial omics dataset. At step 430, the computing system can receive, from the client system, a user query regarding the spatial omics dataset. At step 440, the computing system can generate one or more prompts based on the user query, wherein the prompts comprise the metadata and are configured to be inputted into a large language model (LLM) to elicit a response from the LLM. At step 450, the computing system can inject one or more question-code pair examples to the user query by referencing a curated knowledge base of question-code pairs. At step 460, the computing
[0166] 22
[0167] ACTIVE 126368586.1 072396.1115 system can generate, by the LLM based on the prompts, a code executable on the spatial omics dataset. At step 470, the computing system can execute the code on the spatial omics dataset. At step 480, the computing system can generate, by the LLM based on the prompts and the execution result, a response to the user query. At step 490, the computing system can send, to the client system, instructions for presenting the response. Particular embodiments can repeat one or more steps of the method of FIG. 4, where appropriate. Although this disclosure describes and illustrates particular steps of the method of FIG. 4 as occurring in a particular order, this disclosure contemplates any suitable steps of the method of FIG. 4 occurring in any suitable order. Moreover, although this disclosure describes and illustrates an example method for spatial omics discovery using multi-agent large language models including the particular steps of the method of FIG. 4, this disclosure contemplates any suitable method for spatial omics discovery using multi-agent large language models including any suitable steps, which can include all, some, or none of the steps of the method of FIG. 4, where appropriate. Furthermore, although this disclosure describes and illustrates particular components, devices, or systems carrying out particular steps of the method of FIG. 4, this disclosure contemplates any suitable combination of any suitable components, devices, or systems carrying out any suitable steps of the method of FIG. 4.
[0168] FIG. 5 illustrates an example computer system 500. In particular embodiments, one or more computer systems 500 perform one or more steps of one or more methods described or illustrated herein. In particular embodiments, one or more computer systems 500 provide functionality described or illustrated herein. In particular embodiments, software running on one or more computer systems 500 performs one or more steps of one or more methods described or illustrated herein or provides functionality described or illustrated herein. Particular embodiments include one or more portions of one or more computer systems 500. Herein, reference to a computer system can encompass a computing device, and vice versa, where appropriate. Moreover, reference to a computer system can encompass one or more computer systems, where appropriate.
[0169] This disclosure contemplates any suitable number of computer systems 500. This disclosure contemplates computer system 500 taking any suitable physical form. As example and not by way of limitation, computer system 500 can be an embedded computer system, a system-on-chip (SOC), a single-board computer system (SBC) (such as, for example, a computer-on-module (COM) or system-on-module (SOM)), a desktop computer system, a laptop or notebook computer system, an interactive kiosk, a mainframe, a mesh of computer systems, a mobile telephone, a personal digital assistant (PDA), a server, a tablet computer
[0170] 23
[0171] ACTIVE 126368586.1 072396.1115 system, or a combination of two or more of these. Where appropriate, computer system 500 can include one or more computer systems 500; be unitary or distributed; span multiple locations; span multiple machines; span multiple data centers; or reside in a cloud, which can include one or more cloud components in one or more networks. Where appropriate, one or more computer systems 500 can perform without substantial spatial or temporal limitation one or more steps of one or more methods described or illustrated herein. As an example and not by way of limitation, one or more computer systems 500 can perform in real time or in batch mode one or more steps of one or more methods described or illustrated herein. One or more computer systems 500 can perform at different times or at different locations one or more steps of one or more methods described or illustrated herein, where appropriate.
[0172] In particular embodiments, computer system 500 includes a processor 502, memory 504, storage 506, an input / output (I / O) interface 508, a communication interface 510, and a bus 512. Although this disclosure describes and illustrates a particular computer system having a particular number of particular components in a particular arrangement, this disclosure contemplates any suitable computer system having any suitable number of any suitable components in any suitable arrangement.
[0173] In particular embodiments, processor 502 includes hardware for executing instructions, such as those making up a computer program. As an example and not by way of limitation, to execute instructions, processor 502 can retrieve (or fetch) the instructions from an internal register, an internal cache, memory 504, or storage 506; decode and execute them; and then write one or more results to an internal register, an internal cache, memory 504, or storage 506. In particular embodiments, processor 502 can include one or more internal caches for data, instructions, or addresses. This disclosure contemplates processor 502 including any suitable number of any suitable internal caches, where appropriate. As an example and not by way of limitation, processor 502 can include one or more instruction caches, one or more data caches, and one or more translation lookaside buffers (TLBs). Instructions in the instruction caches can be copies of instructions in memory 504 or storage 506, and the instruction caches can speed up retrieval of those instructions by processor 502. Data in the data caches can be copies of data in memory 504 or storage 506 for instructions executing at processor 502 to operate on; the results of previous instructions executed at processor 502 for access by subsequent instructions executing at processor 502 or for writing to memory 504 or storage 506; or other suitable data. The data caches can speed up read or write operations by processor 502. The TLBs can speed up virtual-address translation for processor 502. In particular embodiments, processor 502 can include one or more internal registers for data, instructions, or addresses.
[0174] 24
[0175] ACTIVE 126368586.1 072396.1115
[0176] This disclosure contemplates processor 502 including any suitable number of any suitable internal registers, where appropriate. Where appropriate, processor 502 can include one or more arithmetic logic units (ALUs); be a multi-core processor; or include one or more processors 502. Although this disclosure describes and illustrates a particular processor, this disclosure contemplates any suitable processor.
[0177] In particular embodiments, memory 504 includes main memory for storing instructions for processor 502 to execute or data for processor 502 to operate on. As an example and not by way of limitation, computer system 500 can load instructions from storage 506 or another source (such as, for example, another computer system 500) to memory 504. Processor 502 can then load the instructions from memory 504 to an internal register or internal cache. To execute the instructions, processor 502 can retrieve the instructions from the internal register or internal cache and decode them. During or after execution of the instructions, processor 502 can write one or more results (which can be intermediate or final results) to the internal register or internal cache. Processor 502 can then write one or more of those results to memory 504. In particular embodiments, processor 502 executes only instructions in one or more internal registers or internal caches or in memory 504 (as opposed to storage 506 or elsewhere) and operates only on data in one or more internal registers or internal caches or in memory 504 (as opposed to storage 506 or elsewhere). One or more memory buses (which can each include an address bus and a data bus) can couple processor 502 to memory 504. Bus 512 can include one or more memory buses, as described below. In particular embodiments, one or more memory management units (MMUs) reside between processor 502 and memory 504 and facilitate accesses to memory 504 requested by processor 502. In particular embodiments, memory 504 includes random access memory (RAM). This RAM can be volatile memory, where appropriate. Where appropriate, this RAM can be dynamic RAM (DRAM) or static RAM (SRAM). Moreover, where appropriate, this RAM can be single-ported or multi-ported RAM. This disclosure contemplates any suitable RAM. Memory 504 can include one or more memories 504, where appropriate. Although this disclosure describes and illustrates particular memory, this disclosure contemplates any suitable memory.
[0178] In particular embodiments, storage 506 includes mass storage for data or instructions. As an example and not by way of limitation, storage 506 can include a hard disk drive (HDD), a floppy disk drive, flash memory, an optical disc, a magneto-optical disc, magnetic tape, or a Universal Serial Bus (USB) drive or a combination of two or more of these. Storage 506 can include removable or non-removable (or fixed) media, where appropriate. Storage 506 can be internal or external to computer system 500, where appropriate. In particular embodiments,
[0179] 25
[0180] ACTIVE 126368586.1 072396.1115 storage 506 is non-volatile, solid-state memory. In particular embodiments, storage 506 includes read-only memory (ROM). Where appropriate, this ROM can be mask-programmed ROM, programmable ROM (PROM), erasable PROM (EPROM), electrically erasable PROM (EEPROM), electrically alterable ROM (EAROM), or flash memory or a combination of two or more of these. This disclosure contemplates mass storage 506 taking any suitable physical form. Storage 506 can include one or more storage control units facilitating communication between processor 502 and storage 506, where appropriate. Where appropriate, storage 506 can include one or more storages 506. Although this disclosure describes and illustrates particular storage, this disclosure contemplates any suitable storage.
[0181] In particular embodiments, VO interface 508 includes hardware, software, or both, providing one or more interfaces for communication between computer system 500 and one or more I / O devices. Computer system 500 can include one or more of these VO devices, where appropriate. One or more of these VO devices can enable communication between a person and computer system 500. As an example and not by way of limitation, an VO device can include a keyboard, keypad, microphone, monitor, mouse, printer, scanner, speaker, still camera, stylus, tablet, touch screen, trackball, video camera, another suitable VO device or a combination of two or more of these. An VO device can include one or more sensors. This disclosure contemplates any suitable VO devices and any suitable VO interfaces 508 for them. Where appropriate, I / O interface 508 can include one or more device or software drivers enabling processor 502 to drive one or more of these VO devices. VO interface 508 can include one or more VO interfaces 508, where appropriate. Although this disclosure describes and illustrates a particular VO interface, this disclosure contemplates any suitable VO interface.
[0182] In particular embodiments, communication interface 510 includes hardware, software, or both providing one or more interfaces for communication (such as, for example, packetbased communication) between computer system 500 and one or more other computer systems 500 or one or more networks. As an example and not by way of limitation, communication interface 510 can include a network interface controller (NIC) or network adapter for communicating with an Ethernet or other wire-based network or a wireless NIC (WNIC) or wireless adapter for communicating with a wireless network, such as a WI-FI network. This disclosure contemplates any suitable network and any suitable communication interface 510 for it. As an example and not by way of limitation, computer system 500 can communicate with an ad hoc network, a personal area network (PAN), a local area network (LAN), a wide area network (WAN), a metropolitan area network (MAN), or one or more portions of the Internet or a combination of two or more of these. One or more portions of one or more of these
[0183] 26
[0184] ACTIVE 126368586.1 072396.1115 networks can be wired or wireless. As an example, computer system 500 can communicate with a wireless PAN (WPAN) (such as, for example, a BLUETOOTH WPAN), a WI-FI network, a WI-MAX network, a cellular telephone network (such as, for example, a Global System for Mobile Communications (GSM) network), or other suitable wireless network or a combination of two or more of these. Computer system 500 can include any suitable communication interface 510 for any of these networks, where appropriate. Communication interface 510 can include one or more communication interfaces 510, where appropriate. Although this disclosure describes and illustrates a particular communication interface, this disclosure contemplates any suitable communication interface.
[0185] In particular embodiments, bus 512 includes hardware, software, or both coupling components of computer system 500 to each other. As an example and not by way of limitation, bus 512 can include an Accelerated Graphics Port (AGP) or other graphics bus, an Enhanced Industry Standard Architecture (EISA) bus, a front-side bus (FSB), a HYPERTRANSPORT (HT) interconnect, an Industry Standard Architecture (ISA) bus, an INFINIBAND interconnect, a low-pin-count (LPC) bus, a memory bus, a Micro Channel Architecture (MCA) bus, a Peripheral Component Interconnect (PCI) bus, a PCI-Express (PCIe) bus, a serial advanced technology attachment (SATA) bus, a Video Electronics Standards Association local (VLB) bus, or another suitable bus or a combination of two or more of these. Bus 512 can include one or more buses 512, where appropriate. Although this disclosure describes and illustrates a particular bus, this disclosure contemplates any suitable bus or interconnect.
[0186] Herein, a computer-readable non-transitory storage medium or media can include one or more semiconductor-based or other integrated circuits (ICs) (such, as for example, field- programmable gate arrays (FPGAs) or application-specific ICs (ASICs)), hard disk drives (HDDs), hybrid hard drives (HHDs), optical discs, optical disc drives (ODDs), magneto-optical discs, magneto-optical drives, floppy diskettes, floppy disk drives (FDDs), magnetic tapes, solid-state drives (SSDs), RAM-drives, SECURE DIGITAL cards or drives, any other suitable computer-readable non-transitory storage media, or any suitable combination of two or more of these, where appropriate. A computer-readable non-transitory storage medium can be volatile, non-volatile, or a combination of volatile and non-volatile, where appropriate.
[0187] Herein, “or” is inclusive and not exclusive, unless expressly indicated otherwise or indicated otherwise by context. Therefore, herein, “A or B” means “A, B, or both,” unless expressly indicated otherwise or indicated otherwise by context. Moreover, “and” is both joint and several, unless expressly indicated otherwise or indicated otherwise by context. Therefore,
[0188] 27
[0189] ACTIVE 126368586.1 072396.1115 herein, “A and B” means “A and B, jointly or severally,” unless expressly indicated otherwise or indicated otherwise by context.
[0190] The scope of this disclosure encompasses all changes, substitutions, variations, alterations, and modifications to the example embodiments described or illustrated herein that a person having ordinary skill in the art would comprehend. The scope of this disclosure is not limited to the example embodiments described or illustrated herein. Moreover, although this disclosure describes and illustrates respective embodiments herein as including particular components, elements, feature, functions, operations, or steps, any of these embodiments can include any combination or permutation of any of the components, elements, features, functions, operations, or steps described or illustrated anywhere herein that a person having ordinary skill in the art would comprehend. Furthermore, reference in the appended claims to an apparatus or system or a component of an apparatus or system being adapted to, arranged to, capable of, configured to, enabled to, operable to, or operative to perform a particular function encompasses that apparatus, system, component, whether or not it or that particular function is activated, turned on, or unlocked, as long as that apparatus, system, or component is so adapted, arranged, capable, configured, enabled, operable, or operative. Additionally, although this disclosure describes or illustrates particular embodiments as providing particular advantages, particular embodiments can provide none, some, or all of these advantages.
[0191] 28
[0192] ACTIVE 126368586.1
Claims
072396.1115CLAIMSWhat is claimed is:
1. A method comprising, by one or more computing systems: receiving, from a client system, a user selection of a spatial omics dataset; determining metadata associated with the spatial omics dataset; receiving, from the client system, a user query regarding the spatial omics dataset; generating one or more prompts based on the user query, wherein the prompts comprise the metadata and are configured to be inputted into a large language model (LLM) to elicit a response from the LLM; generating, by the LLM based on the prompts, a code executable on the spatial omics dataset; executing the code on the spatial omics dataset; generating, by the LLM based on the prompts and the execution result, a response to the user query; and sending, to the client system, instructions for presenting the response.
2. The method of Claim 1, further comprising: generating one or more visualizations associated with the spatial omics dataset; and sending, to the client system, instructions for presenting the visualizations.
3. The method of Claim 2, wherein generating the visualizations comprises: sending, via an application-programming-interface (API) to an external system, a request for the visualizations; and receiving, via the API from the external system, the visualizations.
4. The method of any one of Claims 1-3, wherein the metadata comprises spatial coordinates of each cell and their corresponding cell types.
5. The method of any one of Claims 1-4, further comprising: sending, via an application-programming-interface (API) to the LLM, a request for a response to the user query; and receiving, via the API from the LLM, the response.29ACTIVE 126368586.1072396.11156. The method of any one of Claims 1-5, further comprising: selecting, by the LLM based on the user query and the prompt, a first agent among a plurality of agents to generate the code.
7. The method of Claim 6, further comprising: accessing, by the first agent, a software tool from a plurality of software tools for generating the code, wherein the plurality of software tools comprise one or more of a data analysis tool, a plotting tool, or a text response tool.
8. The method of any one of Claims 1-7, further comprising: selecting, by the LLM based on the execution result and the prompt, a second agent among a plurality of agents to generate the response.
9. The method of Claim 8, further comprising: accessing, by the second agent, a software tool from a plurality of software tools for generating the response, wherein the plurality of software tools comprise one or more of a data analysis tool, a plotting tool, or a text response tool.
10. The method of any one of Claims 1-9, wherein the prompts comprise at least a generic prompt and a customizable prompt, wherein the generic prompt is applicable to a plurality of spatial omics datasets.
11. The method of Claim 10, further comprising: updating the customizable prompt by adding the metadata.
12. The method of any one of Claims 1-11, further comprising: detecting one or more errors in the code; and correcting the errors in the code.
13. The method of any one of Claims 1-12, wherein the user query comprises a multi-step analytical query associated with a plurality of actions, the method further comprising: decomposing the user query into a plurality of sub -tasks; and mapping each sub-task to a particular software tool among a plurality of software tools;30ACTIVE 126368586.1072396.1115 wherein generating the code or executing the code is further based on executions of the plurality of software tools associated with the plurality of sub-tasks.
14. The method of Claim 13, wherein decomposing the user query into the sub-tasks is performed by the LLM using a particular prompt comprising instructions for generating execution plans and access to conversation history.
15. The method of Claim 13, wherein the sub-tasks comprise at least a first sub-task and a second sub-task, the method further comprising: executing the first sub-task; and subsequent to executing the first sub-task, executing the second sub-task based on execution results associated with the first sub-task.
16. The method of Claim 13, wherein the response comprises a combination of intermediate outputs generated from executions of the plurality of sub-tasks.
17. The method of any one of Claims 1-16, further comprising: accessing, from a storage, a conversation history comprising textual records of prior user inputs and system responses; and detecting whether the user query is a follow-up to a previous query based on the conversation history.
18. The method of Claim 17, further comprising: accessing, from the storage, prior results generated during prior interactions with one or more users, wherein the prior results comprise a plurality of output objects stored in a structured format separate from the conversation history; and retrieving one or more relevant prior results associated with the user query; wherein the response is generated further based on the relevant prior results.
19. The method of Claim 17, further comprising: generating a summary of the conversation history when a number of exchanges exceeds a threshold; and providing the summary to the LLM, wherein the response is generated further based on the summary.31ACTIVE 126368586.1072396.111520. The method of Claim 17, further comprising: receiving, from the client system, a request to reset conversational context; in response to the request, clearing the conversation history; and resetting internal memory configured to cache the conversation history.
21. The method of any one of Claims 1-20, further comprising: updating the one or more prompts by injecting one or more question-code pair examples to the user query by referencing a curated knowledge-base of question-code pairs to carry out spatial omics or bioinformatics workloads.
22. The method of any one of Claims 1-21, further comprising: injecting one or more question-code pair examples to the user query by referencing a curated knowledge base of question-code pairs.
23. One or more computer-readable non-transitory storage media embodying software that is operable when executed to: receive, from a client system, a user selection of a spatial omics dataset; determine metadata associated with the spatial omics dataset; receive, from the client system, a user query regarding the spatial omics dataset; generate one or more prompts based on the user query, wherein the prompts comprise the metadata and are configured to be inputted into a large language model (LLM) to elicit a response from the LLM; generate, by the LLM based on the prompts, a code executable on the spatial omics dataset; execute the code on the spatial omics dataset; generate, by the LLM based on the prompts and the execution result, a response to the user query; and send, to the client system, instructions for presenting the response.
24. The media of Claim 23, wherein the software is further operable when executed to: generate one or more visualizations associated with the spatial omics dataset; and send, to the client system, instructions for presenting the visualizations.
25. The media of Claim 3424, wherein generating the visualizations comprises:32ACTIVE 126368586.1072396.1115 send, via an application-programming-interface (API) to an external system, a request for the visualizations; and receive, via the API from the external system, the visualizations.
26. The media of any one of Claims 23-25, wherein the metadata comprises spatial coordinates of each cell and their corresponding cell types.
27. The media of any one of Claims 23-26, wherein the software is further operable when executed to: send, via an application-programming-interface (API) to the LLM, a request for a response to the user query; and receive, via the API from the LLM, the response.
28. The media of any one of Claims 23-27, wherein the software is further operable when executed to: select, by the LLM based on the user query and the prompt, a first agent among a plurality of agents to generate the code.
29. The media of Claim 28, wherein the software is further operable when executed to: access, by the first agent, a software tool from a plurality of software tools for generating the code, wherein the plurality of software tools comprise one or more of a data analysis tool, a plotting tool, or a text response tool.
30. The media of any one of Claims 23-29, wherein the software is further operable when executed to: select, by the LLM based on the execution result and the prompt, a second agent among a plurality of agents to generate the response.
31. The media of Claim 30, wherein the software is further operable when executed to: access, by the second agent, a software tool from a plurality of software tools for generating the response, wherein the plurality of software tools comprise one or more of a data analysis tool, a plotting tool, or a text response tool.33ACTIVE 126368586.1072396.111532. The media of any one of Claims 23-31, wherein the prompts comprise at least a generic prompt and a customizable prompt, wherein the generic prompt is applicable to a plurality of spatial omics datasets.
33. The media of Claim 32, wherein the software is further operable when executed to: update the customizable prompt by adding the metadata.
34. The media of any one of Claims 23-33, wherein the software is further operable when executed to: detect one or more errors in the code; and correct the errors in the code.
35. The media of any one of Claims 23-34, wherein the user query comprises a multi-step analytical query associated with a plurality of actions, wherein the software is further operable when executed to: decompose the user query into a plurality of sub-tasks; and map each sub-task to a particular software tool among a plurality of software tools; wherein generating the code or executing the code is further based on executions of the plurality of software tools associated with the plurality of sub-tasks.
36. The media of Claim 35, wherein decomposing the user query into the sub-tasks is performed by the LLM using a particular prompt comprising instructions for generating execution plans and access to conversation history.
37. The media of Claim 35, wherein the sub-tasks comprise at least a first sub-task and a second sub-task, wherein the software is further operable when executed to: execute the first sub-task; and subsequent to executing the first sub-task, execute the second sub-task based on execution results associated with the first sub-task.
38. The media of Claim 35, wherein the response comprises a combination of intermediate outputs generated from executions of the plurality of sub-tasks.34ACTIVE 126368586.1072396.111539. The media of any one of Claims 23-38, wherein the software is further operable when executed to: access, from a storage, a conversation history comprising textual records of prior user inputs and system responses; and detect whether the user query is a follow-up to a previous query based on the conversation history.
40. The media of Claim 39, wherein the software is further operable when executed to: access, from the storage, prior results generated during prior interactions with one or more users, wherein the prior results comprise a plurality of output objects stored in a structured format separate from the conversation history; and retrieve one or more relevant prior results associated with the user query; wherein the response is generated further based on the relevant prior results.
41. The media of Claim 39, wherein the software is further operable when executed to: generate a summary of the conversation history when a number of exchanges exceeds a threshold; and provide the summary to the LLM, wherein the response is generated further based on the summary.
42. The media of Claim 39, wherein the software is further operable when executed to: receive, from the client system, a request to reset conversational context; in response to the request, clear the conversation history; and reset internal memory configured to cache the conversation history.
43. The media of any one of Claims 23-42, wherein the software is further operable when executed to: update the one or more prompts by injecting one or more question-code pair examples to the user query by referencing a curated knowledge-base of question-code pairs to carry out spatial omics or bioinformatics workloads.
44. The media of any one of Claims 23-43, wherein the software is further operable when executed to:35ACTIVE 126368586.1072396.1115 inject one or more question-code pair examples to the user query by referencing a curated knowledge base of question-code pairs.
45. A system comprising: one or more processors; and a non-transitory memory coupled to the processors comprising instructions executable by the processors, the processors operable when executing the instructions to: receive, from a client system, a user selection of a spatial omics dataset; determine metadata associated with the spatial omics dataset; receive, from the client system, a user query regarding the spatial omics dataset; generate one or more prompts based on the user query, wherein the prompts comprise the metadata and are configured to be inputted into a large language model (LLM) to elicit a response from the LLM; generate, by the LLM based on the prompts, a code executable on the spatial omics dataset; execute the code on the spatial omics dataset; generate, by the LLM based on the prompts and the execution result, a response to the user query; and send, to the client system, instructions for presenting the response.
46. The system of Claim 45, wherein the processors are further operable when executing the instructions to: generate one or more visualizations associated with the spatial omics dataset; and send, to the client system, instructions for presenting the visualizations.
47. The system of Claim 4634, wherein generating the visualizations comprises: send, via an application-programming-interface (API) to an external system, a request for the visualizations; and receive, via the API from the external system, the visualizations.
48. The system of any one of Claims 45-47, wherein the metadata comprises spatial coordinates of each cell and their corresponding cell types.
49. The system of any one of Claims 45-48, wherein the processors are further operable when executing the instructions to:36ACTIVE 126368586.1072396.1115 send, via an application-programming-interface (API) to the LLM, a request for a response to the user query; and receive, via the API from the LLM, the response.
50. The system of any one of Claims 45-49, wherein the processors are further operable when executing the instructions to: select, by the LLM based on the user query and the prompt, a first agent among a plurality of agents to generate the code.
51. The system of Claim 50, wherein the processors are further operable when executing the instructions to: access, by the first agent, a software tool from a plurality of software tools for generating the code, wherein the plurality of software tools comprise one or more of a data analysis tool, a plotting tool, or a text response tool.
52. The system of any one of Claims 45-51, wherein the processors are further operable when executing the instructions to: select, by the LLM based on the execution result and the prompt, a second agent among a plurality of agents to generate the response.
53. The system of Claim 52, wherein the processors are further operable when executing the instructions to: access, by the second agent, a software tool from a plurality of software tools for generating the response, wherein the plurality of software tools comprise one or more of a data analysis tool, a plotting tool, or a text response tool.
54. The system of any one of Claims 45-53, wherein the prompts comprise at least a generic prompt and a customizable prompt, wherein the generic prompt is applicable to a plurality of spatial omics datasets.
55. The system of Claim 54, wherein the processors are further operable when executing the instructions to: update the customizable prompt by adding the metadata.37ACTIVE 126368586.1072396.111556. The system of any one of Claims 45-55, wherein the processors are further operable when executing the instructions to: detect one or more errors in the code; and correct the errors in the code.
57. The system of any one of Claims 45-56, wherein the user query comprises a multi-step analytical query associated with a plurality of actions, wherein the processors are further operable when executing the instructions to: decompose the user query into a plurality of sub-tasks; and map each sub-task to a particular software tool among a plurality of software tools; wherein generating the code or executing the code is further based on executions of the plurality of software tools associated with the plurality of sub-tasks.
58. The system of Claim 57, wherein decomposing the user query into the sub-tasks is performed by the LLM using a particular prompt comprising instructions for generating execution plans and access to conversation history.
59. The system of Claim 57, wherein the sub-tasks comprise at least a first sub-task and a second sub-task, wherein the processors are further operable when executing the instructions to: execute the first sub-task; and subsequent to executing the first sub-task, execute the second sub-task based on execution results associated with the first sub-task.
60. The system of Claim 57, wherein the response comprises a combination of intermediate outputs generated from executions of the plurality of sub-tasks.
61. The system of any one of Claims 45-60, wherein the processors are further operable when executing the instructions to: access, from a storage, a conversation history comprising textual records of prior user inputs and system responses; and detect whether the user query is a follow-up to a previous query based on the conversation history.38ACTIVE 126368586.1072396.111562. The system of Claim 61, wherein the processors are further operable when executing the instructions to: access, from the storage, prior results generated during prior interactions with one or more users, wherein the prior results comprise a plurality of output objects stored in a structured format separate from the conversation history; and retrieve one or more relevant prior results associated with the user query; wherein the response is generated further based on the relevant prior results.
63. The system of Claim 61, wherein the processors are further operable when executing the instructions to: generate a summary of the conversation history when a number of exchanges exceeds a threshold; and provide the summary to the LLM, wherein the response is generated further based on the summary.
64. The system of Claim 61, wherein the processors are further operable when executing the instructions to: receive, from the client system, a request to reset conversational context; in response to the request, clear the conversation history; and reset internal memory configured to cache the conversation history.
65. The system of any one of Claims 45-64, wherein the processors are further operable when executing the instructions to: update the one or more prompts by injecting one or more question-code pair examples to the user query by referencing a curated knowledge-base of question-code pairs to carry out spatial omics or bioinformatics workloads.
66. The system of any one of Claims 45-65, wherein the processors are further operable when executing the instructions to: inject one or more question-code pair examples to the user query by referencing a curated knowledge base of question-code pairs.39ACTIVE 126368586.1