Business interaction method, system, apparatus, storage medium, and computer program product
By leveraging LLM understanding and multi-round clarification through structured intent matching and card-based interaction, the problems of fragmented entry points, high learning costs, and state consistency in traditional business interaction systems have been solved. This has enabled efficient and flexible multi-business domain interaction, improving user experience and system stability.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- WEIWEI TECHNOLOGY (HONG KONG) CO LTD
- Filing Date
- 2026-02-14
- Publication Date
- 2026-06-19
Smart Images

Figure CN122245308A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the fields of artificial intelligence and conversational interaction technology, and in particular to a business interaction method, system, device and storage medium, and computer program product. Background Technology
[0002] As digital transformation deepens, the functions of enterprise-level applications (CRM, ERP, OA, BI, ticketing systems, marketing operations platforms, knowledge bases, etc.) and consumer-level applications (content services, e-commerce platforms, lifestyle service tools, educational products, health management applications, etc.) continue to overlap, gradually forming a complex structure with multiple business domains, roles, entry points, and terminals. Users need to address the rigid requirements of enterprise-level scenarios such as access control, process compliance, and data auditing, while also seeking a consumer-level experience that prioritizes low learning costs, efficient operation, and natural interaction. This places higher demands on the versatility and adaptability of business interaction systems.
[0003] Traditional business interactions are mainly carried out through GUI (Graphical User Interface) with "menu-page-form" as the core. Its design logic is significantly out of touch with current user needs and system form. At the same time, although existing AI dialogue assistants have made breakthroughs in natural language understanding, they have not yet solved the core engineering problems in business execution, resulting in many interaction pain points for both B-end enterprises and C-end users. Summary of the Invention
[0004] This application provides a business interaction method, system, device, storage medium, and computer program product to achieve business interaction through LLM understanding, controllable execution, and state consistency.
[0005] The embodiments of this application adopt the following technical solutions:
[0006] In a first aspect, embodiments of this application provide a business interaction method applied to a business server, the method comprising:
[0007] In response to user interaction input, output structured intent and match the structured intent to the corresponding business domain;
[0008] Based on the structured intent, editable cards are generated for distribution to the business domain and rendered to the client, thereby generating and presenting a structured intent trigger object on the client; and
[0009] In response to the structured intent triggering object, complete the business interaction or trigger a new round of business interaction.
[0010] Secondly, embodiments of this application also provide a business interaction system applied to a business server, the system comprising:
[0011] The input access module is configured to perform the following operations: receive interactive input from users in the form of text, voice, or clicks; the input access module supports in-application access points and external IM platform access points.
[0012] The session context and memory module is configured to perform the following operations: maintain the session history messageList, the current session identifier conversationId, the business basic information businessBasicInfo, and the session configuration config;
[0013] The LLM semantic understanding and multi-turn clarification module is configured to perform the following operations: semantic parsing, entity recognition, range recognition and multi-turn clarification of user interaction input, and output structured intent;
[0014] The central routing and capability registration module is configured to perform the following operations: maintain a business domain capability registry, which includes assistant_id, app information (appCode and branchCode), business type (businessType), and authorization requirements (Authorization), and complete the matching of structured intents to business domain capabilities.
[0015] The plan generation and orchestration module is configured to perform the following operations: break down structured intents into multi-step execution plans, and generate editable cards for each business domain step by step and distribute them to the corresponding business domains;
[0016] The card protocol generation and rendering module is configured to perform the following operations: generate editable or displayable cards according to the semantic card protocol and render them to the client;
[0017] The state consistency control module is configured to perform the following operations: maintain session state consistency, perform state synchronization through the event bus, and query the state of asynchronous operations through the retrieve interface;
[0018] The Structured Intent Trigger Object module is configured to perform the following operations: generate and present structured intent trigger objects on the client side, and translate user click operations into messages and send them to the LLM model through methods such as handleSendMessage and handleChatAgentBranchClick.
[0019] Thirdly, embodiments of this application also provide a computer device, including a memory, a processor, and a computer program stored in the memory, wherein when the processor executes the computer program, it implements the business interaction method described in the first aspect.
[0020] Fourthly, embodiments of this application also provide a computer-readable storage medium storing computer-executable instructions thereon, which, when executed by a processor, implement the business interaction method described in the first aspect.
[0021] Fifthly, embodiments of this application also provide a computer program product, including computer-executable program instructions, which, when executed by a processor, implement the business interaction method described in the first aspect.
[0022] The at least one technical solution adopted in this application embodiment can achieve the following beneficial effects: responding to user interaction input, outputting a structured intent and matching the structured intent to the corresponding business domain; generating an editable card for distribution to the business domain based on the structured intent and rendering it to the client, so as to generate and present a structured intent trigger object on the client; and responding to the structured intent trigger object to complete a business interaction or trigger a new round of business interaction. Through the above method, the following technical effects are achieved:
[0023] 1. Supports multiple input methods such as text, voice, and clicks, and supports multi-platform access, offering strong interactive compatibility and low access costs.
[0024] 2. By maintaining the session context globally, the continuity and consistency of multi-turn interactions are ensured, thereby improving the reliability of the interaction.
[0025] 3. By combining LLM semantic understanding with multi-turn clarification, natural language input is transformed into structured intent, thereby improving the accuracy of intent recognition.
[0026] 4. Based on capability registration and central routing, business domains are decoupled, supporting dynamic expansion of business capabilities, and the system architecture is flexible and scalable.
[0027] 5. Break down business processes into programmable execution plans, and reduce user operation costs and improve business execution efficiency through card-based interaction.
[0028] 6. By using an event bus and state consistency control, real-time synchronization of front-end and back-end states is achieved, significantly improving system stability and smoothness.
[0029] 7. Structured intent trigger objects enable closed-loop interaction, supporting multi-round iterative execution and enhancing the system's intelligence and automation level.
[0030] Furthermore, the business interaction system in this application, through its modular, standardized, and programmable design, achieves structured parsing of user intent, intelligent routing, process orchestration, and card-based interaction. It is suitable for complex business scenarios involving multiple business domains, multiple channels, and multiple rounds, and features high scalability, high stability, and a superior interactive experience. Attached Figure Description
[0031] The accompanying drawings, which are included to provide a further understanding of this application and form part of this application, illustrate exemplary embodiments and are used to explain this application, but do not constitute an undue limitation of this application. In the drawings:
[0032] Figure 1 This is a schematic diagram illustrating the use scenario of the business interaction method in the embodiments of this application;
[0033] Figure 2 This is a flowchart illustrating the business interaction method in an embodiment of this application;
[0034] Figure 3 This is a schematic diagram of the structure of the service interaction device in the embodiments of this application;
[0035] Figure 4 This is a schematic diagram of the structure of an electronic device according to an embodiment of this application. Detailed Implementation
[0036] To make the objectives, technical solutions, and advantages of this application clearer, the technical solutions of this application will be clearly and completely described below in conjunction with specific embodiments and corresponding drawings. Obviously, the described embodiments are only a part of the embodiments of this application, and not all of them. Based on the embodiments in this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.
[0037] Currently, the technical solutions for business interactions in the industry are mainly divided into four categories. The core implementation logic and application scenarios of each solution are as follows:
[0038] (a) Traditional GUI (page / form) solution
[0039] This solution represents a classic interaction pattern for business systems, embedding business processes within a hierarchical page structure: users must first locate the operation entry point through multi-level menus, then fill out forms and upload materials on designated pages, and finally submit to complete the operation. For complex business processes, functional coverage is typically achieved by increasing the number of pages and operation steps. The messaging system only serves as a notification reminder and does not support direct business operations within messages, creating a fragmented chain of "notification reception - page navigation - business execution - historical review".
[0040] (ii) Rule / Intent Library-Based Dialogue Robot Solution
[0041] Chatbot solutions based on keyword matching or pre-defined intent databases rely on manually configured rule sets to recognize user intent. They can only respond to explicit requests within a closed set of instructions, typically completing interactions by returning text replies, redirecting to links, or guiding users to designated pages, thus failing to form a complete business operation loop.
[0042] (III) Pure LLM Chat Assistant Solution
[0043] Leveraging the natural language understanding and generation capabilities of Large Language Models (LLM), this solution addresses user requests in a purely dialogue-based format. While it excels at understanding ambiguous expressions and handling multi-turn contextual relationships, and can output responses or preliminary operational suggestions in natural language, it lacks the robust constraint mechanisms and engineering support required for business execution.
[0044] (iv) Existing "chat + card" hybrid solution
[0045] Building upon dialogue interaction, cards are introduced as information carriers, encapsulating business data, operation buttons, and other elements within them, achieving a preliminary combination of "dialogue understanding + card operation." Cards are primarily used to display key information or provide quick access points, but a unified interaction protocol and lifecycle management mechanism have not been established; they exist only as supplementary components to the UI layer.
[0046] In view of the inherent defects of the traditional GUI solution mentioned above (I)
[0047] 1. Dispersed entry points and lengthy processes: Functions of multiple business domains are scattered across different menus and pages. Users need to switch back and forth between multiple pages to complete a single task, resulting in long operation paths and low efficiency.
[0048] 2. High learning cost: Users need to be familiar with the system's menu structure and operation logic in order to accurately locate the required functions, resulting in a long learning curve for new users;
[0049] 3. Data fragmentation and context breakage: Data on different pages cannot be synchronized in real time, and users are prone to losing context during operation, requiring them to repeatedly enter information;
[0050] 4. Inability to adapt to natural language requirements: Vague, ellipsis, and referential requests expressed by users in natural language are difficult to be directly accommodated by the fixed structure of "menu-page-form".
[0051] Addressing the key shortcomings of existing AI conversational assistant solutions mentioned in (II) above.
[0052] 1. Lack of state consistency: This is the most critical flaw in the existing solution. During multi-turn dialogues, the system generates multiple operable components (such as cards and buttons). When users revisit historical messages and click on older components, it can easily trigger errors, duplicate submissions, or new data being overwritten, leading to chaotic business execution.
[0053] 2. Insufficient controllability of business execution: Pure LLM solutions lack strong constraint mechanisms, have problems such as missing fields and incorrect parameters, and do not integrate permission verification and auditing processes, which cannot support the reliable execution of enterprise critical business.
[0054] 3. Weak cross-domain collaboration capabilities: Rule / intent library-based solutions struggle to handle complex demands across business domains, while pure LLM solutions have poor compatibility with existing business systems and cannot break down ambiguous intents into structured operations executable across multiple business domains.
[0055] 4. Insufficient adaptability to B+C scenarios: Existing solutions either focus on enterprise-level process compliance but sacrifice user experience, or focus on C-end convenience but lack enterprise-level capabilities such as permissions and auditing, making it difficult to meet the core requirements of both types of scenarios at the same time.
[0056] 5. Lack of uniformity in interaction carriers: The existing "chat + card" solution has not established a unified card protocol and business mapping mechanism. Cards are only used as independent display or operation units, which cannot be deeply integrated with the dialogue context and business process, and are difficult to support the state management and operation execution of complex business objects.
[0057] The embodiments of this application design an interactive architecture that takes into account both the rigid requirements of enterprises and the experience demands of consumers. It not only meets the key requirements of B-end multi-tenant isolation, permission verification, operation auditing, and process compliance, but also adapts to the experience needs of C-end users for high frequency, low learning cost, and natural interaction, so as to achieve full coverage of B+C dual scenarios with one system.
[0058] The technical solutions provided by the various embodiments of this application are described in detail below with reference to the accompanying drawings.
[0059] Figure 1 This is a schematic diagram illustrating a usage scenario of the business interaction method in an embodiment of this application. For example... Figure 1 As shown, it includes:
[0060] User: As the initiator of business interactions, the user inputs through the client, such as voice, text, or clicks. Client: As the entry point and result display platform for user interactions, it is responsible for receiving user input, rendering the interface, and providing feedback on business results. Editable Card: As the core component carrying the business interaction logic, it is generated by the system based on structured intents and used to guide the user to complete specific business operations. Further, the information flow between the components is as follows:
[0061] First, there is a two-way interaction arrow between the user and the client, indicating that the user inputs interactive commands to the client, while the client provides feedback on the business processing results to the user. Then, there is a two-way interaction arrow between the client and the editable card, indicating that the editable card is rendered and displayed on the client, while the user initiates business commands to the client by manipulating the structured intent trigger object on the editable card. Finally, there is a one-way arrow between the user and the editable card, indicating that the user's interactive input is parsed by the system into a structured intent, which further drives the generation and distribution of the editable card.
[0062] Specifically, the complete process of the business interaction method includes: 1. Intent parsing and matching: In response to user input on the client, the system outputs a structured intent and matches it to the corresponding business domain. 2. Card generation and rendering: Based on the structured intent, an editable card for distribution to the corresponding business domain is generated and rendered to the client. The client generates and presents the structured intent triggering object, such as operation buttons or form submission items on the card. 3. Business interaction loop: In response to the user's operation on the structured intent triggering object, the client completes the corresponding business interaction, such as submitting a query, completing a reservation, or triggering a new round of business interaction, such as further selecting operations on the result card.
[0063] This application provides a business interaction method applied to a business server, such as... Figure 2 The diagram illustrates a business interaction method flow in this application embodiment, used to realize intelligent business interaction based on structured intents and editable cards. It can be widely applied to scenarios such as intelligent customer service, intelligent assistants, and enterprise business platforms, improving the structuring level, execution efficiency, and user experience of business interactions. The method includes at least the following steps S210 to S230:
[0064] Step S210: In response to the user's interactive input, output a structured intent and match the structured intent to the corresponding business domain.
[0065] User interaction input includes, but is not limited to, text, voice, and click input. Furthermore, it can maintain in-app entry points and external IM (Instant Messaging) entry points. In-app entry points can be understood as providing three forms: a floating ball, a sidebar, and a full-screen chat page, integrated into business applications via an SDK, using a unified input protocol. External IM entry points include, but are not limited to, integration with IM platforms such as WeChat Work, DingTalk, and Lark, receiving messages through open platform interfaces, returning card messages and conversation links, and achieving cross-platform conversation synchronization.
[0066] The system employs various methods to maintain session history. For example, short-term memory can be used to cache the most recent 10 interactions, the ID of the focus business object, and the current topic, stored in an in-memory database (Redis) with an expiration time. Alternatively, long-term memory can be used to persist the entire session history, user preferences, and historical business operation records to a relational database, indexed by user ID and session ID. Layered session data storage also includes global / domain sessions: global sessions store common parameters across business domains (such as user identity and time range), while domain sessions store contexts specific to a single business domain (such as order numbers for order domains and waybill numbers for logistics domains).
[0067] Session thread (Topic) management: For example, automatic topic identification: LLM divides topics based on input semantics (such as "check order", "change address", "expedit shipment"), and assigns a unique identifier to each topic. Another example is multi-topic parallel processing: It supports users switching between multiple topics within a single session, distinguishing them by tags, preserving the context state of each topic, and avoiding context confusion. It also includes topic merging / termination: When a user enters a cross-topic command, related topics are automatically merged; when there is no interaction timeout, idle topics are terminated and their context is archived.
[0068] Business object maintenance: For example, automatic binding: LLM recognizes business entities (such as order number, customer ID) in the input and binds them as the current focus object, and subsequent interactions are based on this object by default; another example is manual switching: users can switch focus objects by command / clicking cards, update the context pointer and refresh associated cards.
[0069] Output a structured intent based on the input content and match the structured intent to the corresponding business domain:
[0070] First, semantic parsing and structured output: Specifically, a fine-tuned large language model is used, combined with a business domain Prompt template, to classify the input intent (query / operation / configuration / consultation), extract entities (business ID, parameters, time), and identify the scope (global / single domain / single object); the output is a standardized structured intent JSON, including but not limited to fields such as intent_type, entity_list, scope, confidence_score, etc. If the confidence score is lower than the threshold, it enters the clarification process.
[0071] Secondly, a multi-round clarification mechanism is implemented, including missing parameter clarification: when the intent lacks necessary parameters (such as "checking orders" without an order number), a clarification card is generated to guide the user to input / select parameters; ambiguity resolution clarification: for vague expressions (such as "recent orders"), a candidate list is displayed through cards, allowing the user to click to confirm; clarification loop: the user's clarification results are fed back into the context, and the intent is re-parsed until a complete structured intent is generated.
[0072] Preferably, domain adaptation optimization is adopted: a built-in business domain dictionary and rule base are used to standardize the mapping of industry terms and business abbreviations; it supports few-shot learning in low-resource domains and improves the understanding accuracy of specific business scenarios through Few-Shot Prompt.
[0073] For example,
[0074] Step 1: Respond to interactive input, output structured intent, and match the business domain.
[0075] The business server responds to user interaction input (including text, speech transcription, commands, etc.) sent by the client, performs structured parsing of the input content, and finally outputs standardized structured intents, which are then matched to the corresponding business domain.
[0076] Step S1. Input preprocessing and semantic parsing.
[0077] After preprocessing user input through cleaning, word segmentation, and noise reduction, the business server performs one or more of the following processes: semantic parsing, entity recognition, and scope recognition. Semantic parsing identifies the core intent of the user input (such as query, process, modify, cancel, etc.) based on a preset intent template and semantic model. Entity recognition extracts key business entity information, such as order number, account number, amount, time, and location. Scope recognition determines the business scope to which the interaction belongs, such as order business, payment business, logistics business, account business, etc.
[0078] Step S2. Multiple rounds of clarification treatment.
[0079] If the parsing results are ambiguous, lack key information, or have unclear intent, the business server initiates multiple rounds of clarification interactions with the user through the client, such as asking "Please provide your order number" or "Do you need a refund or an exchange?", until complete and clear information is obtained, and then outputs the structured intent.
[0080] Step S3. Structured intent output.
[0081] Structured intents are represented by standardized data structures, including fields such as intent type, key entities, business parameters, and interaction context, ensuring that intents can be uniformly identified and processed by business systems.
[0082] Step S4. Business domain matching.
[0083] The business server uses a pre-configured business domain capability registry to match structured intents to the corresponding business domains and capabilities. The business domain capability registry records the capability scope, processable intent types, associated business interfaces, and permission information of each business domain. By matching intent types with entity characteristics, it achieves precise routing of intents to business domains.
[0084] Step S220: Generate editable cards for distribution to the business domain based on the structured intent and render them to the client, so as to generate and present structured intent trigger objects on the client.
[0085] Based on semantic card protocol definition: Unified Card Schema: Defines the standard structure of edit / display state cards, including fields such as card_id, card_type, data, style, and action; Card Type Adaptation: Supports common types such as form cards, list cards, detail cards, confirmation cards, and clarification cards, covering all business scenarios; Protocol Extension: Supports business parties to customize card components and achieve personalized rendering through protocol extension fields.
[0086] Card rendering and adaptation: Client-side rendering: The card protocol is sent to the client SDK, and the client renders it into a native UI according to the protocol, adapting to multiple platforms such as iOS / Android / PC; Responsive layout: The card style is automatically adjusted according to the device screen size to ensure a consistent interactive experience; Offline rendering: Commonly used card templates are cached, and cached cards are displayed first in weak network environments to improve response speed.
[0087] The structured intent is generated and distributed to the editable cards of the business domain, which are simultaneously rendered to the client, where a structured intent trigger object is generated and presented.
[0088] For example,
[0089] Step 2: Generate editable cards and render them to the client.
[0090] Based on the structured intent, the business server generates editable cards according to the semantic card protocol, distributes them to the corresponding business domain, and then sends them to the client for rendering, enabling the client to generate and present the structured intent trigger object.
[0091] Step S1. Semantic Card Protocol Definition
[0092] In this embodiment, editable cards are generated based on a unified semantic card protocol, which includes at least two parts: rendering data and metadata.
[0093] ◦ Rendering data: Used for rendering the visual style, interactive controls, and display content of cards on the client side;
[0094] ◦ Metadata: Used to enable the pre-defined LLM model to understand the session context and enforce constraints, including at least the following fields:
[0095] Among them, cardId: a unique identifier for the card, used to distinguish different card instances, and supports globally unique indexes;
[0096] Wherein, sourceType: Card source type, which identifies whether the card is generated by the business server, LLM model, or client;
[0097] Among them, runtimeData: business runtime data, including business context information such as payload, subject, submitData, and methodData, which carries the core parameters of business interaction;
[0098] Among them, sourceData: the original data source of the card, which records the structured intent and parsing results of generating the card;
[0099] Among them, role: an interaction role identifier that distinguishes the interaction subject from the user side, system side, and business domain side;
[0100] Among them, content: the card display content, which supports formats such as text, rich text, and form controls;
[0101] Where, type: message type, including text message 1001, card message 1002, system message 1003, and synchronization context message MessageType.SyncContext;
[0102] Among them, attach: additional information, used to store extended fields, attachment links, business tags, etc.
[0103] Among them, isSyncLlm: a boolean field that indicates whether card data is synchronized to the LLM model to update the session context.
[0104] Step S2. Generation and Distribution of Edit-State Cards
[0105] The business server automatically fills in card fields based on the structured intent and semantic card protocol, generating an editable card containing a business form, confirmation control, and trigger button. The card is then distributed to the matching business domain, which performs data validation and parameter supplementation before sending it to the client.
[0106] Step S3. Client-side rendering and triggering object presentation
[0107] After receiving the editable card, the client parses and renders the data according to the protocol and presents it to the user in the form of a visual card. At the same time, the card contains a built-in structured intent trigger object, which contains at least the fields triggerMsg, callbackData, messageType, and subject, which are used to record the trigger command, callback data, message type, and business topic, respectively, to provide data support for subsequent interaction triggers.
[0108] Step S230: In response to the structured intent triggering object, complete the business interaction or trigger a new round of business interaction.
[0109] The structured intent triggers the object to complete a business interaction or to initiate a new round of business interactions.
[0110] For example,
[0111] Step 3: Respond to the triggering object to complete the business interaction or trigger a new round of interaction.
[0112] After the client presents the structured intent trigger object, the user can perform the trigger operation or edit and confirm the operation. The business server responds to the operation and completes the corresponding business interaction.
[0113] Step S1. Respond to the trigger operation to initiate a new round of interaction.
[0114] When a user clicks on a structured intent trigger object (such as controls for "Continue Processing", "Modify Information", "Query More"), the client encapsulates the trigger operation into a standard message and sends it back to the business server. The business server feeds this message back into the preset LLM model. Based on the updated session context and card data, the LLM model regenerates the execution plan, thereby triggering a new round of structured intent parsing, edit-state card generation and distribution process, realizing multi-round structured interaction.
[0115] Step S2. Respond to the confirmation operation and execute the business interaction.
[0116] When a user completes the information entry and confirmation operation of the editable card (such as "submit" or "confirm processing"), the business server verifies the runtimeData and business parameters in the card, calls the business API interface of the corresponding business domain, and executes the business operation (such as order creation, payment submission, information modification, etc.). After the business is executed, the business server encapsulates the execution result into a displayable card, sends it to the client for rendering and display, provides feedback on the business result to the user, and completes the business interaction.
[0117] Using the above method, after receiving user interaction input from the client, the business server generates an editable card that conforms to the semantic card protocol by matching intent parsing with the business domain and sends it to the client for rendering. The client presents a structured intent trigger object. After the user triggers it, the business server completes the business execution or triggers a new round of interaction, forming a closed-loop structured business interaction link.
[0118] In one embodiment of this application, the step of generating editable cards for distribution to the business domain and rendering them to the client according to the structured intent includes: generating editable cards for distribution to the business domain according to a semantic card protocol and rendering them to the client; the semantic card protocol includes at least rendering data and metadata for enabling a preset LLM model to understand the session context and execute constraints, wherein the metadata of the semantic card protocol includes at least: cardId field, sourceType field, runtimeData field, sourceData field, role field, content field, type field, attach field, and isSyncLlm field; wherein, the cardId field is used to represent the unique identifier of the card; the sourceType field is used to represent the source type of the card; the runtimeData field is used to represent business runtime data, including business context information such as payload, subject, submitData, and methodData; and the type field is used to represent the message type, including text message 1001, card message 1002, system message 1003, and synchronization context message MessageType.SyncContext.
[0119] The cardId field is a unique identifier for a card. It is globally unique and used for card creation, updating, deletion, status synchronization, and historical tracking, ensuring that the system can accurately locate and operate on individual cards.
[0120] The sourceType field is used to characterize the source type of the card, such as generated by LLM, issued by the business domain, automatically created by the system, or triggered by the user. It is used to distinguish the card generation subject and facilitate subsequent log tracking and troubleshooting.
[0121] The `runtimeData` field represents runtime data for the business logic, including business context information such as payload, subject, submitData, and methodData. Specifically, payload stores the core parameters of card interactions; subject identifies the business topic; submitData records the form data submitted by the user; and methodData records information about the business execution method, providing a complete context for business API calls.
[0122] The sourceData field stores the original data source for card generation, such as structured intent, user input, and historical context, which is used for card updates, backtracking, and validation.
[0123] The role field is used to identify the message role, such as user, assistant, system, card, etc., which is used by the LLM model to distinguish the message source and ensure the accuracy of context understanding.
[0124] The `content` field stores the core content of the card, including structured data, text, and prompts required for rendering, and is the main basis for client-side rendering.
[0125] The `type` field is used to identify the message type, including but not limited to: text message 1001, card message 1002, system message 1003, and synchronization context message `MessageType.SyncContext`. The system executes different message processing logic based on the `type` field, such as rendering, state synchronization, and event triggering.
[0126] The attach field is used to store additional information, such as attachments, extended parameters, extended configurations, and extended events, supporting flexible expansion of card functionality.
[0127] The `isSyncLlm` field is a boolean field that indicates whether the card message needs to be synchronized to the LLM model context. If true, the card message will be added to the session context so that the LLM model can understand the current interaction state; if false, it is only used for client display and does not participate in LLM context calculations.
[0128] After generating editable cards according to the aforementioned semantic card protocol, the cards are distributed to the corresponding business domains for verification and processing, and then sent to the client. The client, based on the rendering data and metadata in the protocol, renders the card interface, presenting interactive editable cards for users to fill in, select, and confirm information, thereby achieving structured and visual business interaction.
[0129] In one embodiment of this application, the step of responding to the structured intent trigger object to complete a business interaction or trigger a new round of business interaction includes: the structured intent trigger object includes at least the triggerMsg field, callbackData field, messageType field, and subject field; after generating and presenting the structured intent trigger object, responding to the user's trigger operation on the structured intent trigger object, translating the trigger operation into a message and feeding it back to the preset LLM model to trigger a new round of execution plan generation and the distribution process of the editable card; responding to the user's confirmation operation on the editable card, calling the business API application programming interface corresponding to the business domain to execute the business operation, and rendering the business execution result as a displayable card.
[0130] The structured intent trigger object includes at least the `triggerMsg` field, `callbackData` field, `messageType` field, and `subject` field. The `triggerMsg` field records the original user interaction information that triggered the object; the `callbackData` field stores the business parameters and context data required for the callback; the `messageType` field identifies the message type, distinguishing between ordinary messages, card messages, trigger-type messages, etc.; and the `subject` field identifies the business topic, facilitating subsequent routing to the corresponding business domain.
[0131] After generating and presenting a structured intent trigger object, the system responds to user actions such as clicking, selecting, or confirming on that object, translating the actions into standard messages according to a preset format and feeding them back into the preset LLM model. Based on this message and the session context, the LLM model regenerates a new execution plan and triggers a new round of distribution of editable cards.
[0132] After the user completes the information filling of the editable card and performs the confirmation operation, the system calls the corresponding business API interface of the business domain to perform the business operation according to the structured intent and business domain mapping relationship. After the business operation is completed, the execution result is rendered according to the displayable card protocol and sent to the client to be presented to the user, thus completing a complete business interaction loop.
[0133] In one embodiment of this application, the step of responding to user interaction input, outputting a structured intent, and matching the structured intent to the corresponding business domain includes: sequentially performing one or more of the following processing steps on the user interaction input: semantic parsing, entity recognition, and range recognition; if the parsing result is unclear, performing multiple rounds of clarification processing to output a structured intent; and matching the structured intent to the business capabilities of the corresponding business domain based on the business domain capability registry.
[0134] After receiving text, speech-to-text, or other forms of interactive input from the user, the system sequentially performs one or more of the following processes: semantic parsing, entity recognition, and scope recognition. Semantic parsing is used to extract the core intent expressed by the user; entity recognition is used to extract key business entities, such as time, location, object, and number; and scope recognition is used to determine the scope and constraints of business operations.
[0135] If the above analysis results are ambiguous, missing, or unclear, the system will automatically enter a multi-round clarification process, gradually supplementing the necessary information by asking the user guided questions and providing optional confirmations, until a clear and complete structured intent is obtained.
[0136] After obtaining the structured intent, the system matches the intent type, entity type, business scope, and other information in the structured intent with the business capability entries in the registry based on the pre-configured business domain capability registry. This routes the structured intent to the business domain with the corresponding processing capabilities, providing a basis for subsequent execution plan generation and card issuance.
[0137] By leveraging the contextual reasoning capabilities of LLM, it addresses uncertainties in users' natural language expressions, such as omissions, pronouns, and ambiguous scope. Through multiple rounds of clarification and completion of key information, it ultimately outputs standardized and executable structured intents, avoiding deviations in business execution due to ambiguous intents.
[0138] In one embodiment of this application, the step of generating editable cards for distribution to the business domains and rendering them to the client based on the structured intent, so as to generate and present structured intent trigger objects on the client, includes: decomposing the structured intent into a multi-step execution plan, generating editable cards for each business domain according to the steps and distributing them to the corresponding business domains, and then rendering the editable cards to the client according to the semantic card protocol.
[0139] The structured intent is broken down into multi-step execution plans according to business logic, with each step corresponding to one or more business operation nodes. For each execution plan, the system generates corresponding editable cards for each business domain based on the business domain division, and distributes the cards to the corresponding business domain for processing.
[0140] Editable cards contain required fields, optional fields, default values, validation rules, and interactive controls, adhering to a unified semantic card protocol. Upon receiving an editable card, the client renders it according to this protocol, generating an interactive editing interface. This interface displays structured intent trigger objects for users to fill in, select, and confirm information, thus enabling visual and structured interaction between the user and the system.
[0141] Establish a precise mapping mechanism between structured intents and multi-business domain capabilities. Decompose structured intents converted from natural language into executable operations that conform to business rules and permission requirements, and trigger corresponding business API calls through a controllable confirmation process to achieve a closed loop of "intent-operation-execution".
[0142] In one embodiment of this application, generating and presenting a structured intent triggering object on the client includes: defining a session identifier (conversationId) for the editable card, wherein the conversationId is used to uniquely identify a session; maintaining the session context state, wherein the session context state includes the message list (messageList), session information (conversationInfo), and business basic information (businessBasicInfo) of the current session; when a new editable card is generated, creating a card message in the session through the createNotAIMessage interface; if it is a new editable card of the same business domain and the same business object, broadcasting an update notification through the event bus (eventBus) to enable the client to update the corresponding card state.
[0143] Assign a unique session identifier (conversationId) to each edit-state card. This conversationId is used to globally and uniquely identify a complete session, ensuring the consistency and traceability of messages, cards, and context states within the session.
[0144] By maintaining the session context state, the session context state includes at least the message list (messageList), session information (conversationInfo), and business basic information (businessBasicInfo) for the current session. Specifically, messageList stores all interactive messages within the session; conversationInfo records session creation time, session status, user information, etc.; and businessBasicInfo stores basic parameters and configuration information related to the business.
[0145] When a new editable card is generated, the system creates a card message in the current session through the createNotAIMessage interface, making the card part of the session message stream. If the newly generated editable card belongs to the same business domain and the same business object, the system broadcasts the card update notification through the event bus. After receiving the notification, the client updates the status, content, or interactive capabilities of the corresponding card in real time, ensuring that the interface display is synchronized with the business status.
[0146] The design incorporates standardized edit-state cards as a unified carrier for conversational interaction and the GUI. These cards can hold core GUI elements such as business object data, operation buttons, and status indicators, while also deeply integrating into the dialogue context. This addresses the problems of fragmented entry points and disconnected dialogue and operation in traditional GUIs, achieving "one-time interaction, end-to-end support." A state machine control logic based on "unique edit-state cards + failure broadcast + historical snapshots" is constructed to resolve issues such as data sequence disorder, duplicate submissions, and system errors caused by user operations on old edit-state components when multiple cards coexist in multi-turn dialogues. This ensures that editing operations on the same business object are always unique, status is synchronized in real time, and historical operations are traceable.
[0147] In one embodiment of this application, the method further includes: broadcasting a card status update instruction to the client's historical message interaction component, so that the client updates the corresponding card status in the component after receiving the instruction; wherein the update instruction is broadcast through the event bus eventBus, and the event types include SYNC_CONTEXT synchronization context event, allowSend allow sending event, and getRecoQuestionForCard get recommended question event.
[0148] The event bus broadcasts card status update instructions to the client's historical message interaction component. After receiving the instruction, the client updates the status of the corresponding card in the historical message interaction component according to the instruction content, including but not limited to the card's editable status, submittable status, completed status, error message status, etc.
[0149] The update instructions are broadcast uniformly through the event bus, and the event types include, but are not limited to: SYNC_CONTEXT (synchronization context event), used to synchronize the latest session context; allowSend (allow sending event), used to control whether the client allows the user to continue sending messages; and getRecoQuestionForCard (get recommended question event), used to provide intelligent recommended questions for card interaction and improve user interaction efficiency.
[0150] By leveraging the unified broadcast mechanism of the event bus, real-time synchronization and cross-component interaction of card states are achieved, ensuring consistency between the client interface display and the backend business state, thereby improving the smoothness of interaction and user experience.
[0151] In one embodiment of this application, the editable card is configured with a message type, which includes at least ordinary text message 1001, card message 1002, text message saved to history 1003, special processing message 1004, and synchronization context message MessageType.SyncContext. The message processing flow is as follows: when a new message is generated, the message type is determined according to the type field; when type is 1002, it indicates a card message, and the corresponding card component needs to be rendered; when type is 1003, it indicates a user-side message that needs to be saved to the history; when type is MessageType.SyncContext, it indicates a synchronization context message, and the synchronization status needs to be queried through the retrieve interface, and the allowSend event and getRecoQuestionForCard event are triggered after completion.
[0152] While maintaining autonomy (independent permissions, independent processes, and independent data) in each business domain, a central routing and planning mechanism is used to break down ambiguous user intentions across domains into multi-step, business domain-specific execution plans, thereby enabling collaborative execution across business domains and avoiding process disruptions and data inconsistencies caused by cross-domain operations.
[0153] This application also provides a business interaction system 300, such as... Figure 3 As shown, a schematic diagram of the business interaction system in this embodiment of the application is provided. The business interaction system 300 includes at least: an input access module 310, a session context and memory module 320, an LLM semantic understanding and multi-turn clarification module 330, a central routing and capability registration module 340, a plan generation and orchestration module 350, a card protocol generation and rendering module 360, a state consistency control module 370, and a structured intent triggering object module 380, wherein:
[0154] In one embodiment of this application, the input access module 310 is specifically used to: receive interactive input submitted by the user in the form of text, voice, or click operations, supporting multiple channels and multiple access methods. Text input includes natural language text directly entered by the user; voice input is converted into text by a voice recognition module before entering the system; click input includes user-triggered operations on interactive elements such as cards, buttons, and options.
[0155] The input access module supports both in-application access and external IM platform access. It can uniformly access multi-source interaction requests from embedded pages in the client, third-party instant messaging platforms, enterprise collaboration tools, etc., and uniformly convert inputs of different formats into the system's internal standard message format, providing a unified data entry point for subsequent semantic processing and ensuring the compatibility and scalability of interaction channels.
[0156] In one embodiment of this application, the session context and memory module 320 is specifically used to: maintain global context information of the session, including a session history message list (messageList), a unique identifier for the current session (conversationId), business basic information (businessBasicInfo), and session configuration (config).
[0157] Among them, messageList is used to store the historical records of all user inputs, system replies, card messages, and trigger events within a session, ensuring the contextual coherence of multi-turn interactions; conversationId is used to globally and uniquely identify a session, enabling session tracking across modules and services; businessBasicInfo stores basic parameters, user information, and business object identifiers related to the business, providing basic data for business execution; and config is used to store session-level configurations, such as interaction modes, language preferences, permission policies, and multi-turn clarification rules.
[0158] The module supports context persistence and real-time updates, ensuring that session state is not lost or corrupted during multi-round interactions, asynchronous operations, and cross-service calls.
[0159] In one embodiment of this application, the LLM semantic understanding and multi-round clarification module 330 is specifically used to: perform deep semantic processing on the user's interactive input, sequentially perform semantic parsing, entity recognition, and range recognition, and initiate multi-round clarification when the information is incomplete or ambiguous, and finally output structured intent.
[0160] Semantic parsing is used to identify the core intent type expressed by the user; entity recognition is used to extract key business entities, such as time, location, object, number, and value; scope recognition is used to determine the scope, constraints, and priorities of business operations.
[0161] If the parsing results are ambiguous, incomplete, or conflicting, the module automatically enters a multi-round clarification process, interacting with the user through guided questioning, option confirmation, and information completion until a clear, complete, and executable structured intent is obtained. The structured intent includes fields such as intent type, entity set, business scope, and constraints, providing standardized input for subsequent routing and execution.
[0162] In one embodiment of this application, the central routing and capability registration module 340 is specifically used to: maintain a business domain capability registry, which contains key information such as assistant_id, app information (appCode, branchCode), business type businessType, and authorization requirements, and is used to describe the capability scope, access identifier, branch routing, and access permissions of each business domain.
[0163] The specific implementation of the central routing and capability registration module 340 (Router) includes, but is not limited to:
[0164] 1. Building the Business Domain Capability Registry
[0165] Capability metadata: Registers API interfaces, schema structures, permission rules, preconditions, input parameters, and output formats for each business domain; Capability tagging: Tags each capability with domain tags, intent tags, and parameter tags, supporting fuzzy matching and exact matching; Dynamic registration: Provides a management backend, supporting business users to add / modify / deactivate capabilities in real time without restarting the system.
[0166] 2. Intent-Capability Matching Logic
[0167] Precise matching: Directly matches a unique capability based on the intent_type, entity, and scope of the structured intent; Candidate matching: When multiple matching capabilities exist, they are sorted by confidence level, permissions, and call frequency to generate a candidate capability list; Pre-verification of permissions: Verifies user permissions before matching, filters out capabilities without permissions, and returns an insufficient permission prompt.
[0168] 3. Routing Distribution Strategy
[0169] Single-capability routing: A single intent corresponds to a single capability and is directly distributed to the Planner module; Composite capability routing: Complex intents correspond to combinations of multiple capabilities, are marked as composite tasks, and are orchestrated by the Planner.
[0170] After receiving a structured intent, the module matches it with the capability registry based on information such as intent type, entity characteristics, and business scope, and routes the structured intent to the business domain with the corresponding processing capability. Simultaneously, the module verifies user permissions and business domain access policies to ensure that the interaction request is legitimate, secure, and executable.
[0171] Through a unified central routing mechanism, dynamic registration, expansion, and decoupling of business capabilities are achieved, supporting rapid access to new business domains and improving the overall scalability of the system.
[0172] In one embodiment of this application, the plan generation and orchestration module 350 is specifically used to: decompose the structured intent into a multi-step executable execution plan according to business logic, with each step corresponding to one or more business operation nodes. The module generates corresponding editable cards for each step based on business domain division, and distributes the cards to the corresponding business domains for processing.
[0173] The execution plan supports orchestration methods such as sequential execution, conditional branching, parallel execution, and asynchronous waiting, ensuring the controllability and stability of complex business processes. Editable cards contain required fields, optional fields, default values, validation rules, and interactive controls, providing users with a visual and structured interface, reducing user input costs, and improving interaction accuracy.
[0174] The specific implementation of the plan generation and orchestration module 350 (Planner) includes, but is not limited to:
[0175] 1. Breakdown of the Execution Plan
[0176] Atomization decomposition: Decompose complex intentions into atomic execution steps; Step dependency configuration: Define the execution order and dependencies between steps, and generate a directed acyclic graph (DAG); Exception branch planning: Preset the handling branches after a step fails.
[0177] 2. Editable Card Generation and Distribution
[0178] Step-by-step generation of editable cards: Each execution step corresponds to an editable card, which includes interactive elements such as parameter input boxes, selectors, and confirmation buttons; Card distribution: Based on the business domain to which the step belongs, the card is distributed to the card protocol module of the corresponding business domain, and synchronized to the session context at the same time; Step-by-step execution control: Supports users to confirm execution step by step, and the next card is unlocked only after the previous step is completed, avoiding operation confusion.
[0179] In one embodiment of this application, the card protocol generation and rendering module 360 is specifically used to: generate editable cards or displayable cards according to a unified semantic card protocol, and send them to the client to complete rendering.
[0180] Editable cards are used for user information entry, selection, and confirmation, and support interactive controls such as input boxes, drop-down selections, date selections, switches, and buttons; displayable cards are used to present business execution results, status information, prompts, etc., in a static or semi-interactive format.
[0181] The module ensures the standardization and compatibility of card formats, enabling cards generated by different business domains to present a consistent visual and interactive experience on the client side, while also supporting flexible configuration of card styles, layouts, and behaviors.
[0182] The specific implementation of the card protocol generation and rendering module 360 includes, but is not limited to:
[0183] 1. Semantic Card Protocol Definition
[0184] Unified Card Schema: Defines the standard structure of editable / displayable cards, including fields such as card_id, card_type, data, style, and action; Card Type Adaptation: Supports common types such as form cards, list cards, detail cards, confirmation cards, and clarification cards, covering all business scenarios; Protocol Extension: Supports business users to customize card components and achieve personalized rendering through protocol extension fields.
[0185] 2. Card rendering and adaptation
[0186] Client-side rendering: The card protocol is sent to the client SDK, and the client renders it into a native UI according to the protocol, adapting to multiple platforms such as iOS / Android / PC; Responsive layout: The card style is automatically adjusted according to the device screen size to ensure a consistent interactive experience; Offline rendering: Commonly used card templates are cached, and cached cards are displayed first in weak network environments to improve response speed.
[0187] In one embodiment of this application, the state consistency control module 370 is specifically used to: maintain the consistency of the entire session state, realize state synchronization, state update and state notification through the event bus; and query the completion status of asynchronous business operations through the retrieve interface to ensure that the backend execution result and the frontend display status are aligned in real time.
[0188] The module handles events such as card status updates, context synchronization, permission changes, and asynchronous task callbacks. It achieves cross-module and cross-service status synchronization through a unified event mechanism, avoiding issues such as inconsistent status, duplicate execution, and execution interruption, thereby improving system stability and smoothness of interaction.
[0189] In one embodiment of this application, the structured intent triggering object module 380 is specifically used to: generate and present a structured intent triggering object on the client, the object containing key fields such as triggerMsg, callbackData, messageType, and subject, which are used to identify the trigger source, callback parameters, message type, and business topic.
[0190] When a user clicks on a trigger object, the module uses methods such as handleSendMessage and handleChatAgentBranchClick to translate the user's click action into a standard message and feeds it back into the LLM model, triggering a new round of execution plan generation and card distribution process.
[0191] By using structured trigger objects, user interactions can be rapidly transformed from natural language into executable intents, supporting seamless transitions between multiple rounds of interaction and improving interaction efficiency and intelligence.
[0192] The specific implementation of the Structured Intent Trigger Object (SITO) module includes, but is not limited to, the following steps:
[0193] Step S1. SITO Generation and Presentation
[0194] Clickable chips / buttons are generated based on structured intents, including intent descriptions, associated parameters, and icons; they are displayed in the conversation interface in a semantically relevant order, and support horizontal swiping and collapsing / expanding to improve interaction efficiency.
[0195] Step S2. Click to achieve closed-loop recharge.
[0196] When a SITO is clicked, the corresponding structured intent is automatically converted into an equivalent text instruction and fed back to the input access module. The system processes the user's text input according to the user's text input flow, triggering intent parsing and card generation, forming a closed loop of "click → feedback → execution". It supports batch SITO selection, combining multiple intents to generate composite execution plans, simplifying complex operation processes.
[0197] It is understood that the above-mentioned business interaction system can implement each step of the business interaction method provided in the foregoing embodiments. The relevant explanations of the business interaction method are applicable to the business interaction system and will not be repeated here.
[0198] In addition, (h) Business execution module: After user confirmation, it calls the business domain API to execute, and handles idempotency, transactions and rollback;
[0199] The business interaction system 300 also includes a business execution module, specifically used for...
[0200] 1. API Execution and Transaction Control
[0201] Execution after confirmation: The business domain API is called only after the user clicks the card confirmation button to avoid accidental operation; Distributed transactions: For cross-domain multi-step operations, the TCC (Try-Confirm-Cancel) mode is adopted to ensure eventual consistency of transactions; Rollback mechanism: When execution fails, the card status and business data are rolled back according to the snapshot, and the failure reason card is returned.
[0202] 2. Idempotency Guarantee
[0203] Generate idempotent keys: Generate unique idempotent keys based on user ID, business object ID, and operation type, and cache them in Redis; Duplicate request interception: Requests with the same idempotent key are executed only once, and duplicate requests directly return the result of the first execution.
[0204] The business interaction system 300 also includes an audit and security module, specifically used for: permission verification, idempotent keys, traceId, operation logs, and compliance.
[0205] 1. End-to-end security verification
[0206] Real-time permission verification: User operation permissions are verified twice before execution, and unauthorized operations are directly blocked; Sensitive data anonymization: Sensitive data such as ID cards and mobile phone numbers are automatically anonymized when displayed on cards and stored in logs.
[0207] 2. Audit Logs and Tracking
[0208] Generate a global traceId: used throughout the entire process of input, understanding, routing, and execution for troubleshooting; Operation log recording: records user input, card operations, API calls, and execution results, storing them in the audit database for traceability; Compliance policy: built-in data compliance and operation compliance rules to alert and block violations.
[0209] Figure 4 This is a schematic diagram of the structure of an electronic device according to an embodiment of this application. Please refer to it. Figure 4 At the hardware level, the electronic device includes a processor, and optionally also includes an internal bus, a network interface, and memory. The memory may include main memory, such as high-speed random-access memory (RAM), or non-volatile memory, such as at least one disk drive. Of course, the electronic device may also include other hardware required for other business operations.
[0210] The processor, network interface, and memory can be interconnected via an internal bus, which can be an ISA (Industry Standard Architecture) bus, a PCI (Peripheral Component Interconnect) bus, or an EISA (Extended Industry Standard Architecture) bus, etc. This bus can be divided into address bus, data bus, control bus, etc. For ease of representation, Figure 4 The symbol is represented by a single double-headed arrow, but this does not mean that there is only one bus or one type of bus.
[0211] Memory is used to store programs. Specifically, programs may include program code, which includes computer operation instructions. Memory may include main memory and non-volatile memory, and provides instructions and data to the processor.
[0212] The processor reads the corresponding computer program from non-volatile memory into main memory and then executes it, forming a business interaction system at the logical level. The processor executes the program stored in memory and specifically performs the following operations:
[0213] In response to user interaction input, output structured intent and match the structured intent to the corresponding business domain;
[0214] Based on the structured intent, editable cards are generated for distribution to the business domain and rendered to the client, thereby generating and presenting a structured intent trigger object on the client; and
[0215] In response to the structured intent triggering object, complete the business interaction or trigger a new round of business interaction.
[0216] The above is as stated in this application. Figure 2The methods for executing the business interaction system disclosed in the illustrated embodiments can be applied to a processor or implemented by a processor. The processor may be an integrated circuit chip with signal processing capabilities. During implementation, each step of the above methods can be completed by integrated logic circuits in the processor's hardware or by instructions in software form. The processor can be a general-purpose processor, including a Central Processing Unit (CPU), a Network Processor (NP), etc.; it can also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field-Programmable Gate Array (FPGA), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components. It can implement or execute the methods, steps, and logic block diagrams disclosed in the embodiments of this application. The general-purpose processor can be a microprocessor or any conventional processor. The steps of the methods disclosed in the embodiments of this application can be directly manifested as being executed by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor. The software module can reside in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, or registers. This storage medium is located in memory, and the processor reads information from the memory and, in conjunction with its hardware, completes the steps of the above method.
[0217] The electronic device can also perform Figure 2 The method executed by the business interaction system, and the implementation of the business interaction system in Figure 2 The functions of the embodiments shown are not described in detail here.
[0218] This application also proposes a computer-readable storage medium that stores one or more programs, the programs including instructions that, when executed by an electronic device including multiple applications, enable the electronic device to perform... Figure 2 The method executed by the business interaction system in the illustrated embodiment is specifically used to perform:
[0219] In response to user interaction input, output structured intent and match the structured intent to the corresponding business domain;
[0220] Based on the structured intent, editable cards are generated for distribution to the business domain and rendered to the client, thereby generating and presenting a structured intent trigger object on the client; and
[0221] In response to the structured intent triggering object, complete the business interaction or trigger a new round of business interaction.
[0222] Those skilled in the art will understand that embodiments of the present invention can be provided as methods, systems, or computer program products. Therefore, the present invention can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention can take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
[0223] This invention is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart illustrations and / or block diagrams. Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.
[0224] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.
[0225] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.
[0226] In a typical configuration, a computing device includes one or more processors (CPU), input / output interfaces, network interfaces, and memory.
[0227] Memory may include non-persistent storage in computer-readable media, such as random access memory (RAM) and / or non-volatile memory, such as read-only memory (ROM) or flash RAM. Memory is an example of computer-readable media.
[0228] Computer-readable media includes both permanent and non-permanent, removable and non-removable media that can store information using any method or technology. Information can be computer-readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technologies, CD-ROM, digital versatile optical disc (DVD) or other optical storage, magnetic tape, magnetic magnetic disk storage or other magnetic storage devices, or any other non-transferable medium that can be used to store information accessible by a computing device. As defined herein, computer-readable media does not include transient computer-readable media, such as modulated data signals and carrier waves.
[0229] It should also be noted that the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.
[0230] Those skilled in the art will understand that embodiments of this application can be provided as methods, systems, or computer program products. Therefore, this application can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, this application can take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
[0231] The above description is merely an embodiment of this application and is not intended to limit the scope of this application. Various modifications and variations can be made to this application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of this application should be included within the scope of the claims of this application.
Claims
1. A business interaction method, applied to a business server, the method comprising: In response to user interaction input, a structured intent is output and the structured intent is matched to the corresponding business domain; Based on the structured intent, editable cards are generated for distribution to the business domain and rendered to the client, so as to generate and present structured intent trigger objects on the client. as well as In response to the structured intent triggering object, complete the business interaction or trigger a new round of business interaction.
2. The method as described in claim 1, wherein, The step of generating editable cards for distribution to the business domain and rendering them to the client based on the structured intent includes: Editable cards are generated according to the semantic card protocol for distribution to the business domain and rendered to the client; The semantic card protocol includes at least rendering data and metadata for enabling the preset LLM model to understand the session context and execute constraints. The metadata of the semantic card protocol includes at least the following fields: cardId, sourceType, runtimeData, sourceData, role, content, type, attach, and isSyncLlm. in, The cardId field is used to represent the unique identifier of the card; The sourceType field is used to characterize the source type of the card; The runtimeData field is used to represent business runtime data, including business context information such as payload, subject, submitData, and methodData. The type field is used to characterize the message type, including text message 1001, card message 1002, system message 1003, and synchronization context message MessageType.SyncContext.
3. The method as described in claim 2, wherein, The response to the structured intent triggering object, completing a business interaction or triggering a new round of business interaction, includes: The structured intent trigger object includes at least the triggerMsg field, the callbackData field, the messageType field, and the subject field; After generating and presenting the structured intent trigger object, respond to the user's trigger operation on the structured intent trigger object, translate the trigger operation into a message and feed it back to the preset LLM model to trigger a new round of execution plan generation and the distribution process of editable cards; In response to the user's confirmation operation on the editable card, the business API application programming interface corresponding to the business domain is invoked to execute the business operation, and the business execution result is rendered as a displayable card.
4. The method of claim 1, wherein, The step of responding to user interaction input, outputting structured intents, and matching the structured intents to the corresponding business domains includes: For the user's interactive input, perform one or more of the following processes in sequence: semantic parsing, entity recognition, and range recognition. If the parsing result is unclear, perform multiple rounds of clarification processing and output a structured intent. Based on the business domain capability registry, the structured intent is matched to the business capabilities of the corresponding business domain.
5. The method of claim 2, wherein, The step of generating editable cards for distribution to the business domain based on the structured intent and rendering them to the client, so as to generate and present structured intent trigger objects on the client, includes: The structured intent is broken down into a multi-step execution plan. Editable cards are generated for each business domain according to the steps described and distributed to the corresponding business domains. The editable cards are then rendered to the client according to the semantic card protocol.
6. The method of claim 1, wherein, The generation and presentation of the structured intent triggering object on the client includes: Define a session identifier (conversationId) for the edit-state card, whereby the conversationId is used to uniquely identify a session; Maintain the session context state, which includes the message list, conversation information, and business basic information of the current session. When a new editable card is generated, the card message is created in the session through the createNotAIMessage interface; If it is a new editable card in the same business domain and the same business object, the update notification is broadcast through the event bus so that the client can update the corresponding card status.
7. The method of claim 6, wherein, The method further includes: Broadcast a card status update instruction to the client's historical message interaction component, so that the client updates the corresponding card status in the component after receiving the instruction; The update command is broadcast via the event bus, and the event types include SYNC_CONTEXT (synchronization context event), allowSend (allow sending event), and getRecoQuestionForCard (get recommended question event).
8. The method of claim 6, wherein, The editable card is configured with a message type, which includes at least a plain text message 1001, a card message 1002, a text message saved to history 1003, a special processing message 1004, and a synchronization context message MessageType.SyncContext; The message processing flow is as follows: When a new message is generated, the message type is determined based on the type field; When type is 1002, it indicates a card message, and the corresponding card component needs to be rendered. When type is 1003, it indicates that the user-side message needs to be saved to the history. When the type is MessageType.SyncContext, it indicates a synchronization context message. The synchronization status needs to be queried through the retrieve interface, and the allowSend event and getRecoQuestionForCard event are triggered after completion.
9. A business interaction system applied to a business server, the system comprising: The input access module is configured to perform the following operations: receive interactive input from users in the form of text, voice, or clicks; the input access module supports in-application access points and external IM platform access points. The session context and memory module is configured to perform the following operations: maintain the session history messageList, the current session identifier conversationId, the business basic information businessBasicInfo, and the session configuration config; The LLM semantic understanding and multi-turn clarification module is configured to perform the following operations: semantic parsing, entity recognition, range recognition and multi-turn clarification of user interaction input, and output structured intent; The central routing and capability registration module is configured to perform the following operations: maintain a business domain capability registry, which includes assistant_id, app information (appCode and branchCode), business type (businessType), and authorization requirements (Authorization), and complete the matching of structured intents to business domain capabilities. The plan generation and orchestration module is configured to perform the following operations: break down structured intents into multi-step execution plans, and generate editable cards for each business domain step by step and distribute them to the corresponding business domains; The card protocol generation and rendering module is configured to perform the following operations: generate editable or displayable cards according to the semantic card protocol and render them to the client; The state consistency control module is configured to perform the following operations: maintain session state consistency, perform state synchronization through the event bus, and query the state of asynchronous operations through the retrieve interface; The Structured Intent Trigger Object module is configured to perform the following operations: generate and present structured intent trigger objects on the client side, and translate user click operations into messages and send them to the LLM model through methods such as handleSendMessage and handleChatAgentBranchClick.
10. A computer device comprising a memory, a processor, and a computer program stored in the memory, wherein, When the processor executes the computer program, it implements the business interaction method as described in any one of claims 1 to 8.
11. A computer-readable storage medium having stored thereon computer-executable instructions, which, when executed by a processor, implement the business interaction method as described in any one of claims 1 to 8.
12. A computer program product comprising computer-executable program instructions, wherein when executed by a processor, the computer-executable program instructions implement the business interaction method as described in any one of claims 1 to 8.