Method and apparatus for verifying dialog service, and device and medium

By automating the verification of model request consistency and historical context, this approach solves the problem of verifying the orchestration link of complex generative model dialogue services in traditional methods, thereby improving the accuracy and efficiency of dialogue services and enhancing the user experience.

WO2026123269A1PCT designated stage Publication Date: 2026-06-18BEIJING ZITIAO NETWORK TECH CO LTD

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
BEIJING ZITIAO NETWORK TECH CO LTD
Filing Date
2024-12-11
Publication Date
2026-06-18

AI Technical Summary

Technical Problem

Traditional methods struggle to effectively validate the orchestration chain of complex generative model dialogue services, especially when model responses are diverse and the orchestration chain becomes more complex, leading to uncertainty in response quality and a decline in user experience.

Method used

An automated verification method is provided to ensure the accuracy and integrity of the orchestration chain of a dialogue service by verifying the consistency of model requests, historical context, and number of tokens. The method includes an automated verification system and apparatus for verifying the protocol packets of model requests and responses.

🎯Benefits of technology

This improves the accuracy and efficiency of link verification in dialogue services, ensures that model responses meet user expectations, and enhances user experience.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN2024138616_18062026_PF_FP_ABST
    Figure CN2024138616_18062026_PF_FP_ABST
Patent Text Reader

Abstract

Provided in the present disclosure are a method and apparatus for verifying a dialog service, and a computing device, a computer storage medium and a computer program product. A dialog service comprises a target model and at least one extension capability component coupled to the target model. The method comprises: providing a dialog request to a dialog service; acquiring a model request corresponding to the dialog request, wherein the model request is inputted into a target model to generate a model response, and the model request comprises extension capability information associated with one or more of at least one extension capability component; and on the basis of the model request, verifying an orchestration link of the dialog service.
Need to check novelty before this filing date? Find Prior Art

Description

Methods, apparatus, equipment, and media for verifying dialogue services Technical Field

[0001] This disclosure relates to the field of computer technology, and more specifically, to a method, apparatus, computing device, computer storage medium, and computer program product for verifying a dialogue service. Background Technology

[0002] Dialogue services based on generative models (such as Large Language Models, LLM) provide the ability for runtime dialogue between users and models. Dialogue services primarily implement flow and extension capabilities (also known as atomic capabilities) through orchestration mechanisms, providing rich contextual information to the model to help it accurately understand task requirements and return the model-generated response.

[0003] To test such dialogue services, traditional methods involve writing a dedicated test function to verify the correctness of the service's orchestration chain. However, as model complexity increases and orchestration chains become more sophisticated, this approach gradually reveals its limitations, especially as model responses become more diverse and orchestration chains become more complex. Summary of the Invention

[0004] Embodiments of this disclosure provide a method, apparatus, computing device, computer storage medium, and computer program product for verifying a dialog service.

[0005] According to a first aspect of this disclosure, a method for verifying a dialogue service is provided, the dialogue service including a target model and at least one extended capability component coupled to the target model, the method comprising: providing a dialogue request to the dialogue service; obtaining a model request corresponding to the dialogue request, wherein the model request is input to the target model to generate a model response, the model request including at least one or more extended capability information associated with one or more of the at least one extended capability component; and verifying the orchestration chain of the dialogue service based on the model request.

[0006] According to a second aspect of this disclosure, an apparatus for verifying a dialogue service is provided, the dialogue service including a target model and at least one extended capability component coupled to the target model, the apparatus comprising: a dialogue request unit configured to provide a dialogue request to the dialogue service; an acquisition unit configured to acquire a model request corresponding to the dialogue request, wherein the model request is input to the target model to generate a model response, the model request including at least one or more extended capability information associated with one or more of the at least one extended capability component; and a verification unit configured to verify the orchestration links of the dialogue service based on the model request.

[0007] According to a third aspect of this disclosure, a computing device is provided, comprising: at least one processing unit; and at least one memory coupled to the at least one processing unit and storing instructions for execution by the at least one processing unit, the instructions, when executed by the at least one processing unit, causing the computing device to perform the method as described in the first aspect of this disclosure.

[0008] According to a fourth aspect of this disclosure, a non-transient computer storage medium is provided, including machine-executable instructions that, when executed by a device, cause the device to perform the method as described in the first aspect of this disclosure.

[0009] According to a fifth aspect of this disclosure, a computer program product is provided, including machine-executable instructions that, when executed by a device, cause the device to perform the method as described in the first aspect of this disclosure.

[0010] It should be understood that the summary section is not intended to identify key or essential features of the embodiments of this disclosure, nor is it intended to limit the scope of this disclosure. Other features of this disclosure will become readily apparent from the following description. Attached Figure Description

[0011] The above and other objects, features, and advantages of embodiments of the present disclosure will become more readily understood from the following detailed description with reference to the accompanying drawings. In the drawings, several embodiments of the present disclosure will be described by way of example and non-limitation, wherein:

[0012] Figure 1 shows a schematic diagram of an example environment in which embodiments of the present disclosure may be implemented;

[0013] Figure 2 shows a schematic flowchart of a method for verifying a dialogue service according to an embodiment of the present disclosure;

[0014] Figure 3 illustrates a schematic diagram of an automated verification system for a dialogue service according to an embodiment of the present disclosure;

[0015] Figure 4 shows a schematic flowchart of a process for verifying the consistency of a model request according to an embodiment of the present disclosure;

[0016] Figure 5 shows a schematic flowchart of a process for verifying the historical context of a model request according to an embodiment of the present disclosure;

[0017] Figure 6 shows a schematic flowchart of a process for verifying model tokens according to an embodiment of the present disclosure;

[0018] Figure 7 shows a block diagram of an apparatus for verifying a dialogue service according to an embodiment of the present disclosure; and

[0019] Figure 8 shows a block diagram of an electronic device according to an embodiment of the present disclosure.

[0020] In all the accompanying figures, the same or similar reference numerals denote the same or similar elements. Detailed Implementation

[0021] Embodiments of this disclosure will now be described in more detail with reference to the accompanying drawings. While some embodiments of this disclosure are shown in the drawings, it should be understood that this disclosure can be implemented in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided to provide a more thorough and complete understanding of this disclosure. It should be understood that the accompanying drawings and embodiments of this disclosure are for illustrative purposes only and are not intended to limit the scope of protection of this disclosure.

[0022] In the description of embodiments of this disclosure, the term "comprising" and similar terms should be understood as open-ended inclusion, i.e., "including but not limited to". The term "based on" should be understood as "at least partially based on". The term "one embodiment" or "the embodiment" should be understood as "at least one embodiment". The terms "first", "second", etc., may refer to different or the same objects unless explicitly stated. Other explicit and implicit definitions may also be included below.

[0023] The basic principles and implementation of this disclosure will be explained with reference to the accompanying drawings. It should be understood that the exemplary embodiments given are merely intended to enable those skilled in the art to better understand and implement the embodiments of this disclosure, and are not intended to limit the scope of this disclosure in any way.

[0024] Figure 1 illustrates a schematic diagram of an environment 100 in which various embodiments of the present disclosure can be implemented. In the environment 100 shown in Figure 1, at box 101, a user initiates a dialogue request and sends it to a model-based dialogue service (also known as a dialogue engine or bot). The dialogue service may include one or more models, each of which can receive input in text or other modalities (images, audio, video, etc.) and output a model response. The models may be generative models. As an example, a user can input the query "Beijing 1-day tour guide" into the dialogue service, and the dialogue service can then generate an answer to that query.

[0025] Specifically, in box 102, the dialogue service can obtain basic information about the dialogue service, including basic framework information of prompt words, and decide on the model to use.

[0026] In box 103, the dialogue service can determine the extended capabilities (also known as atomic capabilities) to use. Extended capabilities can include plugins, variables, knowledge bases, memories, timing tools, and other capabilities.

[0027] In box 104, based on the extended capabilities, the system prompt is obtained. In some implementations, different extended capabilities can call different services to query the database and inject the retrieved results into the system prompt.

[0028] In terms of extended capabilities, plugins can be Application Programming Interfaces (APIs). Models can answer user questions based on context and API results to improve response quality. Variables can provide parameters such as the current time and are appended to system prompts. A knowledge base provides textual knowledge. During dialogue, the model can query the knowledge base, and the results can be input into the model along with prompts to improve response quality. Memory provides historical dialogue information; during dialogue, the model can remember previous questions and answers, summarize the context, and retain key information for use in subsequent dialogues. Scheduled tasks can automatically create and execute tasks based on the user's dialogue context.

[0029] In box 105, system prompts can be input into the selected model along with historical context, model parameters, plugin information, etc., as a model request. In box 106, the model returns results to the user or an upstream task.

[0030] In the above process, the model request input to the model acts as a bridge connecting the user request and the model. Richer model requests facilitate effective communication between the user and the model, enabling the model to accurately understand the task and generate the corresponding response. However, the lack of some information in the model request will affect the answer quality. This effect is not deterministic; subtle issues are often difficult to detect intuitively but can lead to a poor user experience. On the other hand, in a single dialogue with the model, the model may provide different answers to the same question. This differs from traditional validation methods that rely on verifying response values ​​(e.g., assertions) because the answers from model-based systems are not fixed.

[0031] In view of this, embodiments of this disclosure provide an automated verification method for model-based dialogue services to improve the accuracy and efficiency of link verification for dialogue services. To ensure that the response meets expectations, it is necessary to guarantee the correctness of the orchestration links of the dialogue service, ensure the integrity of model requests, and prevent information loss that could affect the response. The automated verification system according to embodiments of this disclosure may include verification of orchestration capabilities (for model input, also known as model requests) and verification of interface returns (for model output, also known as model responses).

[0032] Orchestration capability validation ensures that extended capabilities can be correctly invoked and incorporated into model requests. This means validating that the service can accurately invoke the various expected extended capabilities, effectively integrate them, and provide them to the model so that the model can provide a more comprehensive and accurate response. In some embodiments, orchestration capability validation may include validation of consistency with model requests, validation of historical context, and validation of the number of tokens consumed by the model invocation.

[0033] The verification returned by the interface refers to verifying the correctness of the protocol packets in the model response, but not verifying the content of the model's response. In some embodiments, it can be verified whether the protocol packets in the model conform to the expected orchestration link, ensuring that they meet user needs and expectations, and that they meet requirements in terms of format, content integrity, and other aspects.

[0034] The exemplary embodiments of this disclosure will now be described in detail with reference to Figures 2 through 8.

[0035] Figure 2 illustrates a schematic flowchart of a method 200 for verifying a dialogue service according to an embodiment of the present disclosure. The dialogue service may include a target model (e.g., a large language model) and at least one extension capability component coupled to the target model.

[0036] As shown in Figure 2, in box 210, a dialogue request is provided to the dialogue service. For example, a user can enter text, an image, audio, or video and expect to receive a response from the dialogue service.

[0037] In box 220, a model request corresponding to a dialogue request is obtained, wherein the model request is input into a target model to generate a model response, and the model request includes at least one or more extension capability information associated with one or more of at least one extension capability component.

[0038] In the orchestration of a dialogue service, different requests or questions may traverse different links, invoke different extended capability components, and concatenate the results obtained from these components into the model request. Request parameters in the orchestration process and model request can be obtained by reporting and parsing key data points during the dialogue request process. In some embodiments, the model request may include system prompts, historical context, model parameters, and information about extended capability components (e.g., plugins, variables, knowledge bases, memories, scheduled tasks, etc.). When the information in the model request is complete, the expected dialogue effect can be obtained.

[0039] In box 230, the orchestration chain of the dialogue service is validated based on model requests. In some embodiments, validation may include: validation of consistency for model requests, validation of historical context, and validation of the number of tokens in model calls. Embodiments of validation for model requests are described in detail below with reference to Figures 3 through 6.

[0040] Figure 3 illustrates a schematic diagram of an automated verification system 300 for a dialogue service according to an embodiment of the present disclosure. Method 200 can be implemented using the system 300 shown in Figure 3. Generally, the automated verification system 300 may include automated verification based on model requests 310 and automated verification based on model responses 340.

[0041] As shown in the figure, in box 301, the dialogue request is input into the dialogue service, and in box 302, the basic information of the dialogue service is determined, including templates, prompt word frames, plugins, etc.

[0042] In box 303, business logic orchestration is implemented based on the basic information of the dialogue request and dialogue service. Business logic orchestration can invoke one or more extended capabilities and append the retrieved information to prompts to form a model request. For example, business logic orchestration can invoke extended capabilities such as variables, databases, workloads, and plugins, appending the queried information to the dialogue request. Additionally, business logic orchestration can also obtain historical context information, such as previous dialogues and responses, or their summaries, and append them to the dialogue request.

[0043] In box 304, model scheduling is performed to select a model from one or more models 306 of the dialogue service to respond to dialogue request 301. Then, model request 305 is input to the selected model. The model can stream a model response 307 based on the model request. After parsing and validation, model response 307 yields dialogue result 308, which serves as a response to dialogue request 301.

[0044] In some embodiments, automated verification 310 can be performed based on model request 306. For example, for a dialogue request, a message ID can be randomly generated, and the model request can be obtained through the message ID. Specifically, in box 312, the event tracking data is requested through the message ID to obtain the reported data for this request. In box 314, the required fields are obtained by parsing the reported data; these fields are included in the model request. In box 316, the model request is verified, including difference verification, historical message verification, token verification, and other possible verifications. In box 318, the verification results are returned.

[0045] Automated verification based on model response can also be performed 340. In some embodiments, to verify the model response, it can be verified whether the protocol packets in the model response conform to the expected orchestration chain. For example, it can be verified whether the model response includes intent recognition packets, search term splitting packets, packets that call search plugins, answer content packets, etc. Therefore, although the model's answer content may be different, the orchestration chain of the dialogue service can be verified in this way.

[0046] Figure 4 illustrates a schematic flowchart of a process for verifying the consistency of a model request according to an embodiment of the present disclosure. This process verifies the plugin information, variable information, and prompt word concatenation of the same dialogue request in benchmark and test environments, preventing the loss of relevant information from affecting model performance.

[0047] In some embodiments, a baseline model request 410 corresponding to the dialogue request can be obtained, wherein the baseline model request is generated based on a baseline version of the dialogue service. The baseline model request 410 and the model request to be verified 402 are input to the difference verification module 410, which verifies the orchestration chain of the dialogue service by comparing the model request to be verified 402 and the baseline model request 401. At 403, the difference between the two requests is returned.

[0048] When comparing two model requests, three result sets can be prepared: a difference set (A) for the baseline environment, a difference set (B) for the test environment, and a consistent dataset (AB) between the two environments. During comparison, first determine if a field exists in both environments. If it exists in the baseline environment but not in the test environment, insert a difference (diff) record into set A. If it exists in both environments, determine if the current field is a basic data type. If it is, directly compare the corresponding values. If the values ​​match, store them in the consistent result set AB; otherwise, store them in result sets A and B respectively. If the current field is not a basic data type, refer to procedure 405 to perform the comparison: if it is not a basic data type (e.g., map or slice type), parse it into a basic data type, and then compare their consistency. Finally, sets A, B, and AB are returned.

[0049] As mentioned above, a model request may include extensibility information and model parameters. The consistency of the extensibility information and model parameters in the test environment and the baseline environment can be verified separately. In some embodiments, if it is determined that the model request and the baseline model request include the same extensibility information, it can be determined that the orchestration chain of the dialogue service is consistent with its baseline version. If it is determined that the model request and the baseline model request include different extensibility information, it can be determined that the orchestration chain of the dialogue service is different from its baseline version.

[0050] In some embodiments, if it is determined that the model request and the baseline model request have the same model parameters, it can be determined that the orchestration chain of the dialogue service is consistent with its baseline version. If it is determined that the model request and the baseline model request have different model parameters, it is determined that the orchestration chain of the dialogue service is different from its baseline version.

[0051] Figure 5 illustrates a schematic flowchart of a process for verifying the historical context of a model request according to an embodiment of the present disclosure. This process is used to verify whether the loss of historical data has affected the response.

[0052] To verify the historical context of a model request, historical data of the dialogue request can be obtained. By comparing the historical context information in the model request with the historical data, it can be determined whether any historical context has been omitted. Historical context information may include multimodal historical messages, which may include at least one of text, images, audio, or video.

[0053] As shown in Figure 5, historical context 503 can be extracted from the model request 501 to be verified, and historical messages 504 can be obtained from the original dialogue request 502. Historical context 503 and historical messages 504 are input into the historical context verification module 510. After verification is completed, verification result 505 is returned.

[0054] When comparing historical context 503 and historical message 504, we can first check if the number of entries in historical message 504 and historical context 503 are the same. If they are not the same, we can return directly. If they are the same, we continue to iterate through historical message 504 and historical context 503, comparing them one by one. First, we determine the type of the current message and check the comparability of the two messages. If the types are the same, we compare whether the content of the messages is the same and check the integrity of the messages. In addition, if historical message 504 includes files, images, or other types of data, we check whether historical context 503 includes this data. After the comparison is complete, we return the verification result 505.

[0055] Figure 6 illustrates a schematic flowchart of a process for verifying model tokens according to an embodiment of the present disclosure. This process is used to verify the cost of model calls, including verification for a single model call and verification for all model calls (a single dialogue may include multiple model calls).

[0056] For a single model invocation associated with a model request, the dialogue service determines the actual number of input and output tokens for that single model invocation. The expected number of input and output tokens is determined using the model's tokenization tool. Then, the expected number of input and output tokens is compared with the actual number of input and output tokens. Specifically, at 601, the model request and its response are obtained. At 603, the expected number of tokens, including input and output tokens, is calculated by invoking the model's tokenization tool. At 602, the input and output tokens calculated by the engine are obtained. At 604, the actual number of tokens is calculated. At 610, token count verification for the single model invocation is performed.

[0057] For validation of all model calls, the expected input and output token counts for each model call are summed to obtain the total expected input and output token counts. The actual total input and output token counts for the dialogue request are obtained from the model response. Then, the total expected input and output token counts are compared with the actual total input and output token counts for validation. Specifically, at 605, the token counts for each model call are summed. At 606, the dialogue result is obtained. At 607, the input and output token counts calculated by the engine are extracted from the dialogue result as the actual total token count. At 620, token count validation for all model calls is performed.

[0058] The exemplary embodiments described above with reference to Figures 1 to 6 provide an automated verification method for model-based dialogue services. Compared to existing technologies, its advantages lie in its ability to verify orchestrated links in any dialogue scenario and to achieve more comprehensive verification. In this way, it ensures that the model's responses better meet user expectations, thereby improving the user experience of using the dialogue service.

[0059] Figure 7 shows a schematic block diagram of an apparatus 700 for verifying a dialogue service according to an embodiment of the present disclosure. The dialogue service includes a target model and at least one extended capability component coupled to the target model. The apparatus 700 can be used to implement the methods or steps described with reference to Figures 1 to 6.

[0060] As shown in Figure 7, the apparatus 700 includes a dialogue unit 710, an acquisition unit 720, and a verification unit 730. The dialogue unit 710 is configured to provide a dialogue request to the dialogue service. The acquisition unit 720 is configured to acquire a model request corresponding to the dialogue request, wherein the model request is input to the target model to generate a model response, and the model request includes at least one or more extension capability information associated with one or more of the at least one extension capability component. The verification unit 730 is configured to verify the orchestration links of the dialogue service based on the model request.

[0061] In some embodiments, the verification unit 730 may be configured to: obtain a baseline model request corresponding to the dialogue request, the baseline model request being generated based on a baseline version of the dialogue service; and verify the orchestration link of the dialogue service by comparing the model request and the baseline model request.

[0062] In some embodiments, the verification unit 730 may be configured to: determine that the orchestration link of the dialogue service is consistent with its baseline version in response to determining that the model request and the baseline model request include the same extension capability information; and determine that the orchestration link of the dialogue service is different from its baseline version in response to determining that the model request and the baseline model request include different extension capability information.

[0063] In some embodiments, the model request and the baseline model request each include model parameters. In some embodiments, the verification unit 730 may be configured to: determine that the orchestration link of the dialogue service is consistent with its baseline version in response to determining that the model request and the baseline model request have the same model parameters; and determine that the orchestration link of the dialogue service is different from its baseline version in response to determining that the model request and the baseline model request have different model parameters.

[0064] In some embodiments, the model request may further include historical context information, and the verification unit 730 may be configured to: obtain historical data of the dialogue request; and determine whether the historical context is missing by comparing the historical context information with the historical data.

[0065] In some embodiments, the historical context information includes multimodal historical messages, which include at least one of text, images, audio, or video.

[0066] In some embodiments, the verification unit 730 may also be configured to: determine the actual number of input and output tokens for a single model call associated with the model request; determine the expected number of input and output tokens using a tokenization tool for the target model; and compare the expected number of input and output tokens with the actual number of input and output tokens.

[0067] In some embodiments, the verification unit 730 may also be configured to: summarize the expected number of input and output tokens for each model call to obtain the total number of expected input and output tokens; obtain the total number of actual input and output tokens of the dialogue request from the model response; and compare the total number of expected input and output tokens with the total number of actual input and output tokens.

[0068] In some embodiments, the verification unit 730 may also be configured to: verify whether the protocol packets in the model response conform to the expected orchestration link, including verifying whether the model response includes one or more of the following: intent recognition packet, search term splitting packet, packet that calls the search plugin, and answer content packet.

[0069] In some embodiments, the at least one extended capability component may include one or more of the following: plugins, variables, knowledge bases, memories, and scheduled tasks.

[0070] It should be noted that further actions or steps shown in Figures 1 to 6 can be implemented using the device 700 shown in Figure 7. For example, device 700 may include more modules or units to implement the actions or steps described above, or some of the units or modules shown in Figure 7 may be further configured to implement the actions or steps described above. This will not be repeated here.

[0071] Figure 8 shows a schematic block diagram of an example device 800 that can be used to implement embodiments of the present disclosure. As shown, device 800 includes a computing unit 801, which can perform various appropriate actions and processes according to computer program instructions stored in read-only memory (ROM) 802 or loaded from storage unit 806 into random access memory (RAM) 803. Various programs and data required for the operation of device 800 may also be stored in RAM 803. The computing unit 801, ROM 802, and RAM 803 are interconnected via bus 804. Input / output (I / O) interface 805 is also connected to bus 804.

[0072] Multiple components in device 800 are connected to I / O interface 805, including: input unit 806, such as keyboard, mouse, etc.; output unit 807, such as various types of monitors, speakers, etc.; storage unit 808, such as disk, optical disk, etc.; and communication unit 809, such as network card, modem, wireless transceiver, etc. Communication unit 809 allows device 800 to exchange information / data with other devices through computer networks such as the Internet and / or various telecommunications networks.

[0073] The computing unit 801 can be a variety of general-purpose and / or special-purpose processing components with processing and computing capabilities. Some examples of the computing unit 801 include, but are not limited to, a central processing unit (CPU), a graphics processing unit (GPU), various special-purpose artificial intelligence (AI) computing chips, various computing units running machine learning model algorithms, a digital signal processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 801 performs the various methods and processes described above, such as method 200. For example, in some embodiments, method 200 may be implemented as a computer software program tangibly contained in a machine-readable medium, such as storage unit 808. In some embodiments, part or all of the computer program may be loaded and / or installed on device 800 via ROM 802 and / or communication unit 809. When the computer program is loaded into RAM 803 and executed by the computing unit 801, one or more steps of method 200 described above may be performed. Alternatively, in other embodiments, the computing unit 801 may be configured to perform method 200 by any other suitable means (e.g., by means of firmware).

[0074] In some embodiments, the methods and processes described above can be implemented as a computer program product. The computer program product may include a computer-readable storage medium having computer-readable program instructions loaded thereon for performing various aspects of this disclosure.

[0075] Computer-readable storage media can be tangible devices capable of holding and storing instructions for use by an instruction execution device. Computer-readable storage media can be, for example—but not limited to—electrical storage devices, magnetic storage devices, optical storage devices, electromagnetic storage devices, semiconductor storage devices, or any suitable combination thereof. More specific examples (a non-exhaustive list) of computer-readable storage media include: portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), static random access memory (SRAM), portable compact disc read-only memory (CD-ROM), digital multifunction disc (DVD), memory sticks, floppy disks, mechanical encoding devices, such as punch cards or recessed protrusions storing instructions thereon, and any suitable combination thereof. The computer-readable storage media used herein are not to be construed as transient signals themselves, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., light pulses through fiber optic cables), or electrical signals transmitted through wires.

[0076] The computer-readable program instructions described herein can be downloaded from computer-readable storage media to various computing / processing devices, or downloaded via a network, such as the Internet, local area network, wide area network, and / or wireless network, to an external computer or external storage device. The network may include copper cables, fiber optic cables, wireless transmission, routers, firewalls, switches, gateway computers, and / or edge servers. A network adapter card or network interface in each computing / processing device receives the computer-readable program instructions from the network and forwards them to computer-readable storage media within the respective computing / processing device.

[0077] Computer program instructions used to perform the operations of this disclosure may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, status setting data, or source code or object code written in any combination of one or more programming languages, including object-oriented programming languages ​​and conventional procedural programming languages. The computer-readable program instructions may execute entirely on a user's computer, partially on a user's computer, as a standalone software package, partially on a user's computer and partially on a remote computer, or entirely on a remote computer or server. In cases involving a remote computer, the remote computer may be connected to the user's computer via any type of network—including a local area network (LAN) or a wide area network (WAN)—or may be connected to an external computer (e.g., via the Internet using an Internet service provider). In some embodiments, electronic circuitry, such as programmable logic circuitry, field-programmable gate arrays (FPGAs), or programmable logic arrays (PLAs), is personalized by utilizing the status information of the computer-readable program instructions to implement various aspects of this disclosure.

[0078] These computer-readable program instructions can be provided to a processing unit of a general-purpose computer, a special-purpose computer, or other programmable data processing apparatus to produce a machine such that, when executed by the processing unit of the computer or other programmable data processing apparatus, they create means for implementing the functions / actions specified in one or more blocks of the flowchart and / or block diagram. These computer-readable program instructions can also be stored in a computer-readable storage medium that causes a computer, programmable data processing apparatus, and / or other device to operate in a particular manner. Thus, the computer-readable medium storing the instructions comprises an article of manufacture that includes instructions for implementing aspects of the functions / actions specified in one or more blocks of the flowchart and / or block diagram.

[0079] Computer-readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable data processing apparatus, or other device to produce a computer-implemented process, thereby causing the instructions executed on the computer, other programmable data processing apparatus, or other device to perform the functions / actions specified in one or more boxes of a flowchart and / or block diagram.

[0080] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of devices, methods, and computer program products according to various embodiments of the present disclosure. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of an instruction containing one or more executable instructions for implementing a specified logical function. In some alternative implementations, the functions marked in the blocks may occur in a different order than those shown in the drawings. For example, two consecutive blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, may be implemented using a dedicated hardware-based system that performs the specified function or action, or using a combination of dedicated hardware and computer instructions.

[0081] The various embodiments of this disclosure have been described above. These descriptions are exemplary and not exhaustive, nor are they limited to the disclosed embodiments. Many modifications and variations will be apparent to those skilled in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen to best explain the principles, practical application, or technical improvements to the technology in the market, or to enable others skilled in the art to understand the embodiments disclosed herein.

Claims

1. A method for validating a dialogue service, the dialogue service comprising a target model and at least one extension capability component coupled to the target model, the method comprising: Provide a dialogue request to the dialogue service; Obtain a model request corresponding to the dialogue request, wherein the model request is input into the target model to generate a model response, and the model request includes at least one or more extension capability information associated with one or more of the at least one extension capability component; as well as The model is used to request verification of the orchestration link of the dialogue service.

2. The method according to claim 1, wherein, The orchestration chain for verifying the dialogue service based on the model request includes: Obtain a baseline model request corresponding to the dialogue request, the baseline model request being generated based on a baseline version of the dialogue service; The orchestration link of the dialogue service is verified by comparing the model request with the baseline model request.

3. The method according to claim 2, wherein, Verifying the orchestration link of the dialogue service includes: In response to determining that the model request and the baseline model request include the same extensibility information, it is determined that the orchestration link of the dialogue service is consistent with its baseline version; and In response to determining that the model request and the baseline model request include different extension capability information, it is determined that the orchestration link of the dialogue service is different from its baseline version.

4. The method according to claim 2 or 3, wherein, The model request and the baseline model request each include model parameters, wherein the orchestration link for verifying the dialogue service includes: In response to determining that the model request and the baseline model request have the same model parameters, it is determined that the orchestration link of the dialogue service is consistent with its baseline version; and In response to determining that the model request and the baseline model request have different model parameters, it is determined that the orchestration link of the dialogue service is different from its baseline version.

5. The method according to any one of claims 1 to 4, wherein, The model request also includes historical context information, and the orchestration chain for verifying the dialogue service also includes: Obtain the historical data of the dialogue request; and By comparing the historical context information with the historical data, it can be determined whether any historical context has been omitted.

6. The method according to claim 5, wherein, The historical context information includes multimodal historical messages, which include at least one of text, images, audio, or video.

7. The method according to any one of claims 1 to 6, wherein verifying the orchestration link of the dialogue service further comprises: For a single model invocation associated with the model request, Determine the actual number of input and output tokens in a single model call; The expected number of input and output tokens is determined using the tokenization tool of the target model; and Compare the expected number of input and output tokens with the actual number of input and output tokens.

8. The method of claim 7, wherein verifying the orchestration link of the dialogue service further comprises: The total number of expected input and output tokens for each model call is obtained by summing the expected number of input and output tokens. Obtain the total number of actual input and output tokens of the dialogue request from the model response; as well as Compare the expected total number of input and output tokens with the actual total number of input and output tokens.

9. The method according to any one of claims 1 to 8, wherein verifying the orchestration link of the dialogue service further comprises: Verifying whether the protocol packets in the model response conform to the expected orchestration chain includes verifying whether the model response includes one or more of the following: intent recognition packet, search term splitting packet, packet calling search plugin, and answer content packet.

10. The method according to any one of claims 1 to 9, wherein the at least one extended capability component includes one or more of the following: plug-ins, variables, knowledge bases, memory, and timed tasks.

11. An apparatus for validating a dialogue service, the dialogue service comprising a target model and at least one extension capability component coupled to the target model, the apparatus comprising: A dialogue unit is configured to provide a dialogue request to the dialogue service; The acquisition unit is configured to acquire a model request corresponding to the dialogue request, wherein the model request is input to the target model to generate a model response, and the model request includes at least one or more extension capability information associated with one or more of the at least one extension capability component. as well as The verification unit is configured to verify the orchestration link of the dialogue service based on the model request.

12. A computing device, comprising: At least one processing unit; At least one memory coupled to the at least one processing unit and storing instructions for execution by the at least one processing unit, the instructions, when executed by the at least one processing unit, causing the computing device to perform the method as described in any one of claims 1 to 10.

13. A computer storage medium comprising machine-executable instructions that, when executed by a device, cause the device to perform the method as described in any one of claims 1 to 10.

14. A computer program product comprising machine-executable instructions that, when executed by a device, cause the device to perform the method as described in any one of claims 1 to 10.