Communication method and apparatus

WO2026123731A1PCT designated stage Publication Date: 2026-06-18HUAWEI TECH CO LTD

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
HUAWEI TECH CO LTD
Filing Date
2025-08-06
Publication Date
2026-06-18

AI Technical Summary

Technical Problem

Artificial intelligence agents are unable to accurately determine the business that the terminal needs to use, resulting in inaccurate task execution by AI or ML models and giving irrelevant answers.

Method used

The system receives information from the second network device through the first network device, determines the services that the terminal needs to use, establishes a session with the terminal, receives user input information, determines description information, and generates media information using AI or ML models to improve task accuracy.

🎯Benefits of technology

It improves the accuracy of AI or ML models in performing tasks, avoids problems such as ambiguous role positioning and rules, and ensures the accuracy of task execution.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN2025113092_18062026_PF_FP_ABST
    Figure CN2025113092_18062026_PF_FP_ABST
Patent Text Reader

Abstract

The present application discloses a communication method and apparatus. The method comprises: a second network apparatus indicates to a first network apparatus a first service that needs to be used by a first terminal apparatus; the first network apparatus establishes a first session with the first terminal apparatus, the first session being used for implementing the first service; and the first network apparatus receives user input information from the first terminal apparatus by means of the first session, determines, according to the user input information, first description information used for describing a first target task corresponding to the first service, and interacts with an AI or ML model having a semantic analysis function to obtain first media information corresponding to the first service, the first media information being generated on the basis of the first description information. In the present application, after determining that an AI or ML model needs to execute a task corresponding to a first service, a first network apparatus can determine, according to user input information, description information of a first target task corresponding to the first service that the AI or ML model needs to execute, thereby improving the accuracy of executing the task by the AI or ML model.
Need to check novelty before this filing date? Find Prior Art

Description

A communication method and apparatus

[0001] Cross-reference to related applications

[0002] This application claims priority to Chinese Patent Application No. 202411813566.6, filed on December 10, 2024, entitled "A Communication Method and Apparatus", the entire contents of which are incorporated herein by reference. Technical Field

[0003] This application relates to the field of communication technology, and in particular to a communication method and apparatus. Background Technology

[0004] Artificial intelligence (AI) agents (or intelligent agents or smart agents) are intelligent entities capable of perceiving their environment, making decisions, and performing actions. They aim to efficiently execute and process complex tasks through natural language interaction using AI or machine learning (ML) models. For example, an AI agent can input a task description into an AI or ML model, enabling the model to perform that task.

[0005] Currently, AI agents are widely used in the field of telephone communication, capable of handling or assisting in calls. For example, if terminal 1 calls terminal 2 and terminal 2 is using a smart answering service, then terminal 2's AI agent can communicate with terminal 1 on behalf of terminal 2. Similarly, if terminal 1 calls terminal 2 and terminal 1 is using a smart assistance service, then terminal 1's AI agent can provide assistance during the call between terminal 1 and terminal 2. However, the terminal's AI agent may not accurately determine the service required by the terminal. For instance, if terminal 1 needs to use service 1, but its AI agent incorrectly determines that terminal 1 needs to use service 2 based on the user input, the task description information input to the AI ​​or ML model may be inaccurate, causing the AI ​​or ML model to execute the wrong task and provide irrelevant answers. Summary of the Invention

[0006] This application provides a communication method and apparatus to improve the accuracy of AI or ML models in performing tasks.

[0007] In a first aspect, embodiments of this application provide a communication method that can be applied to a first network device, which is a network-side device. For example, the first network device is a subscriber artificial intelligence agent function (SAAF) network element, or other devices including SAAF network element functions, or a chip system (or chip) or other functional module capable of implementing the SAAF network element functions, and the chip system or functional module is, for example, set in the SAAF network element. The method includes: receiving first information from a second network device, the first information indicating a first service required by a first terminal device; establishing a first session with the first terminal device, the first session being used to implement the first service; receiving user input information from the first terminal device through the first session, and determining first description information based on the user input information, the first description information describing a first target task corresponding to the first service; interacting with an AI or ML model to obtain first media information corresponding to the first service, the AI ​​or ML model having semantic analysis capabilities, and the first media information being generated based on the first description information.

[0008] In this embodiment, the second network device can instruct the first network device on the first service that the first terminal device needs to use, so that the first network device can determine the task corresponding to the first service that the AI ​​or ML model needs to perform. Then, based on the user input information from the first terminal device, the first network device can determine the description information of the first target task corresponding to the first service that the AI ​​or ML model needs to perform, thereby improving the accuracy of the AI ​​or ML model in performing the task and avoiding irrelevant answers.

[0009] In one possible implementation, the method further includes: sending the first media information to the first terminal device through the first session. In this implementation, the first network device can send the first media information corresponding to the first service to the first terminal device to realize the first service.

[0010] In one possible implementation, determining the first description information based on the user input information includes: determining the first description information based on the user input information and the task description information template corresponding to the first service.

[0011] In this embodiment, the first network device can determine the description information of the first target task corresponding to the first service based on the task description information template corresponding to the first service, thereby improving the accuracy of the description information of the first target task corresponding to the first service and avoiding problems such as ambiguous role positioning and ambiguous rules.

[0012] In one possible implementation, the first service includes multiple services; determining the first description information based on the user input information and the task description information template corresponding to the first service includes: selecting one service from the multiple services based on the user input information; and determining the first description information based on the user input information and the task description information template corresponding to the selected service.

[0013] In this embodiment, the first network device can determine the service that the first terminal device needs to use from multiple services based on user input information, and then determine the first description information based on the user input information and the task description information template corresponding to the service that the first terminal device needs to use, thereby improving the accuracy of the first description information.

[0014] In one possible implementation, the first information includes the identification information of the first service; the method further includes: determining a task description information template corresponding to the first service based on the identification information of the first service.

[0015] In this embodiment, a method is provided for a first network device to determine a task description information template corresponding to a first service. For example, the first network device may store the correspondence between multiple services and multiple task description information templates. When a second network device indicates the identification information of the first service to the first network device, the first network device can determine the task description information template corresponding to the first service based on the identification information of the first service. Alternatively, other methods can be used to determine the task description information template corresponding to the first service, and there are no limitations on this.

[0016] In one possible implementation, the first information includes a task description information template corresponding to the first service. In this implementation, the second network device can indicate the task description information template corresponding to the first service to the first network device, so that the first network device can determine the description information of the target task corresponding to the first service based on the task description information template, thereby improving the accuracy of the description information of the target task corresponding to the first service.

[0017] In one possible implementation, establishing a first session with the first terminal device includes: determining second description information based on parameter information of the first service and / or first preference information of the first terminal device, wherein the first preference information is related to the first service and the second description information is used to describe a second target task corresponding to the first service; interacting with the AI ​​or ML model to obtain second media information corresponding to the first service, wherein the second media information is generated based on the second description information; sending a first session establishment request to the first terminal device, wherein the first session establishment request includes the second media information; and receiving a first session establishment response from the first terminal device.

[0018] In this embodiment, the first network device may carry second media information corresponding to the first service in the first session establishment request sent to the first terminal device, so that the first terminal device can determine that the first session is for implementing the first service based on the second media information corresponding to the first service. Furthermore, a method is provided for the first network device to generate the second media information corresponding to the first service. For example, after determining that the first terminal device needs to use the first service, the first network device can determine the description information of the second target task corresponding to the first service that the AI ​​or ML model needs to execute, based on the parameter information of the first service and / or the first preference information of the first terminal device related to the first service, so that the AI ​​or ML model generates the second media information corresponding to the first service based on the description information of the second target task corresponding to the first service. Alternatively, the second media information corresponding to the first service can be generated in other ways, without limitation.

[0019] In one possible implementation, the first information includes the identification information of the first terminal device and the identification information of the first service; the method further includes: sending the identification information of the first terminal device and the identification information of the first service to a third network device, wherein the third network device is used to store the preference information of the terminal device; and receiving the first preference information from the third network device.

[0020] In this embodiment, a method is provided for a first network device to determine first preference information of a first terminal device related to a first service. A third network device may store preference information of multiple terminal devices. When a second network device sends the identification information of the first terminal device and the identification information of the first service to the third network device, the first network device may send the identification information of the first terminal device and the identification information of the first service to query the first preference information of the first terminal device related to the first service. Alternatively, other methods may be used to determine the first preference information of the first terminal device related to the first service; no limitation is imposed on this method.

[0021] Secondly, embodiments of this application also provide a communication method. This method can be applied to a second network device, which is a network-side communication device. For example, the second network device is a data channel application server (DCAS) network element, or other devices including DCAS network element functions, or a chip system (or chip) or other functional module. The chip system or functional module can implement the functions of the DCAS network element, and the chip system or functional module is, for example, set in the DCAS network element. The method includes: determining first information, the first information being used to indicate a first service that a first terminal device needs to use; receiving second information, the second information being used to indicate that a media channel between the first terminal device and a media function network element has been successfully established, the media channel being used to transmit media information corresponding to the first service; and sending the first information to the first network device according to the second information, instructing the first network device to establish a first session with the first terminal device, the first session being used to implement the first service.

[0022] In this embodiment, the second network device can determine the first service that the first terminal device needs to use, and when it is determined that the media channel between the first terminal device and the media function network element has been successfully established, it instructs the first network device on the first service that the first terminal device needs to use, so that the first network device can determine the task corresponding to the first service that the AI ​​or ML model needs to perform, thereby improving the accuracy of the AI ​​or ML model in performing the task and avoiding irrelevant questions.

[0023] It can be understood that the first information may include the identification information of the first terminal device and the identification information of the first service. The identification information of the first terminal device and the identification information of the first service are used to instruct the first network device and the first terminal device to establish a first session for implementing the first service. This method can also be understood as an indirect instruction method. Alternatively, the first information may include first instruction information, which is used to instruct the first network device and the first terminal device to establish a first session for implementing the first service. This instruction method can also be understood as a direct instruction method. Therefore, the method by which the first information instructs the first network device and the first terminal device to establish a first session for implementing the first service is quite flexible.

[0024] In one possible implementation, before receiving the second information, the method further includes: establishing a media channel between the first terminal device and the media function network element through a voice service processing network element when a first condition is met or a second session establishment request is received; wherein the second session establishment request is used to request the establishment of a session between the first terminal device and the second terminal device or the first network device.

[0025] In this embodiment, the establishment of a media channel between the first terminal device and the media function network element by the voice service processing network element can be understood as the voice service processing network element establishing media channels with both the first terminal device and the media function network element. The second network device can instruct the voice service processing network element to establish media channels with both the first terminal device and the media function network element when the first condition is met or when a second session establishment request is received. This allows the second network device to instruct the first network device on the first service required by the first terminal device once the media channel between the first terminal device and the media function network element has been successfully established.

[0026] In one possible implementation, the first condition includes the identification information of the first terminal device, the identification information of the first service, and the condition for triggering the first service. Determining the first information includes: determining the first information based on the identification information of the first terminal device and the identification information of the first service when the first condition is met.

[0027] In this embodiment, one method is provided for the second network device to determine the first information. For example, if a first condition is met, the second network device can determine the first service that the first terminal device needs to use based on the identification information of the first terminal device and the identification information of the first service in the first condition. Alternatively, the second network device can also determine the first information in other ways, without limitation.

[0028] In one possible implementation, the second session establishment request includes call information, and determining the first information includes: upon receiving the second session establishment request, determining the first service from the subscribed services of the first terminal device based on the call information.

[0029] In this embodiment, one method is provided for the second network device to determine the first information. For example, upon receiving a second session establishment request, the second network device can determine the first service that the first terminal device needs to use from the subscribed services of the first terminal device based on the call information in the second session establishment request. For instance, if the calling number is the number of service 1, the called number is the number of the user to whom the first terminal device belongs, and the first terminal device has subscribed to service 1, then the second network device can determine that the first terminal device needs to use service 1. Besides this, the second network device can also determine the first information through other methods, and there are no limitations on this.

[0030] In one possible implementation, the call information includes one or more of the following: calling number, called number, or call type.

[0031] This implementation provides various possibilities for call information, such as the caller's number, the called number, or the call type. In addition, call information may include other information, without limitation.

[0032] In one possible implementation, the first information includes one or more of the following: identification information of the first terminal device; identification information of the first service; parameter information of the first service; and a task description information template corresponding to the first service.

[0033] In this embodiment, multiple methods are provided for the second network device to indicate to the first network device the first service that the first terminal device needs to use. For example, the second network device can indicate the identification information of the first terminal device and the identification information of the first service; this indication method can also be understood as a direct indication method. Alternatively, the second network device can indicate the identification information of the first terminal device and the parameter information of the first service or the task description information template corresponding to the first service; this indication method can also be understood as an indirect indication method. Therefore, the methods by which the second network device indicates to the first network device the first service that the first terminal device needs to use are quite flexible.

[0034] In one possible implementation, the first information includes a task description information template corresponding to the first service. Before receiving the second information, the method further includes: receiving a registration request from a third-party application, wherein the registration request includes the identification information of the first service and the task description information template corresponding to the first service.

[0035] In this embodiment, a method is provided for the second network device to determine the task description information template corresponding to the first service. For example, the third-party application indicates the task description information template corresponding to the first service to the second network device through a registration request. In addition, the second network device can also determine the task description information template corresponding to the first service through other methods, without limitation.

[0036] Thirdly, embodiments of this application also provide a communication device. The communication device can be the first network device described in the first aspect above. The communication device possesses the functions of the first network device. The communication device is, for example, a SAAF network element, or other equipment including SAAF network element functions, or a chip system (or chip) or other functional module. This chip system or functional module can implement the functions of the SAAF network element, and is, for example, disposed within the SAAF network element. In one optional implementation, the communication device includes a baseband device and a radio frequency device. In another optional implementation, the communication device includes a processing unit (sometimes also called a processing module) and a transceiver unit (sometimes also called a transceiver module). The transceiver unit can implement both transmitting and receiving functions. When the transceiver unit implements the transmitting function, it can be called a transmitting unit (sometimes also called a transmitting module), and when the transceiver unit implements the receiving function, it can be called a receiving unit (sometimes also called a receiving module). The transmitting unit and the receiving unit can be the same functional module, which is called the transceiver unit. This functional module can realize the transmitting and receiving functions; or, the transmitting unit and the receiving unit can be different functional modules, and the transceiver unit is a general term for these functional modules.

[0037] In one embodiment, the transceiver unit is configured to receive first information from a second network device, the first information being used to indicate a first service that the first terminal device needs to use;

[0038] In one embodiment, the processing unit is configured to establish a first session with the first terminal device, the first session being used to implement the first service;

[0039] In one embodiment, the processing unit is configured to receive user input information from the first terminal device through the first session, and determine first description information based on the user input information, wherein the first description information is used to describe the first target task corresponding to the first service.

[0040] In one embodiment, the processing unit is configured to interact with an AI or ML model to obtain first media information corresponding to the first service, wherein the AI ​​or ML model has semantic analysis capabilities, and the first media information is generated based on the first description information.

[0041] Fourthly, embodiments of this application also provide a communication device. The communication device can be the second network device described in the second aspect above. The communication device possesses the functions of the aforementioned second network device. This communication device is, for example, a DCAS network element, or other equipment including DCAS network element functions, or a chip system (or chip) or other functional module. This chip system or functional module can implement the functions of the DCAS network element, and is, for example, disposed within the DCAS network element. In one optional implementation, the communication device includes a baseband device and a radio frequency device. In another optional implementation, the communication device includes a processing unit (sometimes also called a processing module) and a transceiver unit (sometimes also called a transceiver module). The transceiver unit can implement both transmitting and receiving functions. When the transceiver unit implements the transmitting function, it can be called a transmitting unit (sometimes also called a transmitting module), and when the transceiver unit implements the receiving function, it can be called a receiving unit (sometimes also called a receiving module). The transmitting unit and the receiving unit can be the same functional module, which is called the transceiver unit. This functional module can realize the transmitting and receiving functions; or, the transmitting unit and the receiving unit can be different functional modules, and the transceiver unit is a general term for these functional modules.

[0042] In one embodiment, the processing unit is configured to determine first information, the first information being used to indicate a first service that the first terminal device needs to use;

[0043] In one embodiment, the transceiver unit is configured to receive second information, the second information indicating that a media channel between the first terminal device and the media function network element has been successfully established, the media channel being used to transmit media information corresponding to the first service;

[0044] In one embodiment, the transceiver unit is configured to send the first information to the first network device according to the second information, instructing the first network device to establish a first session with the first terminal device, the first session being used to implement the first service.

[0045] Fifthly, a communication device is provided, which can be the first network device described in the first aspect above. The communication device possesses the functions of the first network device. The communication device is, for example, a SAAF network element, or other device including SAAF network element functions, or a system-on-a-chip (or chip) or other functional module capable of implementing the functions of the SAAF network element, and the system-on-a-chip or functional module is, for example, disposed within the SAAF network element. The communication device includes a processor for executing the functions of the first network device described in the first aspect above. Optionally, the communication device further includes a memory. The memory stores a computer program, and the processor is coupled to the memory. When the processor reads the computer program or instructions, it causes the communication device to execute the methods executed by the first network device in the above aspects. Optionally, the memory and the processor are integrated together.

[0046] Sixthly, a communication device is provided, which can be the second network device described in the second aspect above. The communication device possesses the functions of the second network device described above. The communication device is, for example, a DCAS network element, or other device including DCAS network element functions, or a system-on-a-chip (or chip) or other functional module capable of implementing the functions of the DCAS network element, and the system-on-a-chip or functional module is, for example, disposed within the DCAS network element. The communication device includes a processor for executing the functions of the second network device described in the second aspect above. Optionally, the communication device further includes a memory. The memory stores a computer program, and the processor is coupled to the memory. When the processor reads the computer program or instructions, it causes the communication device to execute the methods executed by the second network device in the above aspects. Optionally, the memory and the processor are integrated together.

[0047] A seventh aspect provides a communication system including a first network device and a second network device. The first network device is used to perform the method described in the first aspect. For example, the first network device can be implemented using the communication device described in the third or fifth aspect. The second network device is used to perform the method described in the second aspect. For example, the second network device can be implemented using the communication device described in the fourth or sixth aspect.

[0048] Eighthly, a computer-readable storage medium is provided for storing a computer program or instructions that, when executed, cause the method performed by the first or second network device in the preceding aspects to be implemented.

[0049] Ninthly, a computer program product containing instructions is provided, which, when the computer program or instructions are run on a computer, causes the methods described in the above aspects to be implemented.

[0050] In a tenth aspect, a chip system is provided, including a processor and an interface, the processor being configured to call and execute instructions from the interface to enable the chip system to implement the methods described above.

[0051] The beneficial effects of the second to tenth aspects and their embodiments described above can be referred to the beneficial effects of the first aspect and any of its embodiments, and will not be repeated here. Attached Figure Description

[0052] Figure 1 is a schematic diagram of a communication system provided in an embodiment of this application;

[0053] Figure 2 is a schematic diagram of another communication system provided in an embodiment of this application;

[0054] Figure 3 is a schematic diagram of an artificial intelligence agent provided in an embodiment of this application;

[0055] Figure 4 is a schematic diagram of a service that uses an artificial intelligence agent to process or assist in calls, as provided in this application example;

[0056] Figure 5 is a flowchart illustrating a communication method provided in an embodiment of this application;

[0057] Figure 6a is a flowchart illustrating another communication method provided in an embodiment of this application;

[0058] Figure 6b is a flowchart illustrating another communication method provided in an embodiment of this application;

[0059] Figure 6c is a flowchart illustrating another communication method provided in an embodiment of this application;

[0060] Figure 6d is a flowchart illustrating another communication method provided in an embodiment of this application;

[0061] Figure 6e is a flowchart illustrating another communication method provided in an embodiment of this application;

[0062] Figure 6f is a flowchart illustrating another communication method provided in an embodiment of this application;

[0063] Figure 6g is a flowchart illustrating another communication method provided in an embodiment of this application;

[0064] Figure 7 is a schematic diagram of a communication device provided in an embodiment of this application;

[0065] Figure 8 is a schematic diagram of another communication device provided in an embodiment of this application. Detailed Implementation

[0066] To make the objectives, technical solutions, and advantages of the embodiments of this application clearer, the embodiments of this application will be further described in detail below with reference to the accompanying drawings.

[0067] The technical solutions provided in the embodiments of this application can be applied to communication systems related to the 3rd Generation Partnership Project (3GPP), such as Long Term Evolution (LTE) communication systems, 5th Generation (5G) mobile communication systems (specifically, New Radio (NR) communication systems, or NR communication systems that introduce Multi-Input Multi-Output (MIMO) technology), or they can also be applied to other next-generation mobile communication systems, other similar communication systems, or communication systems in the future evolution process. Other similar communication systems may include Wireless Fidelity (WiFi), Vehicle-to-Everything (V2X), Internet of Things (IoT) systems, Narrow Band Internet of Things (NB-IoT) systems, or the Industrial Internet, etc.

[0068] Referring to Figure 1, it is a schematic diagram of the structure of a communication system provided in an embodiment of this application. As shown in Figure 1, the communication system may include a radio access network (RAN) 100 and a core network (CN) 200. Optionally, the communication system may also include the Internet 300.

[0069] The wireless access network 100 may include at least one access network device (such as access network devices 110a and 110b in Figure 1, collectively referred to as access network device 110) and at least one terminal device (such as terminal devices 120a-120j in Figure 1, collectively referred to as terminal device 120). The wireless access network 100 may also include other devices, such as wireless relay devices and / or wireless backhaul devices (not shown in Figure 1). The terminal device 120 is wirelessly connected to the access network device 110. The access network device 110 is wirelessly or wired connected to the core network 200. The core network device 210 in the core network 200 and the access network device 110 in the wireless access network 100 may be different physical devices, or they may be the same physical device integrating core network logical functions and wireless access network logical functions.

[0070] The radio access network 100 can be a 3GPP-related communication system (such as a 5G mobile communication system) or a future mobile communication system. The radio access network 100 can also be an open RAN (O-RAN or ORAN), a cloud radio access network (CRAN), or a WiFi system. The radio access network 100 can also be a communication system that integrates two or more of the above systems.

[0071] Access network equipment 110, also known as RAN node, RAN entity, or access node, is used to help terminal equipment 120 achieve wireless access.

[0072] In one possible scenario, a RAN node can be a base station, an evolved NodeB (eNodeB), an access point (AP), a transmission reception point (TRP), a next-generation NodeB (gNB), a base station in a future mobile communication system, or an access node in a WiFi system. A RAN node can be a macro base station (as shown in Figure 1, 110a), a micro base station or indoor station (as shown in Figure 1, 110b), a relay node or donor node, or a radio controller in a CRAN scenario. Optionally, a RAN node can also be a server, wearable device, vehicle, or in-vehicle equipment. For example, in V2X technology, a RAN node can be a roadside unit (RSU).

[0073] In another possible scenario, multiple RAN nodes can collaborate to assist terminal device 120 in achieving wireless access, with different RAN nodes each implementing a portion of the base station's functions. For example, RAN nodes can be central units (CUs), distributed units (DUs), CU-control plane (CPs), CU-user plane (UPs), or radio units (RUs), etc. CUs and DUs can be configured separately or included in the same network element, such as a baseband unit (BBU). RUs can be included in radio frequency equipment or radio frequency units, such as remote radio units (RRUs), active antenna units (AAUs), or remote radio heads (RRHs). The CU can perform the functions of the radio resource control (RRC) protocol and packet data convergence protocol (PDCP) of the base station, and can also perform the functions of the service data adaptation protocol (SDAP). The DU can perform the functions of the radio link control (RLC) layer and medium access control (MAC) layer of the base station, and can also perform some or all of the physical (PHY) layer functions. For specific descriptions of the above protocol layers, please refer to the relevant technical specifications of 3GPP.

[0074] In different systems, CU (or CU-CP and CU-UP), DU, or RU may have different names, but those skilled in the art will understand their meaning. For example, in an ORAN system, CU can also be called O-CU (open CU), DU can also be called O-DU, CU-CP can also be called O-CU-CP, CU-UP can also be called O-CU-UP, and RU can also be called O-RU. For ease of description, this application uses CU, CU-CP, CU-UP, DU, and RU as examples. Any of the units among CU (or CU-CP, CU-UP), DU, and RU in this application can be implemented through software modules, hardware modules, or a combination of software modules and hardware modules.

[0075] In this embodiment, the access network device 110 and its components (such as chips, processing units, or processors) can be collectively referred to as network devices. For example, it can be the access network device 110 shown in FIG1, or it can be the chip (system) in the access network device 110 in FIG1.

[0076] The embodiments of this application do not limit the device form of the access network device 110. The apparatus for implementing the functions of the access network device 110 can be the access network device 110 itself; it can also be an apparatus capable of supporting the access network device 110 in implementing the functions, such as a chip system. This apparatus can be installed in the access network device 110 or used in conjunction with the access network device 110. In the embodiments of this application, the chip system can be composed of chips, or it can include chips and other discrete devices. All or part of the functions of the access network device 110 in this application can also be implemented through software functions running on hardware, or through virtualization functions instantiated on a platform (e.g., a cloud platform).

[0077] Terminal equipment 120, also known as user equipment (UE), mobile station (MS), mobile terminal (MT), etc., refers to a device that provides voice and / or data connectivity to a user.

[0078] Terminal device 120 can be a handheld device, vehicle-mounted device, or other device with wireless connectivity. For example, terminal device 120 can be a mobile phone, tablet computer, laptop computer, PDA, mobile internet device (MID), wearable device (e.g., smartwatch, smart bracelet, pedometer, smart glasses, etc.), vehicle-mounted device (e.g., car, bicycle, electric vehicle, airplane, ship, train, high-speed rail, etc.), satellite terminal, virtual reality (VR) device, augmented reality (AR) device, point of sale (POS) machine, customer-premises equipment (CPE), light user equipment (light UE), reduced capability user equipment (REDCAP UE), wireless terminal in industrial control, smart home device (e.g., refrigerator, television, air conditioner, electricity meter, etc.), smart robot, robotic arm, workshop equipment, wireless terminal in autonomous driving, wireless terminal in telemedicine, wireless terminal in smart grid, wireless terminal in transportation safety, wireless terminal in smart city, or wireless terminal in smart home, flying device (e.g., smart robot, hot air balloon, drone, airplane), etc. Terminal device 120 can also be a vehicle device, such as a complete vehicle device, vehicle module, vehicle chip, on-board unit (OBU), or telematics box (T-BOX). Terminal device 120 can also be other devices with terminal functions; for example, terminal device 120 can also be a device that plays a terminal function in device-to-device (D2D) communication.

[0079] In the embodiments of this application, the terminal device 120 and its components (such as chips, processing units, or processors) can be collectively referred to as a terminal device. For example, it can be the terminal device 120 shown in FIG1, or it can be the chip (system) in the terminal device 120 in FIG1.

[0080] The embodiments of this application do not limit the device form of the terminal device 120. The device used to implement the functions of the terminal device 120 can be the terminal device 120 itself; it can also be a device capable of supporting the terminal device 120 in implementing the functions, such as a chip system. This device can be installed in the terminal device 120 or used in conjunction with the terminal device 120. In the embodiments of this application, the chip system can be composed of chips, or it can include chips and other discrete devices. All or part of the functions of the terminal device 120 in this application can also be implemented through software functions running on hardware, or through virtualization functions instantiated on a platform (e.g., a cloud platform).

[0081] The core network device 210 may include different network elements in different communication systems. For example, see Figure 2, which is a schematic diagram of another communication system provided in an embodiment of this application. As shown in Figure 2, in this communication system, the core network equipment may include some or all of the following network elements: Internet Protocol Multimedia Subsystem-Access Media Gateway (IMS-AGW) network element, Proxy-Call Session Control Function (P-CSCF) network element, Serving-Call Session Control Function (S-CSCF) network element, Interrogating-Call Session Control Function (I-CSCF) network element, Media Function (MF) network element, Voice Over Long Term Evolution Application Server (VoLTE AS) network element, Data Channel Signaling Function (DCSF) network element, Data Channel Application Server (DCAS) network element, Artificial Intelligence Agent Management Function (AAMF) (or Intelligent Agent Management Function) network element, and Subscriber Artificial Intelligence Agent (Subscriber Artificial Intelligence Agent) network element. The network elements include User Agent Function (SAAF) network elements, Subscriber Vector Database Function (SVDF) network elements, and Artificial Intelligence (AI) or Machine Learning (ML) model network elements. Specifically, the SAAF network element provides AI agent functionality to users. The SVDF network element stores user information, such as user preferences. The AAMF network element allocates SAAF and SVDF network elements to users. The DCAS network element stores data channel applications and distributes them to users. The DCSF network element manages data channel media capabilities and data channel service capabilities.Of course, core network equipment may also include other network elements, which will not be listed here.

[0082] In this embodiment, the core network device 210 and its components (such as chips, processing units, or processors) can be collectively referred to as network devices. For example, it can be the core network device 210 shown in FIG1, or it can be the chip (system) in the core network device 210 in FIG1.

[0083] The embodiments of this application do not limit the device form of the core network device 210. The apparatus used to implement the functions of the core network device 210 can be the core network device 210 itself; it can also be an apparatus capable of supporting the core network device 210 in implementing the functions, such as a chip system. This apparatus can be installed in the core network device 210 or used in conjunction with the core network device 210. In the embodiments of this application, the chip system can be composed of chips, or it can include chips and other discrete devices. All or part of the functions of the core network device 210 in this application can also be implemented through software functions running on hardware, or through virtualization functions instantiated on a platform (e.g., a cloud platform).

[0084] The communication system applicable to the embodiments of this application has been briefly introduced above. The relevant technical solutions involved in the embodiments of this application are described below.

[0085] Artificial intelligence agents, also known as intelligent agents or intelligent agents, refer to intelligent entities that can perceive the environment, make decisions, and perform actions. They aim to efficiently execute and process complex tasks through AI or ML models using natural language interaction.

[0086] For example, Figure 3 is a schematic diagram of an artificial intelligence agent provided in an embodiment of this application. As shown in Figure 3, the AI ​​or ML model acts as the "brain" of the artificial intelligence agent, responsible for processing and generating text, and performing reasoning and decision-making. Planning, memory, tool use, and action together constitute the core capabilities of the artificial intelligence agent. Among them, planning refers to the ability of the artificial intelligence agent to formulate a series of steps or strategies to achieve the goal based on the current state and the goal. Memory refers to the ability of the artificial intelligence agent to store and retrieve information. Tool use refers to the ability of the artificial intelligence agent to use external tools to perform tasks. Action refers to the ability of the artificial intelligence agent to perform specific operations, including physical operations and virtual operations.

[0087] AI or ML models (such as large language models) have semantic analysis capabilities and can handle various natural language tasks, such as question answering and dialogue. Artificial intelligence agents can input task descriptions into AI or ML models to enable them to perform the task.

[0088] The task description information, also known as task prompt information or task guidance information, refers to a piece of descriptive text used to guide AI or ML models in performing tasks. For example, the components of task description information can be as shown in Table 1, including roles, rules, goals, context, examples, feedback, constraints, workflow prompts, user instructions, short-term memory, and long-term memory.

[0089] Table 1

[0090] Currently, artificial intelligence agents are widely used in the field of telephone communication, capable of processing or assisting in calls. For example, Figure 4 is a schematic diagram of a service using an artificial intelligence agent to process or assist in calls according to an embodiment of this application. As shown in Figure 4, the services using artificial intelligence agents to process or assist in calls include intelligent answering (referred to as "chatting on behalf") service, intelligent assistance (referred to as "chatting assistance") service, emotional companionship (referred to as "chatting companion") service, and personal assistant service.

[0091] The intelligent proxy service includes express delivery assistant services and food delivery assistant tasks. For example, if terminal 1 calls terminal 2, and terminal 2 uses the intelligent proxy service, then terminal 2's artificial intelligence agent can communicate with terminal 1 on behalf of terminal 2.

[0092] Intelligent assistance services include intelligent real-time reminder services. For example, if terminal 1 calls terminal 2 and terminal 1 is using intelligent assistance services, then terminal 1's AI agent can provide assistance to terminal 1 during the call between terminal 1 and terminal 2.

[0093] The emotional companionship service includes virtual character call services. For example, if terminal 1 uses the emotional companionship service, the AI ​​agent of terminal 1 can call terminal 1 and have a conversation with terminal 1. Here, the virtual character can refer to a virtual person with certain human characteristics.

[0094] Personal assistant services include basic life assistant services (knowledge Q&A, weather, search, schedule management), medical assistant services, hotel assistant services, communication assistant services (button-free), food ordering assistant services, travel assistant services, car insurance assistant services, telecommunications service assistant services, and enabling third-party assistant services. For example, if terminal 1 uses the personal assistant service, terminal 1 can call its AI agent, which can then provide services to terminal 1 according to its needs during the call.

[0095] However, the AI ​​agent on the terminal may not be able to accurately determine the service that the terminal needs to use. For example, if terminal 1 needs to use service 1, but the AI ​​agent on terminal 1 incorrectly determines that terminal 1 needs to use service 2 based on the user input information of terminal 1, the task description information input by the AI ​​agent of the terminal to the AI ​​or ML model will not be accurate enough. For example, there may be problems such as ambiguous role positioning, ambiguous rules, or too many tokens. This will cause the AI ​​or ML model to execute the wrong task, resulting in problems such as incorrect output format, irrelevant answers, and large output delays.

[0096] In view of this, embodiments of this application provide a communication method for improving the accuracy of AI or ML models in performing tasks.

[0097] In the embodiments of this application, "when," "if," and "if" all refer to the device taking corresponding actions under certain objective circumstances, and are not time-limited, nor do they require the device to perform a judgment action, nor do they imply any other limitations. Unless otherwise specified, "if" and "if" can be substituted, and "when" and "in the case of" can be substituted. "When" and "if" / "if" can be substituted.

[0098] In the embodiments of this application, the terms "exemplary" or "for example" are used to indicate that something is an example, illustration, or description. Any embodiment or design that is described as "exemplary" or "for example" in this application should not be construed as being more preferred or advantageous than other embodiments or designs. Specifically, the use of the terms "exemplary" or "for example" is intended to present the relevant concepts in a specific manner.

[0099] In this document, "used for indication" can include both direct and indirect indication. For example, when descriptive information I is used to indicate information J, it can mean that information I directly indicates information J or indirectly indicates information J, but it does not necessarily mean that information I carries information J.

[0100] Let information J, indicated by information I, be called the information to be indicated. In practice, there are many ways to indicate the information to be indicated, such as, but not limited to, directly indicating the information to be indicated, such as the information itself or its index. It can also be indirectly indicated by indicating other information, where there is a relationship between the other information and the information to be indicated. It can also indicate only a part of the information to be indicated, while the other parts are known or pre-agreed upon. For example, the indication of specific information can be achieved by using a pre-agreed (e.g., protocol-defined) order of various pieces of information, thereby reducing indication overhead to some extent. Simultaneously, common parts of various pieces of information can be identified and indicated uniformly to reduce the indication overhead caused by individually indicating the same information.

[0101] Furthermore, the specific instruction method can also be any existing instruction method, such as, but not limited to, the above-mentioned instruction methods and their various combinations. As described above, for example, when multiple pieces of information of the same type need to be indicated, the instruction methods for different pieces of information may differ. In specific implementation, the required instruction method can be selected according to specific needs. This application embodiment does not limit the selected instruction method. Therefore, the instruction methods involved in this application embodiment should be understood to cover various methods that enable the party to be instructed to obtain the information to be indicated.

[0102] In the embodiments of this application, "send" and "receive" indicate the direction of signal transmission. For example, "send information to XX" can be understood as the destination of the information being XX, which may include direct transmission via the air interface or indirect transmission via the air interface by other units or modules. "Receive information from YY" can be understood as the source of the information being YY, which may include direct reception from YY via the air interface or indirect reception from YY via the air interface by other units or modules. "Send" can also be understood as the "output" of the chip interface, and "receive" can also be understood as the "input" of the chip interface.

[0103] Information may undergo necessary processing, such as encoding and modulation, between the source and destination ends, but the destination end can understand the valid information from the source end. Similar statements in the embodiments of this application can be understood in a similar way, and will not be repeated here.

[0104] In this application embodiment, the number of nouns, unless otherwise specified, refers to "singular nouns or plural nouns," that is, "one or more." "At least one" means one or more, and "more than one" means two or more. "And / or" describes the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can mean: A exists alone, A and B exist simultaneously, or B exists alone, where A and B can be singular or plural. The character " / " can indicate that the related objects before and after are in an "or" relationship. For example, A / B means: A or B. "At least one of the following" or similar expressions refer to any combination of these items, including any combination of single or plural items. For example, at least one of a, b, or c means: a, b, c, a and b, a and c, b and c, or a and b and c, where a, b, and c can be single or multiple.

[0105] In this application, the ordinal numbers such as "first" and "second" are used to distinguish multiple objects, and are not used to limit the size, content, order, timing, priority, or importance of the multiple objects. For a technical feature, the technical features within that technical feature are distinguished by "A", "B", "C", and "D", and there is no sequential or size order among the technical features described by "A", "B", "C", and "D".

[0106] The solutions provided in the embodiments of this application are described in detail below with reference to the accompanying drawings. In the following description, the communication method provided in the embodiments of this application is used as an example applied to the communication system shown in Figures 1-2. The communication system and application scenarios described in the embodiments of this application are for the purpose of more clearly illustrating the technical solutions of the embodiments of this application, and do not constitute a limitation on the technical solutions provided in the embodiments of this application. Those skilled in the art will understand that with the evolution of communication systems and the emergence of new application scenarios, the technical solutions provided in the embodiments of this application are also applicable to similar technical problems.

[0107] The communication method provided in this application embodiment is described below using a first terminal device, a first network device, a second network device, and a third network device as examples. The first terminal device can be the terminal equipment shown in Figures 1-2, or it can be a component (such as a chip, processing unit, or processor module) within the terminal equipment shown in Figures 1-2. The first network device can be the SAAF network element shown in Figure 2, or it can be a component (such as a chip, processing unit, or processor module) within the SAAF network element shown in Figure 2. The second network device can be the DCAS network element shown in Figure 2, or it can be a component (such as a chip, processing unit, or processor module) within the DCAS network element shown in Figure 2. The third network device can be the SVDF network element or DCSF network element shown in Figure 2, or it can be a component (such as a chip, processing unit, or processor module) within the SVDF network element or DCSF network element shown in Figure 2. When this communication method is implemented by components in the first terminal device, the first network device, the second network device, and the third network device, the receiving and transmitting steps can be understood as the component communicating with other components, for example, communication between a baseband chip and a radio frequency circuit. In the embodiments of this application, the processing performed by a single execution subject can also be divided into multiple execution subjects, which can be logically and / or physically separated.

[0108] Referring to Figure 5, which is a flowchart illustrating a communication method provided in an embodiment of this application, the communication method includes the following steps, as shown in Figure 5.

[0109] S501, the second network device determines the first information. The first information is used to indicate the first service that the first terminal device needs to use.

[0110] In this application embodiment, the first information may include one or more of the following: identification information of the first terminal device; identification information of the first service; parameter information of the first service; and a task description information template corresponding to the first service. This application embodiment does not limit this.

[0111] The first service may include one or more services required by the first terminal device, such as the virtual character caller service, food ordering assistant service, telecommunications service assistant service, and enabling third-party assistant service shown in Figure 4 above. This application embodiment does not limit this.

[0112] For example, the first piece of information can be shown in Table 2 below.

[0113] Table 2

[0114] The following section describes how the second network device determines the first information segment.

[0115] In scenario one, the second network device can establish a media channel between the first terminal device and the media function network element through the voice service processing network element, provided that the first condition is met.

[0116] The first condition can be pre-configured, standard-defined, or negotiated between the first terminal device and the second network device, such as being configured by the first terminal device to the second network device. This embodiment of the application does not limit this. The first condition may include the identification information of the first terminal device, the identification information of the first service, and the condition for triggering the first service. The condition for triggering the first service may include, for example, reaching a pre-configured time or receiving pre-configured user input information. This embodiment of the application does not limit this.

[0117] The media channel can be used to transmit media information corresponding to the first service required by the first terminal device. The media information can be one or more of audio, video, images, or text, and this application embodiment does not limit this.

[0118] Media function network elements (such as MF network elements) can be used to process media information, such as encoding and decoding media information, or format conversion of media information. Voice service processing network elements (such as VoLTE AS network elements) can be used to process voice services, such as handling call initiation, reception, and termination. Establishing a media channel between the first terminal device and the media function network element through the voice service processing network element can be understood as the voice service processing network element establishing media channels with both the first terminal device and the media function network element.

[0119] The second network device can also determine the first information based on the identification information of the first terminal device and the identification information of the first service, provided that the first condition is met.

[0120] For example, as shown in Figure 6a, taking the first service required by the first terminal device as the virtual character caller service, the first terminal device as UE1, the first network device as the SAAF network element, the second network device as the DCAS network element, the media function network element as the MF network element, and the voice service processing network element as the VoLTE AS network element as an example, this application can perform the following steps.

[0121] Steps A0-A4 are the virtual character caller service setup stage, and steps A5-A21 are the machine-to-consumer (M2C) call establishment stage.

[0122] Step A0: SAAF network elements are pre-configured to support virtual character caller service.

[0123] Step A1: UE1 sends a virtual character caller ID service subscription request to the DCAS network element. Correspondingly, the DCAS network element receives the virtual character caller ID service subscription request from UE1. The virtual character caller ID service subscription request includes parameter information for the virtual character caller ID service, such as the virtual character caller ID information, virtual character identifier information, and opening remarks configured by UE1.

[0124] Step A2: The DCAS network element sends a call event subscription request to the DCSF network element, and the DCSF network element receives the call event subscription request from the DCAS network element. The call event subscription request is used to subscribe to call events related to the virtual character call service. When a call event related to the virtual character call service occurs, the DCSF network element immediately notifies the DCAS network element.

[0125] Step A3: The DCSF network element sends a call event subscription response to the DCAS network element, and the DCAS network element receives the call event subscription response from the DCSF network element.

[0126] Step A4: The DCAS network element sends a virtual person caller service subscription response to UE1, and UE1 receives the virtual person caller service subscription response from the DCAS network element.

[0127] Step A5: Upon reaching the virtual character's incoming call time configured in UE1, the DCAS network element sends an outbound call request to the DCSF network element. Correspondingly, the DCSF network element receives the outbound call request from the DCAS network element. This outbound call request is used to request triggering the virtual character's incoming call, configure call information (e.g., caller ID, called number, or display number), and create media resources (e.g., audio and video).

[0128] Step A6: The DCSF network element sends an outbound call request to the VoLTE AS network element, and the VoLTE AS network element receives the outbound call request from the DCSF network element.

[0129] Step A7: The VoLTE AS network element sends an INVITE message to UE1, and UE1 receives the INVITE message from the VoLTE AS network element. The INVITE message is used to request the establishment of a session, such as a voice call, video call, or other types of multimedia session. The INVITE message includes push-to-talk over cellular (3POC) information and session description protocol (SDP) information (used to negotiate media parameters for the session, such as audio).

[0130] Step A8: The VoLTE AS network element sends a media resource creation request to the MF network element, and the MF network element receives the media resource creation request from the VoLTE AS network element.

[0131] Step A9: UE1 sends a 183 response message to the VoLTE AS network element, and the VoLTE AS network element receives the 183 response message from UE1. The 183 response message is a provisional response message, indicating that the session is in progress but not yet complete. The 183 response message includes SDP information (used to negotiate media parameters for the session, such as audio).

[0132] Step A10: The VoLTE AS network element sends a provisional response acknowledgment (PRACK) message to UE1. Correspondingly, UE1 receives the PRACK message from the VoLTE AS network element. It can be understood that after receiving the PRACK message from the VoLTE AS network element, UE1 can send a 200 response message 1 to the VoLTE AS network element, and the VoLTE AS network element can receive the 200 response message 1 from UE1. The PRACK message is used to confirm receipt of the 183 response message. The 200 response message 1 is used to confirm receipt of the PRACK message.

[0133] Step A11: UE1 sends an UPDATE message to the VoLTE AS network element, and the VoLTE AS network element receives the UPDATE message from UE1. It can be understood that after receiving the UPDATE message from UE1, the VoLTE AS network element can send a 200 response message 2 to UE1, and UE1 can receive a 200 response message 2 from the VoLTE AS network element. The UPDATE message is used to confirm the resource reservation status. The 200 response message 2 is used to confirm receipt of the UPDATE message.

[0134] In step A12, UE1 sends a 180 ringing message to the VoLTE AS network element, and the VoLTE AS network element receives the 180 ringing message from UE1. The 180 ringing message is used to notify the initiating party (such as the VoLTE AS network element) that the called party (such as UE1) has received the call request and is attempting to connect.

[0135] In step A13, UE1 sends a 200 response message 3 to the VoLTE AS network element, and the VoLTE AS network element receives the 200 response message 3 from UE1. The 200 response message 3 is used to confirm receipt of the 180 ringing message.

[0136] Step A14: The VoLTE AS network element sends an acknowledgment (ACK) message 1 to UE1. Correspondingly, UE1 receives the ACK message 1 from the VoLTE AS network element.

[0137] Step A15: UE1 sends a re-INVITE message to the VoLTE AS network element. Correspondingly, UE1 receives a re-INVITE message from the VoLTE AS network element. The re-INVITE message is used to request a session update and includes SDP information (used to negotiate media parameters for the session, such as audio and video).

[0138] Step A16: The VoLTE AS network element sends a 200 Response Message 4 to UE1. Correspondingly, UE1 receives the 200 Response Message 4 from the VoLTE AS network element. The 200 Response Message 4 is used to acknowledge receipt of the re-INVITE message. The 200 Response Message 4 includes SDP information (used to negotiate media parameters for the session, such as audio and video).

[0139] Step A17: The VoLTE AS network element sends an update media resource request to the MF network element, and the MF network element receives the update media resource request from the VoLTE AS network element.

[0140] Step A18: VoLTE AS network element UE1 sends ACK message 2, and correspondingly, UE1 receives ACK message 2 from VoLTE AS network element.

[0141] Step A19: The VoLTE AS network element sends an outbound call response to the DCAS network element, and correspondingly, the DCAS network element receives the outbound call response from the VoLTE AS network element. The outbound call response includes the UE1's audio stream address (Uniform Resource Locator, URL) 1 and video stream address (URL 2).

[0142] Step A20: The DCAS network element sends an outbound party response to the DCSF network element, and correspondingly, the DCSF network element receives the outbound party response from the DCAS network element.

[0143] Step A21: The DCSF network element sends a 200 response message 5 to the DCAS network element. Correspondingly, the DCAS network element receives a 200 response message from the DCSF network element. The 200 response message 5 is used to confirm receipt of a response from the outgoing caller.

[0144] It is understandable that in step A5 above, after the virtual person's call time configured for UE1 is reached, the DCAS network element can determine that UE1 needs to use the virtual person for the call. The DCAS network element can then send an outbound call request to the VoLTE AS network element, instructing the VoLTE AS network element to establish media channels with both UE1 and the MF network element. For example, in step A7 above, the VoLTE AS network element requests to establish a media channel with UE1 by sending an INVITE message to UE1; in step A8 above, the VoLTE AS network element requests to establish a media channel with the MF network element by sending a media resource creation request to the MF network element.

[0145] In scenario two, upon receiving a second session establishment request, the second network device can establish a media channel between the first terminal device and the media function network element through the voice service processing network element.

[0146] The second session establishment request can be used to request the establishment of a session between the first terminal device and the second terminal device or the first network device.

[0147] The second session establishment request may include call information. Call information may include one or more of the following: calling number, called number, or call type. Call types include consumer-to-consumer (C2C) calls and / or consumer-to-machine (C2M) calls. This application does not limit this specific type.

[0148] It is understood that the second session establishment request may be initiated by the first terminal device to request the establishment of a session between the first terminal device (i.e., the calling party) and the second terminal device (i.e., the called party); or, the second session establishment request may be initiated by the second terminal device to request the establishment of a session between the second terminal device (i.e., the calling party) and the first terminal device (i.e., the called party); or, the second session establishment request may be initiated by the first terminal device to request the establishment of a session between the first terminal device (i.e., the calling party) and the first network device (i.e., the called party). This application embodiment does not limit this.

[0149] The second network device can also, upon receiving a second session establishment request, determine the first service from the subscribed services of the first terminal device based on the call information.

[0150] For example, as shown in Figure 6b, taking the first services required by the first terminal device as the food ordering assistant service and the telecommunications service assistant service, the first terminal device as UE1, the first network device as the SAAF network element, the second network device as the DCAS network element, the media function network element as the MF network element, and the voice service processing network element as the VoLTE AS network element as an example, this application can perform the following steps.

[0151] Steps B0-B4 are the setup phases for the food ordering assistant service and the telecommunications service assistant service, while steps B5-B16 are the C2M call setup phases.

[0152] Step B0: SAAF network elements are pre-configured to support the food ordering assistant service and the telecommunications service assistant service.

[0153] Step B1: UE1 sends subscription requests for the Food Ordering Assistant service and the Telecommunications Service Assistant service to the DCAS network element. Correspondingly, the DCAS network element receives these subscription requests from UE1. The subscription requests include parameter information for the Food Ordering Assistant service and the Telecommunications Service Assistant service, such as the authorized login information for the food ordering application and the authorized login information for the telecommunications service configured by UE1.

[0154] Step B2: The DCAS network element sends a call event subscription request to the DCSF network element, and the DCSF network element receives the call event subscription request from the DCAS network element. The call event subscription request is used to subscribe to call events related to the Food Ordering Assistant service and the Telecommunications Service Assistant service. When a call event related to these services occurs, the DCSF network element immediately notifies the DCAS network element.

[0155] Step B3: The DCSF network element sends a call event subscription response to the DCAS network element, and the DCAS network element receives the call event subscription response from the DCSF network element.

[0156] Step B4: The DCAS network element sends a subscription response for the Meal Ordering Assistant service and the Telecommunications Service Assistant service to UE1. Correspondingly, UE1 receives the subscription response for the Meal Ordering Assistant service and the Telecommunications Service Assistant service from the DCAS network element.

[0157] Step B5: When a user belonging to UE1 dials a personal assistant number, UE1 sends an INVITE message to the VoLTE AS network element, and the VoLTE AS network element receives the INVITE message from UE1. The INVITE message is used to request the establishment of a session, such as a voice call, video call, or other types of multimedia session. The INVITE message includes 3POC information and SDP information (used to negotiate media parameters for the session, such as audio).

[0158] Step B6: The VoLTE AS network element sends a BEGIN message to the DCSF network element, and the DCSF network element receives the BEGIN message from the VoLTE AS network element. The BEGIN message indicates that a new session has begun.

[0159] Step B7: The DCSF network element sends a BEGIN message to the DCAS network element, and the DCAS network element receives the BEGIN message from the DCSF network element.

[0160] Step B8: The DCAS network element sends a CONTINUE message to the DCSF network element. Correspondingly, the DCSF network element receives the CONTINUE message from the DCAS network element. The CONTINUE message indicates that the DCAS network element has completed the relevant processing logic and allows the DCSF network element to continue processing the current session.

[0161] Step B9: The VoLTE AS network element sends a media resource creation request to the MF network element, and the MF network element receives the media resource creation request from the VoLTE AS network element.

[0162] In step B10, the VoLTE AS network element sends a 183 response message, a 180 ringing message, and a 200 response message 1 to UE1. Correspondingly, UE1 receives the 183 response message, the 180 ringing message, and the 200 response message 1 from the VoLTE AS network element. This step B10 can be referred to steps A9-A13 above, and will not be repeated here.

[0163] Step B11: UE1 sends an ACK message to the VoLTE AS network element, and the VoLTE AS network element receives the ACK message from UE1.

[0164] Step B12: The VoLTE AS network element sends a re-INVITE message to UE1. Correspondingly, UE1 receives the re-INVITE message from the VoLTE AS network element. The re-INVITE message is used to request a session update and includes SDP information (used to negotiate session media parameters, such as audio and video).

[0165] Step B13: UE1 sends a 200 Response Message 2 to the VoLTE AS network element. Correspondingly, UE1 receives a 200 Response Message 2 from the VoLTE AS network element. The 200 Response Message 2 is used to acknowledge receipt of the re-INVITE message. The 200 Response Message 2 includes SDP information (used to negotiate media parameters for the session, such as audio and video).

[0166] Step B14: The VoLTE AS network element sends an update media resource request to the MF network element, and the MF network element receives the update media resource request from the VoLTE AS network element.

[0167] Step B15: The VoLTE AS network element sends an ANSWER message to the DCSF network element, and correspondingly, the DCSF network element receives the ANSWER message from the VoLTE AS network element. The ANSWER message indicates that the session has been established.

[0168] Step B16: The DCSF network element sends an ANSWER message to the DCAS network element, and the DCAS network element receives the ANSWER message from the DCSF network element.

[0169] It is understandable that in step B7 above, after the DCAS network element receives the BEGIN message from the DCAS network element, the DCAS network element can determine that UE1 needs to use the food ordering assistant service and the telecommunications service based on the fact that the calling number is the phone number of the user to which UE1 belongs, the called number is the personal assistant number, the call type is C2M call, and UE1 has subscribed to the food ordering assistant service and the telecommunications service assistant service.

[0170] In step B5 above, UE1 requests to establish a media channel with the VoLTE AS network element by sending an INVITE message to the VoLTE AS network element; in step B9 above, the VoLTE AS network element requests to establish a media channel with the MF network element by sending a media resource creation request to the MF network element.

[0171] In one possible implementation, if the first information includes a task description information template corresponding to the first service, the second network device may receive a registration request from a third-party application before determining the first information. The registration request may include the identification information of the first service and the task description information template corresponding to the first service.

[0172] For example, as shown in Figure 6c, taking the first service required by the first terminal device as enabling the third-party assistant service, the first terminal device as UE1, the first network device as SAAF network element, the second network device as DCAS network element, the media function network element as MF network element, the voice service processing network element as VoLTE AS network element, and the third-party application as a simulated star operation application as an example, this application can perform the following steps.

[0173] Steps C0-C7 are the enabling third-party assistant service setup stage, and steps C8-C16 are the C2M call establishment stage.

[0174] Step C0: The third-party application sends a registration request to the DCAS network element, and the DCAS network element receives the registration request from the third-party application. The registration request is used to request the third-party application to register with the DCAS network element. The registration request may include an identifier enabling the third-party assistant service and a task description information template generated by the third-party application corresponding to enabling the third-party assistant service.

[0175] Step C1: The DCAS network element sends a registration enable third-party tool request to the SAAF network element. Correspondingly, the SAAF network element receives the registration enable third-party tool request from the DCAS network element. This registration enable third-party tool request is used to request the registration of third-party applications with the SAAF network element.

[0176] Step C2: The SAAF network element sends a registration enable third-party tool response to the DCAS network element, and the DCAS network element receives the registration enable third-party tool response from the SAAF network element.

[0177] Step C3: The DCAS network element sends a registration response to the third-party application, and the third-party application receives the registration response from the DCAS network element.

[0178] Step C4: UE1 sends an enable third-party assistant service subscription request to the DCAS network element. Correspondingly, the DCAS network element receives the enable third-party assistant service subscription request from UE1. The enable third-party assistant service subscription request includes parameter information for enabling the third-party assistant service, such as the third-party application access number information configured by UE1, and the third-party application authorized login information.

[0179] Step C5: The DCAS network element sends a call event subscription request to the DCSF network element, and the DCSF network element receives the call event subscription request from the DCAS network element. The call event subscription request is used to subscribe to call events related to enabling the third-party assistant service. When a call event related to enabling the third-party assistant service occurs, the DCSF network element will immediately notify the DCAS network element.

[0180] Step C6: The DCSF network element sends a call event subscription response to the DCAS network element, and the DCAS network element receives the call event subscription response from the DCSF network element.

[0181] Step C7: The DCAS network element sends an enable third-party assistant service subscription response to UE1. Correspondingly, UE1 receives the enable third-party assistant service subscription response from the DCAS network element.

[0182] Step C8: When a user belonging to UE1 dials a third-party application number, UE1 sends an INVITE message to the VoLTE AS network element, and the VoLTE AS network element receives the INVITE message from UE1. The INVITE message is used to request the establishment of a session, such as a voice call, video call, or other types of multimedia session. The INVITE message includes 3POC information and SDP information (used to negotiate media parameters for the session, such as audio).

[0183] Step C9: The VoLTE AS network element sends a BEGIN message to the DCSF network element, and correspondingly, the DCSF network element receives the BEGIN message from the VoLTE AS network element. The BEGIN message indicates that a new session has begun.

[0184] Step C10: The DCSF network element sends a BEGIN message to the DCAS network element, and correspondingly, the DCAS network element receives the BEGIN message from the DCSF network element.

[0185] Step C11: The DCAS network element sends a CONTINUE message to the DCSF network element, and correspondingly, the DCSF network element receives the CONTINUE message from the DCAS network element. The CONTINUE message indicates that the DCAS network element has completed the relevant processing logic and allows the DCSF network element to continue processing the current session.

[0186] Step C12: The VoLTE AS network element sends a media resource creation request to the MF network element, and the MF network element receives the media resource creation request from the VoLTE AS network element.

[0187] Step C13: The VoLTE AS network element sends a 183 response message, a 180 ringing message, and a 200 response message 1 to UE1. Correspondingly, UE1 receives the 183 response message, the 180 ringing message, and the 200 response message 1 from the VoLTE AS network element. This step C13 can be referred to steps A9-A13 above, and will not be repeated here.

[0188] Step C14: UE1 sends an ACK message to the VoLTE AS network element, and the VoLTE AS network element receives the ACK message from UE1 accordingly.

[0189] Step C15: The VoLTE AS network element sends a re-INVITE message to UE1, and UE1 receives the re-INVITE message from the VoLTE AS network element. The re-INVITE message is used to request a session update and includes SDP information (used to negotiate session media parameters, such as audio and video).

[0190] Step C16: UE1 sends a 200 Response Message 2 to the VoLTE AS network element. Correspondingly, UE1 receives a 200 Response Message 2 from the VoLTE AS network element. The 200 Response Message 2 is used to acknowledge receipt of the re-INVITE message. The 200 Response Message 2 includes SDP information (used to negotiate media parameters for the session, such as audio and video).

[0191] Step C17: The VoLTE AS network element sends an update media resource request to the MF network element, and correspondingly, the MF network element receives the update media resource request from the VoLTE AS network element.

[0192] Step C18: The VoLTE AS network element sends an ANSWER message to the DCSF network element, and correspondingly, the DCSF network element receives the ANSWER message from the VoLTE AS network element. The ANSWER message indicates that the session has been established.

[0193] Step C19: The DCSF network element sends an ANSWER message to the DCAS network element, and correspondingly, the DCAS network element receives the ANSWER message from the DCSF network element.

[0194] It is understandable that in step C10 above, after the DCAS network element receives the BEGIN message from the DCAS network element, the DCAS network element can determine that UE1 needs to use the enabled third-party assistant service and has been matched with a third-party application (e.g., analog star application) based on the fact that the calling number is the phone number of the user to which UE1 belongs, the called number is the third-party application number, the call type is C2M call, and UE1 has subscribed to enable the third-party assistant service and has been matched with a third-party application (e.g., analog star application).

[0195] In step C8 above, UE1 requests to establish a media channel with the VoLTE AS network element by sending an INVITE message to the VoLTE AS network element; in step C12 above, the VoLTE AS network element requests to establish a media channel with the MF network element by sending a media resource creation request to the MF network element.

[0196] S502, the second network device receives second information. The second information indicates that a media channel has been successfully established between the first terminal device and the media function network element. The media channel is used to transmit media information corresponding to the first service required by the first terminal device.

[0197] In this embodiment of the application, the second information may be encapsulated or carried in a Hypertext Transfer Protocol (HTTP) message, or it may be encapsulated or carried in other messages. This embodiment of the application does not limit this.

[0198] The media channel between the first terminal device and the media function network element has been successfully established. This can be understood as the first terminal device being successfully called as the called terminal device or as the calling terminal device in an M2C call, C2M call, or C2C call, and the media channel between the first terminal device and the MF network element has been successfully established.

[0199] For example, as shown in Figure 6a, in step A19 above, after the DCAS network element receives the outgoing party's response from the VoLTE AS network element, the DCAS network element can determine that UE1 has been successfully called as the called terminal in the M2C call, and that the media channel between UE1 and the MF network element has been successfully established.

[0200] For example, as shown in Figure 6b, in step B16 above, after the DCAS network element receives the ANSWER message from the DCSF network element, the DCAS network element can determine that UE1 has successfully made a call as the calling terminal in the C2M call, and that the media channel between UE1 and the MF network element has been successfully established.

[0201] For example, as shown in Figure 6c, in step C19 above, after the DCAS network element receives the ANSWER message from the DCSF network element, the DCAS network element can determine that UE1 has successfully made a call as the calling terminal in the C2M call, and that the media channel between UE1 and the MF network element has been successfully established.

[0202] S503, the second network device sends the first information, and correspondingly, the first network device receives the first information.

[0203] In the embodiments of this application, the first information may be encapsulated or carried in an HTTP message, or it may be encapsulated or carried in other messages. The embodiments of this application do not limit this.

[0204] It is understood that the first information may include the identification information of the first terminal device and the identification information of the first service. The identification information of the first terminal device and the identification information of the first service are used to instruct the first network device to establish a first session with the first terminal device to implement the first service. This instruction method can be understood as an indirect instruction method. Alternatively, the first information may include first instruction information, which is used to instruct the first network device to establish a first session with the first terminal device to implement the first service. This instruction method can also be understood as a direct instruction method. This application does not limit this aspect.

[0205] For example, as shown in Figure 6d, taking the first service required by the first terminal device as the virtual character caller service, the first terminal device as UE1, the first network device as the SAAF network element, the second network device as the DCAS network element, the media function network element as the MF network element, and the voice service processing network element as the VoLTE AS network element as an example, after performing step A21 shown in Figure 6a above, this application can also perform the following steps.

[0206] Step A22: The DCAS network element sends a service activation request to the SAAF network element through the DCSF network element. Correspondingly, the SAAF network element receives the service activation request from the DCAS network element through the DCSF network element. The service activation request includes information indicating the virtual person caller service that UE1 needs to use.

[0207] Step A23: The SAAF network element sends a service initiation response to the DCAS network element through the DCSF network element. Correspondingly, the DCAS network element receives the service initiation response from the SAAF network element through the DCSF network element.

[0208] Step A24: The DCSF network element sends a media stream copy request to the MF network element, and correspondingly, the MF network element receives the media stream copy request from the DCSF network element.

[0209] Step A25: The MF network element sends a push media replication request to the SAAF network element, and correspondingly, the SAAF network element receives the push media replication request from the MF network element.

[0210] Step A26: The SAAF network element sends a push media replication response to the MF network element, and the MF network element receives the push media replication response from the SAAF network element.

[0211] Step A27: The MF network element sends a media stream replication response to the DCSF network element, and the corresponding DCSF network element receives the media stream replication response from the MF network element.

[0212] It can be understood that steps A24-A27 are used to copy the media channel between UE1 and the MF network element to the media channel between UE1 and the SAAF network element, that is, to establish a media channel between UE1 and the SAAF network element. The media channel between UE1 and the SAAF network element can be used to transmit the media information corresponding to the virtual character call service that UE1 needs to use.

[0213] For example, as shown in Figure 6e, taking the first services required by the first terminal device as the food ordering assistant service and the telecommunications service assistant service, the first terminal device as UE1, the first network device as the SAAF network element, the second network device as the DCAS network element, the media function network element as the MF network element, and the voice service processing network element as the VoLTE AS network element as an example, after executing step B16 shown in Figure 6b above, this application can also execute the following steps.

[0214] Step B17: The DCAS network element sends a service activation request to the SAAF network element through the DCSF network element. Correspondingly, the SAAF network element receives the service activation request from the DCAS network element through the DCSF network element. The service activation request includes information indicating the food ordering assistant service and telecommunications service assistant service that UE1 needs to use.

[0215] Step B18: The SAAF network element sends a service initiation response to the DCAS network element through the DCSF network element. Correspondingly, the DCAS network element receives the service initiation response from the SAAF network element through the DCSF network element.

[0216] Step B19: The DCSF network element sends a media stream copy request to the MF network element, and correspondingly, the MF network element receives the media stream copy request from the DCSF network element.

[0217] Step B20: The MF network element sends a push media replication request to the SAAF network element, and correspondingly, the SAAF network element receives the push media replication request from the MF network element.

[0218] Step B21: The SAAF network element sends a push media replication response to the MF network element, and the MF network element receives the push media replication response from the SAAF network element.

[0219] Step B22: The MF network element sends a media stream replication response to the DCSF network element, and the corresponding DCSF network element receives the media stream replication response from the MF network element.

[0220] It can be understood that steps B19-B22 are used to copy the media channel between UE1 and the MF network element to the media channel between UE1 and the SAAF network element, that is, to establish a media channel between UE1 and the SAAF network element. The media channel between UE1 and the SAAF network element can be used to transmit the media information corresponding to the food ordering assistant service and the telecommunications service assistant service that UE1 needs to use.

[0221] For example, as shown in Figure 6f, taking the service required by the first terminal device as enabling the third-party assistant service, the first terminal device as UE1, the first network device as SAAF network element, the second network device as DCAS network element, the media function network element as MF network element, the voice service processing network element as VoLTE AS network element, and the third-party application as a simulated star operation application as an example, after executing step C19 shown in Figure 6c above, this application can also execute the following steps.

[0222] In step C20, the DCAS network element sends a service initiation request to the SAAF network element through the DCSF network element. Correspondingly, the SAAF network element receives the service initiation request from the DCAS network element through the DCSF network element. The service initiation request includes information indicating the enabled third-party assistant service that UE1 needs to use.

[0223] Step C21: The SAAF network element sends a service initiation response to the DCAS network element through the DCSF network element. Correspondingly, the DCAS network element receives the service initiation response from the SAAF network element through the DCSF network element.

[0224] Step C22: The DCSF network element sends a media stream copy request to the MF network element, and correspondingly, the MF network element receives the media stream copy request from the DCSF network element.

[0225] Step C23: The MF network element sends a push media replication request to the SAAF network element, and correspondingly, the SAAF network element receives the push media replication request from the MF network element.

[0226] Step C24: The SAAF network element sends a push media replication response to the MF network element, and the MF network element receives the push media replication response from the SAAF network element.

[0227] Step C25: The MF network element sends a media stream replication response to the DCSF network element, and the corresponding DCSF network element receives the media stream replication response from the MF network element.

[0228] It can be understood that steps C22-C25 are used to copy the media channel between UE1 and the MF network element to the media channel between UE1 and the SAAF network element, that is, to establish a media channel between UE1 and the SAAF network element. The media channel between UE1 and the SAAF network element can be used to transmit media information corresponding to the enabling third-party assistant service that UE1 needs to use.

[0229] S504. The first network device and the first terminal device establish a first session. The first session is used to implement the first service.

[0230] In this embodiment of the application, after the first network device determines that the first terminal device needs to use the first service, it can establish a first session with the first terminal device to implement the first service.

[0231] In specific implementation, the first network device may send a first session establishment request to the first terminal device, and correspondingly, the first terminal device may receive the first session establishment request from the first network device. The first session establishment request can be used to request the establishment of a first session between the first network device and the first terminal device for implementing a first service.

[0232] The first terminal device can send a first session establishment response to the first network device, and correspondingly, the first network device can receive a first session establishment response from the first network device. The first session establishment response can be used to indicate that a first session for implementing the first service between the first network device and the first terminal device has been successfully established.

[0233] In one possible implementation, the first session establishment request may include second media information corresponding to the first service. The second media information corresponding to the first service may be one or more of audio, video, images, or text, and this application embodiment does not limit this. It is understood that the second media information corresponding to the first service can be used to indicate that the first session requested by the first session establishment request is for implementing the first service.

[0234] The first network device can generate the second media information corresponding to the first service in various ways. These will be described below.

[0235] Method 1: The first network device can generate second media information corresponding to the first service based on the parameter information of the first service and / or the first preference information of the first terminal device.

[0236] The first preference information of the first terminal device is related to the first service. For example, if the first service is a virtual character caller service, then the first preference information of the first terminal device could be the sports preference information of the user to whom the first terminal device belongs, such as the user liking basketball. As another example, if the first service is a food ordering assistant service and a telecommunications service assistant service, then the first preference information of the first terminal device could be the dining preference information and call package preference information of the user to whom the first terminal device belongs, such as the user liking Sichuan cuisine and calling package 1. As yet another example, if the first service is an enabled third-party assistant service and the matched third-party application is a simulated star trajectory application, then the first preference information of the first terminal device could be the star preference information of the user to whom the first terminal device belongs, such as the user liking Proxima Centauri.

[0237] For example, the first service is a virtual character call service. The parameter information for the first service may include the virtual character's identifier and opening remarks, such as the virtual character's name being Alice and the total number of characters in the opening remarks not exceeding 20. The first preference information of the first terminal device may be the sports preference information of the user to whom the first terminal device belongs, such as the user liking basketball. Then, the first network device can generate second media information corresponding to the first service. The second media information corresponding to the first service is a piece of text, the content of which is "Good morning, Alice is sending you good morning greetings. There's a basketball game at 10 o'clock, remember to watch!"

[0238] For example, the first service could be a food ordering assistant service and a telecommunications service assistant service. The parameter information for the first service could include authorization login information for the food ordering application and authorization login information for the telecommunications service application. The first preference information for the first terminal device could be the dining preferences and call package preferences of the user to whom the first terminal device belongs, such as the user liking Sichuan cuisine and calling package 1. Then, the first network device could generate second media information corresponding to the first service. This second media information is a text message with the content: "Hello, how can I help you? For example, if you like Sichuan cuisine, I've found several highly-rated Sichuan restaurants for you. If you're interested, I can help you make a reservation. Or, your preferred calling package 1 currently has a special offer. If you're interested, I can help you check the details and apply for it."

[0239] For example, the first service is to enable the third-party assistant service, and the matched third-party application is an application that simulates the orbital path of a star. The parameter information of the first service may include the access number information and the authorized login information of the third-party application. The first preference information of the first terminal device may be the star preference information of the user to which the first terminal device belongs. For example, if the user likes Proxima Centauri, then the first network device can generate the second media information corresponding to the first service. The second media information corresponding to the first service is a text message with the content "Hello, how can I help you now? For example, simulating the orbital path of Proxima Centauri."

[0240] Method 2: The first network device can determine the second description information based on the parameter information of the first service and / or the first preference information of the first terminal device, and interact with the AI ​​or ML model to obtain the second media information corresponding to the first service.

[0241] The second descriptive information describes the second objective task corresponding to the first business. The second descriptive information can be text and may consist of one or more of the following: roles, rules, objectives, context, examples, constraints, thought chain cues, user instructions, short-term memory, or long-term memory.

[0242] The second media information corresponding to the first service is generated based on the second descriptive information. This can be understood as follows: the first network device can input the second descriptive information into an AI or ML model, which can then perform semantic analysis on the second descriptive information and execute the second target task corresponding to the first service to generate the third media information corresponding to the first service. The third media information corresponding to the first service can be one or more of audio, video, images, or text; this embodiment does not limit this.

[0243] The first network device can use the third media information corresponding to the first service as the second media information corresponding to the first service. In other words, the second description information is the input information of the AI ​​or ML model, and the second media information corresponding to the first service is the output information of the AI ​​or ML model.

[0244] Alternatively, the first network device can also process the third media information corresponding to the first service (e.g., text-to-speech (TTS) processing) to obtain the second media information corresponding to the first service. In other words, the second descriptive information is the input information of the AI ​​or ML model, and the second media information corresponding to the first service is the processed output information of the AI ​​or ML model.

[0245] For example, the first service is a virtual character call service. The parameter information of the first service includes virtual character identification information and opening remarks, such as the virtual character's name being Alice and the total number of characters in the opening remarks not exceeding 20. The first preference information of the first terminal device is the sports preference information of the user to which the first terminal device belongs, such as the user liking basketball. The first network device can determine the second description information as shown in Table 3 below based on the virtual character identification information, the opening remarks information, and the sports preference information of the user to which the first terminal device belongs.

[0246] Table 3

[0247] The first network device inputs the second description information as shown in Table 3 into the AI ​​or ML model to generate the second media information corresponding to the first service. The second media information corresponding to the first service is a piece of text, the content of which is "Good morning, Alice is sending you good morning, there is a basketball game at 10 o'clock, remember to watch it".

[0248] For example, the first service is a food ordering assistant service and a telecommunications service assistant service. The parameter information of the first service includes the authorization login information of the food ordering application and the authorization login information of the telecommunications service application. The first preference information of the first terminal device is the dining preference information and call package preference information of the user to which the first terminal device belongs, such as the user liking Sichuan cuisine and liking call package 1. Then, the first network device can determine the second description information as shown in Table 4 below based on the authorization login information of the food ordering application, the authorization login information of the telecommunications service application, the dining preference information and call package preference information of the user to which the first terminal device belongs.

[0249] Table 4

[0250] The first network device inputs the second description information as shown in Table 4 into the AI ​​or ML model to generate the second media information corresponding to the first service. The second media information corresponding to the first service is a piece of text, the content of which is: "Hello, is there anything I can help you with? For example, if you like Sichuan cuisine, I have found several highly rated Sichuan restaurants for you. If you are interested, I can help you make a reservation. Or, your preferred call package 1 currently has a special offer. If you are interested, I can help you check the details and apply for it."

[0251] For example, if the first service is to enable the third-party assistant service and the matched third-party application is a simulation of star orbits, the parameter information of the first service includes the third-party application access number information and the third-party application authorization login information. The first preference information of the first terminal device can be the star preference information of the user to which the first terminal device belongs, such as the user liking Proxima Centauri. Then, the first network device can determine the second description information as shown in Table 5 below based on the third-party application access number information, the third-party application authorization login information, and the star preference information of the user to which the first terminal device belongs.

[0252] Table 5

[0253] The first network device inputs the second description information as shown in Table 5 into the AI ​​or ML model to generate the second media information corresponding to the first service. The second media information corresponding to the first service is a piece of text, the content of which is "Hello, is there anything I can help you with? For example, simulating the orbit of Proxima Centauri".

[0254] In one possible implementation, as shown in FIG6g, after executing S503, the present application may further perform the following steps.

[0255] S503a, the first network device sends the identification information of the first terminal device and the identification information of the first service to the third network device. Correspondingly, the third network device receives the description information of the first terminal device and the identification information of the first service from the first network device. The third network device is used to store the terminal device's preference information.

[0256] S503b: The third network device sends the first preference information of the first terminal device to the first network device, and correspondingly, the first network device receives the first preference information of the first terminal device from the third network device. The first preference information is related to the first service.

[0257] S505. The first network device receives user input information from the first terminal device through a first session, and determines first description information based on the user input information. The first description information describes the first target task corresponding to the first service.

[0258] In this embodiment of the application, the user input information may be one or more of audio, video, image or text, and this embodiment of the application does not limit this.

[0259] The first descriptive information can be text and consists of one or more of the following: roles, rules, goals, context, examples, constraints, mind chain cues, user instructions, short-term memory, or long-term memory.

[0260] In the specific implementation process, the first network device can determine the first description information based on the user input information and the task description information template corresponding to the first service.

[0261] For example, the first service is a virtual character call service. The user input information is a text message that reads, "Teacher Alice, when is your next TV series?" After receiving the user input information from the first terminal device through the first session, the first network device can obtain the first description information as shown in Table 4 based on the user input information and the task description information template corresponding to the first service.

[0262] Table 6

[0263] For example, the first service is the food ordering assistant service and the telecommunications service assistant service. The user input information is a text message that reads, "Introduce the benefits of call package 1." After receiving the user input information from the first terminal device through the first session, the first network device can determine, based on the user input information, that the service currently needed by the first terminal device is the telecommunications service assistant service between the food ordering assistant service and the telecommunications service assistant service. Then, based on the user input information and the task description information template corresponding to the telecommunications service assistant service, it obtains the first description information as shown in Table 7.

[0264] Table 7

[0265] It is understandable that when the first service includes multiple services, the first network device can select the service that the first terminal device needs to use from the multiple services according to the user input information, and then determine the first description information according to the user input information and the task description information template corresponding to the service that the first terminal device needs to use.

[0266] For example, the first service is to enable the third-party assistant service, and the matched third-party application is a simulation of star orbit. The user input information is a text message containing the phrase "simulate the orbit of Proxima Centauri". After receiving the user input information from the first terminal device through the first session, the first network device can obtain the first description information as shown in Table 8 based on the user input information and the task description information template corresponding to the first service.

[0267] Table 8

[0268] In one possible implementation, different services correspond to different task description information templates. For example, the correspondence between services and task description information templates is shown in Table 9 below.

[0269] Table 9

[0270] It is understood that different task description information templates corresponding to different services can be generated by the first network device or by a third-party application. For example, the task description information templates for the virtual character caller service, the food ordering assistant service, and the telecommunications service assistant service are generated by the first network device, while the task description information template for enabling the third-party assistant service is generated by the third-party application that matches the enabling third-party assistant service. This application does not limit this aspect.

[0271] The task description information template generated by the first network device can be stored in the first network device. When the second network device determines that the first terminal device needs to use the first service, it can send the identification information of the first service to the first network device through the first information. The first network device can then determine the task description information template corresponding to the first service based on the identification information of the first service. For example, if the first service is service 1, then the first network device can determine that the task description information template corresponding to the first service is task description information template 1.

[0272] The task description information template generated by the third-party application can be stored in the second network device. For example, as shown in Figure 6c above, in step C0, the registration request for the third-party application may include the task description information template corresponding to enabling the third-party assistant service generated by the third-party application. After the second network device determines that the first terminal device needs to use the enabled third-party assistant service, it can send the task description information template corresponding to enabling the third-party assistant service to the first network device through the first information.

[0273] S506. The first network device interacts with an AI or ML model to obtain first media information corresponding to the first service. The AI ​​or ML model has semantic analysis capabilities, and the first media information is generated based on the first descriptive information.

[0274] In this embodiment of the application, after the first network device determines the first description information based on the user input information, it can interact with the AI ​​or ML model to obtain the first media information corresponding to the first service.

[0275] The first media information corresponding to the first service can be one or more of audio, video, images, or text.

[0276] The first media information corresponding to the first service is generated based on the first description information. This can be understood as follows: the first network device can input the first description information into an AI or ML model, and the AI ​​or ML model can perform semantic analysis on the first description information and then execute the first target task corresponding to the first service to generate the fourth media information corresponding to the first service. The fourth media information corresponding to the first service can be one or more of audio, video, images, or text; this application embodiment does not limit this.

[0277] The first network device can use the fourth media information corresponding to the first service as the first media information corresponding to the first service. That is, the first description information is the input information of the AI ​​or ML model, and the first media information corresponding to the first service is the output information of the AI ​​or ML model.

[0278] Alternatively, the first network device can also process the fourth media information corresponding to the first service (e.g., TTS processing) to obtain the first media information corresponding to the first service. That is, the first description information is the input information of the AI ​​or ML model, and the first media information corresponding to the first service is the processed output information of the AI ​​or ML model.

[0279] For example, the first network device inputs the first description information shown in Table 6 above into the AI ​​or ML model to generate the first media information corresponding to the first service. The first media information is a piece of text, the content of which is "There is a basketball game by Jack on the 15th of next month, stay tuned".

[0280] For example, the first network device inputs the first description information shown in Table 7 above into the AI ​​or ML model to generate the first media information corresponding to the first service. The first media information is a piece of text, the content of which is "Call Package 1 currently offers an automatic 10 yuan discount after the first top-up of 100 yuan. Would you like to apply?"

[0281] For example, the first network device inputs the first description information shown in Table 8 above into the AI ​​or ML model to generate the first media information corresponding to the first service. The first media information is a video, and the content of the video is the orbital path of Proxima Centauri.

[0282] In one possible implementation, as shown in FIG6g, after executing S506, the present application may further perform the following steps.

[0283] S506a. The first network device sends the first media information corresponding to the first service to the first terminal device through the first session. Correspondingly, the first terminal device receives the first media information corresponding to the first service from the first network device through the first session.

[0284] According to the above scheme, the second network device can instruct the first network device that the first terminal device needs to use the first service, so that the first network device can determine that the AI ​​or ML model needs to perform the task corresponding to the first service. Then, based on the user input information from the first terminal device, the description information of the first target task corresponding to the first service that the AI ​​or ML model needs to perform is determined, which improves the accuracy of the AI ​​or ML model in performing the task and avoids the problem of answering the wrong question.

[0285] In the embodiments provided above, the methods provided by the embodiments of this application are described using the execution of a first network device and a second network device as examples. In this application, each embodiment can be implemented independently or in combination based on certain inherent connections; in each embodiment, different implementation methods can be implemented in combination or independently. To achieve the functions in the methods provided by the embodiments of this application above, the steps executed by the terminal device can be implemented by different functional entities constituting the terminal device. The steps executed by the network device can be implemented by different functional entities constituting the network device. To achieve the functions in the methods provided by the embodiments of this application above, the first network device and the second network device may include hardware structures and / or software modules, implementing the above functions in the form of hardware structures, software modules, or hardware structures plus software modules. Whether a certain function is executed in the form of hardware structures, software modules, or hardware structures plus software modules depends on the specific application and design constraints of the technical solution.

[0286] The methods provided by the embodiments of this application have been described above with reference to the accompanying drawings. The apparatus provided by the embodiments of this application will be described below with reference to the accompanying drawings.

[0287] Based on the same technical concept, embodiments of this application provide a communication device, which includes a module / unit / means for executing the method performed by the device in the above-described method embodiments. This module / unit / means can be implemented in software, or in hardware, or implemented by hardware executing corresponding software.

[0288] For example, referring to Figure 7, which is a schematic diagram of a communication device 700, the device 700 includes a transceiver module 701 and a processing module 702. This device can be the first network device or the second network device described above.

[0289] When the device 700 is a first network device, the functions of each module of the device 700 are as follows:

[0290] Transceiver module 701 is used to receive first information, the first information being used to indicate the first service that the first terminal device needs to use;

[0291] Processing module 702 is used to establish a first session with the first terminal device, the first session being used to implement the first service;

[0292] The processing module 702 is further configured to receive user input information from the first terminal device through the first session, and determine first description information based on the user input information, wherein the first description information is used to describe the first target task corresponding to the first service;

[0293] The processing module 702 is also used to interact with an AI or ML model to obtain first media information corresponding to the first service. The AI ​​or ML model has semantic analysis function, and the first media information is generated based on the first description information.

[0294] In one possible implementation, the transceiver module 701 is further configured to send the first media information to the first terminal device through the first session.

[0295] In one possible implementation, the processing module 702 is further configured to determine the first description information based on the user input information and the task description information template corresponding to the first service.

[0296] In one possible implementation, the first service includes multiple services; the processing module 702 is further configured to select one service from the multiple services based on the user input information; and determine the first description information based on the user input information and the task description information template corresponding to the service.

[0297] In one possible implementation, the first information includes the identification information of the first service; the processing module 702 is further configured to determine the task description information template corresponding to the first service based on the identification information of the first service.

[0298] In one possible implementation, the first information includes a task description information template corresponding to the first service.

[0299] In one possible implementation, the processing module 702 is further configured to determine second description information based on the parameter information of the first service and / or the first preference information of the first terminal device, wherein the first preference information is related to the first service and the second description information is used to describe the second target task corresponding to the first service; the processing module 702 is further configured to interact with the AI ​​or ML model to obtain second media information corresponding to the first service, wherein the second media information is generated based on the second description information; the transceiver module 701 is further configured to send a first session establishment request to the first terminal device, wherein the first session establishment request includes the second media information; the transceiver module 701 is further configured to receive a first session establishment response from the first terminal device.

[0300] In one possible implementation, the first information includes the identification information of the first terminal device and the identification information of the first service; the transceiver module 701 is further configured to send the identification information of the first terminal device and the identification information of the first service to a third network device, wherein the third network device is configured to store the preference information of the terminal device; and to receive the first preference information from the third network device.

[0301] Alternatively, when the device 700 is a second network device, the functions of each module of the device 700 are as follows:

[0302] Processing module 702 is used to determine first information, the first information being used to indicate the first service that the first terminal device needs to use;

[0303] The transceiver module 701 is used to receive second information, which indicates that a media channel between the first terminal device and the media function network element has been successfully established, and the media channel is used to transmit media information corresponding to the first service.

[0304] The transceiver module 701 is used to send the first information to the first network device according to the second information, instructing the first network device to establish a first session with the first terminal device, and the first session is used to implement the first service.

[0305] In one possible implementation, before receiving the second information, the processing module 702 is further configured to establish the media channel between the first terminal device and the media function network element through the voice service processing network element when the first condition is met or when the second session establishment request is received; wherein, the second session establishment request is used to request the establishment of a session between the first terminal device and the second terminal device or the first network device.

[0306] In one possible implementation, the first condition includes the identification information of the first terminal device, the identification information of the first service, and the condition for triggering the first service. The processing module 702 is specifically used to determine the first information based on the identification information of the first terminal device and the identification information of the first service when the first condition is met.

[0307] In one possible implementation, the second session establishment request includes call information. The processing module 702 is specifically configured to, upon receiving the second session establishment request, determine the first service from the subscribed services of the first terminal device based on the call information.

[0308] In one possible implementation, the call information includes one or more of the following: calling number, called number, or call type.

[0309] In one possible implementation, the first information includes one or more of the following: identification information of the first terminal device; identification information of the first service; parameter information of the first service; and a task description information template corresponding to the first service.

[0310] In one possible implementation, the first information includes a task description information template corresponding to the first service. Before receiving the second information, the transceiver module 701 is further configured to receive a registration request from a third-party application. The registration request includes the identification information of the first service and the task description information template corresponding to the first service.

[0311] In one possible implementation, the processing module 702 is further configured to determine the first service from the subscribed services of the first terminal device based on the call information of the first terminal device.

[0312] In one possible implementation, the call information includes one or more of the following: calling number, called number, or call type.

[0313] In one possible implementation, the first information includes one or more of the following: identification information of the first terminal device; identification information of the first service; parameter information of the first service; and a task description information template corresponding to the first service.

[0314] In practical implementation, the above-mentioned device 700 can have various product forms. Several possible product forms are introduced below.

[0315] Referring to Figure 8, which is a schematic diagram of another communication device, the communication device 800 includes a processor 801, which uses logic circuits or execution instructions to implement the methods executed by the communication device or terminal device in the above method embodiments.

[0316] Optionally, the communication device 800 may further include an interface circuit 802, which is used to receive signals from other communication devices outside the communication device and transmit them to the processor 801, or to send signals from the processor 801 to other communication devices outside the communication device. The processor 801 and the interface circuit 802 are coupled to each other. It is understood that the interface circuit 802 can be a transceiver or an input / output interface.

[0317] Optionally, the communication device 800 may also include a memory 803 for storing instructions executed by the processor 801, or storing input data required by the processor 801 to execute instructions, or storing data generated after the processor 801 executes instructions.

[0318] It should be understood that the processor mentioned in the embodiments of this application can be implemented in hardware or software. When implemented in hardware, the processor can be a logic circuit, integrated circuit, etc. When implemented in software, the processor can be a general-purpose processor, implemented by reading software code stored in memory.

[0319] For example, the processor can be a central processing unit (CPU), or other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), microprocessor units (MPUs), microcontroller units (MCUs), graphics processing units (GPUs), field-programmable gate arrays (FPGAs), artificial intelligence processors (AI processors) or neural processing units (NPUs), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor can be a microprocessor or any conventional processor.

[0320] It should be understood that the memory mentioned in the embodiments of this application can be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory. Non-volatile memory can be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), or flash memory. Volatile memory can be cache or random access memory (RAM), which is used as an external cache. By way of example, but not limitation, many forms of RAM are available, such as static random access memory (SRAM), dynamic random access memory (DRAM), synchronous dynamic random access memory (SDRAM), double data rate synchronous dynamic random access memory (DDR SDRAM), enhanced synchronous dynamic random access memory (ESDRAM), synchronous linked dynamic random access memory (SLDRAM), and direct rambus RAM (DR RAM).

[0321] It is understandable that when the processor is a general-purpose processor, DSP, ASIC, FPGA, or other programmable logic device, discrete gate or transistor logic device, or discrete hardware component, the memory (storage module) can be integrated into the processor.

[0322] It should be noted that the memories described herein are intended to include, but are not limited to, these and any other suitable types of memories.

[0323] Based on the same technical concept, embodiments of this application also provide a computer-readable storage medium storing a computer program or instructions, which, when executed by a processor, causes the methods executed by the first network device and the second network device in the above method embodiments to be implemented.

[0324] Based on the same technical concept, this application also provides a computer program product, which includes a computer program or instructions that, when executed by a processor, cause the methods executed by the first network device and the second network device in the above method embodiments to be implemented.

[0325] Those skilled in the art will understand that embodiments of this application can be provided as methods, systems, or computer program products. Therefore, this application can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, this application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

[0326] This application is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to this application. It should be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in one or more blocks of the flowchart illustrations and / or one or more blocks of the block diagrams.

[0327] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means that implement the functions specified in one or more flowcharts and / or one or more block diagrams.

[0328] These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer-implemented process, such that the instructions, which execute on the computer or other programmable apparatus, provide steps for implementing the functions specified in one or more flowcharts and / or one or more block diagrams.

Claims

1. A communication method, characterized in that, Applied to a first network device, the method includes: Receive first information from a second network device, the first information being used to indicate a first service that the first terminal device needs to use; A first session is established with the first terminal device, and the first session is used to implement the first service; The system receives user input information from the first terminal device through the first session, and determines first description information based on the user input information. The first description information is used to describe the first target task corresponding to the first service. Interacting with an artificial intelligence (AI) or machine learning (ML) model to obtain first media information corresponding to the first service, wherein the AI ​​or ML model has semantic analysis capabilities, and the first media information is generated based on the first descriptive information.

2. The method according to claim 1, characterized in that, The method further includes: The first media information is sent to the first terminal device through the first session.

3. The method according to claim 1 or 2, characterized in that, Based on the user input information, the first description information is determined, including: The first description information is determined based on the user input information and the task description information template corresponding to the first service.

4. The method according to claim 3, characterized in that, The first service includes multiple services; based on the user input information and the task description information template corresponding to the first service, the first description information is determined, including: Based on the user input information, select one service from the plurality of services; The first description information is determined based on the user input information and the task description information template corresponding to the service.

5. The method according to claim 3 or 4, characterized in that, The first information includes the identification information of the first service; the method further includes: Based on the identification information of the first service, determine the task description information template corresponding to the first service.

6. The method according to any one of claims 1-5, characterized in that, The first information includes a task description information template corresponding to the first service.

7. The method according to any one of claims 1-6, characterized in that, Establishing a first session with the first terminal device includes: Based on the parameter information of the first service and / or the first preference information of the first terminal device, second description information is determined, wherein the first preference information is related to the first service, and the second description information is used to describe the second target task corresponding to the first service. Interact with the AI ​​or ML model to obtain second media information corresponding to the first service, wherein the second media information is generated based on the second description information; Send a first session establishment request to the first terminal device, wherein the first session establishment request includes the second media information; Receive a first session establishment response from the first terminal device.

8. The method according to any one of claims 1-7, characterized in that, The first information includes the identification information of the first terminal device and the identification information of the first service; the method further includes: The identification information of the first terminal device and the identification information of the first service are sent to a third network device, wherein the third network device is used to store the preference information of the terminal device. Receive the first preference information from the third network device.

9. A communication method, characterized in that, The method, using a second network device, includes: Determine first information, which is used to indicate the first service that the first terminal device needs to use; Receive second information, the second information being used to indicate that a media channel between the first terminal device and the media function network element has been successfully established, the media channel being used to transmit media information corresponding to the first service; The first information is sent to the first network device according to the second information, instructing the first network device to establish a first session with the first terminal device, and the first session is used to implement the first service.

10. The method according to claim 9, characterized in that, Before receiving the second information, the method further includes: If the first condition is met or a second session establishment request is received, the media channel is established between the first terminal device and the media function network element through the voice service processing network element. The second session establishment request is used to request the establishment of a session between the first terminal device and the second terminal device or the first network device.

11. The method according to claim 10, characterized in that, The first condition includes the identification information of the first terminal device, the identification information of the first service, and the condition for triggering the first service. Determining the first information includes: If the first condition is met, the first information is determined based on the identification information of the first terminal device and the identification information of the first service.

12. The method according to claim 10, characterized in that, The second session establishment request includes call information, and the determination of the first information includes: Upon receiving the second session establishment request, the first service is determined from the subscribed services of the first terminal device based on the call information.

13. The method according to claim 12, characterized in that, The call information includes one or more of the following: the calling number, the called number, or the call type.

14. The method according to any one of claims 9-13, characterized in that, The first information includes one or more of the following: The identification information of the first terminal device; The identification information of the first service; The parameter information of the first service; The task description information template corresponding to the first service.

15. The method according to claim 14, characterized in that, The first information includes a task description information template corresponding to the first service. Before receiving the second information, the method further includes: Receive a registration request from a third-party application, the registration request including the identification information of the first service and the task description information template corresponding to the first service.

16. A communication device, characterized in that, The communication device includes a module for performing the method as described in any one of claims 1 to 8, or a module for performing the method as described in any one of claims 9 to 15.

17. A communication device, characterized in that, The communication device includes a processor, which is configured to perform the method as described in any one of claims 1 to 8, or the method as described in any one of claims 9 to 15.

18. A computer-readable storage medium, characterized in that, The computer-readable storage medium is used to store a computer program that, when run on a computer, causes the method as described in any one of claims 1 to 8 to be performed, or causes the method as described in any one of claims 9 to 15 to be performed.

19. A computer program product, characterized in that, The computer program product includes a computer program that, when run on a computer, causes the method as described in any one of claims 1 to 8 to be performed, or causes the method as described in any one of claims 9 to 15 to be performed.