Handling of a natural-language service provision request, and corresponding entities and computer programs
By employing vectorization and homomorphic encryption, the chatbot system ensures secure and confidential data processing and response generation, addressing privacy concerns in chatbot data management.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- ORANGE SA
- Filing Date
- 2025-12-17
- Publication Date
- 2026-06-25
Smart Images

Figure EP2025087539_25062026_PF_FP_ABST
Abstract
Description
Processing a request for the provision of a service in natural language, corresponding entities and computer programs
[0001] The invention belongs to the general field of communications, for example in the field of telecommunications.
[0002] It falls more specifically within the context of conversational agents or "chatbots" in English.
[0003] The invention more specifically aims at mechanisms to improve the processing of requests issued by user equipment.
[0004] Chatbots, also known as conversational agents, are essential tools in the modern digital landscape. These computer programs, capable of simulating a conversation with a user, are deployed across a variety of sectors to simplify interactions, automate tasks, and offer solutions that are accessible at any time.
[0005] More specifically, a conversational agent is designed to answer questions or perform tasks by interpreting requests received from user devices via a text or voice interface of the latter.
[0006] There are two main types of chatbots. The first type, based on a set of rules, follows pre-programmed scenarios and offers limited responses within a defined framework; this is the case, for example, with the interactive menus found on some websites. The second type, based on the use of artificial intelligence, uses technologies such as Natural Language Processing (NLP) and machine learning to analyze the context of the query and offer more fluid and personalized responses. Finally, a third type, called a hybrid, combines these two approaches to offer a balance between accuracy and adaptability.
[0007] The operation of a conversational agent is based on three main steps: receiving a request or service request, analyzing it, and providing a response to the user who made the service request.
[0008] When a chatbot receives a request from a user device (UE), it first breaks down the request using natural language processing algorithms to understand its meaning. Next, it determines the best possible response to the request using predefined rules or artificial intelligence models. Finally, the response is formulated as text or a voice message and sent back to the UE.
[0009] The advantages of chatbots are numerous. Available at any time, they can provide a quick response to a large number of simultaneous requests, thus reducing wait times and operational costs. Their adaptability, particularly in versions based on artificial intelligence, allows them to provide personalized and contextual responses.
[0010] The widespread use of chatbots raises concerns about the confidentiality of data exchanged during service requests, particularly when this data is sensitive, such as personal information like the user's name, postal or email address, behavioral data from past interactions with the chatbot, or even medical or financial information. This data is used to personalize responses, improve the chatbot's performance, or train the underlying artificial intelligence models.
[0011] However, collecting and handling this data when processing a service request is not without risk. Poor information management or a security breach can lead to privacy violations, thereby undermining user trust.
[0012] Consequently, there is a need for solutions to improve the protection of data used by a chatbot in the context of processing a service request.
[0013] To this end, and according to a first aspect, the invention relates to a communication method implemented by a first communication network management entity implementing a vectorization function, the method comprising: the generation of a plurality of encrypted vectors representing information relating to a service, obtained from an orchestration entity distinct from the first management entity, by applying said vectorization function to said information; the generation of a first encrypted vector by applying said vectorization function to a request for the provision of said service, said request being transmitted from a linguistic model implemented by a second communication network management entity; and the selection, from among the plurality of encrypted vectors, of at least a second encrypted vector.at least one second encrypted vector being selected by means of a comparison between the first encrypted vector and the plurality of encrypted vectors, the transmission, to the orchestration entity, of said at least one second selected encrypted vector.
[0014] The invention also relates to a method for orchestrating the processing of a service request implemented by an orchestration entity belonging to a telecommunications network, the method comprising: the transmission, to a first management entity (HE), distinct from the orchestration entity, of the communication network implementing a vectorization function of information relating to a service intended to be vectorized in the form of a plurality of encrypted vectors, and the reception, from the first management entity, of at least one second encrypted vector from among the plurality of encrypted vectors, representing information relating to said service, selected by means of a comparison between a first encrypted vector, representing a request for the provision of said service and the plurality of encrypted vectors, the transmission, to the second management entity,of an identifier of at least one second selected encrypted vector,
[0015] Correspondingly, the invention relates to a method for processing a request for the provision of a service by a second management entity of a communication network, said communication network further comprising a first management entity implementing a vectorization function, said second management entity implementing a linguistic model, the method comprising: the generation of a request for the provision of a service based on a set of parameters obtained beforehand, the transmission to the first management entity (HE) of said request for vectorization into a first vector and encryption of the first vector, the reception, from the orchestration entity, of at least one identifier of at least one second encrypted vector representing information relating to said service, the at least one second vector having been previously selected by means of a comparison with the first encrypted vector,representative of the request for the provision of said service and a plurality of encrypted vectors, representative of information relating to said service, the generation of a response to the request for the provision of said service based on the identifier of at least one second encrypted vector received.
[0016] The invention relies on the use of vectorization and encryption of vectorized data to ensure the confidentiality of data exchanged between the various entities involved in processing a service request. Thus, only the user's equipment that initiated the service request and the entities possessing the encryption function have access to the unencrypted content of the service request and the message containing the response to that service request. These entities include, in particular, the management entity implementing the vectorization function and the management entity implementing the language model.
[0017] In the present invention, a conversational agent relies on the use of several entities that collaborate to process a service request issued by a user device (UE). Thus, the conversational agent comprises a service request processing orchestration entity, a first management entity, distinct from the orchestration entity, implementing vectorization, and a second management entity implementing a language model. These different entities can be distributed within a communications network or co-located, for example, within the same data center.
[0018] The invention thus combines two new functions in the first management entity: a vectorization function and a comparison function, for example, of a semantic type. The vectorization is complemented by a vector encryption function, for example, of a homographic type. These vectors are communicated to the second management entity, which implements a linguistic model in prompts. The orchestration entity and the linguistic model call these vectorization and comparison functions to generate and compare the vectors. The orchestration entity thus delegates the production of the vectors to the first management entity, which is separate from the orchestration entity. The orchestration entity is responsible for calling the vectorization function and returning the URLs of the vectors stored in a database.
[0019] First, the orchestration entity requests the vectorization function to process the data in natural language. Then, the language model requests the vectorization function for the query. The vectors returned by this vectorization function are then used in the comparison function. A score is returned from this function to the second management entity, which uses this score to return the most relevant solution.
[0020] The orchestration entity can implement a RAG (Retrieval-Augmented Generation) function. Such a function combines two complementary steps: information retrieval and text generation. Thus, the orchestration entity implements a portion of the RAG function that involves searching for relevant data in one or more distributed databases, a document set, or external sources, using a search engine or a specialized model.
[0021] A second part of the RAG function concerns the generation of a response intended to be presented to the user of the EU. More specifically, this part of the RAG function is implemented by a generative model, such as the linguistic model discussed earlier.
[0022] Thus, the linguistic model uses this information to produce a coherent and contextual text. This method makes it possible to generate accurate and up-to-date responses, even in situations where the model's static knowledge is insufficient or outdated.
[0023] Finally, the first entity implements a vectorization function. In the solution object of the invention, the vectorization function combines a vectorization function and an encryption function.
[0024] A vectorization function allows complex objects to be represented as numerical vectors while preserving the essential relationships between these objects. More specifically, this technique relies on transforming data into a vector space, where initial similarities or relationships, such as semantic proximities or logical correspondences, are maintained. For example, in natural language processing, algorithms like Word2Vec or BERT allow words to be represented in such a way that their semantic relationships are reflected by vector operations.
[0025] Other transformations also allow us to assess the extent to which two elements (texts, structures, objects or concepts) share a similar meaning or significance, including in the field of cryptography when the transformations give similar or identical results without revealing the original data.
[0026] The encryption function can be, among other things, a homomorphic encryption function. A homomorphic encryption function allows operations to be performed on encrypted data without requiring decryption. This encryption technique is based on a single principle: operations performed on encrypted data, once decrypted, produce a result identical to that obtained if the calculations had been performed on plaintext data.
[0027] Homomorphic encryption is given as an example, but any encryption solution exhibiting the characteristics of homomorphic encryption, namely manipulating, processing, or analyzing data securely without compromising its confidentiality, i.e., not requiring decryption of the data to perform operations on the data, can be used.
[0028] There are two main forms of homomorphic encryption: partial homomorphism, which limits possible operations to addition or multiplication, and full homomorphism, or FHE for "Fully Homomorphic Encryption," which allows complex calculations combining these two types of operations. This capability makes homomorphic encryption particularly well-suited to contexts where confidentiality is paramount. For example, it allows companies to process sensitive data in distributed databases without ever exposing this information in plaintext. Similarly, in the medical field, it guarantees the security of patient data while facilitating its analysis.
[0029] In particular implementations of the communication process, the selection of said at least one second cipher vector includes the determination of a score from the comparison between the first cipher vector and the plurality of cipher vectors.
[0030] In particular implementations of the communication process, which of the at least second selected cipher vector is the vector with the highest score?
[0031] In particular implementations of the communication process, said at least one second selected cipher vector is the vector whose score is greater than or equal to a predetermined threshold.
[0032] In particular implementations of the communication process, the plurality of encrypted vectors representing information relating to a service is associated with a conversational agent exploiting the linguistic model.
[0033] In particular implementations of the communication process, prior to the reception of the first encrypted vector, it includes: the reception, from the orchestration entity, of a message including at least one instruction intended to be used by the language model to process the request for the provision of said service, the vectorization of at least one received instruction and the encryption of the vector representing at least one received instruction and the transmission, to the management entity, of the encrypted vector representing at least one received instruction and a set of parameters of the vectorization function.
[0034] The instructions for processing the service request constitute what is called a "prompt system".
[0035] In general, a "prompt" is a set of instructions provided to the language model so that it generates a response to a request issued by a UE. There are three types of "prompts": "user prompts", "assistant prompts", and "system prompts".
[0036] A "user prompt" corresponds to the request issued by a UE such as, for example, "what is the best restaurant in Lannion?".
[0037] A "prompt assistant" represents the response generated by the language model, such as, "The best restaurant in Lannion is 'Chez Orange' and is located near Lannion airport." A "prompt assistant" is a function of the "user prompt" being processed and the "system prompt."
[0038] Finally, a "prompt system" provides an overall context for processing received requests and influences how the exchanges between the language model and the EU take place. A "prompt system" includes a set of instructions defining the behavior of the language model (e.g., advise, inform, etc.), the tone of the conversation (formal, friendly, professional), the specific knowledge required (remain factual or be creative in the answers given), and the limitations to be applied when a response is generated (do not give medical advice, do not give illegal answers, etc.).
[0039] An example of a "prompt system" might be: "respond concisely and informatively, in a friendly tone" or "provide detailed and factual explanations. Avoid opinions."
[0040] The parameter set of the vectorization function includes all the parameters that determine how textual data is transformed into numerical vectors usable by a machine learning model.
[0041] Such a set of parameters includes, among other things: parameters representing the preprocessing applied to the data to be vectorized, such as cleaning the texts to remove special or unnecessary characters, converting to lowercase to standardize the data, removing stopwords (frequent but uninformative words, such as "the", "and", "of") and using techniques like "stemming" which reduce words to their root or canonical form; a parameter relating to the size of the vocabulary, which defines the number of words taken into account; it can be chosen, for example, to keep only the most frequent words or exclude those appearing too rarely; parameters relating to the representation of the vectors: the generated vectors can be sparse, that is, contain mainly zeros, or dense, that is, with continuous values;parameters relating to the dimensionality of the vectors, i.e. the number of coefficients constituting the vectors; dimensionality is a function of the vectorization method chosen and the size of the vocabulary; parameters representing the level of analysis to be applied to the data; the data can, in fact, be analyzed at the level of characters, words, or using n-grams (sequences of words or characters, such as bigrams or trigrams); and parameters for generating missing values, for example by filling in the vectors for words absent from the vocabulary or incomplete texts.
[0042] In particular implementations of the communication process, the comparison between the first ciphertext vector and the plurality of ciphertext vectors is a semantic type comparison.
[0043] In particular implementations of the communication process, the encryption function applied to the vectors obtained by applying the vectorization function is an encryption function identical or equivalent to a homomorphic encryption function.
[0044] In particular implementations of the orchestration process, the sending to the first management entity of a message including at least one instruction intended to be used by the language model of the second management entity to process the request for the provision of said service, the reception, from the first management entity: of a set of parameters of the vectorization function implemented by the first management entity, and of a third encrypted vector representing at least one instruction, the transmission to the second management entity implementing the language model of an initialization message including the set of parameters of the vectorization function and an identifier of the third encrypted vector representing at least one instruction.
[0045] The invention further relates to a telecommunications network entity implementing a vectorization function, said entity comprising: a generation module configured to generate a plurality of ciphertext vectors representing information relating to a service, obtained from an orchestration entity distinct from the first management entity, by applying said vectorization function to said information; a generation module configured to generate a first ciphertext vector by applying said vectorization function to a request for the provision of said service, said request being transmitted from a linguistic model implemented by a second management entity of the communication network; and the selection, from among the plurality of ciphertext vectors, of at least one ciphertext vector, the at least one second ciphertext vector being selected by means of a comparison between the first ciphertext vector and the plurality of ciphertext vectors.a transmission module configured to transmit, to the orchestration entity, said at least one second selected encrypted vector.
[0046] The invention also relates to an orchestration entity for processing a service request belonging to a telecommunications network, the orchestration entity comprising: a transmission module configured to transmit, to a first management entity, distinct from the communication network orchestration entity implementing a vectorization function, information relating to a service intended to be vectorized in the form of a plurality of encrypted vectors, said information being obtained from a second communication network management entity implementing a linguistic model, and a reception module configured to receive, from the first management entity, at least one second encrypted vector from among the plurality of encrypted vectors, representative of information relating to said service, selected by means of a comparison with a first encrypted vector,representative of a request for the provision of said service and the plurality of encrypted vectors, a transmission module configured to transmit, to the second management entity, an identifier of at least one second selected encrypted vector.
[0047] The invention also relates to a telecommunications network entity implementing a linguistic model capable of processing a service request, the entity comprising: a generation module configured to generate a service request based on a set of previously obtained parameters, a transmission module configured to transmit said request to the first management entity (HE) for vectorization into a first vector and encryption of the first vector, a reception module configured to receive, from the orchestration entity, at least one identifier of at least one second encrypted vector representing information relating to said service, the at least one second vector having been previously selected by means of a comparison between the first encrypted vector, representing the service request, and a plurality of encrypted vectors.representative of information relating to said service, a generation module configured to generate a response to the request for the provision of said service based on the identifier of at least one second encrypted vector received.
[0048] Finally, the invention relates to a service request processing system comprising at least: an entity implementing a vectorization function according to the invention, an entity orchestrating the processing of a service request according to the invention, and a telecommunications network entity implementing a linguistic model capable of processing a service request according to the invention.
[0049] The invention also relates to a computer program on a recording medium, this program being capable of being implemented in a computer or more generally in an AP conforming to the invention and comprising instructions adapted to the implementation of a communication method as described above.
[0050] The invention further relates to a computer program on a recording medium, this program being capable of being implemented in a computer or more generally in an AP conforming to the invention and comprising instructions adapted to the implementation of a processing method as described above.
[0051] The invention finally relates to a computer program on a recording medium, this program being capable of being implemented in a computer or more generally in an AP conforming to the invention and comprising instructions adapted to the implementation of an orchestration process as described above.
[0052] Each of these programs can use any programming language, and be in the form of source code, object code, or code somewhere between source code and object code, such as in a partially compiled form, or in any other desirable form.
[0053] The invention also relates to an information carrier or a recording medium readable by a computer, and comprising instructions for a computer program as mentioned above.
[0054] The information or recording medium can be any entity or device capable of storing programs. For example, the medium may include a storage means, such as a ROM, for example a CD-ROM or a microelectronic circuit ROM, or a magnetic recording means, for example a hard drive, or a flash memory.
[0055] On the other hand, the information or recording medium can be a transmissible medium such as an electrical or optical signal, which can be carried via an electrical or optical cable, by radio link, by wireless optical link or by other means.
[0056] The programs according to the invention can in particular be downloaded onto an Internet-type network.
[0057] Alternatively, the information or recording medium may be an integrated circuit in which a program is incorporated, the circuit being adapted to execute or to be used in the execution of the processes according to the invention.
[0058] It can also be envisaged, in other embodiments, that the processes according to the invention and the AP according to the invention have in combination all or part of the aforementioned characteristics.
[0059] Other features and advantages of the present invention will become apparent from the description below, with reference to the accompanying drawings, which illustrate an example of an embodiment without being limiting in any way. In the figures:
[0060] lare represents a communication system 1, conforming to the invention in a particular embodiment;
[0061] larepresents schematically the hardware architecture of a computer on which the different entities conforming to the invention, belonging to the system of the, are based;
[0062] larepresente, in the form of a flowchart, the main steps of the processes which are the subject of the invention, as they are implemented by the different entities conforming to the invention belonging to the system of la. Description of the invention
[0063] Lare represents a communication system 1, conforming to the invention in a particular embodiment.
[0064] In this embodiment, system 1 includes: at least one HE entity implementing a vectorization function FV, an encryption function FC and a vector comparison function FComp; at least one RAG orchestration entity for processing a service request issued by a user device UE, at least one LLM processing entity for the service request, a database DB, and at least one user device UE.
[0065] In the following description and diagram, for the sake of simplicity, we consider only one HE entity, one RAG entity, one LLM entity, one DB database, and one UE user equipment, these entities belonging to an NW telecommunications network. Of course, a larger number of entities, databases, and UEs can be considered.
[0066] In the example above, the RAG orchestration entity implements a RAG function for "Retrieval-Augmented Generation." Such a function combines two complementary steps: information retrieval and text generation. Thus, the RAG orchestration entity implements a portion of the RAG function, which consists of searching for relevant data in one or more distributed databases, such as, but not limited to, the DB database, a set of documents, or external sources, using a search engine or a specialized model.
[0067] A second part of the RAG function concerns the generation of a response intended to be presented to the EU. More specifically, this part of the RAG function is also implemented by a generative model, such as a linguistic model implemented by the LLM entity.
[0068] The LLM entity implements several functions that constitute a linguistic model that can be used by a conversational agent or "chatbot". These functions include, but are not limited to, a FAN analysis function whose role is to process service requests received from a UE by analyzing them, and at least one FA assistant function to interact with the UE user in order to refine the analyzed service request and obtain data to generate a response to the service request, notably by exchanging information with the RAG entity, and then to generate a response to the service request that will be presented to the UE user.
[0069] The interactions between the different components of system 1 are described in more detail with reference to the.
[0070] In the embodiment described herein, the HE, RAG, and LLM entities have the hardware architecture of a computer as illustrated in Figure 1. This hardware architecture includes, in particular, a PROC processor, MEM random access memory, ROM read-only memory, NVM non-volatile memory, and COM communication means enabling the respective HE, RAG, and LLM entities to communicate with each other and optionally with the UE and with one or more other devices (not shown in the figures) belonging to the NW network. The NVM non-volatile memory constitutes a storage medium according to the invention, readable by the PROC processor, on which one or more programs according to the invention are stored.
[0071] A first program, denoted PROG1 when the hardware architecture of computer 2 is that of entity HE, is stored in the non-volatile NVM memory and contains instructions defining the main steps of a communication method according to the invention as implemented by entity HE. More specifically, it defines the functional modules of entity HE, which rely on and / or control all or part of the PROC, MEM, ROM, NVM, and COM elements of computer 2 mentioned above.
[0072] In the embodiment described herein, this first program PROG1 defines in particular the following functional modules of the HE entity (represented in the figure), which are activated during the implementation of the invention: a generation module (3A) configured to generate a plurality of ciphertext vectors representing information relating to a service, obtained from an orchestration entity distinct from the first management entity, by applying said vectorization function to said information; a generation module (3B) configured to generate a first ciphertext vector by applying said vectorization function to a request for the provision of said service, said request being transmitted from a linguistic model implemented by a second communication network management entity, and the selection, from among the plurality of ciphertext vectors, of at least one ciphertext vector.where at least one second ciphertext vector is selected by means of a comparison between the first ciphertext vector and the plurality of ciphertext vectors, a transmission module (3C) configured to transmit, to the orchestration entity (RAG), said at least one second selected ciphertext vector.
[0073] A second program, denoted PROG2 when the hardware architecture of computer 2 is that of the RAG entity, is stored in the NVM non-volatile memory and contains instructions defining the main steps of an orchestration process according to the invention as implemented by the RAG entity. More specifically, it defines the functional modules of the RAG entity, which rely on and / or control all or part of the PROC, MEM, ROM, NVM, and COM elements of computer 2 mentioned above.
[0074] In the embodiment described herein, this second program PROG2 defines in particular the following functional modules of the RAG entity (represented in the figure), which are activated during the implementation of the invention: a transmission module (4A) configured to transmit, to a first management entity, distinct from the communication network orchestration entity implementing a vectorization function, information relating to a service intended to be vectorized in the form of a plurality of encrypted vectors, said information being obtained from a second communication network management entity implementing a linguistic model, and a reception module (4B) configured to receive, from the first management entity, at least one second encrypted vector from among the plurality of encrypted vectors, representative of information relating to said service, selected by means of a comparison with a first encrypted vector,representative of a request for the provision of said service and the plurality of encrypted vectors, a transmission module (4C) configured to transmit, to the second management entity, an identifier of at least one second selected encrypted vector.
[0075] A third program, designated PROG3 when the hardware architecture of computer 2 is that of the LLM entity, is stored in the non-volatile NVM memory and contains instructions defining the main steps of a service request processing method according to the invention as implemented by the LLM entity. More specifically, it defines the functional modules of the LLM entity, which rely on and / or control all or part of the PROC, MEM, ROM, NVM, and COM elements of computer 2 mentioned above.
[0076] In the embodiment described here, this third program PROG3 defines in particular the following functional modules of the LLM entity (represented on the diagram), which are activated to process the service request: a generation module (5A) configured to generate a request for the provision of a service based on a set of parameters obtained beforehand, a transmission module (5B) configured to transmit, to the first management entity (HE), said request for its vectorization into a first vector and the encryption of the first vector, a reception module (5C) configured to receive, from the orchestration entity, at least one identifier of at least one second encrypted vector representing information relating to said service, the at least one second vector having been previously selected by means of a comparison between the first encrypted vector, representing the request for the provision of said service and a plurality of encrypted vectors,representative of information relating to said service, a generation module (5D) configured to generate a response to the request for the provision of said service based on the identifier of at least one second encrypted vector received.
[0077] The operation of modules 3A to 3C, 4A to 4C and 5A to 5D is detailed further later with reference to the steps of the communication, orchestration and processing methods according to the invention.
[0078] The document describes the main steps of the communication, processing and orchestration processes according to the invention, in a particular embodiment in which it is implemented by the entities HE, RAG and LLM.
[0079] In step E000, a conversational agent dedicated to a particular topic is created. In the remainder of this document, such a conversational agent is dedicated to searching for restaurants in a given geographic area. Of course, the invention applies to the creation and implementation of conversational agents that can relate to any other topics.
[0080] In an E010 step implemented by the RAG entity, the latter transmits a request to collect information relating to a given type of service, here information relating to restaurants such as their address, their opening hours, their menus, their prices, a URL (“Uniform Resource Locator” in English) pointing to their website, etc.
[0081] The data collected by this RAG entity is transmitted to the HE entity in step E020. An identifier of the conversational agent IdCBT is possibly also transmitted to the HE entity during this step.
[0082] Upon receiving the collected data and the chatbot identifier (IdCBT), the HE entity processes them. Thus, in step E030, each collected data point is subjected to a vectorization function (FV) which produces a vector representing the processed data or a set of processed data, referred to hereafter as a vector representing information related to a service.
[0083] A vectorization function allows complex objects to be represented as numerical vectors while preserving the essential relationships between these objects. More specifically, this technique relies on transforming data into a vector space, where initial similarities or relationships, such as semantic proximities or logical correspondences, are maintained. For example, in natural language processing, algorithms like Word2Vec or BERT allow words to be represented in such a way that their semantic relationships are reflected by vector operations.
[0084] An example of a vectorization function is the following: tokenization of the sentence. The sentence "A pizzeria in Lannion" is tokenized as follows: [A, pizzeria, in, Lannion]. Each token is then associated with a fixed-size vector, or "embedding matrix." Such an embedding matrix is, for example, a lookup table where each token is associated with a fixed-size vector. Such a matrix can be obtained using a neural network trained on a large word database.
[0085] One: [0.1, 0.2, 0.3, 0.4],
[0086] pizzeria: [0.5, 0.6, 0.7, 0.8],
[0087] of: [0.9, 1.0, 1.1, 1.2],
[0088] Lannion: [1.3, 1.4, 1.5, 1.6]
[0089] }the replacement of the tokens by the corresponding vector(s) or "embedding lookup"[0.1, 0.2, 0.3, 0.4], # 'A'
[0090] [0.5, 0.6, 0.7, 0.8], # 'pizzeria'
[0091] [0.9, 1.0, 1.1, 1.2], # 'of'
[0092] [1.3, 1.4, 1.5, 1.6] # 'Lannion'
[0093] ], and finally the grouping of the vectors representing the different tokens into a single fixed-size vector representing the processed sentence, or "pooling." Such grouping conforms to a given pooling strategy. Common pooling strategies include "mean pooling," "max pooling," etc. Without this example, we will use "mean pooling" (0.1 + 0.5 + 0.9 + 1.3) / 4.
[0094] (0.2 + 0.6 + 1.0 + 1.4) / 4,
[0095] (0.3 + 0.7 + 1.1 + 1.5) / 4,
[0096] (0.4 + 0.8 + 1.2 + 1.6) / 4
[0097] ] .
[0098] Thus, the fixed-size vector that represents the phrase "A pizzeria in Lannion" is: [0.7, 0.8, 0.9, 1.0].
[0099] Once the data is vectorized, the HE entity implements, in step E040, an encryption function FC which is applied to each vector representing information relating to a service obtained during step E030. The encryption function may be, in particular but not exclusively, a homomorphic encryption function.
[0100] A homomorphic encryption function allows operations to be performed on encrypted data without requiring decryption. This encryption technique is based on a single principle: operations performed on encrypted data, once decrypted, produce a result identical to that obtained if the calculations had been performed on plaintext data.
[0101] Once all the vectors representing information relating to an encrypted service have been collected, the HE entity transmits them (E050), along with possibly the conversational identifier IdCBT, to one or more distributed databases BD.
[0102] For each of the encrypted vectors representing information relating to a service thus stored, an identifier pointing to the database DB in which it is stored, such as a URL, is transmitted to the RAG entity in an E060 step.
[0103] The RAG entity generates (E070) service request processing instructions to configure the language model implemented by the LLM entity. These instructions constitute what are called a "prompt system," a "prompt user," and a "prompt assistant." This step E070 can be implemented before steps E020 or E060, after step E060, or concurrently with steps E020 or E060.
[0104] In general, a "prompt" is a set of instructions provided to the language model so that it generates a response to a request issued by a UE. There are three types of "prompts": "user prompts", "assistant prompts", and "system prompts".
[0105] An example of a "prompt system" might be: "respond concisely and informatively, in a friendly tone" or "provide detailed and factual explanations. Avoid opinions."
[0106] Once these instructions relating to the processing of a service request have been generated, the RAG entity transmits them (E080) to the HE entity which proceeds to vectorize them and then encrypt them (E090).
[0107] Once the vector representing the instructions for processing an encrypted service request is encrypted, the HE entity transmits them (E100) to one or more distributed databases.
[0108] Next, an identifier pointing to the database DB in which this vector is stored, such as a URL, is passed to the RAG entity in an E110 step.
[0109] In order to initialize the CBT conversational agent, the RAG entity obtains the encrypted vector representing the instructions for processing a service request from the DB database and transmits it (E120) to the LLM entity along with a set of parameters for the FC vectorization function, including all the parameters that determine how the text data is transformed into vectors.
[0110] An example of such a parameter set for a vectorization function includes, among other things: parameters representing the preprocessing applied to the data to be vectorized, such as cleaning the text to remove special or unnecessary characters, converting to lowercase to standardize the data, removing stopwords (frequent but uninformative words, such as "the", "and", "of") and using techniques like "stemming" which reduce words to their root or canonical form; a parameter relating to the size of the vocabulary, which defines the number of words taken into account; it can be chosen, for example, to keep only the most frequent words or exclude those appearing too rarely; parameters relating to the representation of the vectors: the generated vectors can be sparse, that is, mainly contain zeros, or dense, that is, with continuous values;Parameters relating to the dimensionality of the vectors, i.e., the number of coefficients constituting the vectors; dimensionality is a function of the vectorization method chosen and the size of the vocabulary; parameters representing the level of analysis to be applied to the data. The data can, in fact, be analyzed at the level of characters, words, or using n-grams (sequences of words or characters, such as bigrams or trigrams); and parameters for generating missing values, for example by filling in the vectors for words absent from the vocabulary or incomplete texts.
[0111] Once the chatbot is configured, it can be run to respond to service requests issued from UEs.
[0112] Thus, still referring to the previous example, a user of a UE (or a UE if it is a standalone UE that does not necessarily require a user. In the remainder of this embodiment, "user of a UE" can be replaced by UE in the context of a standalone UE with capabilities to implement the required actions) enters a service request via a human-machine interface of the UE, such as a touchscreen. Such a service request is, for example, "find a gourmet restaurant in the Lannion area."
[0113] In step E130, the UE transmits the service request to the HE entity so that the latter can vectorize the service request. In one embodiment, the resulting vector can also be encrypted by the HE entity. The HE entity then transmits (E140) the vector representing the service request to the UE, which in turn transmits it to the LLM entity (E150).
[0114] Upon receiving the vector representing the service request, the LLM entity's "role assistant" performs the analysis (E160) of the vector. This analysis is made possible because the "role assistant" has been previously trained using a large database of vectors representing words and sentences and has access to the parameters of the vectorization function used by the HE entity.
[0115] The role assistant then implements (E170) a question-and-answer loop during which it seeks to clarify the service request received from the user. This question-and-answer loop represents the interaction cycle between the user and the language model implemented within the LLM entity. Upon receiving the service request from the user, the role assistant determines the user's intent by extracting the relevant information contained within the service request. If the determined intent is ambiguous—that is, if it is not possible to provide a clear answer to the service request or if the user has follow-up questions—the question-and-answer loop continues until a clear intent from the user can be identified.
[0116] Thus, during the implementation of this question-and-answer loop, the role assistant generates a question for the user interface (UI). This question is transmitted as text so that the UI user can understand it. The UI then transmits the user's entered answer as text to the role assistant, who, before analyzing it, vectorizes the answer. These exchanges are repeated until the role assistant believes it has enough information to understand the UI user's need. The information collected in this way is called contextual data.
[0117] In step E180, the LLM entity transmits a consolidated service provision request, generated based on the service request received from the UE during step E150 and the contextual data collected during step E170, to the RAG entity. In a particular implementation of the invention, such a consolidated service provision request can also be encrypted. For this purpose, the LLM entity implements an FC encryption function identical to that implemented by the HE entity.
[0118] The RAG entity transmits a selection request including this consolidated service provision request and the chatbot identifier IdCBT to the HE entity in an E190 step. The HE entity then applies the FC vectorization function to this consolidated service provision request.
[0119] Using the chatbot identifier IdCBT, the HE entity obtains (E200) a set of encrypted vectors representing information about a service from the database DB in which it was stored during step E050.
[0120] The HE entity then implements the FC (E210) comparison function. Such a semantic comparison function, like the cosine similarity function, allows for the comparison of vectors. Semantic comparison is just one example of an implementation, and any solution that allows for the comparison of vectors and measures a sharing of meaning, a similar or close meaning, of data represented by two vectors can be used as an alternative to semantic comparison.
[0121] A semantic comparison involves measuring the similarity between two concepts, words, phrases, or texts based on their meanings. It aims to determine the extent to which two expressions share a common meaning or how different their meanings are. This can be done by comparing individual words, for example, by calculating the proximity between the terms "cat" and "feline," by analyzing the semantic similarity between two phrases or text fragments, such as "The sky is blue" and "The sky is clear," or by measuring the extent to which two documents (or longer texts) discuss similar topics or convey similar ideas.
[0122] To achieve this, the comparison function relies on the use of lexical databases in which hierarchical relationships (synonymy, antonymy, hypernymy, etc.) between words allow for measuring their proximity. The FC comparison function can also be based on the co-occurrence of words in large text corpora.
[0123] When dealing with vectors, as is the case in the present solution, the comparison function FC can be a function for determining a geometric proximity between two vectors reflecting the semantic similarity of the words or phrases that these vectors represent.
[0124] Such a comparison function calculates a semantic similarity score reflecting the similarity between two concepts, words, phrases, or texts represented by the vectors being compared, based on their meanings. Thus, the higher the value of this similarity score, the closer the meanings of the phrases or words represented by the compared vectors.
[0125] Once all vectors representing service information have been compared to the vector representing the consolidated service request, the HE entity transmits (E220) to the RAG entity a list of service information vectors with a high similarity score. The HE entity can choose the set of service information vectors with a similarity score above a given threshold, or the X service information vectors with the highest similarity scores among all service information vectors. In one example implementation, X might be 3. Of course, X can be any other value.
[0126] The RAG entity transmits (E230) to the LLM entity, for each of the vectors identified by the HE entity during step E210, the identifier pointing to the DB database in which the vectors representing information relating to an identified service are stored.
[0127] Following the receipt of this information, the LLM entity obtains (E250) the vectors representing information relating to an identified service from the DB database in which they are stored and transmits them to the HE entity so that it can decipher them (E260).
[0128] In a particular implementation of the invention in which the LLM entity implements an FC encryption function identical to that implemented by the HE entity, step E260 is not implemented and the LLM entity proceeds directly to decrypt the vectors representing information relating to a service.
[0129] Once the vectors representing information relating to a service have been deciphered, the LLM entity generates a response to the service request received during step E150, for example "the restaurant "Orange" is an excellent gourmet restaurant, it is located at No. Y rue Z in Lannion, it is open every evening from 7pm except Sunday. Proper attire is required" and transmits it (E270) to the EU.
Claims
A communication method implemented by a first management entity (HE) of a communication network implementing a vectorization function, the method comprising: the generation (E030) of a plurality of cipher vectors representing information relating to a service, obtained (E020) from an orchestration entity (RAG) distinct from the first management entity, by applying said vectorization function to said information; the generation (E180) of a first cipher vector by applying said vectorization function to a request for the provision of said service, said request being transmitted from a language model implemented by a second management entity (LLM) of the communication network; and the selection, from among the plurality of cipher vectors, of at least a second cipher vector.at least one second ciphertext vector being selected by means of a comparison between the first ciphertext vector and the plurality of ciphertext vectors, the transmission (E260), to the orchestration entity (RAG), of said at least one second selected ciphertext vector. Communication method according to claim 1, wherein the selection of said at least a second cipher vector includes determining a score from the comparison between the first cipher vector and the plurality of cipher vectors. Communication method, according to claim 2, wherein said at least a second selected cipher vector is the vector with the highest score. Communication method according to claim 2 or claim 3, wherein said at least a second selected cipher vector is the vector whose score is greater than or equal to a predetermined threshold. A communication method according to any one of the preceding claims, wherein the plurality of encrypted vectors representing information relating to a service is associated with a conversational agent exploiting the linguistic model. A communication method according to any one of the preceding claims, comprising prior to the reception of the first cipher vector: the reception (E070), from the orchestration entity (RAG), of a message comprising at least one instruction intended to be used by the language model to process the request for the provision of said service, the vectorization (E080) of the at least one instruction received and the encryption of the vector representing the at least one instruction received and the transmission (E120), to the second management entity (LLM), of the cipher vector representing the at least one instruction received and a set of parameters of the vectorization function. A communication method according to any one of the preceding claims, wherein the comparison between the first ciphertext vector and the plurality of ciphertext vectors is a semantic type comparison. A communication method according to any one of the preceding claims, wherein the encryption function applied to the vectors obtained by applying the vectorization function is an encryption function identical or equivalent to a homomorphic encryption function. A method for orchestrating the processing of a service request implemented by an orchestration entity (RAG) belonging to a communication network, the method comprising: the transmission (E020), to a first management entity (HE), distinct from the orchestration entity, of the communication network implementing a vectorization function, of information relating to a service intended to be vectorized in the form of a plurality of encrypted vectors, and the reception (E220), from the first management entity (HE), of at least a second encrypted vector from among the plurality of encrypted vectors, representing information relating to said service, selected by means of a comparison between a first encrypted vector, representing a request for the provision of said service and the plurality of encrypted vectors, the transmission (E230), to the second management entity (LLM),of an identifier of at least one second selected encrypted vector. An orchestration method according to claim 9 comprising: the transmission (E070) to the first management entity (HE) of a message comprising at least one instruction intended to be used by the language model of the second management entity (LLM) to process the request for the provision of said service; the reception (E090) from the first management entity (HE) of a set of parameters of the vectorization function implemented by the first management entity, and of a third cipher vector representing at least one instruction; the transmission (E120) to the second management entity (LLM) implementing the language model of an initialization message comprising the set of parameters of the vectorization function and an identifier of the third cipher vector representing at least one instruction. A method for processing a service provision request by a second management entity (LLM) of a communication network, said communication network further comprising a first management entity (HE) implementing a vectorization function, said second management entity implementing a language model, the method comprising: the generation of a service provision request based on a set of previously obtained parameters, the transmission (E130) to the first management entity (HE) of said request for vectorization into a first vector and encryption of the first vector, the reception (E230), from the orchestration entity, of at least one identifier of at least one second encrypted vector representing information relating to said service, the at least one second vector having been previously selected by means of a comparison with the first encrypted vector,representative of the request for the provision of said service and a plurality of encrypted vectors, representative of information relating to said service, the generation (E270) of a response to the request for the provision of said service based on the identifier of at least one second encrypted vector received. A telecommunications network management entity (HE) implementing a vectorization function, said entity comprising: a generation module (3A) configured to generate a plurality of cipher vectors representing information relating to a service, obtained (E020) from an orchestration entity (RAG) distinct from the first management entity, by applying said vectorization function to said information; a generation module (3B) configured to generate a first cipher vector by applying said vectorization function to a request for the provision of said service, said request being transmitted from a language model implemented by a second management entity (LLM) of the communication network; and the selection, from the plurality of cipher vectors, of at least one cipher vector.where at least one second ciphertext vector is selected by means of a comparison between the first ciphertext vector and the plurality of ciphertext vectors, a transmission module (3C) configured to transmit, to the orchestration entity (RAG), said at least one second selected ciphertext vector. An orchestration entity for processing a service request belonging to a telecommunications network, the orchestration entity comprising: a transmission module (4A) configured to transmit, to a first management entity (HE), distinct from the orchestration entity of the communication network implementing a vectorization function, information relating to a service intended to be vectorized in the form of a plurality of encrypted vectors, said information being obtained from a second management entity (LLM) of the communication network implementing a language model, and a reception module (4B) configured to receive, from the first management entity (HE), at least one second encrypted vector from among the plurality of encrypted vectors, representative of information relating to said service, selected by means of a comparison with a first encrypted vector,representative of a request for the provision of said service and the plurality of encrypted vectors, a transmission module (4C) configured to transmit, to the second management entity (LLM), an identifier of at least one second selected encrypted vector. A telecommunications network management entity implementing a linguistic model capable of processing a service request, the entity comprising: a generation module (5A) configured to generate a service provision request based on a set of previously obtained parameters; a transmission module (5B) configured to transmit said request to the first management entity (HE) for vectorization into a first vector and encryption of the first vector; a reception module (5C) configured to receive, from the orchestration entity, at least one identifier of at least one second encrypted vector representing information relating to said service, the at least one second vector having been previously selected by means of a comparison between the first encrypted vector, representing the service provision request, and a plurality of encrypted vectors, representing information relating to said service.a generation module (5D) configured to generate a response to the request for the provision of said service based on the identifier of at least one second encrypted vector received. A service request processing system comprising at least: an entity implementing a vectorization function according to claim 12, an entity orchestrating the processing of a service request according to claim 13, and a telecommunications network entity implementing a language model capable of processing a service request according to claim 14.