Intelligent answering method and apparatus
By training the encoder using contrastive learning techniques, the distance between anchor samples and positive samples is narrowed, while the distance between anchor samples and negative samples is widened. This solves the problem of low search result accuracy in question-and-answer products and achieves more efficient information retrieval.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- INDUSTRIAL AND COMMERCIAL BANK OF CHINA
- Filing Date
- 2023-06-27
- Publication Date
- 2026-06-19
AI Technical Summary
The accuracy of search results in existing question-and-answer products is low, making it difficult for users to obtain results that truly solve their problems.
By introducing contrastive learning techniques, the distance between anchor samples and positive samples is reduced, and the distance between anchor samples and negative samples is increased by training the encoder. The trained encoder is then used to find the answer with the highest similarity to the new question encoding in the question-answering pair dataset.
It improved the effectiveness of information retrieval and achieved better accuracy in search results.
Smart Images

Figure CN116991988B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the fields of artificial intelligence and natural language processing, specifically to an intelligent answering method and apparatus. Background Technology
[0002] In current question-and-answer products, the most commonly used search method is the information retrieval method based on the open-source project Solar. However, during the information retrieval process, the accuracy of search results is often low, making it difficult for users to obtain results that truly solve their problems. Summary of the Invention
[0003] In view of the problems in the prior art, embodiments of this application provide an intelligent answering method and apparatus, which can at least partially solve the problems existing in the prior art.
[0004] On the one hand, this application proposes an intelligent answering method, including:
[0005] Get the new questions sent by the client;
[0006] The new question is encoded using a pre-trained first encoder to generate a sentence representation vector for the new question;
[0007] The answer to each question in the dataset is sequentially input into the pre-trained second encoder to obtain the sentence representation vector of each answer. The first encoder and the second encoder are obtained by comparing and learning from question-answer pairs in the dataset.
[0008] Calculate the similarity between the sentence representation vector of the new question and the sentence representation vector of each answer;
[0009] The answers are sorted in descending order of similarity between the sentence representation vector of the answer and the sentence representation vector of the new question;
[0010] The sorting results of the answers are sent to the client.
[0011] In some embodiments, the training process of the first encoder and the second encoder is as follows:
[0012] For a question in the dataset, the question is used as an anchor sample and input into the first encoder to obtain the sentence representation vector of the anchor sample. The answer to the question is used as a positive sample and input into the second encoder to obtain the sentence representation vector of the positive sample. The answers to at least some of the other questions in the dataset are used as negative samples and input into the second encoder to obtain the sentence representation vector of the negative samples.
[0013] Based on the sentence representation vectors of the anchor samples, positive samples, and negative samples for each question, the first encoder and the second encoder are trained using the contrastive learning loss function until the trained first encoder and the second encoder are obtained.
[0014] In some embodiments, the dataset includes at least two question-answer pair sets and a mapping relationship between each question-answer pair set and expert coding.
[0015] In some embodiments, for a problem in the dataset, the method for obtaining negative samples of that problem includes:
[0016] For the answers to other questions in the question-answer pair set to which this question belongs, the weight of each answer is set to the first value;
[0017] For the answer to each question in every other question-answer pair set outside the question-answer pair set to which the question belongs, the weight of the answer to each question in the other question-answer pair set is determined based on the business relationship between the expert corresponding to the other question-answer pair set and the expert corresponding to the question-answer pair set to which the question belongs;
[0018] The weights of each answer are concatenated to obtain the weight vector of the negative samples;
[0019] The weight vector of the negative samples is processed using the target activation function to obtain the sampling probability distribution of the negative samples;
[0020] Based on the sampling probability distribution of the negative samples, negative samples are sampled using a preset number of samples to obtain the negative samples of the problem.
[0021] In some embodiments, determining the weight of the answer to each question in the other question-answer pair set based on the business relationship between the expert corresponding to the other question-answer pair set and the expert corresponding to the question-answer pair set to which the question belongs includes:
[0022] Obtain the expert organizational structure tree, wherein the organizational structure tree is divided according to business areas, each expert has a department to which they belong, and each expert is a leaf node in the organizational structure tree;
[0023] Calculate the distance on the organizational tree between the expert corresponding to the other question-answer pair set and the expert corresponding to the question-answer pair set to which this question belongs;
[0024] The weight of the answer to each question in the other question-answer pair set is determined based on the distance between the expert corresponding to the other question-answer pair set and the expert corresponding to the question in the question-answer pair set to which the question belongs in the organizational structure tree.
[0025] In some embodiments, determining the weight of the answer to each question in the other question-answer pair set based on the distance between the expert corresponding to the other question-answer pair set and the expert corresponding to the question-answer pair set to which the question belongs in the organizational structure tree includes:
[0026] Calculate the reciprocal of the distance between the expert corresponding to the other question-answer pair set and the expert corresponding to the question-answer pair set to which the question belongs on the organizational structure tree, and determine the weight of the answer to each question in the other question-answer pair set.
[0027] In some embodiments, the first value is greater than or equal to 1.
[0028] On the other hand, this application provides an intelligent answering device, comprising:
[0029] The retrieval module is used to retrieve new questions sent by the client.
[0030] A generation module is used to encode the new question using a pre-trained first encoder to generate a sentence representation vector for the new question;
[0031] The input module is used to sequentially input the answer to each question in the dataset into the pre-trained second encoder to obtain the sentence representation vector of each answer, wherein the first encoder and the second encoder are obtained by comparative learning based on question-answer pairs in the dataset;
[0032] A calculation module is used to calculate the similarity between the sentence representation vector of the new question and the sentence representation vector of each answer;
[0033] The sorting module is used to sort the answers in descending order of similarity between the sentence representation vector of the answer and the sentence representation vector of the new question;
[0034] The sending module is used to send the sorting results of the answers to the client.
[0035] In some embodiments, the input module is further configured to: for a question in the dataset, input the question as an anchor sample into a first encoder to obtain a sentence representation vector of the anchor sample; input the answer to the question as a positive sample into a second encoder to obtain a sentence representation vector of the positive sample; and input the answers to at least some other questions in the dataset as negative samples into a second encoder to obtain a sentence representation vector of the negative samples.
[0036] The device further includes:
[0037] The training module is used to train the first encoder and the second encoder using the contrastive learning loss function based on the sentence representation vectors of the anchor samples, the sentence representation vectors of the positive samples, and the sentence representation vectors of the negative samples for each question, until the trained first encoder and the second encoder are obtained.
[0038] In some embodiments, the dataset includes at least two question-answer pair sets and a mapping relationship between each question-answer pair set and expert coding.
[0039] In some embodiments, the apparatus further includes:
[0040] The first weight setting module is used to set the weight of each answer to a first value for the answers to other questions in the question-answer pair set to which the question belongs, for a question in the dataset.
[0041] The second weight setting module determines the weight of the answer to each question in each other question-answer pair set outside the question-answer pair set to which the question belongs, based on the business relationship between the expert corresponding to the other question-answer pair set and the expert corresponding to the question-answer pair set to which the question belongs;
[0042] The concatenation module is used to concatenate the weights of each answer to obtain the weight vector of the negative samples;
[0043] The processing module is used to process the weight vector of the negative sample using the target activation function to obtain the sampling probability distribution of the negative sample;
[0044] The sampling module is used to sample negative samples according to the sampling probability distribution of the negative samples and use a preset sampling quantity to obtain negative samples of the problem.
[0045] In some embodiments, the second weight setting module determines the weight of the answer to each question in the other question-answer pair set based on the business relationship between the expert corresponding to the other question-answer pair set and the expert corresponding to the question-answer pair set to which the question belongs, including:
[0046] Obtain the expert organizational structure tree, wherein the organizational structure tree is divided according to business areas, each expert has a department to which they belong, and each expert is a leaf node in the organizational structure tree;
[0047] Calculate the distance on the organizational tree between the expert corresponding to the other question-answer pair set and the expert corresponding to the question-answer pair set to which this question belongs;
[0048] The weight of the answer to each question in the other question-answer pair set is determined based on the distance between the expert corresponding to the other question-answer pair set and the expert corresponding to the question in the question-answer pair set to which the question belongs in the organizational structure tree.
[0049] In some embodiments, the second weight setting module determines the weight of the answer to each question in the other question-answer pair set based on the distance between the expert corresponding to the other question-answer pair set and the expert corresponding to the question's question-answer pair set in the organizational structure tree, including:
[0050] Calculate the reciprocal of the distance between the expert corresponding to the other question-answer pair set and the expert corresponding to the question-answer pair set to which the question belongs on the organizational structure tree, and determine the weight of the answer to each question in the other question-answer pair set.
[0051] In some embodiments, the first value is greater than or equal to 1.
[0052] This application also provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the program, it implements the steps of the intelligent response method described in any of the above embodiments.
[0053] This application also provides a computer-readable storage medium storing a computer program thereon, which, when executed by a processor, implements the steps of the intelligent answering method described in any of the above embodiments.
[0054] The intelligent answering method and apparatus provided in this application introduce contrastive learning technology into the intelligent question answering scenario. By training the encoder by narrowing the distance between the anchor sample and the positive sample and widening the distance between the anchor sample and the negative sample, when a new question is raised, the trained encoder is used to find at least a portion of the answers with the highest encoding similarity to the new question in the question-answer pair dataset, and the portion of the answers is sent to the client as the answer to the new question, thereby achieving better information retrieval results. Attached Figure Description
[0055] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort. In the drawings:
[0056] Figure 1 This is a flowchart illustrating an intelligent response method provided in an embodiment of this application.
[0057] Figure 2This is a partial flowchart of an intelligent response method provided in an embodiment of this application.
[0058] Figure 3 This is a partial flowchart of an intelligent response method provided in an embodiment of this application.
[0059] Figure 4 This is a partial flowchart of an intelligent response method provided in an embodiment of this application.
[0060] Figure 5 This is a schematic diagram of the organizational structure tree provided in an embodiment of this application.
[0061] Figure 6 This is a schematic diagram of the structure of an intelligent response device provided in an embodiment of this application.
[0062] Figure 7 This is a schematic diagram of the physical structure of an electronic device provided in an embodiment of this application. Detailed Implementation
[0063] To make the objectives, technical solutions, and advantages of the embodiments of this application clearer, the embodiments of this application will be further described in detail below with reference to the accompanying drawings. Here, the illustrative embodiments and their descriptions are used to explain this application, but are not intended to limit this application. It should be noted that, unless otherwise specified, the embodiments and features in the embodiments of this application can be arbitrarily arranged.
[0064] The terms “first,” “second,” etc., used in this document are not intended to specifically refer to order or sequence, nor are they used to limit this application; they are merely used to distinguish elements or operations described using the same technical terms.
[0065] The terms “include,” “including,” “have,” “contain,” etc., used in this article are all open-ended terms, meaning that they include but are not limited to.
[0066] The term "and / or" as used in this document includes any or all of the items mentioned.
[0067] To better understand this application, the research background of this application will be introduced in detail below.
[0068] Figure 1 This is a flowchart illustrating an embodiment of the intelligent response method provided in this application, as shown below. Figure 1 As shown in the embodiments of this application, the intelligent answering method includes:
[0069] S101, Obtain new questions sent by the client;
[0070] S102. Encode the new question using the pre-trained first encoder to generate a sentence representation vector for the new question;
[0071] S103. Input the answer to each question in the dataset into the pre-trained second encoder in sequence to obtain the sentence representation vector of each answer, wherein the first encoder and the second encoder are obtained by comparative learning based on the question-answer pairs in the dataset;
[0072] S104. Calculate the similarity between the sentence representation vector of the new question and the sentence representation vector of each answer;
[0073] S105. Sort the answers in descending order of similarity between the sentence representation vector of the answer and the sentence representation vector of the new question;
[0074] S106. Send the sorting results of the answers to the client.
[0075] Specifically, the first encoder and the second encoder can be LSTM, BERT, etc. The two encoders can be of different types. The training objective of the encoder is to make the similarity between the question (anchor sample) and the correct answer (positive sample) increasingly higher and the similarity between the question and the wrong answer (negative sample) increasingly lower.
[0076] After the encoder parameters are optimized, when a new question is input, the similarity is calculated between the new question and the sentence representation vectors obtained by the encoder from all the answers in the dataset. The answers are then sorted from highest to lowest similarity, and the sorting result is the most recommended result.
[0077] The intelligent answering method provided in this application introduces contrastive learning technology into the intelligent question answering scenario. By narrowing the distance between anchor samples and positive samples and widening the distance between anchor samples and negative samples, the encoder is trained. When a new question is raised, the trained encoder is used to find at least a portion of the answers with the highest encoding similarity to the new question in the question-answer pair dataset. This portion of the answers is then sent to the client as the answer to the new question, resulting in better information retrieval performance.
[0078] like Figure 2 As shown, in some embodiments, the training process of the encoder is as follows:
[0079] S001. For a question in the dataset, the question is used as an anchor sample and input into the first encoder to obtain the sentence representation vector of the anchor sample. The answer to the question is used as a positive sample and input into the second encoder to obtain the sentence representation vector of the positive sample. The answers to at least some other questions in the dataset are used as negative samples and input into the second encoder to obtain the sentence representation vector of the negative samples.
[0080] S002. Based on the sentence representation vectors of the anchor samples, positive samples, and negative samples for each question, the encoder is trained using the contrastive learning loss function until a trained encoder is obtained.
[0081] Specifically, given a question-answer pair <q, a>, q is processed by encoder1 (the first encoder) to obtain the sentence representation vector q, and a is processed by encoder2 (the second encoder) to obtain the sentence representation vector a; (the encoder can be LSTM, BERT, etc.); each negative sample a is sampled. - After passing through encoder2, the sentence vector representation α is obtained. - ; The InfoNCE loss can be used as the contrastive learning loss function for learning, as shown in the following formula:
[0082]
[0083] Here, τ represents the temperature coefficient. Additionally, the mini-batch gradient descent method and the Adam optimizer can be used to optimize the model.
[0084] In some embodiments, the dataset includes at least two question-answer pair sets and a mapping relationship between each question-answer pair set and expert codes. Specifically, each expert corresponds to one question-answer pair set, and the experts may have the same, similar, or different business domains.
[0085] like Figure 3 As shown, in some embodiments, for a problem in the dataset, the method for obtaining negative samples of that problem includes:
[0086] S003. For the answers to other questions in the question-answer pair set to which this question belongs, set the weight of each answer to the first value;
[0087] S004. For the answer to each question in every other question-answer pair set outside the question-answer pair set to which the question belongs, determine the weight of the answer to each question in the other question-answer pair set based on the business relationship between the expert corresponding to the other question-answer pair set and the expert corresponding to the question-answer pair set to which the question belongs;
[0088] S005. Concatenate the weights of each answer to obtain the weight vector of the negative samples;
[0089] S006. The weight vector of the negative sample is processed using the target activation function to obtain the sampling probability distribution of the negative sample;
[0090] S007. Based on the sampling probability distribution of the negative samples, negative samples are sampled using a preset sampling quantity to obtain the negative samples of the problem.
[0091] Specifically, to address the issue of low search result accuracy, this embodiment proposes a contrastive learning method based on negative sample hierarchical classification. For an anchor sample, contrastive learning generates positive samples through data augmentation techniques such as inversion and back-translation. Data from classes different from the anchor sample are randomly selected from the dataset as negative samples. By narrowing the distance between the anchor sample and positive samples and widening the distance between the anchor sample and negative samples, better representation learning results are achieved. In contrastive learning, the quality of negative samples is a crucial factor affecting the learning outcome, and extensive research has investigated the generation and selection of negative samples. In current bank-related question-and-answer products, the question answerers are all bank employees, who are often experts in a specific business area. Since the business area of the same expert remains largely unchanged, the questions and answers are closer in the representation space, with low distinguishability, making them the most difficult part for the model to learn—strong negative samples in contrastive learning. Simultaneously, different experts may have the same or similar professional areas. In short, the similarity between experts is positively correlated with the distance between the answers to questions previously answered by an expert. This relationship can guide the evaluation of negative sample quality in contrastive learning. This approach, which integrates the relevance between question-and-answer pairs and expert relevance information, categorizes negative samples in contrastive learning to improve their quality and accuracy, thereby enhancing the performance of contrastive learning and achieving better information retrieval results. Specifically:
[0092] Each expert e in the dataset is assigned a set of question-answer pairs <q, a>. The goal is to learn the correspondence between q and a. When a new question q' is asked, the most likely answer list is returned from all answers. Prior knowledge includes not only the q, a correspondence but also the stronger correlation between question-answer pairs belonging to the same expert. This application defines this similarity as question-answer pair similarity. Furthermore, each expert has a business domain. The closer the business domains, the higher the expert similarity and the greater the similarity of the questions answered. This application defines this similarity as expert similarity. Expert similarity and question-answer pair similarity are fused together. By comprehensively judging the similarity between negative samples and anchor samples using both types of information, a negative sample sampling probability distribution is generated, improving the quality of negative samples. The specific process is as follows:
[0093] Step 1: For anchor sample q, there is a positive sample a, given by expert e. Obtain other answers N from expert e. e These answers are strong negative samples and belong to set N. e The weight of the answer is set to the first value;
[0094] Step 2: Calculate the similarity of expert e's business domain with other experts, and assign weights to the answers of different experts based on the similarity.
[0095] Step 3: For positive sample a, N e The weights of the relevant answers, such as [1, 1, ..., 1], are concatenated with the weights of all other expert answer sets, such as [0.1, 0.2, 0.3, ..., 0.5], to obtain the weight vector of all negative samples [1, 1, ..., 1, 0.1, 0.2, 0.3, ..., 0.5]. This vector is then processed using the target activation function softmax to obtain the negative sample sampling probability distribution p. The larger the probability value in p, the greater the probability that a negative sample will be sampled.
[0096] Step 4: Based on the negative sample sampling probability distribution p calculated in Step 3, perform negative sample sampling. The number of samples is determined by the hyperparameter n (the preset number of samples), and n can be adjusted according to the training results.
[0097] like Figure 4 As shown, in some embodiments, determining the weight of the answer to each question in the other question-answer pair set based on the business relationship between the expert corresponding to the other question-answer pair set and the expert corresponding to the question-answer pair set to which the question belongs includes:
[0098] S0041. Obtain the expert organizational structure tree, wherein the organizational structure tree is divided according to business areas, each expert has a department to which they belong, and each expert is a leaf node in the organizational structure tree;
[0099] S0042. Calculate the distance between the expert corresponding to the other question-answer pair set and the expert corresponding to the question-answer pair set to which the question belongs on the organizational structure tree;
[0100] S0043. Based on the distance between the expert corresponding to the other question-answer pair set and the expert corresponding to the question-answer pair set to which the question belongs on the organizational structure tree, determine the weight of the answer to each question in the other question-answer pair set.
[0101] Specifically, each expert belongs to a department. In the organizational structure tree, each expert is a leaf node. The organizational structure itself is divided according to business areas, and the distance of an expert in the tree is positively correlated with business relevance. For example, ... Figure 5 As shown, expert Zhang belongs to the Platform R&D Department, and expert Li belongs to the Risk Management Department. The tree distance from expert Li to expert Zhang is: Risk Management Department → Bank Card Business Department, Bank Card Business Department → Head Office, Head Office → Business R&D Center, Business R&D Center → Platform R&D Department, with a total distance d of 4.
[0102] In some embodiments, determining the weight of the answer to each question in the other question-answer pair set based on the distance between the expert corresponding to the other question-answer pair set and the expert corresponding to the question-answer pair set to which the question belongs in the organizational structure tree includes: calculating the reciprocal of the distance between the expert corresponding to the other question-answer pair set and the expert corresponding to the question-answer pair set to which the question belongs in the organizational structure tree, and determining the reciprocal as the weight of the answer to each question in the other question-answer pair set.
[0103] Specifically, the reciprocal of this distance represents the weight of the answers from different experts. More specifically: if the distance between experts f and e on the tree is d, then the set of answers N that f has given is... f With N e The similarity between the answers is 1 / d. Based on this, a first value greater than or equal to 1 indicates that expert e has other answers N. e For strong negative samples, the first value is 1.
[0104] This application embodiment classifies negative samples based on the correspondence between expert organizational structure information and expert question-answer pairs, forming a negative sample sampling distribution. This effectively improves the accuracy of negative samples in the contrastive learning method and enhances the precision of recommended answers.
[0105] Figure 6 This is a schematic diagram of the structure of an intelligent response device provided in an embodiment of this application, as shown below. Figure 6 As shown, the intelligent answering device provided in this application embodiment includes:
[0106] Module 21 is used to retrieve new questions sent by the client.
[0107] The generation module 22 is used to encode the new question using a pre-trained first encoder to generate a sentence representation vector for the new question;
[0108] The input module 23 is used to sequentially input the answer to each question in the dataset into the pre-trained second encoder to obtain the sentence representation vector of each answer, wherein the first encoder and the second encoder are obtained by comparative learning based on the question-answer pairs in the dataset;
[0109] Calculation module 24 is used to calculate the similarity between the sentence representation vector of the new question and the sentence representation vector of each answer;
[0110] The sorting module 25 is used to sort the answers in descending order of similarity between the sentence representation vector of the answer and the sentence representation vector of the new question;
[0111] The sending module 26 is used to send the sorting result of the answer to the client.
[0112] The intelligent answering device provided in this application introduces contrastive learning technology into the intelligent question answering scenario. By narrowing the distance between anchor samples and positive samples and widening the distance between anchor samples and negative samples, the encoder is trained. When a new question is raised, the trained encoder is used to find at least a portion of the answers with the highest encoding similarity to the new question in the question-answer pair dataset. This portion of the answers is then sent to the client as the answer to the new question, resulting in better information retrieval performance.
[0113] In some embodiments, the input module is further configured to: for a question in the dataset, input the question as an anchor sample into a first encoder to obtain a sentence representation vector of the anchor sample; input the answer to the question as a positive sample into a second encoder to obtain a sentence representation vector of the positive sample; and input the answers to at least some other questions in the dataset as negative samples into a second encoder to obtain a sentence representation vector of the negative samples.
[0114] The device further includes:
[0115] The training module is used to train the first encoder and the second encoder using the contrastive learning loss function based on the sentence representation vectors of the anchor samples, the sentence representation vectors of the positive samples, and the sentence representation vectors of the negative samples for each question, until the trained first encoder and the second encoder are obtained.
[0116] In some embodiments, the dataset includes at least two question-answer pair sets and a mapping relationship between each question-answer pair set and expert coding.
[0117] In some embodiments, the apparatus further includes:
[0118] The first weight setting module is used to set the weight of each answer to a first value for the answers to other questions in the question-answer pair set to which the question belongs, for a question in the dataset.
[0119] The second weight setting module determines the weight of the answer to each question in each other question-answer pair set outside the question-answer pair set to which the question belongs, based on the business relationship between the expert corresponding to the other question-answer pair set and the expert corresponding to the question-answer pair set to which the question belongs;
[0120] The concatenation module is used to concatenate the weights of each answer to obtain the weight vector of the negative samples;
[0121] The processing module is used to process the weight vector of the negative sample using the target activation function to obtain the sampling probability distribution of the negative sample;
[0122] The sampling module is used to sample negative samples according to the sampling probability distribution of the negative samples and use a preset sampling quantity to obtain negative samples of the problem.
[0123] In some embodiments, the second weight setting module determines the weight of the answer to each question in the other question-answer pair set based on the business relationship between the expert corresponding to the other question-answer pair set and the expert corresponding to the question-answer pair set to which the question belongs, including:
[0124] Obtain the expert organizational structure tree, wherein the organizational structure tree is divided according to business areas, each expert has a department to which they belong, and each expert is a leaf node in the organizational structure tree;
[0125] Calculate the distance on the organizational tree between the expert corresponding to the other question-answer pair set and the expert corresponding to the question-answer pair set to which this question belongs;
[0126] The weight of the answer to each question in the other question-answer pair set is determined based on the distance between the expert corresponding to the other question-answer pair set and the expert corresponding to the question in the question-answer pair set to which the question belongs in the organizational structure tree.
[0127] In some embodiments, the second weight setting module determines the weight of the answer to each question in the other question-answer pair set based on the distance between the expert corresponding to the other question-answer pair set and the expert corresponding to the question's question-answer pair set in the organizational structure tree, including:
[0128] Calculate the reciprocal of the distance between the expert corresponding to the other question-answer pair set and the expert corresponding to the question-answer pair set to which the question belongs on the organizational structure tree, and determine the weight of the answer to each question in the other question-answer pair set.
[0129] In some embodiments, the first value is greater than or equal to 1.
[0130] The embodiments of the apparatus provided in this application can be used to execute the processing flow applied to the various method embodiments described above. Its functions will not be repeated here, but can be referred to the detailed description of the above method embodiments.
[0131] It should be noted that the intelligent answering method and device provided in this application embodiment can be used in the financial field, or in any technical field other than the financial field. This application embodiment does not limit the application field of the intelligent answering method and device.
[0132] Figure 7 This is a schematic diagram of the physical structure of an electronic device provided in an embodiment of this application, as shown below. Figure 7As shown, the electronic device may include: a processor 301, a communications interface 302, a memory 303, and a communication bus 304, wherein the processor 301, the communications interface 302, and the memory 303 communicate with each other via the communication bus 304. The processor 301 can call logical instructions in the memory 303 to execute the method described in any of the above embodiments, for example including: acquiring a new question sent by a client; encoding the new question using a pre-trained first encoder to generate a sentence representation vector for the new question; sequentially inputting the answers to each question in the dataset into the pre-trained second encoder to obtain a sentence representation vector for each answer, wherein the first encoder and the second encoder are obtained through comparative learning based on question-answer pairs in the dataset; calculating the similarity between the sentence representation vector of the new question and the sentence representation vector of each answer; sorting the answers in descending order of similarity between the sentence representation vectors of the answers and the sentence representation vectors of the new question; and sending the sorted answers to the client.
[0133] Furthermore, the logical instructions in the aforementioned memory 303 can be implemented as software functional units and, when sold or used as independent products, can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or a part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.
[0134] This embodiment discloses a computer program product, which includes a computer program stored on a non-transitory computer-readable storage medium. The computer program includes program instructions, and when the program instructions are executed by a computer, the computer can perform the methods provided in the above-described method embodiments, such as: obtaining a new question sent by a client; encoding the new question using a pre-trained first encoder to generate a sentence representation vector for the new question; sequentially inputting the answers to each question in the dataset into the pre-trained second encoder to obtain a sentence representation vector for each answer, wherein the first encoder and the second encoder are obtained by comparative learning based on question-answer pairs in the dataset; calculating the similarity between the sentence representation vector of the new question and the sentence representation vector of each answer; sorting the answers in descending order of similarity between the sentence representation vectors of the answers and the sentence representation vectors of the new question; and sending the sorted results of the answers to the client.
[0135] This embodiment provides a computer-readable storage medium storing a computer program that causes a computer to execute the methods provided in the above-described method embodiments. For example, the methods include: acquiring a new question sent by a client; encoding the new question using a pre-trained first encoder to generate a sentence representation vector for the new question; sequentially inputting the answers to each question in the dataset into the pre-trained second encoder to obtain a sentence representation vector for each answer, wherein the first encoder and the second encoder are obtained through comparative learning based on question-answer pairs in the dataset; calculating the similarity between the sentence representation vector of the new question and the sentence representation vector of each answer; sorting the answers in descending order of similarity between the sentence representation vectors of the answers and the sentence representation vectors of the new question; and sending the sorted results of the answers to the client.
[0136] Those skilled in the art will understand that embodiments of this application can be provided as methods, systems, or computer program products. Therefore, this application can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, this application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
[0137] This application is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of this application. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart... Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.
[0138] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.
[0139] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.
[0140] In the description of this specification, the references to terms such as "an embodiment," "a specific embodiment," "some embodiments," "for example," "example," "specific example," or "some examples," etc., indicate that a specific feature, structure, material, or characteristic described in connection with that embodiment or example is included in at least one embodiment or example of this application. In this specification, the illustrative expressions of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the specific features, structures, materials, or characteristics described may be combined in any suitable manner in one or more embodiments or examples.
[0141] The specific embodiments described above further illustrate the purpose, technical solution, and beneficial effects of this application. It should be understood that the above descriptions are merely specific embodiments of this application and are not intended to limit the scope of protection of this application. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of this application should be included within the scope of protection of this application.
Claims
1. An intelligent answering method, characterized by, include: Get the new questions sent by the client; The new question is encoded using a pre-trained first encoder to generate a sentence representation vector for the new question; The answer to each question in the dataset is sequentially input into the pre-trained second encoder to obtain the sentence representation vector of each answer. The first encoder and the second encoder are obtained by comparing and learning from question-answer pairs in the dataset. Calculate the similarity between the sentence representation vector of the new question and the sentence representation vector of each answer; The answers are sorted in descending order of similarity between the sentence representation vector of the answer and the sentence representation vector of the new question; The sorting results of the answers are sent to the client; The training processes for the first encoder and the second encoder are as follows: For a question in the dataset, the question is used as an anchor sample and input into the first encoder to obtain the sentence representation vector of the anchor sample. The answer to the question is used as a positive sample and input into the second encoder to obtain the sentence representation vector of the positive sample. The answers to at least some of the other questions in the dataset are used as negative samples and input into the second encoder to obtain the sentence representation vector of the negative samples. Based on the sentence representation vectors of anchor samples, positive samples, and negative samples for each question, the first encoder and the second encoder are trained using the contrastive learning loss function until the trained first encoder and the second encoder are obtained. For a given problem in the dataset, the method for obtaining negative samples for that problem includes: For the answers to other questions in the question-answer pair set to which this question belongs, the weight of each answer is set to the first value; For the answer to each question in every other question-answer pair set outside the question-answer pair set to which the question belongs, the weight of the answer to each question in the other question-answer pair set is determined based on the business relationship between the expert corresponding to the other question-answer pair set and the expert corresponding to the question-answer pair set to which the question belongs; The weights of each answer are concatenated to obtain the weight vector of the negative samples; The weight vector of the negative samples is processed using the target activation function to obtain the sampling probability distribution of the negative samples; Based on the sampling probability distribution of the negative samples, negative samples are sampled using a preset number of samples to obtain the negative samples of the problem.
2. The method of claim 1, wherein, The dataset includes at least two question-answer pair sets and a mapping relationship between each question-answer pair set and expert coding.
3. The method of claim 1, wherein, The step of determining the weight of the answer to each question in the other question-answer pair set based on the business relationship between the expert corresponding to the other question-answer pair set and the expert corresponding to the question to which the question belongs includes: Obtain the expert organizational structure tree, wherein the organizational structure tree is divided according to business areas, each expert has a department to which they belong, and each expert is a leaf node in the organizational structure tree; Calculate the distance on the organizational tree between the expert corresponding to the other question-answer pair set and the expert corresponding to the question-answer pair set to which this question belongs; The weight of the answer to each question in the other question-answer pair set is determined based on the distance between the expert corresponding to the other question-answer pair set and the expert corresponding to the question in the question-answer pair set to which the question belongs in the organizational structure tree.
4. The method of claim 3, wherein, The step of determining the weight of the answer to each question in the other question-answer pair set based on the distance between the expert corresponding to the other question-answer pair set and the expert corresponding to the question in the question-answer pair set to which the question belongs in the organizational structure tree includes: Calculate the reciprocal of the distance between the expert corresponding to the other question-answer pair set and the expert corresponding to the question-answer pair set to which the question belongs on the organizational structure tree, and determine the weight of the answer to each question in the other question-answer pair set.
5. The method of claim 4, wherein, The first value is greater than or equal to 1.
6. An intelligent answering apparatus characterized by comprising: include: The retrieval module is used to retrieve new questions sent by the client. A generation module is used to encode the new question using a pre-trained first encoder to generate a sentence representation vector for the new question; The input module is used to sequentially input the answer to each question in the dataset into the pre-trained second encoder to obtain the sentence representation vector of each answer, wherein the first encoder and the second encoder are obtained by comparative learning based on question-answer pairs in the dataset; A calculation module is used to calculate the similarity between the sentence representation vector of the new question and the sentence representation vector of each answer; The sorting module is used to sort the answers in descending order of similarity between the sentence representation vector of the answer and the sentence representation vector of the new question; A sending module is used to send the sorting results of the answers to the client; The input module is further configured to: for a question in the dataset, input the question as an anchor sample into the first encoder to obtain the sentence representation vector of the anchor sample; input the answer to the question as a positive sample into the second encoder to obtain the sentence representation vector of the positive sample; and input the answers to at least some other questions in the dataset as negative samples into the second encoder to obtain the sentence representation vector of the negative samples. The device further includes: The training module is used to train the first encoder and the second encoder using the contrastive learning loss function based on the sentence representation vectors of the anchor samples, the sentence representation vectors of the positive samples, and the sentence representation vectors of the negative samples for each question, until the trained first encoder and the second encoder are obtained. The first weight setting module is used to set the weight of each answer to a first value for the answers to other questions in the question-answer pair set to which the question belongs, for a question in the dataset. The second weight setting module determines the weight of the answer to each question in each other question-answer pair set outside the question-answer pair set to which the question belongs, based on the business relationship between the expert corresponding to the other question-answer pair set and the expert corresponding to the question-answer pair set to which the question belongs; The concatenation module is used to concatenate the weights of each answer to obtain the weight vector of the negative samples; The processing module is used to process the weight vector of the negative sample using the target activation function to obtain the sampling probability distribution of the negative sample; The sampling module is used to sample negative samples according to the sampling probability distribution of the negative samples and use a preset sampling quantity to obtain negative samples of the problem.
7. An electronic device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the processor executes the computer program, it implements the steps of the method according to any one of claims 1 to 5.
8. A computer-readable storage medium having stored thereon a computer program, characterized in that, When the computer program is executed by a processor, it implements the steps of the method according to any one of claims 1 to 5.
9. A computer program product comprising a computer program, characterized in that, When the computer program is executed by a processor, it implements the steps of the method as described in any one of claims 1 to 5.