Context learning method and device, equipment and storage medium
By using a large language model and information entropy value to filter reference questions, and combining the context of customer service dialogue to generate reference vectors for multi-class voting, the prediction error problem caused by relying on a single sample in existing technologies is solved, and the efficiency and reliability of question category acquisition are improved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- PING AN TECH (SHENZHEN) CO LTD
- Filing Date
- 2024-08-26
- Publication Date
- 2026-06-26
AI Technical Summary
Existing context learning methods rely on predictions of single samples, which can easily lead to errors, resulting in human intervention and wasted time and resources, and affecting the efficiency of problem category acquisition.
By obtaining the context template of the preset query question, a large language model is used to generate a predicted probability distribution, the information entropy value is calculated to filter reference questions, and a reference vector is generated by combining the context content of the customer service dialogue. Multi-category voting is then conducted to determine the question category.
It reduces manual intervention, improves the efficiency and reliability of problem category acquisition, and reduces the time required to acquire text topics.
Smart Images

Figure CN119128087B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the fields of artificial intelligence and financial technology, and particularly to context learning methods, apparatus, devices, and storage media. Background Technology
[0002] Problem categorization plays a crucial role in information processing. By clearly defining problem categories, the nature and domain of the problem can be clearly identified, allowing for more targeted solutions and methods. This categorization process improves both work efficiency and user satisfaction.
[0003] However, the current process of obtaining question categories for querying questions is cumbersome, which hinders the improvement of question category acquisition efficiency. This is because existing context learning methods rely on individual samples during prediction, making them susceptible to the influence of these single samples and resulting in prediction errors. These errors sometimes lead customer service personnel to manually input question categories, which consumes significant time resources and increases the time required to obtain question categories, thus hindering the improvement of efficiency. Summary of the Invention
[0004] This invention provides a context learning method, apparatus, computer device, and storage medium to solve the technical problem that the process of obtaining the question category for current query questions is cumbersome and does not improve the efficiency of question category acquisition.
[0005] Firstly, a context learning method is provided, including:
[0006] Obtain a preset query question and the context template corresponding to the preset query question;
[0007] Based on the preset query question and the context template, obtain the example question, and obtain the predicted probability distribution output by the large language model based on the example question;
[0008] Based on the predicted probability distribution, obtain the information entropy value corresponding to the example problem, and based on the information entropy value and a predefined acquisition method, obtain the reference problem;
[0009] A set is formed by combining multiple of the aforementioned reference problems;
[0010] Obtain the vector of each of the reference questions in the set, obtain the customer service dialogue, obtain the context content of the currently queried question in the customer service dialogue, and obtain the vector of the context content;
[0011] Based on the vectors of each reference question, the vector of the context content, and the predefined generation method, generate each reference vector, obtain each category predicted by the large language model based on each reference vector, obtain the votes for each category, and select the category with the most votes as the question category of the current query question.
[0012] Further, the step of obtaining an example question based on the preset query question and the context template, and obtaining the predicted probability distribution output by the large language model based on the example question, includes:
[0013] Replace the placeholder in the context template with the preset query question, and use the replaced context template as the example question;
[0014] The example question is input into a preset large language model, and the predicted probability distribution output by the large language model based on the example question is obtained.
[0015] Further, the step of obtaining the information entropy value corresponding to the example question based on the predicted probability distribution, and obtaining the reference question based on the information entropy value and a predefined acquisition method, includes:
[0016] Based on the predicted probability distribution, obtain the information entropy value corresponding to the example question, and determine whether the information entropy value is higher than a preset value;
[0017] When the information entropy value is higher than the preset value, the example problem is marked as a reference problem.
[0018] Furthermore, the aforementioned reference problems are grouped into a set, including:
[0019] Obtain the preset number corresponding to the preset query question;
[0020] A set is formed by a predetermined number of the reference questions.
[0021] Further, obtaining the vectors of each of the reference questions in the set, obtaining the customer service dialogue, obtaining the context content of the currently queried question in the customer service dialogue, and obtaining the vector of the context content includes:
[0022] Feature extraction is performed on the semantic units of each of the reference questions to obtain the vectors of each of the reference questions, which are then connected to the target system to obtain the customer service dialogue sent by the target system.
[0023] Obtain the context content of the current query question in the customer service dialogue, extract features from the semantic units of the context content, and obtain the vector of the context content.
[0024] Further, the step of generating reference vectors based on the vectors of each reference question, the vector of the context content, and a predefined generation method; obtaining the categories predicted by the large language model based on each reference vector; obtaining the votes for each category; and selecting the category with the most votes as the question category of the current query question includes:
[0025] The vectors of each of the reference questions and the vectors of the context content are concatenated to generate reference vectors. Each reference vector is then input into the large language model to obtain the categories predicted by the large language model based on each of the reference vectors.
[0026] Voting is conducted among the categories to obtain the number of votes for each category, and the category with the most votes is selected as the question category for the current query question.
[0027] Further, after generating reference vectors based on the vectors of each reference question, the vector of the context content, and a predefined generation method, obtaining the categories predicted by the large language model based on each reference vector, obtaining the votes for each category, and selecting the category with the most votes as the question category of the current query question, the context learning method includes:
[0028] Obtain the solution corresponding to the problem category and display the solution in the customer service conversation.
[0029] Secondly, a context learning device is provided, comprising:
[0030] The first acquisition module is used to acquire a preset query question and acquire the context template corresponding to the preset query question.
[0031] The second acquisition module is used to acquire example questions based on the preset query question and the context template, and to acquire the predicted probability distribution output by the large language model based on the example questions;
[0032] The third acquisition module is used to acquire the information entropy value corresponding to the example question based on the predicted probability distribution, and to acquire the reference question based on the information entropy value and a predefined acquisition method.
[0033] A component module is used to assemble a set from multiple of the aforementioned reference problems;
[0034] The fourth acquisition module is used to acquire the vector of each of the reference questions in the set, acquire the customer service dialogue, acquire the context content of the currently queried question in the customer service dialogue, and acquire the vector of the context content.
[0035] The generation module is used to generate reference vectors based on the vectors of each reference question, the vector of the context content, and a predefined generation method; obtain the categories predicted by the large language model based on each reference vector; obtain the votes for each category; and select the category with the most votes as the question category of the current query question.
[0036] Thirdly, a computer device is provided, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the steps of the above-described context learning method.
[0037] Fourthly, a computer-readable storage medium is provided, which stores a computer program that, when executed by a processor, implements the steps of the aforementioned context learning method.
[0038] This application provides a context learning method, apparatus, computer device, and storage medium. The method includes: obtaining a preset query question; obtaining a context template corresponding to the preset query question; obtaining an example question based on the preset query question and the context template; obtaining a predicted probability distribution output by a large language model based on the example question; obtaining an information entropy value corresponding to the example question based on the predicted probability distribution; obtaining a reference question based on the information entropy value and a predefined acquisition method; forming a set using multiple reference questions; obtaining vectors for each reference question in the set; obtaining a customer service dialogue; obtaining the context content of the current query question in the customer service dialogue; obtaining a vector of the context content; generating reference vectors based on the vectors of each reference question, the vector of the context content, and a predefined generation method; and obtaining the large language model's output based on the context template. The language model predicts each category based on the reference vectors, obtains the votes for each category, and selects the category with the most votes as the question category for the current query. This has two advantages: First, by generating reference vectors based on the vectors of the reference questions, the vectors of the context content, and a predefined generation method, and obtaining the categories predicted by the language model based on these reference vectors, the model selects the category with the most votes as the question category for the current query. Since manual acquisition is not required, this reduces the time spent acquiring the text topic of the text to be processed, thus improving the efficiency of acquiring the text topic. Second, because the topic generation model is not affected by human intervention, it helps improve the reliability of the acquired text topic of the text to be processed. Attached Figure Description
[0039] To more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments of the present invention will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0040] Figure 1 This is a schematic diagram of an application environment for the context learning method in one embodiment of the present invention;
[0041] Figure 2 A flowchart illustrating a context learning method provided in an embodiment of the present invention;
[0042] Figure 3 yes Figure 1 A flowchart illustrating a specific implementation of step S23;
[0043] Figure 4 yes Figure 1 A schematic diagram of a specific implementation method for step S25;
[0044] Figure 5 yes Figure 1 A schematic diagram of a specific implementation method for step S26;
[0045] Figure 6 This is a schematic diagram of the structure of a context learning device in one embodiment of the present invention;
[0046] Figure 7 This is a schematic diagram of the structure of a computer device according to an embodiment of the present invention;
[0047] Figure 8 This is another structural schematic diagram of a computer device according to one embodiment of the present invention. Detailed Implementation
[0048] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0049] Please see Figure 1 , Figure 1 This is a schematic diagram of an application environment for the context learning method in one embodiment of the present invention. The context learning method provided in this embodiment of the present invention can be applied to, for example... Figure 1 In this application environment, the client communicates with the server via a network.
[0050] The server obtains a preset query question from the client and then obtains the context template corresponding to the preset query question.
[0051] Based on the preset query question and the context template, obtain the example question, and obtain the predicted probability distribution output by the large language model based on the example question;
[0052] Based on the predicted probability distribution, obtain the information entropy value corresponding to the example problem, and based on the information entropy value and a predefined acquisition method, obtain the reference problem;
[0053] A set is formed by combining multiple of the aforementioned reference problems;
[0054] Obtain the vector of each of the reference questions in the set, obtain the customer service dialogue, obtain the context content of the currently queried question in the customer service dialogue, and obtain the vector of the context content;
[0055] Based on the vectors of each reference question, the vector of the context content, and the predefined generation method, generate each reference vector, obtain each category predicted by the large language model based on each reference vector, obtain the votes for each category, and select the category with the most votes as the question category of the current query question.
[0056] The beneficial effects of the above-described context learning methods, devices, equipment, and media are twofold. Firstly, by generating reference vectors based on the vectors of each reference question, the vectors of the context content, and a predefined generation method, and obtaining the categories predicted by the large language model based on each reference vector, and obtaining the votes for each category, the category with the most votes is selected as the question category of the current query question. Since no manual acquisition is required, the time for acquiring the text topic of the text to be processed is reduced, which is beneficial to improving the efficiency of acquiring the text topic of the text to be processed. Secondly, since the topic generation model is not affected by human intervention, it is beneficial to improve the reliability of the acquired text topic of the text to be processed.
[0057] The client can include, but is not limited to, various personal computers, laptops, smartphones, tablets, and portable wearable devices.
[0058] The server can be implemented using a separate task database or a task database cluster consisting of multiple task databases. The invention will be described in detail below through specific embodiments.
[0059] Please see Figure 2 , Figure 2 A flowchart illustrating a context learning method according to an embodiment of the present invention includes the following steps:
[0060] S21, Obtain a preset query question and obtain the context template corresponding to the preset query question;
[0061] For example, obtaining a preset query question and obtaining the context template corresponding to the preset query question includes:
[0062] Obtain the problem set of the target business, select any one problem from the problem set of the target business as the preset query problem, and obtain the context template corresponding to the preset query problem.
[0063] The target business includes one or a combination of financial services, technology services, and insurance services.
[0064] For ease of explanation, the following example is provided:
[0065] Obtain a set of questions related to financial business, select any question from the set of questions as a preset query question, and obtain the context template corresponding to the preset query question;
[0066] Obtain a set of questions related to technology business, select any question from the set of questions related to technology business as a preset query question, and obtain the context template corresponding to the preset query question;
[0067] Obtain a set of questions related to insurance business, select any question from the set of questions as a preset query question, and obtain the context template corresponding to the preset query question.
[0068] S22, Based on the preset query question and the context template, obtain an example question, and obtain the predicted probability distribution output by the large language model based on the example question;
[0069] The step of obtaining an example question based on the preset query question and the context template, and obtaining the predicted probability distribution output by the large language model based on the example question, includes:
[0070] Replace the placeholder in the context template with the preset query question, and use the replaced context template as the example question;
[0071] The example question is input into a preset large language model, and the predicted probability distribution output by the large language model based on the example question is obtained.
[0072] S23, based on the predicted probability distribution, obtain the information entropy value corresponding to the example problem, and based on the information entropy value and a predefined acquisition method, obtain the reference problem;
[0073] For example, obtaining the information entropy value corresponding to the example question based on the predicted probability distribution includes:
[0074] Based on the predicted probability distribution and the formula for calculating information entropy, the information entropy value corresponding to the example problem is obtained.
[0075] The formula for calculating information entropy is detailed below:
[0076] L(x) = -∑Xi×log2p(Xi);
[0077] Where L(X) is the information entropy value of the Xth example question, Xi represents the i-th category in the predicted probability distribution of the Xth example question output, and p(Xi) is the probability of the large language model based on the i-th category in the predicted probability distribution of the Xth example question output.
[0078] For ease of explanation, the following example is provided:
[0079] For example, there are two example questions, namely example question A and example question B;
[0080] The large language model's predicted probability distribution based on the output of the example question A is as follows:
[0081] Category A: 0.9;
[0082] Category B: 0.1;
[0083] The large language model's predicted probability distribution based on the output of example question B is as follows:
[0084] Category A: 0.2;
[0085] Category B: 0.8;
[0086] The information entropy value of example problem A is detailed below:
[0087] L1=-(0.9×log20.9+0.1×log20.1)≈0.469;
[0088] The information entropy value of example problem B is detailed below:
[0089] L2=-(0.2log20.2+0.8log20.8)≈0.722;
[0090] In this example, the higher information entropy value of example question B indicates that the large language model is less certain about the prediction of this example, and that example question B contains more useful information or needs more attention.
[0091] When the preset value is 0.5, the information entropy value of example question B is higher than the preset value, and example question B is marked as a reference question.
[0092] S24, forming a set through multiple of the aforementioned reference problems;
[0093] The set of multiple reference problems includes:
[0094] Obtain the preset number corresponding to the preset query question;
[0095] A set is formed by a predetermined number of the reference questions.
[0096] The preset quantity is either user-defined or system default, and no limit is set here.
[0097] For ease of explanation, the following example is provided:
[0098] For example, 1000 different reference questions can be stored together to form a set;
[0099] For example, 2000 different reference questions can be stored together to form a set;
[0100] For example, 10,000 different reference questions can be stored together to form a set.
[0101] S25, obtain the vector of each of the reference questions in the set, obtain the customer service dialogue, obtain the context content of the currently queried question in the customer service dialogue, and obtain the vector of the context content;
[0102] For ease of explanation, the following example is provided:
[0103] For example, the context of the current query is: "Is my car insurance policy, policy number 123456789, currently valid? When does it expire?"
[0104] The current query is: Is it currently valid? When does it expire?
[0105] For ease of explanation, the following example is provided:
[0106] For example, the context of the current query is: I submitted a medical insurance claim last week, application number: A123456. What is the current progress of the claim? When can I expect to receive the payment?
[0107] The current inquiry is: What is the current progress of the claim? When is it expected that I will receive the compensation payment?
[0108] S26. Generate reference vectors based on the vectors of each reference question, the vector of the context content, and a predefined generation method. Obtain the categories predicted by the large language model based on each reference vector. Obtain the votes for each category. Select the category with the most votes as the question category of the current query question.
[0109] For ease of explanation, the following example is provided:
[0110] For example, the context of the current query is: "Is my car insurance policy, policy number 123456789, currently valid? When does it expire?"
[0111] The current query is: Is it currently valid? When does it expire?
[0112] The reference problems mentioned are reference problem 1, reference problem 2, and reference problem 3, respectively.
[0113] The vector of reference question 1 and the vector of the context content are concatenated to generate reference vector 1. Reference vector 1 is then input into the large language model to obtain the policy status query category predicted by the large language model based on reference vector 1.
[0114] The vector of reference question 2 and the vector of the context content are concatenated to generate reference vector 2. Reference vector 2 is then input into the large language model to obtain the claim progress query category predicted by the large language model based on reference vector 2.
[0115] The vector of reference question 3 and the vector of the context content are concatenated to generate reference vector 3. Reference vector 3 is then input into the large language model to obtain the policy status query category predicted by the large language model based on reference vector 3.
[0116] Therefore, the categories are as follows: policy status inquiry category, claims progress inquiry category;
[0117] Count the number of votes for each category:
[0118] Policy status inquiry category: 2 policies; Claim progress inquiry category: 1 policy;
[0119] The category with the most votes is the policy status query category. Therefore, the policy status query category is selected as the question category for whether the policy is currently valid and when it expires.
[0120] The context learning method, after generating reference vectors based on the vectors of each reference question, the vector of the context content, and a predefined generation method, obtaining the categories predicted by the large language model based on each reference vector, obtaining the votes for each category, and selecting the category with the most votes as the question category of the current query question, includes:
[0121] Obtain the solution corresponding to the problem category and display the solution in the customer service conversation.
[0122] In this embodiment of the invention, the beneficial effects are twofold. Firstly, by generating reference vectors based on the vectors of each reference question, the vectors of the context content, and a predefined generation method, and obtaining the categories predicted by the large language model based on each reference vector, obtaining the votes for each category, and selecting the category with the most votes as the question category of the current query question, since no manual acquisition is required, the time for obtaining the text topic of the text to be processed is reduced, which is beneficial to improving the efficiency of obtaining the text topic of the text to be processed. Secondly, since the topic generation model is not affected by manual intervention, it is beneficial to improve the reliability of the obtained text topic of the text to be processed.
[0123] Please see Figure 3 , Figure 3 yes Figure 1 A detailed flowchart of a specific implementation method for step S23 is described below:
[0124] S31, Based on the predicted probability distribution, obtain the information entropy value corresponding to the example problem, and determine whether the information entropy value is higher than a preset value;
[0125] For example, obtaining the information entropy value corresponding to the example question based on the predicted probability distribution includes:
[0126] Based on the predicted probability distribution and the formula for calculating information entropy, the information entropy value corresponding to the example problem is obtained.
[0127] The formula for calculating information entropy is detailed below:
[0128] L(x) = -∑Xi×log2p(Xi);
[0129] Where L(X) is the information entropy value of the Xth example question, Xi represents the i-th category in the predicted probability distribution of the Xth example question output, and p(Xi) is the probability of the large language model based on the i-th category in the predicted probability distribution of the Xth example question output.
[0130] S32, when the information entropy value is higher than the preset value, the example problem is marked as a reference problem.
[0131] The information entropy value is used to describe the amount of information in the example problem. The larger the information entropy, the more information the example problem contains, and the smaller the information entropy, the less information the example problem contains.
[0132] In this embodiment of the invention, when the information entropy value is higher than the preset value, the example problem is marked as a reference problem, so that a reference problem with more information than expected can be obtained.
[0133] Please see Figure 4 , Figure 4 yes Figure 1 A detailed flowchart of a specific implementation method for step S25 is described below:
[0134] S41, extract features from the semantic units of each reference question to obtain the vector of each reference question, connect it to the target system, and obtain the customer service dialogue sent by the target system;
[0135] The target system includes one or a combination of financial systems, technology systems, and insurance systems.
[0136] For ease of explanation, the following example is provided:
[0137] Feature extraction is performed on the semantic units of each of the reference questions to obtain the vectors of each of the reference questions, which are then connected to the financial system to obtain the customer service dialogues sent by the financial system.
[0138] Feature extraction is performed on the semantic units of each of the reference questions to obtain the vectors of each of the reference questions, which are then connected to the technology system to obtain the customer service dialogue sent by the technology system.
[0139] Feature extraction is performed on the semantic units of each reference problem to obtain the vector of each reference problem, which is then connected to the insurance system to obtain the customer service dialogue sent by the insurance system.
[0140] S42, obtain the context content of the current query question in the customer service dialogue, extract features from the semantic units of the context content, and obtain the vector of the context content.
[0141] Among them, the context vector can serve as a feature representation of the context content, which can be used to distinguish different types of context content.
[0142] In this embodiment of the invention, the first embedding vector is reconstructed based on the attention score to generate a second embedding vector corresponding to the semantic unit. This allows for a greater focus on important semantic units, improving the accuracy and efficiency of semantic unit selection.
[0143] Please see Figure 5 , Figure 5 yes Figure 1 A detailed flowchart of a specific implementation method for step S26 is described below:
[0144] S51, concatenate the vectors of each of the reference questions and the vectors of the context content to generate each reference vector, input each reference vector into the large language model, and obtain each category predicted by the large language model based on each of the reference vectors;
[0145] S52, vote on each category, obtain the number of votes for each category, and select the category with the most votes as the question category for the current query question.
[0146] In this embodiment of the invention, the category with the most votes is selected as the question category of the current query question. This avoids relying on certain specific, potentially biased features, thereby reducing the prediction error that may be caused by a single reference vector. The application of this strategy allows us to not rely too much on a single reference vector, but to comprehensively consider the opinions of multiple reference vectors, thereby improving the robustness and accuracy of context learning.
[0147] Please see Figure 6 , Figure 6 This is a schematic diagram of the structure of a context learning device in one embodiment of the present invention, as shown below. Figure 6 As shown, the context learning device includes a first acquisition module 101, a second acquisition module 102, a third acquisition module 103, a composition module 104, a fourth acquisition module 105, and a generation module 106. Detailed descriptions of each functional module are as follows:
[0148] The first acquisition module 101 is used to acquire a preset query question and acquire the context template corresponding to the preset query question;
[0149] The second acquisition module 102 is used to acquire an example question based on the preset query question and the context template, and to acquire the predicted probability distribution output by the large language model based on the example question;
[0150] The third acquisition module 103 is used to acquire the information entropy value corresponding to the example question according to the predicted probability distribution, and to acquire the reference question according to the information entropy value and a predefined acquisition method.
[0151] Module 104 is used to assemble a set from the multiple reference problems;
[0152] The fourth acquisition module 105 is used to acquire the vector of each of the reference questions in the set, acquire customer service dialogue, acquire the context content of the currently queried question in the customer service dialogue, and acquire the vector of the context content.
[0153] The generation module 106 is used to generate each reference vector according to the vector of each reference question, the vector of the context content and the predefined generation method, obtain each category predicted by the large language model based on each reference vector, obtain the votes of each category, and select the category with the most votes as the question category of the current query question.
[0154] In one embodiment, the first acquisition module 102 includes:
[0155] The replacement subunit is used to replace the placeholder in the context template with the preset query question, and the replaced context template is used as the example question.
[0156] The first acquisition subunit is used to input the example question into a preset large language model and acquire the predicted probability distribution output by the large language model based on the example question.
[0157] In one embodiment, the third acquisition module 103 includes:
[0158] The judgment subunit is used to obtain the information entropy value corresponding to the example question based on the predicted probability distribution, and to determine whether the information entropy value is higher than a preset value.
[0159] A marking subunit is used to mark the example problem as a reference problem when the information entropy value is higher than the preset value.
[0160] In one embodiment, the component module 104 includes:
[0161] The second acquisition subunit is used to acquire the preset quantity corresponding to the preset query question;
[0162] The constituent sub-units are used to form a set by means of a preset number of the reference problems.
[0163] In one embodiment, the fourth acquisition module 105 includes:
[0164] The first extraction subunit is used to extract features from the semantic units of each of the reference questions, obtain the vectors of each of the reference questions, connect them to the target system, and obtain the customer service dialogue sent by the target system.
[0165] The second extraction subunit is used to obtain the context content of the current query question in the customer service dialogue, extract features from the semantic units of the context content, and obtain the vector of the context content.
[0166] In one embodiment, the generation module 106 includes:
[0167] The input subunit is used to concatenate the vectors of each of the reference questions and the vectors of the context content to generate each reference vector, and input each reference vector into the large language model to obtain each category predicted by the large language model based on each of the reference vectors;
[0168] The voting subunit is used to vote on each of the categories, obtain the number of votes for each category, and select the category with the most votes as the question category for the current query question.
[0169] In one embodiment, the context learning device further includes:
[0170] The display module is used to obtain the solution corresponding to the problem category and display the solution in the customer service dialogue.
[0171] In this embodiment of the invention, the beneficial effects are twofold. Firstly, by generating reference vectors based on the vectors of each reference question, the vectors of the context content, and a predefined generation method, and obtaining the categories predicted by the large language model based on each reference vector, obtaining the votes for each category, and selecting the category with the most votes as the question category of the current query question, since no manual acquisition is required, the time for obtaining the text topic of the text to be processed is reduced, which is beneficial to improving the efficiency of obtaining the text topic of the text to be processed. Secondly, since the topic generation model is not affected by manual intervention, it is beneficial to improve the reliability of the obtained text topic of the text to be processed.
[0172] For specific limitations on context learning devices, please refer to the limitations on context learning methods mentioned above, which will not be repeated here.
[0173] Each module in the aforementioned context learning device can be implemented entirely or partially through software, hardware, or a combination thereof. These modules can be embedded in or independent of the processor in a computer device, or stored in the memory of a computer device as software, so that the processor can invoke and execute the operations corresponding to each module.
[0174] Please see Figure 7 , Figure 7This is a schematic diagram of a computer device according to one embodiment of the present invention. In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as follows. Figure 7 As shown. This computer device includes a processor, memory, network interface, and database connected via a system bus.
[0175] The processor of this computer device provides computing and control capabilities. The memory of the computer device includes non-volatile and / or volatile storage media and internal memory. The non-volatile storage media stores the operating system, computer programs, and databases. The internal memory provides an environment for the operation of the operating system and computer programs stored in the non-volatile storage media. The network interface of the computer device is used to communicate with external clients via a network connection. When the computer program is executed by the processor, it implements a context learning method's functions or steps on the server side.
[0176] Please see Figure 8 , Figure 8 This is another structural schematic diagram of a computer device according to one embodiment of the present invention. In one embodiment, a computer device is provided, which can be a client, and its internal structure diagram can be as follows. Figure 8 As shown, the computer device includes a processor, memory, network interface, display screen, and input devices connected via a system bus. The processor provides computational and control capabilities. The memory includes non-volatile storage media and internal memory. The non-volatile storage media stores the operating system and computer programs. The internal memory provides an environment for the operation of the operating system and computer programs stored in the non-volatile storage media. The network interface is used to communicate with an external task database via a network connection. When the computer program is executed by the processor, it implements the functions or steps of a context learning method.
[0177] In one embodiment, a computer device is provided, including a memory, a processor, and a computer program stored in the memory and executable on the processor.
[0178] It should be noted that the functions or steps that can be implemented by the computer-readable storage medium or computer device described above can be referred to the relevant descriptions on the server side and client side in the foregoing method embodiments. To avoid repetition, they will not be described one by one here.
[0179] The processors mentioned above can be general-purpose processors, including central processing units (CPUs), graphics processing units (GPUs), network processors (NPs), etc.; they can also be digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components.
[0180] The foregoing description and accompanying drawings fully illustrate embodiments of this disclosure to enable those skilled in the art to practice them. Other embodiments may include structural, logical, electrical, procedural, and other changes. The embodiments represent only possible variations. Individual components and functions are optional unless explicitly required, and the order of operation may vary. Parts and subsamples of some embodiments may be included in or replace parts and subsamples of other embodiments. Moreover, the terminology used in this application is for describing embodiments only and is not intended to limit the claims. As used in the description of embodiments and claims, the singular forms “a,” “an,” and “the” are intended to equally include the plural forms unless the context clearly indicates otherwise. Similarly, the term “and / or” as used herein means including one or more of the associated listed items and all possible combinations thereof. Additionally, when used in this application, the term "comprise" and its variations "comprises" and / or "comprising" refer to the presence of stated subsamples, wholes, steps, operations, elements, and / or components, but do not exclude the presence or addition of one or more other subsamples, wholes, steps, operations, elements, components, and / or groups thereof. Without further limitations, an element defined by the phrase "comprising a..." does not exclude the presence of other identical elements in the process, method, or apparatus that includes the element. In this document, each embodiment may focus on the differences from other embodiments, and similar or identical parts between embodiments can be referred to mutually. For methods, products, etc., disclosed in the embodiments, if they correspond to the method section disclosed in the embodiments, the relevant parts can be referred to the description of the method section.
[0181] Those skilled in the art will recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of the embodiments of this disclosure. Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the specific working processes of the systems, devices, and units described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be repeated here.
[0182] The methods and products (including but not limited to devices and equipment) disclosed in the embodiments herein can be implemented in other ways. For example, the device embodiments described above are merely illustrative. For instance, the division of units may be merely a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some sub-samples may be ignored or not executed. In addition, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be electrical, mechanical, or other forms. Units described as separate components may or may not be physically separate, and components shown as units may or may not be physical units, that is, they may be located in one place or distributed across multiple network units. Some or all of the units may be selected to implement this embodiment according to actual needs. Furthermore, the functional units in the embodiments of this disclosure may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
[0183] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to embodiments of this disclosure. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of code containing one or more executable instructions for implementing a specified logical function. In some alternative implementations, the functions marked in the blocks may occur in a different order than that shown in the drawings. For example, two consecutive blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. In the descriptions corresponding to the flowcharts and block diagrams in the accompanying drawings, the operations or steps corresponding to different blocks may also occur in a different order than disclosed in the description, and sometimes there is no specific order between different operations or steps. For example, two consecutive operations or steps may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. Each block in a block diagram and / or flowchart, and combinations of blocks in a block diagram and / or flowchart, can be implemented using a dedicated hardware-based system that performs the specified function or action, or using a combination of dedicated hardware and computer instructions.
Claims
1. A context learning method, characterized in that, include: Obtain a preset query question and the context template corresponding to the preset query question; Based on the preset query question and the context template, obtain the example question, and obtain the predicted probability distribution output by the large language model based on the example question; Based on the predicted probability distribution, obtain the information entropy value corresponding to the example problem, and based on the information entropy value and a predefined acquisition method, obtain the reference problem; A set is formed by combining multiple of the aforementioned reference problems; Obtain the vector of each of the reference questions in the set, obtain the customer service dialogue, obtain the context content of the currently queried question in the customer service dialogue, and obtain the vector of the context content; Based on the vectors of each reference question, the vector of the context content, and the predefined generation method, generate each reference vector, obtain each category predicted by the large language model based on each reference vector, obtain the votes for each category, and select the category with the most votes as the question category of the current query question.
2. The context learning method according to claim 1, characterized in that, The process of obtaining an example question based on the preset query question and the context template, and obtaining the predicted probability distribution output by the large language model based on the example question, includes: Replace the placeholder in the context template with the preset query question, and use the replaced context template as the example question; The example question is input into a preset large language model, and the predicted probability distribution output by the large language model based on the example question is obtained.
3. The context learning method according to claim 1, characterized in that, The step of obtaining the information entropy value corresponding to the example question based on the predicted probability distribution, and obtaining the reference question based on the information entropy value and a predefined acquisition method, includes: Based on the predicted probability distribution, obtain the information entropy value corresponding to the example question, and determine whether the information entropy value is higher than a preset value; When the information entropy value is higher than the preset value, the example problem is marked as a reference problem.
4. The context learning method according to claim 1, characterized in that, The aforementioned reference problems are grouped into a set, including: Obtain the preset number corresponding to the preset query question; A set is formed by a predetermined number of the reference questions.
5. The context learning method according to claim 1, characterized in that, The steps of obtaining the vectors of each of the reference questions in the set, obtaining the customer service dialogue, obtaining the context content of the currently queried question in the customer service dialogue, and obtaining the vector of the context content include: Feature extraction is performed on the semantic units of each of the reference questions to obtain the vectors of each of the reference questions, which are then connected to the target system to obtain the customer service dialogue sent by the target system. Obtain the context content of the current query question in the customer service dialogue, extract features from the semantic units of the context content, and obtain the vector of the context content.
6. The context learning method according to claim 1, characterized in that, The process of generating reference vectors based on the vectors of each reference question, the vector of the context content, and a predefined generation method; obtaining the categories predicted by the large language model based on each reference vector; obtaining the votes for each category; and selecting the category with the most votes as the question category for the current query question includes: The vectors of each of the reference questions and the vectors of the context content are concatenated to generate reference vectors. Each reference vector is then input into the large language model to obtain the categories predicted by the large language model based on each of the reference vectors. Voting is conducted among the categories to obtain the number of votes for each category, and the category with the most votes is selected as the question category for the current query question.
7. The context learning method according to claim 1, characterized in that, After generating reference vectors based on the vectors of each reference question, the vector of the context content, and a predefined generation method, obtaining the categories predicted by the large language model based on each reference vector, obtaining the votes for each category, and selecting the category with the most votes as the question category of the current query question, the context learning method includes: Obtain the solution corresponding to the problem category and display the solution in the customer service conversation.
8. A context learning device, characterized in that, include: The first acquisition module is used to acquire a preset query question and acquire the context template corresponding to the preset query question. The second acquisition module is used to acquire example questions based on the preset query question and the context template, and to acquire the predicted probability distribution output by the large language model based on the example questions; The third acquisition module is used to acquire the information entropy value corresponding to the example problem according to the predicted probability distribution, and to acquire the reference problem according to the information entropy value and a predefined acquisition method. A component module is used to assemble a set from multiple of the aforementioned reference problems; The fourth acquisition module is used to acquire the vector of each of the reference questions in the set, acquire the customer service dialogue, acquire the context content of the currently queried question in the customer service dialogue, and acquire the vector of the context content. The generation module is used to generate reference vectors based on the vectors of each reference question, the vector of the context content, and a predefined generation method; obtain the categories predicted by the large language model based on each reference vector; obtain the votes for each category; and select the category with the most votes as the question category of the current query question.
9. A computer device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the processor executes the computer program, it implements the steps of the context learning method as described in any one of claims 1 to 7.
10. A computer-readable storage medium storing a computer program, characterized in that, When the computer program is executed by a processor, it implements the steps of the context learning method as described in any one of claims 1 to 7.