Test set generation method, test method, device, equipment and medium
By generating test examples by determining the semantically correct intent path in the flowchart, the problem of high testing costs in existing dialogue systems is solved, and the testing efficiency and accuracy of dialogue systems are improved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- IFLYTEK SOUTH CHINA ARTIFICIAL INTELLIGENCE RES INST GUANGZHOU CO LTD
- Filing Date
- 2022-01-07
- Publication Date
- 2026-06-26
AI Technical Summary
Existing methods for testing dialogue systems are costly and struggle to accurately identify intents in complex dialogues and the connections between multiple turns.
By defining the preset path for semantically correct intent in the flowchart, test examples are generated, the test set is expanded, and the quantity and quality of test examples are increased.
It effectively increases the number of test examples with correct semantic recognition, reduces testing costs, and improves the testing efficiency and accuracy of dialogue systems.
Smart Images

Figure CN114548119B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of artificial intelligence technology, specifically to a method for generating a test set, a testing method, an apparatus, a device, and a medium. Background Technology
[0002] With the development of artificial intelligence technology in natural language understanding, dialogue systems are increasingly being applied in various scenarios, such as voice assistants, smart speakers, and chatbots. Dialogue systems need to accurately understand the user's expressed intent in order to better communicate with the user, meet their needs, and improve the user experience. Generally, testing the dialogue system can reveal its semantic recognition performance. However, existing testing methods are too costly. Summary of the Invention
[0003] In view of this, embodiments of this application provide a method for generating a test set, a testing method, an apparatus, a device, and a medium, which can effectively increase the number of test examples with correct semantic recognition.
[0004] In a first aspect, one embodiment of this application provides a method for generating a test set. The method includes: determining at least one first intent with correct semantic recognition in a first dialogue sample; determining at least one preset path passing through at least one first intent in a flowchart, the flowchart including multiple nodes, each node corresponding to a round of dialogue, and each node including at least one intent, the flowchart further including the connection relationship between each node and the intent; generating at least one test example based on the at least one preset path, the at least one test example constituting a test set, and each test example in the at least one test example including path information corresponding to the test example.
[0005] In conjunction with the first aspect, in some implementations of the first aspect, generating at least one test example based on at least one preset path includes: determining a second intent located after and immediately adjacent to at least one first intent in each of the at least one preset path; determining semantically correct sample text content corresponding to the second intent from the second dialogue sample; and generating a test example corresponding to the preset path based on the sample text content and the portion of the first dialogue sample corresponding to at least one first intent.
[0006] In conjunction with the first aspect, in some implementations of the first aspect, the generation method further includes: taking the intent corresponding to at least one round of dialogue preceding a round of dialogue in the first dialogue sample that has an intent recognition error marker as at least one first intent.
[0007] In conjunction with the first aspect, in some implementations of the first aspect, the method further includes: randomly determining the intent corresponding to at least one round of dialogue from each round of dialogue in the first dialogue sample as at least one first intent.
[0008] In conjunction with the first aspect, in some implementations of the first aspect, at least one first intent includes multiple first intents. The generation method further includes: selecting intents corresponding to N consecutive rounds of dialogue from a first dialogue sample as multiple first intents. Generating at least one test example based on at least one preset path includes: determining a second dialogue sample corresponding to each preset path in the at least one preset path; determining semantically correct sample text content corresponding to the multiple first intents from the second dialogue sample; and generating a test example corresponding to the preset path based on the sample text content and the N rounds of dialogue.
[0009] In conjunction with the first aspect, in some implementations of the first aspect, a test example corresponding to a preset path is generated based on the sample text content and the N-round dialogue, including: determining the similarity between the N-round dialogue and the sample text content; when the similarity is greater than or equal to a first preset threshold, concatenating the text content located after the sample text content in the second dialogue sample with the N-round dialogue to generate a test example.
[0010] In conjunction with the first aspect, in some implementations of the first aspect, determining the similarity between N rounds of dialogue and sample text content includes: determining the first structured information of the dialogue content corresponding to the N rounds of dialogue, and determining the second structured information of the sample text content; determining the similarity based on the first structured information and the second structured information.
[0011] In conjunction with the first aspect, in some implementations of the first aspect, determining the first structured information of the dialogue content corresponding to N rounds of dialogue includes: determining the token label tree of the dialogue content corresponding to each round of dialogue in the N rounds of dialogue; and obtaining the first structured information based on the N token label trees corresponding to the N rounds of dialogue.
[0012] Secondly, one embodiment of this application provides a testing method, which includes: obtaining test examples from a test set, wherein the test examples include at least one round of dialogue and corresponding path information, the path information being used to indicate the intent jump path of the at least one round of dialogue in a flowchart, the flowchart including multiple nodes, each of the multiple nodes corresponding to one round of dialogue, and each node including at least one intent, the flowchart further including the connection relationship between each node and the intent; and testing the semantic understanding system based on the path information using at least one round of dialogue to obtain test results.
[0013] In conjunction with the second aspect, in some implementations of the second aspect, the semantic understanding system is tested using at least one round of dialogue based on path information to obtain test results, including: testing the semantic understanding system using at least one round of dialogue based on path information to determine the recognition accuracy of each intent in the flowchart.
[0014] In conjunction with the second aspect, in some implementations of the second aspect, the method further includes: optimizing the design of the intent part corresponding to the recognition accuracy in the semantic understanding system when the recognition accuracy is less than a second preset threshold.
[0015] In conjunction with the second aspect, in some implementations of the second aspect, the semantic understanding system is tested using at least one round of dialogue based on path information to obtain test results, including: testing the semantic understanding system using at least one round of dialogue based on path information to determine high-frequency paths, nodes or intents in the flowchart, and / or determining paths, nodes or intents with high conversion rates in the flowchart.
[0016] Thirdly, one embodiment of this application provides a test set generation apparatus, which includes: a first determining module, configured to determine at least one first intent with correct semantic recognition in a first dialogue sample; a second determining module, configured to determine at least one preset path passing through at least one first intent in a flowchart, the flowchart including multiple nodes, each node corresponding to a round of dialogue, and each node including at least one intent, the flowchart further including the connection relationship between each node and the intent; and a generation module, configured to generate at least one test example based on at least one preset path, the at least one test example constituting a test set, each test example in the at least one test example including path information corresponding to the test example.
[0017] Fourthly, one embodiment of this application provides a testing apparatus, which includes: an acquisition module for acquiring test examples from a test set, the test examples including at least one round of dialogue and corresponding path information, the path information indicating the intent jump path of the at least one round of dialogue in a flowchart, the flowchart including multiple nodes, each of the multiple nodes corresponding to one round of dialogue, and each node including at least one intent, the flowchart further including the connection relationship between each node and the intent; and a testing module for testing the semantic understanding system based on the path information using at least one round of dialogue to obtain test results.
[0018] Fifthly, one embodiment of this application provides an electronic device, the electronic device comprising: a processor; and a memory for storing processor-executable instructions, wherein the processor is configured to perform the methods mentioned in the first and / or second aspects above.
[0019] Sixthly, one embodiment of this application provides a computer-readable storage medium storing a computer program for performing the methods mentioned in the first and / or second aspects above.
[0020] This application provides a method for generating a test set, a testing method, an apparatus, a device, and a medium. By determining at least one preset path in the flowchart that passes through the first intention with correct semantic recognition in the first dialogue sample, and generating test examples based on the preset path, the number of test examples can be effectively increased and the test set expanded. Attached Figure Description
[0021] Figure 1 The diagram shown is a flowchart illustrating a test set generation method provided in an exemplary embodiment of this application.
[0022] Figure 2 The diagram shown is a flowchart illustrating a method for generating a test set according to another exemplary embodiment of this application.
[0023] Figure 3 The diagram shown is a flowchart illustrating a testing method provided in an exemplary embodiment of this application.
[0024] Figure 4 The diagram shown is a schematic diagram of a dialogue flowchart provided in an exemplary embodiment of this application.
[0025] Figure 5 This is a schematic diagram of a dialogue flow jump provided for an exemplary embodiment of this application.
[0026] Figure 6 This is a schematic diagram of a token-based user text structure provided as an exemplary embodiment of this application.
[0027] Figure 7 This is a schematic diagram of a token-based structured model provided for an exemplary embodiment of this application.
[0028] Figure 8 This is a flowchart illustrating a method for intent recognition based on a structured model, provided as an exemplary embodiment of this application.
[0029] Figure 9 The diagram shown is a schematic representation of the structure of a test set generation apparatus provided in an exemplary embodiment of this application.
[0030] Figure 10 The diagram shown is a schematic diagram of the structure of a test apparatus provided in an exemplary embodiment of this application.
[0031] Figure 11 The diagram shown is a block diagram of an electronic device for performing a test set generation method or a test method, provided in an exemplary embodiment of this application. Detailed Implementation
[0032] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.
[0033] Application Overview
[0034] Dialogue systems can be broadly categorized by purpose into task-oriented, question-and-answer, and casual conversation-oriented systems, and by the number of interaction rounds into single-turn and multi-turn dialogues. Multi-turn dialogues, through multiple rounds of information interaction with the user, can obtain more precise information about the user's needs, thereby providing users with more diverse, better experiences, and more complex requirements.
[0035] For multi-turn dialogues, multiple rounds of interaction with the user are typically required to achieve the desired outcome. Each turn in a multi-turn dialogue may or may not be related. Accurately identifying the intent of each turn and the connections between them is crucial for better understanding the user's needs. Existing dialogue systems struggle to accurately identify the intent in complex dialogues and the relationships between multiple turns. Furthermore, while the semantic understanding capabilities of a dialogue system can be tested online, this method is prohibitively expensive.
[0036] Exemplary methods
[0037] Figure 1 The diagram shown is a flowchart illustrating a test set generation method provided in an exemplary embodiment of this application. Figure 1 The method can be executed by a computing device, which can be a server or terminal device, etc. For example... Figure 1 As shown, the method for generating this test set includes the following:
[0038] S110, determine at least one first intent in the first dialogue sample that is semantically correct.
[0039] Specifically, a flowchart can include multiple nodes, each corresponding to a round of dialogue. For example, each node can be understood as the topic of that round of dialogue. Each node can include one or more intentions. For example, if the topic of a node is gender, then that node can include two intentions: one for male and one for female. As another example, if the topic of a node is job type, then that node can include three intentions: entrepreneurship, employed worker, and freelancer.
[0040] In human-computer interaction, each intent in a flowchart corresponds to the intent expressed in the user's text response. For example, in a product promotion scenario, for a round of dialogue in the human-computer interaction process, a node can correspond to the topic (category) of the text sent by the machine, and the intent included in that node can correspond to the intent expressed in the text that the user might reply with.
[0041] like Figure 4 As shown in the flowchart, each intent can point to a different node. For example, node 1 can include intents... Figure 11 And intention 12, intention Figure 11 It can point to node 2, and intent 12 can point to node 3. Figure 1 N points to node M. Intention 31 can point to node 4, and intention 32 can point to node 5. Figure 3 N represents the last intent on its path. Flowcharts can be constructed by experts in the relevant domain based on user psychology, potential user needs and expressions in real-world scenarios, and past human-computer interaction data. Interconnected intents within a flowchart constitute a path, and a flowchart can contain multiple paths. The number of paths a flowchart can contain varies depending on the dialogue scenario.
[0042] Different scenarios can correspond to different flowcharts. For example, different flowcharts can be used for dialogue scenarios with no contextual connection (no transition) and those with contextual connection (with transition). Dialogue scenarios with no contextual connection can be information confirmation, information collection, etc., while dialogue scenarios with contextual connection can be product promotion, after-sales service, etc.
[0043] Dialogue samples can be actual collected human-computer interaction dialogue data. The number of these samples is limited, and some may contain errors in intent recognition. For example, when collecting dialogue samples, the machine can identify the user's intent to answer questions or ask questions. In each round of dialogue, the machine identifies the user's intent and labels that round with the identified intent. When the machine misidentifies the user's intent in a round of dialogue, that round will be labeled with an incorrect intent tag. These dialogue samples can be manually reviewed, and dialogues corresponding to incorrectly identified intent tags can be marked. For example, an intent recognition error tag can be added to the intent tag of that round of dialogue. Other marking methods can also be used, and this application does not limit this approach.
[0044] Taking the first dialogue sample among multiple dialogue samples as an example, the first dialogue sample may include one or more rounds of dialogue. If the intent of each round of dialogue in the first dialogue sample is correctly identified, a path completely corresponding to the first dialogue sample can be determined in the flowchart. This path is composed of the intents corresponding to each round of dialogue (which can be determined based on intent labels). If the first dialogue sample includes one round of dialogue, the path corresponding to the first dialogue sample determined in the flowchart is a point. If the first dialogue sample includes multiple rounds of dialogue, the path corresponding to the first dialogue sample determined in the flowchart is a line composed of multiple intents. Based on the flowchart, the first node containing the first intent in the first dialogue sample that is semantically correctly identified can be determined.
[0045] S120, in the flowchart, at least one preset path is determined to pass through at least one first intent. The flowchart includes multiple nodes, each of which corresponds to a round of dialogue, and each node includes at least one intent. The flowchart further includes the connection relationship between each node and the intent.
[0046] S130, generate at least one test example based on at least one preset path, the at least one test example is used to constitute a test set, and each test example in the at least one test example includes path information corresponding to the test example.
[0047] Let the length of a preset path be represented by the number of intents in the preset path. A preset path containing only one intent has a length of 1, and a preset path containing k intents has a length of k, where k is greater than or equal to 2. A preset path of length k can be composed of different intents connected together.
[0048] Specifically, at least one preset path can be identified in the flowchart that passes through at least one first intent.
[0049] Optionally, for multi-turn dialogues with no contextual connection, replacing the sample text content of a particular turn with different intents will not affect the overall effect of the multi-turn dialogue and will not cause the reconstructed test examples to be illogical. Therefore, the first node containing the first intent can be determined, and the preset path passing through the first node can be determined. When the first node includes multiple intents, the preset path passing through the first node includes paths passing through each of the multiple intents.
[0050] Once the preset path is determined, the various intents along that path can be identified. For each intent, a corresponding dialogue round can be selected from existing dialogue samples. There can be at least one such dialogue round, meaning each intent can correspond to one or more dialogue rounds (sample text content). By combining the sample text content corresponding to each intent according to the order of the intents in the preset path, multiple test examples can be obtained. These test examples follow the same path, but their text content is not entirely identical.
[0051] For example, once a preset path is determined, each second intent following the first intent on that path can be identified. For each second intent, a dialogue round corresponding to that second intent can be selected from the remaining dialogue samples. There can be at least one such dialogue round (sample text content). By concatenating the dialogue portions in the first dialogue sample that precede and correspond to the first intent with the sample text content corresponding to each second intent, multiple test examples can be obtained. Figure 5 As shown, in the case of an error in the intention recognition of the text content 530 in the first dialogue sample, the text content 510 in the first dialogue sample and the text content 520 in the second dialogue sample can be concatenated to form a new test example.
[0052] The first dialogue sample in this embodiment can also serve as a test example to constitute the test set. The flowchart can be configured as needed, and multiple intentions that are prone to intention entanglement can be placed in different nodes, thus avoiding the problem of intention entanglement. For example, a user's reply "um" may correspond to the following intentions: agreeing to purchase, polite expression, and interjection. Placing these three intentions in different nodes can prevent the intention that is actually an interjection from being identified as agreeing to purchase.
[0053] This application provides a method for generating a test set. By determining at least one preset path in the flowchart that passes through the first intention that is semantically correctly identified in the first dialogue sample, and generating test examples based on the preset path, the number of test examples can be effectively increased and the test set can be expanded.
[0054] According to one embodiment of this application, generating at least one test example based on at least one preset path includes: determining a second intent located after and adjacent to at least one first intent in each of the at least one preset path; determining semantically correct sample text content corresponding to the second intent from a second dialogue sample; and generating a test example corresponding to the preset path based on the sample text content and the portion in the first dialogue sample corresponding to at least one first intent.
[0055] In one implementation, the first dialogue sample contains multiple rounds of dialogue (at least two rounds), and these rounds are unrelated. One round of dialogue contains a semantically incorrect one, meaning its corresponding intent label is incorrect. Each round preceding this one has correct semantic recognition; the semantically correctly recognized intent is the first intent, which can be determined, for example, based on the semantically correctly recognized intent label. A preset path traversing the first intent can be defined in the flowchart.
[0056] Specifically, the method for generating the test set further includes: taking the intent corresponding to at least one round of dialogue preceding the first round of dialogue in the first dialogue sample that has an intent recognition error marker as at least one first intent.
[0057] For example, a preset path traverses various intents, identifying a second intent that follows and is immediately adjacent to the first intent within the preset path. Then, from multiple dialogue samples (multiple second dialogue samples) outside the first dialogue sample, a semantically correctly recognized one-turn dialogue corresponding to the second intent is determined. Concatenating the semantically correctly recognized portion of the first dialogue sample with the determined one-turn dialogue corresponding to the second intent replaces a semantically incorrect one-turn dialogue in the first dialogue sample, resulting in reconstructed multi-turn dialogues where all semantic recognition is correct—this is the test example. Multiple semantically correctly recognized one-turn dialogues corresponding to the second intent can be identified from the multiple second dialogue samples. These multiple one-turn dialogues correspond to the same second intent, but their specific sample text content differs. Concatenating each of these multiple one-turn dialogues with the semantically correctly recognized portion of the first dialogue sample yields multiple different test examples.
[0058] For example, the first dialogue sample includes five rounds of dialogue. The semantic recognition of the first two rounds is correct, while the semantic recognition of the third round is incorrect. A preset path for the first intent corresponding to the first two rounds of dialogue can be determined in the flowchart. Multiple second intents located after and adjacent to the two first intents in the preset path are then identified. Multi-round dialogues corresponding to these multiple second intents are determined from the second dialogue sample. These multi-round dialogues can be determined from a single dialogue sample or from different dialogue samples. Concatenating the semantically correctly recognized parts of the first dialogue sample with the multi-round dialogues corresponding to the multiple second intents yields a reconstructed multi-round dialogue where all semantic recognition is correct, thus obtaining a test example. Based on the different content of each round of dialogue corresponding to each determined second intent, multiple test examples can be obtained by combining them.
[0059] According to one embodiment of this application, the method for generating the test set further includes: randomly determining the intent corresponding to at least one round of dialogue from each round of dialogue in the first dialogue sample as at least one first intent.
[0060] Specifically, the first dialogue sample contains multiple rounds of dialogue (at least two rounds), and these rounds are unrelated. The semantic recognition of each round of dialogue is correct. At this point, the intent corresponding to a particular round of dialogue can be randomly determined as the first intent. The first node containing the first intent can be determined in the flowchart, and a preset path passing through the first node can be determined. When the first node includes multiple intents, the preset path passing through the first node includes paths passing through each of the multiple intents. A second intent located after and immediately adjacent to the first intent in the preset path is determined. A round of dialogue corresponding to the second intent is determined from multiple second dialogue samples. There can be multiple rounds of dialogue corresponding to the same second intent, but the specific sample text content is different. Alternatively, multiple second intents located after and immediately adjacent to the first intent in the preset path can be determined. Multiple rounds of dialogue corresponding to these multiple second intents are determined from multiple second dialogue samples. These multiple rounds of dialogue can be determined from one dialogue sample or from different dialogue samples. By concatenating the dialogue portions corresponding to the first intent and the intents preceding the first intent in the first dialogue sample with the multi-turn dialogues corresponding to multiple second intents, a new multi-turn dialogue with correct semantic recognition can be obtained, i.e., a test example. Based on the different content of each turn of dialogue corresponding to each determined second intent, multiple test examples can be obtained through combination.
[0061] This means that multiple test examples can be obtained according to a preset path, and the number of dialogue rounds of the obtained test examples can be the same as or different from the original first dialogue sample. For example, the number of dialogue rounds of the test examples is equal to or greater than the number of rounds of the original first dialogue sample.
[0062] Multiple test cases can form a test set for testing the dialogue system (or speech recognition system). Each test case includes at least one round of dialogue and the path information of that at least one round of dialogue (test case) in the flowchart.
[0063] This application provides a method for generating a test set. By determining at least one preset path through a first intention that is semantically correctly identified in a first dialogue sample in a flowchart, determining a second intention following the first intention based on the preset path, and then determining one or more rounds of dialogue corresponding to the second intention from the second dialogue sample, and concatenating the dialogue parts corresponding to the first intention and the intentions preceding the first intention in the first dialogue sample with one or more rounds of dialogue, a reconstructed test example can be obtained. In this way, the semantically incorrect parts in the first dialogue sample can be replaced, and new test examples can be constructed based on existing dialogue samples, effectively increasing the number of test examples with correct semantic recognition.
[0064] When the first dialogue sample includes a single round of dialogue, this round should be semantically correct. Based on the first intent corresponding to this round of dialogue, the first node containing that first intent can be determined in the flowchart. The length of the preset path passing through the first node can be 1 or greater than 1. When the first node includes L intents, there can be L preset paths of length 1. Of course, the length of the preset path passing through the first node can be 2 or greater than 2, and the length of the longest preset path can be determined according to the flowchart. Therefore, the length of the preset path passing through the first node can be diverse, thus obtaining a large number of preset paths and consequently, a large number of test examples.
[0065] For example, based on a preset path of length 1, a single-turn dialogue corresponding to the intent contained in the first node can be determined from multiple second dialogue samples. Test examples can be generated based on this single-turn dialogue. In this way, multiple different test examples of single-turn dialogues can be generated, which can simulate real-world interaction scenarios, such as when the user only performs one round of interaction and then ends the human-computer interaction process.
[0066] Based on a preset path of length greater than 1, one or more second intentions located after and adjacent to the first intention within the preset path can be identified. Then, a semantically correct turn of dialogue corresponding to each second intention is determined from multiple second dialogue samples. The first dialogue sample is then concatenated with one or more turn-by-turn dialogues corresponding to the one or more second intentions to generate a test example. The process of determining a turn-by-turn dialogue corresponding to each of the one or more second intentions can be found in the description of the above embodiments.
[0067] The first dialogue sample with correct semantic recognition in this embodiment can be used as a test example to constitute the test set.
[0068] According to one embodiment of this application, at least one first intent includes multiple first intents. The method for generating the test set further includes: selecting intents corresponding to N consecutive rounds of dialogue from the first dialogue samples as multiple first intents. The method for generating at least one test example based on at least one preset path includes: determining a second dialogue sample corresponding to each preset path in the at least one preset path; determining semantically correct sample text content corresponding to the multiple first intents from the second dialogue samples; and generating test examples corresponding to the preset paths based on the sample text content and the N rounds of dialogue.
[0069] When the first dialogue sample is a multi-turn dialogue and the context of the multi-turn dialogue is related, the sample text content that can replace that turn of dialogue cannot be determined solely based on the intent of a single turn of dialogue. It is necessary to consider the meaning of the preceding and following contexts; otherwise, the reconstructed test example will have illogical context.
[0070] For example, the intents corresponding to N rounds of dialogue can be selected from the first dialogue sample as multiple consecutive first intents, and N can be set according to the needs of the scenario.
[0071] By determining the preset path through the multiple first intentions in the flowchart, a second dialogue sample can be determined from multiple dialogue samples other than the first dialogue sample. That is, the second dialogue sample includes multi-turn dialogues (sample text content) corresponding to multiple consecutive first intentions.
[0072] In some cases, the content in the first dialogue sample that follows multiple first intentions can be directly replaced by one or more rounds of dialogue in the second dialogue sample to obtain a reconstructed test example. Whether to use a single-round or multi-round dialogue to replace the content in the first dialogue sample that follows multiple first intentions can be determined based on the length of the preset path. The sample text content in the second dialogue sample that follows multiple first intentions, determined based on the preset path, can replace the content in the first dialogue sample that follows multiple first intentions.
[0073] This embodiment can construct test examples that conform to the semantic logic of the context for the first dialogue sample that is related to the context, thus expanding the test set for complex dialogue scenarios.
[0074] According to one embodiment of this application, a test example corresponding to a preset path is generated based on sample text content and N rounds of dialogue, including: determining the similarity between the N rounds of dialogue and the sample text content; when the similarity is greater than or equal to a first preset threshold, concatenating the text content located after the sample text content in the second dialogue sample with the N rounds of dialogue to generate a test example.
[0075] In some cases, even if the second dialogue sample includes multiple rounds of dialogue (sample text content) corresponding to several consecutive first intentions, the text content corresponding to these first intentions in the first and second dialogue samples may differ too much to be directly replaced, otherwise illogical situations would occur. For example, the text content corresponding to multiple first intentions in the first dialogue sample might involve a scenario about buying a car, while the text content corresponding to multiple first intentions in the second dialogue sample might involve a scenario about buying shoes. Even if both dialogue samples are about inquiring about prices or after-sales service, they cannot be directly replaced because the objects are different. Therefore, the similarity between N rounds of dialogue in the first dialogue sample and N rounds of dialogue corresponding to multiple first intentions in the second dialogue sample can be calculated in advance. When the similarity meets a threshold, one or more rounds of dialogue (text content) following multiple first intentions in the second dialogue sample can replace the content following multiple first intentions in the first dialogue sample.
[0076] This embodiment can determine the semantic similarity between N rounds of dialogue in the first dialogue sample and the corresponding N rounds of dialogue in the second dialogue sample by similarity calculation. Then, when the similarity meets the threshold, one or more rounds of dialogue (text content) in the second dialogue sample that are located after multiple first intentions are replaced with the content in the first dialogue sample that is located after multiple first intentions, so as to obtain a test example that is more in line with the real chat context.
[0077] Specifically, selecting the intents corresponding to N consecutive rounds of dialogue from the first dialogue sample as multiple first intents may include: selecting the intents corresponding to the N rounds of dialogue preceding the round of dialogue in the first dialogue sample that have intent recognition error markers as multiple first intents.
[0078] The N consecutive rounds of dialogue can be randomly selected from the first dialogue sample. For example, if the semantic recognition of each round of dialogue in the first dialogue sample is correct, then the N consecutive rounds of dialogue can be randomly selected from the first dialogue sample. A preset path corresponding to multiple first intentions through the N rounds of dialogue is determined in the flowchart. Based on the preset path, a second dialogue sample is determined from the remaining dialogue samples. The second dialogue sample includes multiple rounds of dialogue (sample text content) corresponding to multiple consecutive first intentions. The similarity between the N rounds of dialogue in the first dialogue sample and the N rounds of dialogue corresponding to multiple first intentions in the second dialogue sample is calculated. When the similarity meets a threshold, one or more rounds of dialogue in the second dialogue sample that follow multiple first intentions can replace the content in the first dialogue sample that follows multiple first intentions. This allows for the construction of new test examples and expansion of the test set.
[0079] Optionally, if the first dialogue sample contains semantically misidentified intents, then the intents corresponding to the N rounds of dialogue preceding the round with the intent misidentification marker in the first dialogue sample are taken as multiple first intents, and a preset path passing through multiple first intents is determined in the flowchart. Based on the preset path, a second dialogue sample is determined from the existing dialogue samples. The second dialogue sample includes multi-round dialogues (sample text content) corresponding to multiple consecutive first intents. The similarity between the N rounds of dialogue in the first dialogue sample and the N rounds of dialogue in the second dialogue sample corresponding to multiple first intents is calculated. When the similarity meets a threshold, one or more rounds of dialogue in the second dialogue sample following multiple first intents can replace the content in the first dialogue sample following multiple first intents. This corrects the semantically misidentified samples, constructs new test examples, and expands the test set.
[0080] According to an embodiment of this application, determining the similarity between N rounds of dialogue and sample text content includes: determining the first structured information of the dialogue content corresponding to the N rounds of dialogue, and determining the second structured information of the sample text content; determining the similarity based on the first structured information and the second structured information.
[0081] Specifically, the structured information can be a vector or matrix representing the dialogue content corresponding to N rounds of dialogue. For example, the dialogue content corresponding to N rounds of dialogue can be input into a neural network model, which can then extract features from the dialogue content to obtain the first structured information corresponding to the N rounds of dialogue. Similarly, the semantically correct sample text content corresponding to multiple first intentions in a second dialogue sample can be input into a neural network model, which can then extract features from the sample text content to obtain the second structured information corresponding to that sample text content.
[0082] Similarity can be calculated between the first and second structured information, for example, using methods such as Euclidean distance or cosine distance. The similarity score characterizes the correlation between the text content of multiple sample texts corresponding to the first intent in the N-round dialogue and the second dialogue sample.
[0083] The higher the similarity, the higher the correlation between the N rounds of dialogue and the text content of the sample. In this way, the test example obtained by splicing the dialogue content corresponding to multiple first intentions in the first dialogue sample and the dialogue content located after multiple first intentions in the second dialogue sample will be more reasonable. That is, the contextual information of the test example will be more coherent, fluent and logical. Therefore, the test example will be more in line with the real scene and more realistic.
[0084] According to an embodiment of this application, the first structured information for determining the dialogue content corresponding to N rounds of dialogue includes: determining the token tag tree of the dialogue content corresponding to each round of dialogue in the N rounds of dialogue; and obtaining the first structured information based on the N token tag trees corresponding to the N rounds of dialogue.
[0085] Specifically, the dialogue content for each round of conversation can be input into a neural network model. This model processes the dialogue content to obtain a token label tree corresponding to that content. Alternatively, the text output by the user in each round of conversation can be input into the neural network model to obtain a token label tree corresponding to that text content. Each text content corresponds to a different intent label, which can be machine-recognized.
[0086] Then, multiple token tag trees are combined based on a predefined general semantic structure tree to obtain a tree diagram, from which structured information can be obtained.
[0087] The following describes the process of generating a test set in an embodiment of this application, taking a scenario where contextual content is inherited as an example.
[0088] Figure 2 The diagram shown is a flowchart illustrating a method for generating a test set according to another exemplary embodiment of this application. Figure 2 The method is Figure 1Examples of methods are provided, but detailed descriptions are omitted here. For example... Figure 1 As shown, the method for generating this test set includes the following:
[0089] S210, determine at least one first intent in the first dialogue sample that is semantically correctly identified.
[0090] S220, in the flowchart, determine at least one preset path that passes through at least one first intent.
[0091] S230, Select the intents corresponding to N consecutive rounds of dialogue from the first dialogue sample as multiple first intents.
[0092] S240, determine a second dialogue sample corresponding to each of the at least one preset path.
[0093] S250, determine the semantically correct sample text content corresponding to multiple first intentions from the second dialogue sample.
[0094] S260, determine the first structured information of the dialogue content corresponding to N rounds of dialogue, and determine the second structured information of the sample text content.
[0095] S270, determine the similarity based on the first structured information and the second structured information.
[0096] S280, determine whether the similarity is greater than or equal to the first preset threshold. If yes, proceed to S290; otherwise, proceed to S210.
[0097] S290, concatenate the text content following the sample text content in the second dialogue sample with the N rounds of dialogue to generate a test example.
[0098] Figure 3 The diagram shown is a flowchart illustrating a testing method provided in an exemplary embodiment of this application. Figure 3 The method can be executed by a computing device, which can be a server or terminal device, etc. For example... Figure 3 As shown, the test method includes the following:
[0099] S310, Get test examples from the test set.
[0100] The test example includes at least one round of dialogue and corresponding path information. The path information is used to indicate the intent jump path of at least one round of dialogue in the flowchart. The flowchart includes multiple nodes, each of which corresponds to one round of dialogue, and each node includes at least one intent. The flowchart further includes the connection relationship between each node and intent.
[0101] S320, based on path information, tests the semantic understanding system using at least one round of dialogue to obtain test results.
[0102] Specifically, the test set can be generated using the test set generation method described in the above embodiments.
[0103] Test examples may include a dialogue round and the corresponding intent tag for that round. The position of the intent tag in the flowchart can serve as path information. By testing the semantic understanding system using this dialogue round, we can check whether the semantic understanding system accurately recognizes the intent of the text content output by the user in that round of dialogue. If the intent recognized by the semantic understanding system matches the intent tag, it means that the semantic understanding system accurately recognizes the intent; otherwise, it means that the semantic understanding system misrecognizes the intent.
[0104] Test examples can include multi-turn dialogues and the corresponding path information, which can correspond to multiple intents in the flowchart. Each intent can be located under a different node. Testing the semantic understanding system using these multi-turn dialogues can detect whether the semantic understanding system accurately recognizes the intent of each text content output by the user in the multi-turn dialogue.
[0105] Test results can include whether the semantic understanding system's intent recognition results for each round of dialogue in the test example match the intent in the path information. Figure 1 To.
[0106] The flowchart can be configured as needed, allowing multiple intentions that are prone to entanglement to be placed in different nodes, thus avoiding the problem of intention entanglement. For example, a user's reply "Mmm" may correspond to the following intentions: agreeing to purchase, polite expression, and interjection. Placing these three intentions in different nodes can prevent the intention of actually being an interjection from being identified as agreeing to purchase.
[0107] This application provides a testing method that uses test examples containing path information to test a semantic understanding system, thereby obtaining test results for the semantic understanding system. This allows for an understanding of the current performance of the semantic understanding system and provides guidance for further improvements.
[0108] According to one embodiment of this application, testing a semantic understanding system based on path information using at least one round of dialogue to obtain test results includes: testing the semantic understanding system based on path information using at least one round of dialogue to determine the recognition accuracy of each intent in the flowchart.
[0109] If the intent identified by the semantic understanding system in a certain round of dialogue is inconsistent with the intent tag, it indicates that the semantic understanding system's intent recognition is inaccurate. Furthermore, when testing the semantic understanding system using multiple test examples, the accuracy rate of the semantic understanding system in recognizing each intent in the flowchart can be statistically analyzed. Alternatively, the accuracy rate of the semantic understanding system in recognizing all intents under a specific node in the flowchart can be statistically analyzed. This allows for a quick understanding of which intents or nodes the semantic understanding system performs poorly in recognizing, enabling targeted optimization and improvement of the semantic understanding system.
[0110] Furthermore, the testing method also includes: optimizing the design of the intent part corresponding to the recognition accuracy in the semantic understanding system when the recognition accuracy is less than a second preset threshold.
[0111] Specifically, the second preset threshold can be set according to actual needs. When the recognition accuracy of a certain intent or node is less than the second preset threshold, the recognition part of the semantic understanding system for that intent or node is optimized. However, when the recognition accuracy of a certain intent or node is greater than or equal to the second preset threshold, the recognition part of the semantic understanding system for that intent or node is not adjusted or optimized.
[0112] Because different scenarios involve varying degrees of dialogue complexity, different second preset thresholds can be set for different scenarios. Alternatively, for the same scenario, some intents or nodes are more important, and the preset thresholds corresponding to these important intents or nodes can be set higher. That is, within the same flowchart, different intents or nodes may have different performance requirements, and different preset thresholds can be set for nodes or intents with different performance requirements. This allows for more flexible optimization of the semantic understanding system.
[0113] According to one embodiment of this application, a semantic understanding system is tested using at least one round of dialogue based on path information to obtain test results, including: testing the semantic understanding system using at least one round of dialogue based on path information to determine high-frequency paths, nodes, or intents in the flowchart, and / or determining paths, nodes, or intents with high conversion rates in the flowchart.
[0114] Specifically, by testing the semantic understanding system using multiple test examples, we can identify the high-frequency paths, nodes, or intents involved in these test examples. This allows us to focus on the recognition accuracy corresponding to each high-frequency path, node, or intent. Test results can include high-frequency paths, nodes, or intents. For example, if there are 100 test examples, and 80 of them correspond to the same path in the flowchart, then that path is a high-frequency path. Similarly, if the number of test examples corresponding to a certain path exceeds a threshold, that path can be identified as a high-frequency path. Likewise, high-frequency nodes or intents can be identified.
[0115] Furthermore, test results can include high-conversion paths, nodes, or intents in the flowchart. In product sales scenarios, the purpose of the dialogue is generally to encourage users to purchase the product. Therefore, the conversion rate of the corresponding path information in a test example can be determined by whether the multi-turn dialogue in the test example includes the user's intent to purchase the product. For example, test examples in a product sales scenario can correspond to multiple types of path information. If there are 100 test examples corresponding to the first type of path information, and the user ultimately agrees to purchase the product in 80 of these 100 test examples, then the conversion rate corresponding to the first type of path information is 80%. Further, if the conversion rate corresponding to the second type of path information is 50%, then the path corresponding to the first type of path information can be considered a high-conversion path. Similarly, the conversion rates of each node or intent can be statistically analyzed to determine the nodes or intents with high conversion rates.
[0116] By identifying high-frequency paths, nodes, or intents in a flowchart, optimization directions can be provided for the semantic understanding system, facilitating targeted and efficient system optimization. Similarly, identifying high-conversion-rate paths, nodes, or intents in a flowchart can also provide optimization directions for the semantic understanding system. Furthermore, it can provide communication strategies for actual human-computer interaction, thereby enabling further updates to the semantic understanding system based on these communication strategies.
[0117] The following example uses a marketing task scenario. Figures 6 to 8 Describe in detail how to structure the user's dialogue text when the context content is inherited.
[0118] Figure 6 This is a schematic diagram of a user text structure based on a token tag, provided as an exemplary embodiment of this application.
[0119] When contextual content is inherited, the content between adjacent turns of a multi-turn dialogue can be inherited, leading to intent entanglement. If the test dataset is constructed using the concatenation method described in the non-inheritance scenario, the test results will be significantly compromised due to intent entanglement. To address this, combinations of different segments can be filtered based on user tag matching (e.g., token tags), making the complete dialogue constructed from multiple dialogue segments closer to real-world dialogue data in terms of content inheritance and fluency.
[0120] The following describes the process of constructing a user text structure based on token tags.
[0121] First, the user's conversation text (Utterance) can be structured using token tags. For example... Figure 6As shown, the token tags corresponding to the user's dialogue text in each round can be extracted, and the extracted token tags can be combined into a token tag tree according to a predefined general semantic structure tree. Furthermore, the intent tags corresponding to the user's dialogue text in each round can be obtained through semantic judgment.
[0122] Then, the structured information of the multi-turn dialogue text in the current dialogue text is matched with the structured information of other dialogue texts. When the structured text information of the two dialogues is similar, it can be determined that the two are similar dialogue data and can be directly combined.
[0123] Because this construction method requires the information mentioned above, the basic unit of construction differs from the splicing method in the non-inheritance scenario described above. This construction method relies on combining multiple interconnected dialogue segments. It should be understood that, generally speaking, the more dialogue turns a correct dialogue segment contains, the better the coherence and relevance of the content of a complete dialogue constructed from multiple dialogues. Constructing a complete dialogue also includes a certain dialogue flow path in the dialogue flowchart. All dialogue flow paths can be combined through different combinations. The test set data combined in the embodiments of this application conforms to the fluency and rationality of real dialogue data.
[0124] Figure 7 This is a schematic diagram of a token-based structured model provided for an exemplary embodiment of this application. Figure 8 This is a flowchart illustrating a method for intent recognition based on a structured model, provided as an exemplary embodiment of this application.
[0125] Taking a marketing scenario as an example, firstly, experts can manually define the types of user intent to reject marketing in a marketing task scenario (L = [l1, l2, ..., ln]). For example, in the 4G to 5G upgrade service of China Telecom, the user rejection type labels include "busy," "with Wi-Fi," "sufficient data," "able to receive speed reduction," "forgot to cancel," and "unclear intent," etc. Then, intent recognition methods are used to identify the intent of the current user's dialogue content. User intent recognition may present many challenges, such as: multiple intents, negation intents, calculation intents, consultation intents, etc. Mainstream semantic representation methods based on BERT encoding cannot effectively capture the aforementioned complex intents. The embodiments of this application adopt an intent recognition method based on semantic structure parsing. This method mainly structures the current user's dialogue content based on token tags, then constructs a semantic tag graph using graph decoding methods, and finally identifies the user's intent through matching metrics.
[0126] The following is combined with Figure 7 and Figure 8 The intent recognition method is described in detail below:
[0127] 810. Perform concept identification on the current user's dialogue content to obtain the main intent concept.
[0128] For example, the BERT encoder 710 can be used to encode the current user's dialogue content to obtain a semantic representation of the intent.
[0129] 820 uses a Conditional Random Field (CRF) decoder 720 to predict the start and end positions of interval-type token tags in the dialogue content, while simultaneously using attention interaction to predict tag-type tokens. The token tags extracted by the token extractor 730 are also referred to as concept nodes in the constructed graph.
[0130] Next, through steps 830 to 870, an utterance-level semantic label graph is constructed based on the identified primary intent concepts. It should be understood that if there are multiple primary intent concepts, multiple trees need to be constructed.
[0131] 830, Initialization: Initialize the tree / graph as the primary intent concept node.
[0132] In other words, the main intent concept can be represented as the initial graph state; 840, select a node t from the extracted concept nodes.
[0133] 850. In the partially constructed graph, select a parent node for t.
[0134] It should be understood that information from partially constructed subtrees can be incorporated when searching for a parent node.
[0135] 860, Update graph and graph status.
[0136] 870. Repeat steps (1)-(3) until the end node is selected.
[0137] 880 uses matching to obtain the intent of user conversation content.
[0138] According to the embodiments of this application, the user's dialogue content in each round of dialogue can be used as the query input token extractor 730, and an utterance-level semantic tag tree corresponding to the natural sentence structure under each intent in the matching resource can be constructed offline. In addition, the token tag tree and the query can be encoded by the graph analyzer (transformer) 740 and the BERT encoder 710 respectively to obtain their respective semantic representations. Then, attention interaction is performed through the attention layer 731 to finally obtain the final representation of the query.
[0139] Following the above method, utterance-level semantic tag trees corresponding to natural sentence structures under each intent in the matching resources of the knowledge base are constructed offline. The semantic tag trees are then encoded using a graph analyzer 740, and the encoded sentence representations are stored offline in a graph storage device 750. When obtaining the input current query, the semantic tag graph of the current query and its encoded semantic representation are also constructed first. Finally, the intent of the current query is obtained through matching.
[0140] Exemplary device
[0141] The above text combined Figures 1 to 8 The method embodiments of this application are described in detail below, in conjunction with... Figures 9 to 11 The present application provides a detailed description of the apparatus embodiments. Furthermore, it should be understood that the descriptions of the method embodiments correspond to the descriptions of the apparatus embodiments; therefore, any parts not described in detail can be found in the foregoing method embodiments.
[0142] Figure 9 The diagram shown is a schematic representation of a test set generation apparatus provided in an embodiment of this application. Figure 9 As shown, the test set generation apparatus provided in this application embodiment includes a first determining module 910, a second determining module 920, and a generation module 930. Specifically, the first determining module 910 is used to determine at least one first intent with correct semantic recognition in a first dialogue sample. The second determining module 920 is used to determine at least one preset path passing through at least one first intent in a flowchart. The flowchart includes multiple nodes, each node corresponding to a round of dialogue, and each node includes at least one intent. The flowchart further includes the connection relationship between each node and intent. The generation module 930 is used to generate at least one test example based on at least one preset path. The at least one test example is used to constitute a test set, and each test example in the at least one test example includes path information corresponding to the test example.
[0143] This application provides a test set generation apparatus. By determining at least one preset path in the flowchart that passes through the first intention that is semantically correctly identified in the first dialogue sample, and generating test examples based on the preset path, the number of test examples can be effectively increased and the test set expanded.
[0144] In some embodiments, the generation module 930 is configured to determine a second intent located after and adjacent to at least one first intent in each of at least one preset path; determine semantically correct sample text content corresponding to the second intent from the second dialogue sample; and generate a test example corresponding to the preset path based on the sample text content and the portion in the first dialogue sample corresponding to at least one first intent.
[0145] In some embodiments, the first determining module 910 is further configured to take the intent corresponding to at least one round of dialogue prior to the first round of dialogue in the first dialogue sample that has an intent recognition error mark as at least one first intent.
[0146] In some embodiments, the first determining module 910 is further configured to randomly determine the intent corresponding to at least one round of dialogue from each round of dialogue in the first dialogue sample as at least one first intent.
[0147] In some embodiments, at least one first intent includes multiple first intents. The first determining module 910 is further configured to select intents corresponding to N consecutive rounds of dialogue from the first dialogue sample as multiple first intents. In addition, the generating module 930 is further configured to: determine a second dialogue sample corresponding to each of the at least one preset path; determine semantically correct sample text content corresponding to the multiple first intents from the second dialogue sample; and generate test examples corresponding to the preset paths based on the sample text content and the N rounds of dialogue.
[0148] In some embodiments, the generation module 930 is used to determine the similarity between the N-round dialogue and the sample text content; when the similarity is greater than or equal to a first preset threshold, the text content located after the sample text content in the second dialogue sample is concatenated with the N-round dialogue to generate a test example.
[0149] In some embodiments, the generation module 930 is used to determine the first structured information of the dialogue content corresponding to N rounds of dialogue, and to determine the second structured information of the sample text content; and to determine the similarity based on the first structured information and the second structured information.
[0150] In some embodiments, the generation module 930 is used to determine the token tag tree of the dialogue content corresponding to each round of dialogue in N rounds of dialogue; and to obtain the first structured information based on the N token tag trees corresponding to the N rounds of dialogue.
[0151] Figure 10 The diagram shown is a structural schematic of a testing device provided in an embodiment of this application. Figure 10 As shown, the testing device provided in this application embodiment includes an acquisition module 1010 and a testing module 1020.
[0152] Specifically, the acquisition module 1010 is used to acquire test examples from the test set. The test examples include at least one round of dialogue and corresponding path information. The path information indicates the intent jump path of the at least one round of dialogue in the flowchart. The flowchart includes multiple nodes, each node corresponding to one round of dialogue, and each node includes at least one intent. The flowchart further includes the connection relationships between the nodes and intents. The testing module 1020 is used to test the semantic understanding system based on the path information using at least one round of dialogue to obtain test results.
[0153] This application provides a testing device that uses test examples containing path information to test a semantic understanding system, thereby obtaining test results for the semantic understanding system. This allows for an understanding of the current performance of the semantic understanding system and provides guidance for further improvements.
[0154] In some embodiments, the testing module 1020 is used to test the semantic understanding system based on path information using at least one round of dialogue to determine the recognition accuracy of each intent in the flowchart.
[0155] In some embodiments, the testing module 1020 is further configured to optimize the design of the intent portion corresponding to the recognition accuracy in the semantic understanding system when the recognition accuracy is less than a second preset threshold.
[0156] In some embodiments, the testing module 1020 is used to test the semantic understanding system based on path information using at least one round of dialogue to determine high-frequency paths, nodes or intents in the flowchart, and / or to determine paths, nodes or intents with high conversion rates in the flowchart.
[0157] Figure 11 The diagram shown is a structural schematic of an electronic device provided in an embodiment of this application. Figure 11 The electronic device 1100 shown (which may specifically be a computer device) includes a memory 1101, a processor 1102, a communication interface 1103, and a bus 1104. The memory 1101, processor 1102, and communication interface 1103 are interconnected via the bus 1104.
[0158] The memory 1101 may be a read-only memory (ROM), a static storage device, a dynamic storage device, or a random access memory (RAM). The memory 1101 may store programs. When the program stored in the memory 1101 is executed by the processor 1102, the processor 1102 and the communication interface 1103 are used to execute the various steps of the model training method and / or test item processing method of the embodiments of this application.
[0159] The processor 1102 may be a general-purpose central processing unit (CPU), microprocessor, application specific integrated circuit (ASIC), graphics processing unit (GPU), or one or more integrated circuits, used to execute relevant programs to achieve the functions required to be performed by the units in the test set generation apparatus and / or test apparatus of the embodiments of this application.
[0160] The processor 1102 can also be an integrated circuit chip with signal processing capabilities. During implementation, the test set generation method and / or various steps of the test method of this application can be completed by the integrated logic circuits in the hardware of the processor 1102 or by instructions in software form. The aforementioned processor 1102 can also be a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components. It can implement or execute the methods, steps, and logic block diagrams disclosed in the embodiments of this application. The general-purpose processor can be a microprocessor or any conventional processor. The steps of the methods disclosed in the embodiments of this application can be directly embodied in the execution of a hardware decoding processor, or can be executed by a combination of hardware and software modules in the decoding processor. The software modules can be located in random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, or other mature storage media in the art. The storage medium is located in the memory 1101. The processor 1002 reads the information in the memory 1101 and, in conjunction with its hardware, performs the functions required by the units included in the test set generation apparatus and / or test apparatus of the present application embodiment, or executes the test set generation method and / or test method of the method embodiment of the present application.
[0161] The communication interface 1103 uses transceiver devices, such as, but not limited to, transceivers, to enable communication between the electronic device 1100 and other devices or communication networks. For example, a first intent can be obtained through the communication interface 1103.
[0162] Bus 1104 may include a pathway for transmitting information between various components of electronic device 1100 (e.g., memory 1101, processor 1102, communication interface 1103).
[0163] It should be noted that, although Figure 11 The illustrated electronic device 1100 only shows the memory, processor, and communication interface. However, those skilled in the art should understand that in specific implementations, the electronic device 1100 may also include other devices necessary for normal operation. Furthermore, depending on specific needs, those skilled in the art should understand that the electronic device 1100 may also include hardware devices for implementing other additional functions. Moreover, those skilled in the art should understand that the electronic device 1100 may only include the devices necessary for implementing the embodiments of this application, and may not necessarily include... Figure 11 All the devices shown.
[0164] All of the above-mentioned optional technical solutions can be combined in any way to form optional embodiments of this application, and will not be described in detail here.
[0165] Those skilled in the art will recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this application.
[0166] Those skilled in the art will understand that, for the sake of convenience and brevity, the specific working processes of the systems, devices, and units described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be repeated here.
[0167] In the several embodiments provided in this application, it should be understood that the disclosed systems, apparatuses, and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between apparatuses or units may be electrical, mechanical, or other forms.
[0168] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.
[0169] In addition, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit.
[0170] If the aforementioned functions are implemented as software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or a portion of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program verification codes, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.
[0171] It should be noted that in the description of this application, the terms "first," "second," "third," etc., are used for descriptive purposes only and should not be construed as indicating or implying relative importance. Furthermore, in the description of this application, unless otherwise stated, "a plurality of" means two or more.
[0172] The above description is merely a preferred embodiment of this application and is not intended to limit this application. Any modifications or equivalent substitutions made within the spirit and principles of this application should be included within the protection scope of this application.
Claims
1. A method for generating a test set, characterized in that, include: Determine at least one first intent that is semantically correctly identified in the first dialogue sample, wherein the at least one first intent includes multiple first intents; The plurality of first intentions are the intentions corresponding to N consecutive rounds of dialogue in the first dialogue sample; The flowchart determines at least one preset path that passes through at least one first intent. The flowchart includes multiple nodes, each of which corresponds to a round of dialogue, and each node includes at least one intent. The flowchart further includes the connection relationship between each node and the intent. At least one test example is generated based on the at least one preset path. The at least one test example is used to constitute a test set. Each test example in the at least one test example includes the path information corresponding to the test example. The step of generating at least one test example based on the at least one preset path includes: Determine a second dialogue sample corresponding to each of the at least one preset path; Determine the semantically correct sample text content corresponding to the plurality of first intentions from the second dialogue sample; When the similarity between the N rounds of dialogue and the sample text content is greater than or equal to a first preset threshold, the text content in the second dialogue sample that is located after the sample text content is concatenated with the N rounds of dialogue to generate the test example; The similarity between the N-round dialogue and the sample text content is determined based on the similarity of the structured information between the N-round dialogue and the sample text content. The structured information of the N-round dialogue is obtained through the following steps: Concept recognition is performed on the N rounds of dialogue to obtain the main intent; Predict the start and end positions of the interval-type token tags in the N rounds of dialogue, and their corresponding tag-type token tags; The main intent is determined as the root node of the token tag tree to be constructed, and the interval-type token tags and tag-type token tags are determined as child nodes; For each child node, find its parent node and connect it to the corresponding parent node to obtain the structured information of the N rounds of dialogue; wherein, the parent node includes the root node.
2. The method for generating a test set according to claim 1, characterized in that, The generation of at least one test example based on the at least one preset path includes: In each of the at least one preset path, a second intent is determined that is located after and immediately adjacent to the at least one first intent; Determine the semantically correct sample text content corresponding to the second intent from the second dialogue sample; Based on the sample text content and the portion of the first dialogue sample that corresponds to the at least one first intent, a test example corresponding to the preset path is generated.
3. The method for generating a test set according to claim 1, characterized in that, Also includes: The intent corresponding to at least one round of dialogue preceding the first round of dialogue in the first dialogue sample that has an intent recognition error marker is taken as the at least one first intent.
4. The method for generating a test set according to claim 1, characterized in that, Also includes: Randomly determine the intent corresponding to at least one round of dialogue from each round of dialogue in the first dialogue sample as the at least one first intent.
5. The method for generating a test set according to claim 1, characterized in that, The structured information between the N rounds of dialogue and the sample text content includes the first structured information of the dialogue content corresponding to the N rounds of dialogue, and the second structured information of the sample text content; The step of determining the similarity between the N rounds of dialogue and the sample text content includes: Determine the first structured information of the dialogue content corresponding to the N rounds of dialogue, and determine the second structured information of the sample text content; The similarity is determined based on the first structured information and the second structured information.
6. The method for generating a test set according to claim 5, characterized in that, The first structured information for determining the dialogue content corresponding to the N rounds of dialogue includes: Determine the token tag tree of the dialogue content corresponding to each round of dialogue in the N rounds of dialogue; The first structured information is obtained based on the N token tag trees corresponding to the N rounds of dialogue.
7. A testing method, characterized in that, include: Test examples are obtained from the test set. The test examples include at least one round of dialogue and corresponding path information. The path information is used to indicate the intent jump path of the at least one round of dialogue in the flowchart. The flowchart includes multiple nodes, each of which corresponds to one round of dialogue and includes at least one intent. The flowchart further includes the connection relationship between each node and the intent. The test examples are obtained using the method described in any one of claims 1-6. The semantic understanding system is tested using the path information and at least one round of dialogue to obtain test results.
8. The test method according to claim 7, characterized in that, The step of testing the semantic understanding system based on the path information using the at least one round of dialogue to obtain test results includes: The semantic understanding system is tested using the path information and at least one round of dialogue to determine the recognition accuracy of each intent in the flowchart.
9. The test method according to claim 8, characterized in that, Also includes: When the recognition accuracy is less than a second preset threshold, the intent part of the semantic understanding system corresponding to the recognition accuracy is optimized.
10. The test method according to claim 8, characterized in that, The step of testing the semantic understanding system based on the path information using the at least one round of dialogue to obtain test results includes: Based on the path information, the semantic understanding system is tested using at least one round of dialogue to determine high-frequency paths, nodes, or intents in the flowchart, and / or to determine paths, nodes, or intents with high conversion rates in the flowchart.
11. A test set generation apparatus, characterized in that, include: The first determining module is used to determine at least one first intent that is semantically correctly identified in the first dialogue sample, wherein the at least one first intent includes a plurality of first intents; The plurality of first intentions are the intentions corresponding to N consecutive rounds of dialogue in the first dialogue sample; The second determining module is used to determine at least one preset path passing through the at least one first intent in the flowchart. The flowchart includes multiple nodes, each of the multiple nodes corresponds to a round of dialogue, and each node includes at least one intent. The flowchart further includes the connection relationship between each node and the intent. A generation module is used to generate at least one test example based on the at least one preset path. The at least one test example is used to constitute a test set. Each test example in the at least one test example includes path information corresponding to the test example. The step of generating at least one test example based on the at least one preset path includes: Determine a second dialogue sample corresponding to each of the at least one preset path; Determine the semantically correct sample text content corresponding to the plurality of first intentions from the second dialogue sample; When the similarity between the N rounds of dialogue and the sample text content is greater than or equal to a first preset threshold, the text content in the second dialogue sample that is located after the sample text content is concatenated with the N rounds of dialogue to generate the test example; The similarity between the N-round dialogue and the sample text content is determined based on the similarity of the structured information between the N-round dialogue and the sample text content. The structured information of the N-round dialogue is obtained through the following steps: Concept recognition is performed on the N rounds of dialogue to obtain the main intent; Predict the start and end positions of the interval-type token tags in the N rounds of dialogue, and their corresponding tag-type token tags; The main intent is determined as the root node of the token tag tree to be constructed, and the interval-type token tags and tag-type token tags are determined as child nodes; For each child node, find its parent node and connect it to the corresponding parent node to obtain the structured information of the N rounds of dialogue; wherein, the parent node includes the root node.
12. A testing device, characterized in that, include: The acquisition module is used to acquire test examples from a test set. The test examples include at least one round of dialogue and corresponding path information. The path information is used to indicate the intent jump path of the at least one round of dialogue in a flowchart. The flowchart includes multiple nodes, each of the multiple nodes corresponds to one round of dialogue, and each node includes at least one intent. The flowchart further includes the connection relationship between each node and the intent. The test examples are obtained using the method described in any one of claims 1-6. The testing module is used to test the semantic understanding system based on the path information using the at least one round of dialogue, so as to obtain test results.
13. An electronic device, characterized in that, include: processor; Memory used to store the processor's executable instructions. The processor is used to execute the test set generation method of any one of claims 1 to 6 or the test method of any one of claims 7 to 10.
14. A computer-readable storage medium, characterized in that, The storage medium stores a computer program for executing the test set generation method of any one of claims 1 to 6 or the test method of any one of claims 7 to 10.