Intention feature extraction model training method, intention recognition method, and related device

By constructing an intent knowledge graph and a Siamese network model, the problem of intent classification accuracy caused by insufficient or poor data is solved, achieving efficient intent recognition and rapid intent vector space construction, thus improving the user experience in intelligent customer service scenarios.

CN116150353BActive Publication Date: 2026-06-12MASHANG CONSUMER FINANCE CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
MASHANG CONSUMER FINANCE CO LTD
Filing Date
2022-08-01
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

In existing technologies, intent recognition models based on machine learning or deep learning are prone to poor intent classification accuracy when there is insufficient labeled data or poor data quality.

Method used

By constructing an intent knowledge graph, keywords and relationships are extracted from the corpus to generate a training sample set. Then, a Siamese network model is used to extract intent features and generate an intent vector space to achieve intent classification.

🎯Benefits of technology

It improves the accuracy and efficiency of intent recognition, reduces reliance on manually labeled data, simplifies the semantic parsing process, and enhances recognition speed and accuracy.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116150353B_ABST
    Figure CN116150353B_ABST
Patent Text Reader

Abstract

The application provides an intention feature extraction model training method, an intention recognition method and related devices. The method comprises: obtaining an initial corpus set and constructing an intention knowledge graph according to the initial corpus set; classifying each path in the intention knowledge graph according to the corresponding intention information of each path, obtaining a plurality of intention categories and at least one path corresponding to each intention category; generating a training sample set according to the plurality of intention categories and the at least one path corresponding to each intention category, and iteratively training a to-be-trained intention feature extraction model using the training sample set to obtain a trained intention feature extraction model; inputting a plurality of initial corpora in the initial corpus set to the trained intention feature extraction model to obtain first intention vectors corresponding to the plurality of initial corpora, and an intention vector space composed of the first intention vectors corresponding to the plurality of initial corpora. The application is beneficial to improving the accuracy of intention recognition.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of knowledge graphs, and in particular to an intent feature extraction model training method, an intent recognition method, and related apparatus. Background Technology

[0002] In intelligent customer service scenarios, intent recognition can effectively analyze users' core needs. Accurate intent recognition can significantly improve search accuracy and dialogue intelligence, thereby enhancing the user experience in intelligent customer service scenarios.

[0003] In existing technologies, mainstream intent recognition typically involves transforming the user's question into a text classification task to identify the user's intent. The classifier for this text classification task is usually trained using machine learning or deep learning models.

[0004] However, classification models based on machine learning or deep learning usually require a large amount of labeled data for training. Insufficient amount of labeled data or poor data quality can easily lead to poor classification accuracy of the intended classifier. Summary of the Invention

[0005] This application provides an intent feature extraction model training method, an intent recognition method, and related apparatus to solve the problem that the intent classification accuracy of the classification model is easily caused by insufficient amount of labeled data or poor data quality.

[0006] In a first aspect, this application provides an intent recognition method, comprising:

[0007] An initial corpus set is obtained, and an intent knowledge graph is constructed based on the initial corpus set. Each node in the intent knowledge graph is a keyword of each initial corpus in the initial corpus set, each edge in the intent knowledge graph is used to indicate the association between two adjacent nodes, and each path composed of nodes and edges is used to indicate the intent information of each initial corpus.

[0008] Based on each path in the intent knowledge graph and its corresponding intent information, each path is classified to obtain multiple intent categories and at least one path corresponding to each intent category.

[0009] Based on the multiple intent categories and at least one path corresponding to each intent category, a training sample set is generated, and the intent feature extraction model to be trained is iteratively trained using the training sample set to obtain the trained intent feature extraction model.

[0010] Multiple initial corpora from the initial corpus set are input into the trained intent feature extraction model to obtain first intent vectors corresponding to the multiple initial corpora, and an intent vector space composed of the first intent vectors corresponding to the multiple initial corpora.

[0011] Secondly, this application provides an intent recognition method, including:

[0012] The sentence to be identified is input into the trained intent feature extraction model in the first aspect and any possible design of the first aspect to obtain the first intent vector of the sentence to be identified. The sentence to be identified is a sentence that includes user intent information. The intent feature extraction model is used to extract features from the sentence to be identified to obtain the first intent vector. The first intent vector is used to determine the intent of the sentence to be identified.

[0013] The first intent vector is projected onto the intent vector space in the first aspect and any possible design of the first aspect, and the matching degree between the first intent vector and each second intent vector in the intent vector space is calculated. The intent vector space includes a plurality of second intent vectors, each second intent vector corresponds to an intent category, and each intent category corresponds to an intent keyword.

[0014] The intent keyword of the intent category corresponding to the second intent vector with the highest matching degree is used as the recognition result, and the recognition result is used to indicate the intent of the statement to be recognized.

[0015] Thirdly, this application provides an intent feature extraction model training device, comprising:

[0016] The knowledge graph construction module is used to obtain an initial corpus set and construct an intent knowledge graph based on the initial corpus set. Each node in the intent knowledge graph is a keyword of each initial corpus in the initial corpus set, each edge in the intent knowledge graph is used to indicate the association between two adjacent nodes, and each path composed of nodes and edges is used to indicate the intent information of each initial corpus.

[0017] The intent classification module is used to classify each path according to each path in the intent knowledge graph and its corresponding intent information, so as to obtain multiple intent categories and at least one path corresponding to each intent category.

[0018] The model training module is used to generate a training sample set based on the multiple intent categories and at least one path corresponding to each intent category, and to use the training sample set to iteratively train the intent feature extraction model to be trained, so as to obtain the trained intent feature extraction model.

[0019] The space construction module is used to input multiple initial corpora from the initial corpus set into the trained intent feature extraction model to obtain the first intent vectors corresponding to the multiple initial corpora, and the intent vector space composed of the first intent vectors corresponding to the multiple initial corpora.

[0020] Fourthly, this application provides an intent recognition device, comprising:

[0021] The feature extraction module is used to input the statement to be identified into the trained intent feature extraction model in the first aspect and any possible design of the first aspect to obtain the first intent vector of the statement to be identified. The statement to be identified is a statement that includes user intent information. The intent feature extraction model is used to extract features from the statement to be identified to obtain the first intent vector. The first intent vector is used to determine the intent of the statement to be identified.

[0022] An intent recognition module is configured to project the first intent vector onto the intent vector space in the first aspect and any possible design of the first aspect, and calculate the matching degree between the first intent vector and each second intent vector in the intent vector space, wherein the intent vector space includes multiple second intent vectors, each second intent vector corresponds to an intent category, and each intent category corresponds to an intent keyword; the intent keyword of the intent category corresponding to the second intent vector with the highest matching degree is taken as the recognition result, and the recognition result is used to indicate the intent of the statement to be recognized.

[0023] Fifthly, this application provides a server, including: a memory and a processor;

[0024] The memory is used to store computer programs; the processor is used to execute the intent feature extraction model training method in the first aspect, or the intent recognition method in the second aspect, according to the computer programs stored in the memory.

[0025] In a sixth aspect, this application provides a computer-readable storage medium storing a computer program, which, when executed by at least one processor of a server, enables the server to execute either the intent feature extraction model training method of the first aspect or the intent recognition method of the second aspect.

[0026] In a seventh aspect, this application provides a computer program product comprising a computer program that, when at least one processor of a server executes the computer program, executes either the intent feature extraction model training method of the first aspect or the intent recognition method of the second aspect.

[0027] The intent feature extraction model training method, intent recognition method, and related apparatus provided in this application achieve intent category classification by collecting initial corpora from a corpus; extracting keywords and their relationships from each initial corpus; constructing the intent knowledge graph using these keywords as nodes; adding edges between related keywords to connect their corresponding nodes; constructing the intent knowledge graph based on all initial corpora in the initial corpus; classifying the paths in the intent knowledge graph to obtain multiple intent categories, each category indicating a type of intent; determining the paths included in each intent category; generating a training sample set based on the paths of each intent category; iteratively training the intent feature extraction model to be trained using the training sample set to obtain a trained intent feature extraction model; extracting features from the initial corpora using the trained intent feature extraction model to obtain a first intent vector corresponding to each initial corpus; and constructing an intent vector space based on the first intent vector corresponding to each initial corpus in the initial corpus. This application constructs training sample data by using an intent knowledge graph of multiple intent categories, thereby achieving automatic generation of training sample data and improving the rapid training of the intent feature extraction model and the rapid construction of the intent vector space. Furthermore, this application can also achieve rapid optimization of the intent feature extraction model and intent vector space when the initial corpus is thicker, thus improving recognition accuracy. Attached Figure Description

[0028] To more clearly illustrate the technical solutions in this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0029] Figure 1 A system flowchart illustrating an intent recognition method provided in an embodiment of this application;

[0030] Figure 2 A flowchart illustrating an intent recognition method provided in one embodiment of this application;

[0031] Figure 3 A schematic diagram of the structure of an intent knowledge graph provided in one embodiment of this application;

[0032] Figure 4 A flowchart of model training is provided as an embodiment of this application;

[0033] Figure 5A flowchart illustrating an intent recognition method provided in one embodiment of this application;

[0034] Figure 6 A flowchart of a prediction process provided in one embodiment of this application;

[0035] Figure 7 A schematic diagram of the structure of an intent knowledge graph provided in one embodiment of this application;

[0036] Figure 8 This is a schematic diagram of the structure of an intent feature extraction model training device provided in an embodiment of this application;

[0037] Figure 9 This is a schematic diagram of the structure of an intent recognition device provided in an embodiment of this application;

[0038] Figure 10 This is a schematic diagram of the hardware structure of a server provided in one embodiment of this application. Detailed Implementation

[0039] To make the objectives, technical solutions, and advantages of this application clearer, the technical solutions of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.

[0040] The terms "first," "second," "third," "fourth," etc., used in the specification, claims, and accompanying drawings of this application are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that such data can be used interchangeably where appropriate. For example, without departing from the scope of this document, first information can also be referred to as second information, and similarly, second information can also be referred to as first information.

[0041] Depending on the context, the word "if" as used here can be interpreted as "when," "when," or "in response to determination."

[0042] Furthermore, as used herein, the singular forms “a,” “one,” and “the” are intended to also include the plural forms, unless the context indicates otherwise.

[0043] It should be further understood that the terms “comprising” or “including” indicate the presence of features, steps, operations, elements, components, items, kinds, and / or groups, but do not exclude the presence, occurrence, or addition of one or more other features, steps, operations, elements, components, items, kinds, and / or groups.

[0044] The terms “or” and “and / or” as used herein are interpreted as inclusive, or mean any one or any combination thereof. Therefore, “A, B, or C” or “A, B, and / or C” means “any one of the following: A; B; C; A and B; A and C; B and C; A, B, and C”. Exceptions to this definition occur only when combinations of elements, functions, steps, or operations are inherently mutually exclusive in some way.

[0045] In intelligent customer service scenarios, the Dialogue Management (DM) module first needs to determine the user's core needs through intent recognition. For example, a user in an intelligent customer service scenario might have needs such as checking the weather, checking express delivery, or checking flight tickets. When the DM module identifies different needs from the user's questions, the answers returned to the user after parsing will also differ. Incorrect identification of user needs may lead to the DM module returning incorrect answers to the user, resulting in a poor user experience. Therefore, in human-computer dialogue, the accuracy of intent recognition significantly affects the accuracy and intelligence of the response. Intent recognition is a challenging task, as different industries or scenarios typically have different topic categories and different intent categories. For example, in consumer finance scenarios, dialogue content can be categorized into intents such as "inquiry about processing channels," "inquiry about interest rates," and "inquiry about fee standards." The DM module can analyze and process user questions using text classification methods to determine the user's intent. The determination of this intent will have a particularly important impact on the direction of subsequent rounds of dialogue. Therefore, higher intent recognition accuracy can improve the customer experience of many downstream Natural Language Processing (NLP) applications. Intelligent customer service scenarios are a common example of downstream natural language processing applications.

[0046] In existing technologies, servers can use regular expressions to parse user questions. Based on the parsing results and the intent knowledge graph, the server can determine the user's intent. However, regular expressions are strong matches, while user questions are often colloquial, leading to poor matching results. Alternatively, existing technologies can use node mask vectors to train each intent recognition node, sharing the same model. While this reduces the number of models deployed, it doesn't guarantee the accuracy of intent recognition at each node. This method easily leads to each node relying on a single model server; if the model service malfunctions, the entire recognition system becomes unusable. Furthermore, current mainstream intent recognition typically transforms user questions into text classification tasks to identify user intent. In this process, the server can use traditional machine learning or deep learning methods to train an intent recognition classifier. The server can then use this classifier to convert user input into the data format specified by the classifier, allowing it to predict intent. However, training machine learning or deep learning models usually requires a large amount of labeled data, which can include the data itself and data labels. To ensure the accuracy of data labels, labeling typically requires manual work or proofreading. This labeling process is very labor-intensive and suffers from inconsistent quality. Furthermore, models trained using machine learning or deep learning, when dealing with small datasets, often exhibit poor accuracy in intent classification.

[0047] To address the aforementioned issues and problems in existing technologies, this application proposes an intent recognition method based on a Siamese network model. Considering that the paths in an intent knowledge graph contain a large amount of node and intent information, the server can convert the path information of the intent knowledge graph into a first intent vector using an intent feature extraction model. The server can also map all paths in the graph to a vector space using the intent feature extraction model to obtain a first intent vector for each path. Furthermore, the server can calculate a second intent vector corresponding to each intent category based on the average value of all first intent vectors in each intent category. After acquiring the statement to be recognized, the server can obtain the first intent vector through encoding and the intent feature extraction model. The server can also determine the second intent vector corresponding to the first intent vector by matching it with each second intent vector. The intent keyword corresponding to the second intent vector is the recognition result of this intent recognition method. The intent feature extraction model can be a Siamese network model.

[0048] In this application, multiple subgraph paths formed by connecting nodes (i.e., semantic information) fused within the intent knowledge graph are used as input. Through continuous expansion of the knowledge graph, the input data for the intent feature extraction model is synchronously increased, enabling rapid and efficient enrichment of the knowledge graph and iterative optimization of the intent feature extraction model. This application overcomes the limitation of relying on a large amount of manually labeled data for the intent classifier during intent recognition, improving data acquisition efficiency and avoiding the problems present in existing machine learning or deep learning-based classifiers. In this application, for all user statements to be recognized, the server can convert all statements into a first intent vector through the intent feature extraction model and complete matching by calculating similarity, offering advantages such as simple computation, fewer comparisons, and fast inference speed.

[0049] This application proposes an intent recognition method that, under the precondition of a fully constructed intent recognition knowledge graph, all subgraph paths in the intent knowledge graph are divided into multiple intent categories according to intent. Training data for the intent feature extraction model is then constructed based on the different intent categories. This application addresses the problem that in traditional knowledge graphs, the server cannot utilize the entity information extracted from the statement to be recognized for intent understanding across all matched candidate paths. Therefore, after constructing the intent knowledge graph based on the collected initial corpus, this application divides the initial corpus into multiple intent categories according to different intents. The server also maps the initial corpus in each intent category using a Siamese network to obtain its corresponding first intent vector. The server can obtain a second intent vector for each intent category by calculating the average of the first intent vector. The server can then match these second intent vectors with the first intent vector corresponding to the user's statement to be recognized to obtain the recognition result. Furthermore, this application also addresses the problem that in traditional intent recognition, semantic parsing requires a large number of templates and rules, which are then learned by machine learning or deep learning models and used as a separate channel for intent parsing. In this application, the server does not require a large number of templates for semantic extraction. During the semantic parsing phase, the server can collect multiple semantic information items contained in each statement to be recognized under each intent category, according to the dimension of intent category. The server can then convert these multiple semantic information items into multiple nodes and place them into a graph. As the data continuously iterates, the graph will be continuously enriched and will feed back into the semantic parsing part; the two deeply interact and complement each other.

[0050] The technical solutions of this application will be described in detail below with specific embodiments. These specific embodiments can be combined with each other, and the same or similar concepts or processes may not be described again in some embodiments.

[0051] Figure 1A schematic diagram of a system flow for an intent recognition method provided in an embodiment of this application is shown. Figure 1 The image shows an example of such an intent recognition system. This intent recognition system can be an application or interface running on a server. When applications such as intelligent customer service need to call the intent recognition system, they can do so by calling the application or interface of the intent recognition system to perform predictions. The intent recognition system can include two parts: model training and inference prediction.

[0052] The model training component comprises three parts: knowledge graph construction, intent feature extraction model training, and intent vector space output. In the knowledge graph construction phase, after acquiring an initial corpus, the server extracts entities from the initial corpus based on a preset keyword set and determines the relationships between entities in each initial corpus. These entities are keywords, and the relationships are the associations between keywords. The server constructs the knowledge graph based on the entities and relationships in the initial corpus. During this phase, the server can also cluster the initial corpus into multiple intent categories. In the intent feature extraction model training phase, the server generates a training dataset based on the intent categories and paths in the knowledge graph. The server uses this training dataset to train the intent feature extraction model, which extracts the first intent vector for each statement to be recognized. This intent feature extraction model is a Siamese network model. In the intent vector space output phase, the server uses the trained intent feature extraction model to map multiple initial corpora within each intent category, obtaining the first intent vector corresponding to each initial corpus. The server can also determine a second intent vector for each intent category based on the average of the first intent vectors from multiple initial corpora within each intent category. These second intent vectors from intent categories constitute the intent vector space.

[0053] The inference and prediction section includes an intent recognition module and an intent result output module. This inference and prediction section is typically invoked by applications such as intelligent customer service systems. These applications input the statement to be recognized into the intent recognition module on the server. The server uses this module to determine the recognition result corresponding to the statement. The server then uses the recognition result output module to feed the recognition result back to the applications for further processing.

[0054] The server can periodically update the initial corpus set and use the updated corpus set to optimize the intent feature extraction model and intent vector space. The trained intent feature extraction model and intent vector space will then be applied to the intent recognition module.

[0055] In this application, a server is used as the execution entity to perform the intent recognition method of the following embodiments. Specifically, the execution entity can be a hardware device of the server, a software application implementing the following embodiments in the server, a computer-readable storage medium installed with the software application implementing the following embodiments, or code implementing the software application.

[0056] Figure 2 A flowchart illustrating an intent recognition method according to an embodiment of this application is shown. Figure 1 Based on the illustrated embodiments, as Figure 2 As shown, with the server as the execution entity, the method in this embodiment may include the following steps:

[0057] S101. Obtain the initial corpus set and construct an intent knowledge graph based on the initial corpus set. Each node in the intent knowledge graph is a keyword of each initial corpus in the initial corpus set. Each edge in the intent knowledge graph is used to indicate the association between two adjacent nodes. Each path composed of nodes and edges is used to indicate the intent information of each initial corpus.

[0058] In this embodiment, the server can collect initial data from a corpus to form an initial data set. The corpus may include initial data obtained from auxiliary robot devices such as intelligent customer service systems. For example, in an intelligent customer service window for household registration management, the initial data collected by the server may include: "What are the conditions for transferring a Shenzhen household registration?", "What are the procedures for transferring a Shenzhen household registration within the city?", "What are the procedures for moving from Shenzhen out of Zhongshan?", "What are the conditions for moving from Shenzhen to Beijing?", "What are the conditions for moving from Beijing to Shenzhen?", "What are the procedures for transferring within Shenzhen?", "What are the materials required for transferring within Shenzhen?", etc.

[0059] In one example, when collecting the initial corpus, the server can perform a preliminary screening, retaining only the valid data. For instance, the corpus might contain customer input statements obtained by the intelligent customer service system. The server can remove meaningless phrases such as "hello," "human customer service," and "query" from these statements to improve the effectiveness of the initial corpus.

[0060] In one example, the server can continuously collect statements input by the client from the corpus and, after filtering, store the valid statements into the initial corpus set.

[0061] The server can extract keywords and their relationships from each initial corpus. It can then use these keywords as nodes in an intent knowledge graph. Furthermore, the server can add edges between related keywords to connect their corresponding nodes. The server can construct the intent knowledge graph based on all the initial corpora in the initial corpus set. It's important to note that if multiple initial corpora share the same keywords and their relationships, these multiple initial corpora can be considered the same. The server can then select one of these initial corpora to retain in the initial corpus set and remove the others.

[0062] In one example, the specific steps by which the server constructs an intent knowledge graph based on the keywords of each initial corpus and the relationships between those keywords may include:

[0063] Step 1: The server obtains the initial corpus set.

[0064] Step 2: The server extracts at least one keyword and the relationships between keywords from each initial corpus in the initial corpus set. Specifically, the server can store a keyword table. The server can use this keyword table to extract keywords from the initial corpus. The server can also determine the relationships between keywords based on the order in which they appear in each initial corpus. For example, when the initial corpus is "How to apply for a household registration in Shenzhen", the extracted keywords could be "household registration", "Shenzhen", and "apply". Based on the order in which these three keywords appear in the initial corpus, the relationship between these three keywords can be determined as "household registration → Shenzhen → apply".

[0065] Step 3: The server constructs an intent knowledge graph based on the keywords of each initial corpus in the initial corpus set and the relationships between those keywords. Specifically, after obtaining the keywords and relationships between the keywords of each initial corpus in the initial corpus set, the server can construct the intent knowledge graph. This intent knowledge graph can be as follows: Figure 3 As shown in the diagram, each keyword corresponds to a node in the intent knowledge graph. The association between two keywords corresponds to an edge in the intent knowledge graph. For example, in the initial corpus "How to apply for household registration in Shenzhen," there is an association between the keyword "household registration" and the keyword "Shenzhen," and the keyword "household registration" points to the keyword "Shenzhen" in this association. Figure 3 In the intent knowledge graph shown, there is an edge between the keyword "household registration" and the keyword "Shenzhen", and the relationship corresponding to this edge is that the keyword "household registration" points to the keyword "Shenzhen".

[0066] S102. Based on each path in the intent knowledge graph and its corresponding intent information, classify each path to obtain multiple intent categories and at least one path corresponding to each intent category.

[0067] In this embodiment, the keywords corresponding to each initial corpus and the relationships between those keywords can correspond to a path in the intent knowledge graph. For example, the initial corpus "how to apply for household registration in Shenzhen" can correspond to the path "household registration → Shenzhen → apply". As another example... Figure 3 The intent knowledge graph shown can also include paths such as "Shenzhen → Indoor Relocation → Procedures" and "Shenzhen → Transfer → Collective → Conditions". Each path in this intent knowledge graph corresponds to a type of intent information. This intent information is used to indicate the intent of the initial corpus. The server can classify the paths in this intent knowledge graph to obtain multiple intent categories. Each intent category can be used to indicate a type of intent. For example, for... Figure 3 The intent knowledge graph shown can be divided into six intent categories: "Immigration In," "Immigration Out," "Processing," "Transfer," "Advantages and Disadvantages," and "Intra-city Migration." Each intent category can correspond to a cluster center. The nodes corresponding to the cluster center can be as follows: Figure 3 The rectangle in the image shows that each path after the initial corpus transformation includes one and only one cluster center corresponding to the intent category. For example, Figure 3 In the intent knowledge graph shown, the path corresponding to each intent category can be as follows:

[0068] The intent category for “transfer” includes:

[0069] 1. Household registration -> Shenzhen -> Transfer -> Collective -> Conditions

[0070] 2. Household registration -> Shenzhen -> Transfer -> Individual -> Conditions

[0071] The "migration" intention category includes:

[0072] 1. Household registration -> Shenzhen -> Move out -> Zhongshan

[0073] 2. Household registration -> Shenzhen -> Move out -> Huizhou

[0074] 3. Household registration -> Shenzhen -> Move out -> Dongguan

[0075] 4. Household registration -> Shenzhen -> Move out -> Beijing

[0076] The "migration" intent includes a set of paths:

[0077] 1. Household registration -> Shenzhen -> Relocation -> Dongguan

[0078] 2. Household registration -> Shenzhen -> relocation -> Beijing

[0079] 3. Household registration -> Shenzhen -> Migration -> Henan

[0080] The intent categories for "processing" include:

[0081] 1. Household registration -> Shenzhen -> Processing

[0082] The "Advantages and Disadvantages" intention category includes:

[0083] 1. Household registration -> Shenzhen -> Advantages and disadvantages

[0084] The intent category for "intra-city transfer" includes:

[0085] 1. Household registration -> Shenzhen -> Intra-city transfer -> Procedures

[0086] 2. Household registration -> Shenzhen -> Intra-city transfer -> Documents

[0087] In one example, the server clusters the intent knowledge graph, resulting in multiple cluster centers. Each cluster center corresponds to a node in the intent knowledge graph, and each cluster center corresponds to an intent category. The keyword indicated by the node corresponding to the cluster center is the intent keyword for that intent category. The server then categorizes the path including the cluster center into an intent category.

[0088] In this step, the keyword corresponding to the intent center is the intent keyword corresponding to the intent category. This intent keyword is used during the recognition process to determine the intent category of the statement to be recognized and then output the recognition result based on this intent keyword.

[0089] In one implementation, the server encodes the paths of each initial corpus based on their paths, obtaining encoded features. The server can then use a clustering algorithm to cluster these encoded features, resulting in multiple intent categories. The server can determine the paths corresponding to each initial corpus within each intent category. Based on the paths within each intent category, the server can determine its corresponding cluster center. Each cluster center corresponds to a keyword. Within each intent category, each path includes the keyword corresponding to that cluster center.

[0090] In another implementation, after obtaining the intent knowledge graph, the server can cluster the intent knowledge graph. The server can obtain at least one cluster center. This cluster center is a node in the intent knowledge graph. Each cluster center corresponds to an intent category. The server can add the path containing the node to that intent category.

[0091] In one implementation, the initial corpus set can be updated in real time. The server can periodically update the intent categories and the paths included in each intent category based on the updated initial corpus set. Furthermore, the server can optimize the intent feature extraction model and the intent vector space based on the updated intent categories and the paths included in each intent category.

[0092] S103. Generate a training sample set based on multiple intent categories and at least one path corresponding to each intent category, and use the training sample set to iteratively train the intent feature extraction model to be trained, so as to obtain the trained intent feature extraction model.

[0093] In this embodiment, after determining multiple intent categories, the server can determine the paths included in each intent category. The server can then generate a training sample set based on the paths of each intent category. The training samples in this set will be used to train an intent feature extraction model. This intent feature extraction model is a Siamese network model. The Siamese network model can include two sub-networks with the same architecture. The architecture and weights of these two sub-networks are shared. When two data points from a training sample's data pair are input into these two sub-networks respectively, each sub-network can extract features from the two data points, obtaining two feature vectors. The Siamese network also includes a similarity calculation module. This similarity calculation module can determine the prediction result of the training sample by calculating the similarity between the feature vectors extracted by the two sub-networks. The prediction result can include either the two feature vectors being similar or dissimilar. The Siamese network model can calculate the loss function based on the prediction result and the label of the training sample, thereby achieving inverse optimization of the weights in the sub-networks.

[0094] Therefore, each training sample in the server-generated training sample set can include a data pair. This data pair can include two paths. When a training sample is input into the intent feature extraction model, the two sub-networks in the intent feature extraction model can extract features from the two paths respectively, obtaining two first intent vectors. The server can predict the similarity between the two first intent vectors through the similarity calculation module in the intent feature extraction model. The server can calculate the loss function of the intent feature extraction model based on the similarity between the two first intent vectors and the labels of the training data. The server can then optimize the intent feature extraction model in reverse based on the loss function, thereby optimizing the intent feature extraction model.

[0095] In one example, the specific steps for generating a training sample set and completing the training of the Siamese network model may include:

[0096] Step 1: The server obtains multiple paths in the intent knowledge graph and the intent category corresponding to each path.

[0097] Step 2: The server combines paths with the same intent category in multiple paths to obtain a positive sample set, and combines paths with different intent categories in multiple paths to obtain a negative sample set. The training sample set includes both the positive sample set and the negative sample set.

[0098] In this step, the training sample set can include multiple positive samples and multiple negative samples. Each training sample can include two paths. A positive sample can include two paths from one intent category. A negative sample can include two paths from two intent categories. During the generation of positive samples, the server can first randomly select one intent category from multiple intent categories. Then, the server can randomly select two paths from that intent category to form a data pair. This data pair is a positive sample. Since the two initial corpora in this positive sample come from the same intent category, the similarity between the two initial corpora in this positive sample is high. During the generation of negative samples, the server can first randomly select two intent categories from multiple intent categories. The server can then randomly select one path from each of these two intent categories. These two paths form a data pair. This data pair is a negative sample. Since the two initial corpora in this negative sample come from different intent categories, the similarity between the two initial corpora in this negative sample is low. For example, the training samples can be as shown in Table 1:

[0099] Table 1

[0100]

[0101]

[0102] As shown in Table 1, sentences 1 and 2 represent two paths. The intents of sentences 1 and 2 represent the intent categories of these two paths, respectively. The label indicates whether the sample is positive or negative. A positive label indicates a positive sample, and a negative label indicates a negative sample. For example, if the intent of sentences 1 ("Household Registration -> Shenzhen -> Transfer -> Collective -> Conditions") and 2 ("Household Registration -> Shenzhen -> Transfer -> Individual -> Conditions") is "Transfer," then the training sample composed of sentences 1 and 2 is a positive sample, and its label is "Positive Example." Similarly, if the intents of sentences 1 ("Household Registration -> Shenzhen -> Transfer -> Collective -> Conditions") and 2 ("Household Registration -> Shenzhen -> Processing") are "Transfer" and "Processing," respectively, then the training sample composed of sentences 1 and 2 is a negative sample, and its label is "Negative Example."

[0103] Step 3: The server inputs the training sample set into the intent feature extraction model to be trained, and obtains the trained intent feature extraction model.

[0104] In this step, the intent feature extraction model can be a Siamese network model. A Siamese network typically includes two sub-networks with identical architectures and shared weights. In the intent feature extraction model, the input training samples are data pairs. The two sub-networks of the intent feature extraction model can map the two paths in the input data pair to new spaces to obtain a first intent vector u and a first intent vector v. The server can calculate the similarity between the first intent vector u and the first intent vector v using the similarity calculation module of the intent feature extraction model. When calculating the first intent vector u and the first intent vector v, the server can concatenate and combine them before inputting them into the similarity calculation network and activation function to calculate the similarity between the two sentences. For example, the similarity calculation network can use the Manhattan distance calculation formula. That is, the server can calculate the similarity between the two first intent vectors by calculating their Manhattan distance. The Manhattan distance is used to measure distance in a multidimensional data space, ranging from 0 to 1. The smaller the Manhattan distance, the greater the similarity between the two vectors. The formula is expressed as follows:

[0105]

[0106] Here, vec1 and vec2 represent two paths in a training sample. x and y represent two first intent vectors. i represents the dimension of the two first intent vectors x and y.

[0107] The server can set a similarity threshold. Based on this threshold and the calculated similarity of each training sample, the server determines the prediction result for each training sample. When the similarity of a training sample is greater than the threshold, the prediction result is considered low similarity. When the similarity is less than or equal to the threshold, the prediction result is considered high similarity. The server can also obtain the label for each sample. When the label is positive, the training sample is considered positive and has high similarity. When the label is negative, the training sample is considered negative and has low similarity. Based on the prediction result and the label, the server calculates the model loss for the feature extraction model. The server can use this model loss to train the intent feature extraction model and optimize its feature extraction performance.

[0108] In one implementation, the server can compare the model loss with a loss threshold. When the model loss is less than or equal to the loss threshold, the server can determine that the intention feature extraction model has completed training.

[0109] In another implementation, the server can set a maximum number of iterations. When the number of iterations of the intent feature extraction model reaches this maximum number of iterations, the server can determine that the intent feature extraction model has completed training.

[0110] S104. Input multiple initial corpora from the initial corpus set into the trained intent feature extraction model to obtain the first intent vectors corresponding to the multiple initial corpora, and the intent vector space composed of the first intent vectors corresponding to the multiple initial corpora.

[0111] In this embodiment, after training the intent feature extraction model, the server can use the intent feature extraction model to extract features from the initial corpus. The server can input the initial corpus into the intent feature extraction model, map the initial corpus to a vector space, and obtain a first intent vector corresponding to each initial corpus. The server can form an intent vector space based on the intent category and the first intent vector corresponding to each initial corpus in the initial corpus set. If the number of vectors constituting the intent vector space is determined according to the number of initial corpus in the initial corpus set, then after the server obtains the statement to be recognized, it needs to match the first intent vector of the statement to be recognized with each vector constituting the intent vector space to determine the intent category corresponding to the statement to be recognized. Obviously, when the number of initial corpus in the initial corpus set is large, this method has the problem of high computational cost. To this end, this application combines the first intent vector corresponding to the initial corpus with the intent category determined by the server in the above step S102 to reduce the number of vectors constituting the intent vector space.

[0112] In one example, the specific steps by which the server constructs an intent vector space based on the first intent vector corresponding to the initial corpus may include:

[0113] Step 1: The server uses an intent feature extraction model to extract features from the initial corpus, obtaining the first intent vector corresponding to each initial corpus.

[0114] Step 2: The server calculates the mean of all first intent vectors in each intent category and uses the mean as the second intent vector corresponding to the intent category.

[0115] In this step, after determining the intent category, the server can determine the intent category corresponding to each initial corpus. The server can calculate the mean of the first intent vectors corresponding to all initial corpora in an intent category. The server can use this mean as the second intent vector for that intent category. Each intent category can correspond to one second intent vector.

[0116] Step 3: The server uses the second intent vectors of each intent category to form an intent vector space.

[0117] In this step, the server can construct an intent vector space using the second intent vectors corresponding to each intent category. This intent vector space will be applied to model prediction. When the server obtains a statement to be identified, it can match the first intent vector corresponding to the statement to be identified with each second intent vector in the intent vector space to determine the intent category corresponding to the statement to be identified.

[0118] In one implementation, when new initial corpus is added to the initial corpus set, the server can recalculate the second intent vector corresponding to each intent category based on the updated initial corpus set, thereby optimizing the intent vector space based on the second intent vector.

[0119] In one example, the above model training process can also be as follows: Figure 4 As shown, the server can collect initial corpus data from a corpus. The server can then construct an intent knowledge graph based on this initial corpus. The server can also construct a training sample set based on this intent knowledge graph. This training sample set is the Siamese network data. The server can further use this training sample set to train an intent feature extraction model. This intent feature extraction model is the feature extraction part of the Siamese network model. The server can use this intent feature extraction model to map the initial corpus data in the initial corpus set, thereby obtaining a first intent vector corresponding to each initial corpus. The server can then calculate a second intent vector for each intent category based on the intent category and the first intent vector. This second intent vector is the intent representation. The server can also output the intent feature extraction model and the intent representation, and apply them to prediction.

[0120] The intent recognition method provided in this application allows the server to collect initial data from a corpus to form an initial corpus set. The server can extract keywords and their relationships from each initial corpus. The server can use these keywords as nodes in an intent knowledge graph to construct the graph. The server can also add edges between two related keywords to connect their corresponding nodes. The server can construct the intent knowledge graph based on all the initial data in the initial corpus set. The server can classify the paths in the intent knowledge graph to obtain multiple intent categories. Each intent category can indicate a type of intent. After determining multiple intent categories, the server can determine the paths included in each category. The server can generate a training sample set based on the paths of each intent category. The server can use the training sample set to iteratively train the intent feature extraction model to obtain a trained intent feature extraction model. The server uses the trained intent feature extraction model to extract features from the initial data, obtaining a first intent vector corresponding to each initial corpus. The server can then construct an intent vector space based on the first intent vector corresponding to each initial corpus in the initial corpus set. In this application, intent category classification is achieved by constructing an intent knowledge image using an initial corpus set. Furthermore, this application enables the automatic generation of training sample data by constructing an intent knowledge graph of multiple intent categories, thus improving the rapid training of the intent feature extraction model and the rapid construction of the intent vector space. Moreover, this application can also achieve rapid optimization of the intent feature extraction model and intent vector space when the initial corpus is thicker, thereby improving recognition accuracy.

[0121] The above describes the intent feature extraction model training method provided in this application embodiment. The following section, using a specific application scenario, details the application of the intent feature extraction model training method provided in this application within that scenario. Taking the intelligent customer service scenario as an example, combined with... Figure 2 The embodiment shown includes the following processes for generating the intent feature extraction model and the intent vector space:

[0122] The server retrieves user-generated queries from the historical database of the intelligent customer service system. It then filters these queries to obtain an initial corpus. This filtering process may include removing meaningless phrases such as "hello," "human customer service," and "query." The server extracts keywords from each initial query in the corpus. These keywords form the nodes in the intent knowledge graph. While extracting keywords from the initial corpus, the server also obtains the relationships between these keywords. Based on these relationships, the server connects two adjacent nodes in the intent knowledge graph, forming an edge. The server uses these nodes and edges to construct the intent knowledge graph. A path in this intent knowledge graph represents an initial query keyword and the relationships between keywords. The server categorizes each path in the intent knowledge graph and its corresponding intent information to obtain multiple intent categories. Each intent category may include a cluster center, which corresponds to a node. The server can classify paths that include the node corresponding to the cluster center into that intent category. The server generates a training sample set based on multiple intent categories and at least one path corresponding to each intent category. The server can use the training sample set to iteratively train the intent feature extraction model to obtain the trained intent feature extraction model. The server can input multiple initial corpora from the initial corpus set into the trained intent feature extraction model to obtain a first intent vector corresponding to each initial corpus. The server can use the average of all first intent vectors in an intent category as a second intent vector. The server can use the second intent vectors from all intent categories to construct an intent vector space.

[0123] As the server continuously acquires user queries from the intelligent customer service system, it can update the initial corpus. Based on the updated corpus, the server can update the intent knowledge graph. The server can then update the intent categories and their corresponding paths based on the updated intent knowledge graph. The server can regenerate a training sample set based on the updated intent categories and their corresponding paths. This regenerated training sample set can be used to further train the intent feature extraction model, thereby optimizing it. The optimized intent feature extraction model can then be used to re-extract features from the initial corpus, yielding a new first intent vector. Based on this new first intent vector and the new intent categories, the server can recalculate the second intent vector for each intent category. Finally, the server can reconstruct the intent vector space using the new second intent vectors, thus optimizing the intent vector space.

[0124] Figure 5 A flowchart illustrating an intent recognition method according to an embodiment of this application is shown. Figures 1 to 4 Based on the illustrated embodiments, as Figure 5 As shown, with the server as the execution entity, the method in this embodiment may include the following steps:

[0125] S201. Input the statement to be recognized as follows: Figure 2 The trained intent feature extraction model shown in the embodiment obtains a first intent vector of the statement to be identified. The statement to be identified is a statement that includes user intent information. The intent feature extraction model is used to extract features from the statement to be identified to obtain the first intent vector. The first intent vector is used to determine the intent of the statement to be identified.

[0126] In this embodiment, the server can obtain a query to be identified provided by other applications. This query is a question entered by the user on that application for consultation. The server can input this query into a trained intent feature extraction model. The intent feature extraction model can extract features from the query to obtain a first intent vector corresponding to it. In subsequent steps, the server can use this first intent vector to match within the intent vector space to obtain the corresponding intent category. It should be noted that when the server obtains multiple queries to be identified from one application, or multiple queries to be identified from multiple applications, the server can input these queries one by one into the intent feature extraction model for feature extraction.

[0127] In one example, the specific process by which the server uses an intent feature extraction model to extract the first intent vector of the statement to be identified may include:

[0128] Step 1: The server extracts at least one keyword and the association between at least one keyword in the statement to be identified;

[0129] In this step, the server can pre-store a keyword table. This keyword table includes at least one keyword. The server can extract keywords from the statement to be identified based on this keyword table and treat these keywords as entities. The server can determine the relationship between these entities based on the order of the extracted keywords from the statement to be identified. For example, when the statement to be identified is "How to apply for a Beijing household registration", the extracted keywords could be "household registration", "Beijing", and "apply". These three keywords, as entities, can have the connection relationship "household registration → Beijing → apply".

[0130] Step 2: The server encodes the keywords in the sentence to be identified sequentially according to the relationship between the keywords, and obtains the encoding vector.

[0131] In this step, the keyword table stored on the server can include the code corresponding to each keyword. The server can determine the code corresponding to each keyword in the statement to be recognized based on this keyword table. The server can then replace the keywords in the statement with their codes according to the order of the keywords, obtaining a code vector. For example, when the code for "household registration" is 11, the code for "Beijing" is 12, and the code for "process" is 13, the code vector for the statement "How to process household registration in Beijing" is 111213.

[0132] Step 3: The server inputs the encoded vector into the trained intent feature extraction model to obtain the first intent vector of the sentence to be recognized.

[0133] In this step, the server can input the encoded vector into the trained intent feature extraction model. This model will then perform feature extraction on the encoded vector to obtain the first intent vector corresponding to the statement to be recognized. The dimension of this first intent vector can be determined based on the intent feature extraction model, which can be a Siamese network model.

[0134] S202, Project the first intention vector onto such a... Figure 2 The embodiment shows an intent vector space, and calculates the matching degree between the first intent vector and each second intent vector in the intent vector space. The intent vector space includes multiple second intent vectors, each second intent vector corresponds to an intent category, and each intent category corresponds to an intent keyword.

[0135] In this embodiment, the intent vector space consists of multiple second intent vectors. Each second intent vector corresponds to an intent category. Each intent category corresponds to an intent keyword. For example, in a household registration application scenario, the intent categories may include six items: "migration in," "migration out," "processing," "transfer," "advantages and disadvantages," and "intra-city migration." The intent keywords corresponding to these six intent categories can be "migration in," "migration out," "processing," "transfer," "advantages and disadvantages," and "intra-city migration," respectively. These intent keywords are nodes in the intent knowledge graph.

[0136] The server can use the first intent vector to calculate the matching degree between it and each second intent vector. Since calculating the matching degree between the first and second intent vectors is essentially calculating the matching degree between two vectors, this matching degree can also be achieved through vector space similarity calculation. For example, the server can calculate metrics such as the Pearson Correlation Coefficient, Tanimoto Coefficient, Cosine Similarity, Euclidean Distance, Manhattan Distance, Mahalanobis Distance, Lance Williams Distance, Chebyshev Distance, and Hausdorff Distance between the first and second intent vectors. The server can determine the matching degree between the first and second intent vectors based on these metrics. Since some metrics show a higher matching degree with a smaller value, the server can also calculate the matching degree by calculating the reciprocal, etc., to better determine the level of matching. In the process of matching the first intent vector with each second intent vector, the second intent vector with the highest matching degree has the highest matching degree.

[0137] In one example, the matching degree calculation process may specifically include the following steps:

[0138] Step 1: The server calculates the Manhattan distance between the first intent vector and every second intent vector in the intent vector space. The Manhattan distance is used to measure distance in a multidimensional data space. The smaller the Manhattan distance value, the higher the similarity between the two vectors.

[0139] Step 2: The server determines the matching degree between the first intent vector and each second intent vector based on the Manhattan distance. The server can use the reciprocal of the Manhattan distance as the matching degree.

[0140] S203. The intent keyword of the intent category corresponding to the second intent vector with the highest matching degree is used as the recognition result. The recognition result is used to indicate the intent of the statement to be recognized.

[0141] In this embodiment, the server can determine the second intent vector with the highest matching degree between the first intent vector and each of the second intent vectors. The server can also determine that the intent category corresponding to the second intent vector with the highest matching degree is the intent category corresponding to the vector to be identified. The server can use the intent keyword corresponding to the intent category as the recognition result. The server can use the intent keyword as the recognition result for the statement to be identified. The server can output the recognition result. The recognition result is used to indicate the intent of the statement to be identified. For example, in an intelligent customer service application scenario, the execution steps of the above process can be as follows: Figure 5 As shown, the server receives a question sent by another smart terminal. This question is the statement to be recognized. The server can input this question into a trained intent feature extraction model to predict the first intent vector. This intent feature extraction model is a Siamese network model. The intent vector space can include multiple second intent vectors. The server can calculate the similarity between the first intent vector and each second intent vector. The server can determine the intent category corresponding to the second intent vector with the highest similarity, which is the predicted intent. The server can generate a recognition result based on the intent category. The server can output this recognition result.

[0142] In one example, after obtaining the recognition result of the statement to be recognized, the server can also retrieve the intelligent response content of the statement to be recognized from a preset database based on the recognition result.

[0143] The intent recognition method provided in this application allows the server to obtain a statement to be recognized from other applications. The server can input this statement into a trained intent feature extraction model. This model can predict a first intent vector corresponding to the statement. The server can then match this first intent vector with each second intent vector in the intent vector space to obtain a matching degree between the first and second intent vectors. The server can sort these matching degrees to obtain the second intent vector with the highest matching degree. The server can then determine the recognition result based on the intent category corresponding to this second intent vector. This application, by using the intent feature extraction model and the intent vector space, achieves the prediction of the recognition result for the statement to be recognized, thus improving recognition accuracy and efficiency.

[0144] Taking intelligent customer service as an example, combined with such Figure 5 The embodiment shown may include the following specific steps in the process of recognizing the statement to be recognized:

[0145] The server retrieves the user's current question from the intelligent customer service system. This question is the statement to be identified. The server inputs this statement into a trained intent feature extraction model to obtain a first intent vector. The server projects this first intent vector into an intent vector space to determine the intent category corresponding to it. During this projection process, the server calculates the matching degree between the first intent vector and each second intent vector in the intent vector space. Each second intent vector corresponds to an intent category. Each intent category corresponds to a cluster center. The keywords displayed by the nodes corresponding to the cluster center are the intent keywords for that intent category. The server uses the intent keywords of the intent category corresponding to the second intent vector with the highest matching degree to the first intent vector as the recognition result. The server outputs this recognition result, which indicates the intent of the statement to be identified. The server can retrieve the response content from the intelligent customer service's question-and-answer database based on this recognition result. The server can then output this response content through the intelligent customer service system for the user to view.

[0146] Figure 7 A flowchart illustrating an intent recognition method according to an embodiment of this application is shown. Figures 1 to 6 Based on the illustrated embodiments, as Figure 7 As shown, with the server as the execution entity, the method in this embodiment may include the following steps:

[0147] S301. Generate key search information based on the recognition results and the keywords of the statement to be recognized.

[0148] In this embodiment, similar to the construction of traditional knowledge graphs, edges in the intent knowledge graph are used to represent associations between entities, entity attributes, hierarchical relationships between entities and concepts, and actions performed by a certain entity. In the intent knowledge graph, a "hyperedge" is a set representing an entity or term. This "hyperedge" represents the key information to be retrieved. Hyperedges between intents can represent equivalence relationships between intents. Therefore, after obtaining the keywords and intent keywords in the statement to be identified, the server can determine the associations between entities in the statement. For example, for the statement to be identified, "How to apply for a Beijing household registration," the server can obtain three keywords: "household registration," "Beijing," and "apply." Its edges can be represented as "household registration → Beijing → apply." The intent keyword is "apply." Based on this intent keyword and the keyword, the intent corresponding to its hyperedge is "household registration application." That is, the key information to be retrieved is "household registration application."

[0149] S302. Use the key information to search in the preset database to obtain the search results. The preset database includes intelligent response content for each intent.

[0150] In this embodiment, the server can perform a search in a preset database based on the key search information obtained in the previous step, and obtain search results. The preset database includes answers to various questions that the user may ask. These search results can be the intelligent response content corresponding to the user's question in an intelligent customer service scenario.

[0151] In one example, since the key information for the search typically does not include all the keywords of the statement to be identified, the server can further refine the search results by performing a search based on the keywords in the statement to be identified, thus improving the relevance of the response. For instance, when the key information for the search includes "household registration processing," the server can perform an initial search in a pre-defined database based on "household registration processing." This search will retrieve multiple highly relevant responses as search results. When the keywords include "household registration," "Beijing," and "processing," the server can further refine the search based on these three keywords to narrow down the number of responses in the search results.

[0152] In one example, when the search results include multiple responses, the server can sort them according to their relevance to the statement to be identified, and select the response with the highest relevance as the search result.

[0153] In one example, each response may correspond to a short sentence or a paragraph. The server can also select a preset number of responses as search results after retrieving multiple responses.

[0154] S303. Determine the intelligent response content corresponding to the statement to be identified based on the search results. The intelligent response content is used to indicate the result of the user's inquiry.

[0155] In this embodiment, the server can determine the intelligent response content corresponding to the statement to be identified based on the retrieval structure. This intelligent response content is used to indicate the answer to the user's question.

[0156] In one example, the search results may include multiple short phrases. The server can combine and refine these phrases to obtain the intelligent response.

[0157] The intent recognition method provided in this application allows the server to obtain key retrieval information based on the recognition result after receiving it. The server can also retrieve intelligent response content based on this key retrieval information. In this application, by using the recognition result predicted in the above embodiments, the accuracy of intelligent responses can be improved, thus enhancing the user experience.

[0158] Figure 8 This illustration shows a schematic diagram of the structure of an intent feature extraction model training device according to an embodiment of this application, as shown below. Figure 8As shown, the intent recognition device 10 of this embodiment is used to implement the operation corresponding to the server in any of the above method embodiments. The intent recognition device 10 of this embodiment includes:

[0159] The knowledge graph construction module 11 is used to obtain an initial corpus set and construct an intent knowledge graph based on the initial corpus set. Each node in the intent knowledge graph is a keyword of each initial corpus in the initial corpus set. Each edge in the intent knowledge graph is used to indicate the association between two adjacent nodes. Each path composed of nodes and edges is used to indicate the intent information of each initial corpus.

[0160] The intent classification module 12 is used to classify each path according to each path in the intent knowledge graph and its corresponding intent information, so as to obtain multiple intent categories and at least one path corresponding to each intent category.

[0161] The model training module 13 is used to generate a training sample set based on multiple intent categories and at least one path corresponding to each intent category, and to use the training sample set to iteratively train the intent feature extraction model to be trained, so as to obtain the trained intent feature extraction model.

[0162] The space construction module 14 is used to input multiple initial corpora from the initial corpus set into the trained intent feature extraction model to obtain the first intent vectors corresponding to the multiple initial corpora, and the intent vector space composed of the first intent vectors corresponding to the multiple initial corpora.

[0163] In one example, model training module 13 is specifically used for:

[0164] Obtain multiple paths in the intent knowledge graph and the intent category corresponding to each path.

[0165] The training sample set includes both positive and negative sample sets. Paths with the same intent category from multiple paths are combined in pairs to obtain a positive sample set, and paths with different intent categories from multiple paths are combined in pairs to obtain a negative sample set.

[0166] The training sample set is input into the intent feature extraction model to be trained, and the trained intent feature extraction model is obtained.

[0167] In one example, space building module 14 is specifically used for:

[0168] The initial corpus is used to extract features using an intent feature extraction model, resulting in the first intent vector corresponding to each initial corpus.

[0169] Calculate the mean of all first intent vectors in each intent category, and use the mean as the second intent vector corresponding to the intent category.

[0170] The intent vector space is composed of the second intent vectors of each intent category.

[0171] In one example, knowledge graph construction module 11 is specifically used for:

[0172] Obtain the initial corpus set.

[0173] Extract at least one keyword and the relationship between keywords from each initial corpus in the initial corpus set.

[0174] Based on the keywords of each initial corpus in the initial corpus set and the relationships between keywords, an intent knowledge graph is constructed.

[0175] In one example, the intent classification module 12 is specifically used for:

[0176] Clustering the intent knowledge graph yields multiple cluster centers, each corresponding to a node in the intent knowledge graph. Each cluster center corresponds to an intent category, and the keywords indicated by the nodes corresponding to the cluster centers are the intent keywords of the intent category.

[0177] The path that includes the cluster center is classified into an intent category.

[0178] The intent recognition device 10 provided in this application embodiment can execute the above method embodiment. Its specific implementation principle and technical effects can be found in the above method embodiment, and will not be repeated here.

[0179] Figure 9 This application provides a schematic diagram of the structure of an intent recognition device according to an embodiment of the present application. Figure 9 As shown, the intent recognition device 20 of this embodiment is used to implement the operation corresponding to the server in any of the above method embodiments. The intent recognition device 20 of this embodiment includes:

[0180] Feature extraction module 21 is used to input the statement to be recognized, such as... Figure 2 The trained intent feature extraction model in the illustrated embodiment obtains a first intent vector of the statement to be identified. The statement to be identified is a statement that includes user intent information. The intent feature extraction model is used to extract features from the statement to be identified to obtain the first intent vector. The first intent vector is used to determine the intent of the statement to be identified.

[0181] Intent recognition module 22 is used to project the first intent vector onto a surface such as... Figure 2The embodiment illustrates an intent vector space, and calculates the matching degree between a first intent vector and each second intent vector in the intent vector space. The intent vector space includes multiple second intent vectors, each corresponding to an intent category, and each intent category corresponding to an intent keyword. The intent keyword of the intent category corresponding to the second intent vector with the highest matching degree is used as the recognition result, which is used to indicate the intent of the statement to be recognized.

[0182] In one example, feature extraction module 21 is specifically used for:

[0183] Extract at least one keyword and the relationships between keywords from the statement to be identified.

[0184] Based on the relationships between keywords, the keywords in the sentence to be identified are encoded sequentially to obtain the encoding vector.

[0185] The encoded vector is input into the trained intent feature extraction model to obtain the first intent vector of the sentence to be recognized.

[0186] In one example, the intent recognition module 22 is specifically used for:

[0187] Calculate the Manhattan distance between the first intent vector and every second intent vector in the intent vector space.

[0188] The matching degree between the first intent vector and each second intent vector is determined based on the Manhattan distance.

[0189] In one example, the device further includes:

[0190] Based on the recognition results and the keywords of the statement to be recognized, key search information is generated.

[0191] Searching for key information in a pre-defined database yields search results. The pre-defined database includes intelligent responses for each intent.

[0192] The intelligent response content corresponding to the statement to be identified is determined based on the search results. The intelligent response content is used to indicate the result of the user's inquiry.

[0193] The intent recognition device 20 provided in this application embodiment can execute the above method embodiment. Its specific implementation principle and technical effect can be found in the above method embodiment, and will not be repeated here.

[0194] Figure 10 A schematic diagram of the hardware structure of a server provided in an embodiment of this application is shown. Figure 10 As shown, the server 30 is used to implement the operations corresponding to the server in any of the above method embodiments. The server 30 in this embodiment may include: a memory 31, a processor 32, and a communication interface 34.

[0195] The memory 31 is used to store computer programs. The memory 31 may include high-speed random access memory (RAM) and may also include non-volatile memory (NVM), such as at least one disk storage device, or a USB flash drive, external hard drive, read-only memory, disk or optical disc, etc.

[0196] Processor 32 is used to execute computer programs stored in memory to implement the intent recognition method in the above embodiments. For details, please refer to the relevant descriptions in the foregoing method embodiments. The processor 32 can be a Central Processing Unit (CPU), or other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), etc. A general-purpose processor can be a microprocessor or any conventional processor. The steps of the method disclosed in this invention can be directly manifested as execution by a hardware processor, or execution by a combination of hardware and software modules within the processor.

[0197] Alternatively, the memory 31 can be either standalone or integrated with the processor 32.

[0198] When the memory 31 is a device independent of the processor 32, the server 30 may also include a bus 33. This bus 33 is used to connect the memory 31 and the processor 32. The bus 33 may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, or an Extended Industry Standard Architecture (EISA) bus, etc. Buses can be categorized as address buses, data buses, control buses, etc. For ease of illustration, the buses shown in the accompanying drawings are not limited to a single bus or a single type of bus.

[0199] The communication interface 34 can be connected to the processor 31 via the bus 33. The communication interface 34 is used to acquire the statement to be recognized and to output the predicted recognition result.

[0200] The server provided in this embodiment can be used to execute the above-described intent recognition method. Its implementation and technical effects are similar, and will not be described again here.

[0201] This application also provides a computer-readable storage medium storing a computer program, which, when executed by a processor, is used to implement the methods provided in the various embodiments described above.

[0202] The computer-readable storage medium can be a computer storage medium or a communication medium. A communication medium includes any medium that facilitates the transfer of a computer program from one location to another. A computer storage medium can be any available medium accessible to a general-purpose or special-purpose computer. For example, a computer-readable storage medium is coupled to a processor, enabling the processor to read information from and write information to the computer-readable storage medium. Of course, the computer-readable storage medium can also be a component of the processor. The processor and the computer-readable storage medium can reside in an Application Specific Integrated Circuit (ASIC). Alternatively, the ASIC can reside in a user equipment. Of course, the processor and the computer-readable storage medium can also exist as discrete components in a communication device.

[0203] Specifically, the computer-readable storage medium can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as Static Random-Access Memory (SRAM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Erasable Programmable Read Only Memory (EPROM), Programmable Read-Only Memory (PROM), Read-Only Memory (ROM), magnetic storage, flash memory, magnetic disk, or optical disk. The storage medium can be any available medium accessible to general-purpose or special-purpose computers.

[0204] This application also provides a computer program product comprising a computer program stored in a computer-readable storage medium. At least one processor of the device can read the computer program from the computer-readable storage medium, and the at least one processor executes the computer program to cause the device to implement the methods provided in the various embodiments described above.

[0205] This application also provides a chip including a memory and a processor. The memory is used to store a computer program, and the processor is used to call and run the computer program from the memory, so that a device with the chip installed performs the methods described in the various possible implementations above.

[0206] In the several embodiments provided in this application, it should be understood that the disclosed apparatus and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of modules is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple modules may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between apparatuses or modules may be electrical, mechanical, or other forms.

[0207] The modules can be physically separate, for example, installed in different locations within a single device, installed on different devices, distributed across multiple network units, or distributed across multiple processors. Alternatively, the modules can be integrated, for example, installed in the same device, or integrated into a single codebase. The modules can exist in hardware form, software form, or a combination of both. This application can select some or all of the modules to achieve the objectives of this embodiment based on actual needs.

[0208] When the various modules are implemented as integrated software functional modules, they can be stored in a computer-readable storage medium. The aforementioned software functional modules, stored in a storage medium, include several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) or processor to execute some steps of the methods of the various embodiments of this application.

[0209] It should be understood that although the steps in the flowcharts of the above embodiments are shown sequentially according to the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some of the steps in the figures may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily completed at the same time, but can be executed at different times, and their execution order is not necessarily sequential, but can be performed alternately or in turn with other steps or at least some of the sub-steps or stages of other steps.

[0210] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of this application, and not to limit them. Although this application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some or all of the technical features. These modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of this application.

Claims

1. A method for training an intent feature extraction model, characterized in that, The method includes: An initial corpus set is obtained, and an intent knowledge graph is constructed based on the initial corpus set. Each node in the intent knowledge graph is a keyword of each initial corpus in the initial corpus set, each edge in the intent knowledge graph is used to indicate the association between two adjacent nodes, and each path composed of nodes and edges is used to indicate the intent information of each initial corpus. Based on each path in the intent knowledge graph and its corresponding intent information, each path is classified to obtain multiple intent categories and at least one path corresponding to each intent category. Based on the multiple intent categories and at least one path corresponding to each intent category, a training sample set is generated, and the intent feature extraction model to be trained is iteratively trained using the training sample set to obtain the trained intent feature extraction model. Multiple initial corpora from the initial corpus set are input into the trained intent feature extraction model to obtain first intent vectors corresponding to the multiple initial corpora, and an intent vector space composed of the first intent vectors corresponding to the multiple initial corpora.

2. The method according to claim 1, characterized in that, The step of generating a training sample set based on the multiple intent categories and at least one path corresponding to each intent category, and then using the training sample set to iteratively train the intent feature extraction model to be trained, to obtain the trained intent feature extraction model, specifically includes: Obtain multiple paths in the intent knowledge graph and the intent category corresponding to each path; The training sample set includes the positive sample set and the negative sample set. Paths with the same intent category among the multiple paths are combined in pairs to obtain a positive sample set. Paths with different intent categories among the multiple paths are combined in pairs to obtain a negative sample set. The training sample set is input into the intent feature extraction model to be trained to obtain the trained intent feature extraction model.

3. The method according to claim 1, characterized in that, The step of inputting multiple initial corpora from the initial corpus set into the trained intent feature extraction model to obtain first intent vectors corresponding to the multiple initial corpora, and the intent vector space composed of the first intent vectors corresponding to the multiple initial corpora, specifically includes: The intent feature extraction model is used to extract features from the initial corpus to obtain a first intent vector corresponding to each initial corpus. Calculate the mean of all first intent vectors in each intent category, and use the mean as the second intent vector corresponding to the intent category; The intent vector space is formed using the second intent vectors of each of the stated intent categories.

4. The method according to any one of claims 1-3, characterized in that, The step of obtaining an initial corpus set and constructing an intent knowledge graph based on the initial corpus set specifically includes: Obtain the initial corpus set; Extract at least one keyword and the association between the at least one keyword from each initial corpus in the initial corpus set; An intent knowledge graph is constructed based on the keywords of each initial corpus in the initial corpus set and the relationships between the keywords.

5. The method according to any one of claims 1-3, characterized in that, The step of classifying each path in the intent knowledge graph and its corresponding intent information to obtain multiple intent categories and at least one path corresponding to each intent category specifically includes: The intent knowledge graph is clustered to obtain multiple cluster centers. Each cluster center corresponds to a node in the intent knowledge graph, and each cluster center corresponds to an intent category. The keyword indicated by the node corresponding to the cluster center is the intent keyword of the intent category. The path including the cluster center is classified into an intent category.

6. An intent recognition method, characterized in that, The method includes: The sentence to be identified is input into the trained intent feature extraction model as described in any one of claims 1-5 to obtain a first intent vector corresponding to the sentence to be identified. The sentence to be identified is a sentence that includes user intent information. The intent feature extraction model is used to extract features from the sentence to be identified to obtain the first intent vector. The first intent vector is used to determine the intent of the sentence to be identified. Project the first intent vector corresponding to the statement to be identified onto the intent vector space as described in any one of claims 1-5, and calculate the matching degree between the first intent vector corresponding to the statement to be identified and each second intent vector in the intent vector space. The intent vector space includes a plurality of second intent vectors, each second intent vector corresponds to an intent category, and each intent category corresponds to an intent keyword. The intent keyword of the intent category corresponding to the second intent vector with the highest matching degree is used as the recognition result, and the recognition result is used to indicate the intent of the statement to be recognized.

7. The method according to claim 6, characterized in that, The step of inputting the statement to be identified into the trained intent feature extraction model as described in any one of claims 1-5 to obtain the first intent vector of the statement to be identified specifically includes: Extract at least one keyword from the statement to be identified and the association between the keywords; Based on the correlation between the keywords, the keywords in the sentence to be identified are encoded sequentially to obtain an encoding vector; The encoded vector is input into the trained intent feature extraction model to obtain the first intent vector of the statement to be recognized.

8. The method according to claim 6, characterized in that, The step of projecting the first intent vector onto the intent vector space as described in any one of claims 1-5, and calculating the matching degree between the first intent vector and each second intent vector in the intent vector space, specifically includes: Calculate the Manhattan distance between the first intent vector and every second intent vector in the intent vector space; The matching degree between the first intent vector and each of the second intent vectors is determined based on the Manhattan distance.

9. The method according to any one of claims 6-8, characterized in that, The method further includes: Based on the recognition results and the keywords of the statement to be recognized, key retrieval information is generated; The search results are obtained by using the key search information to search a preset database, which includes intelligent response content for each intent. Based on the search results, the intelligent response content corresponding to the statement to be identified is determined, and the intelligent response content is used to indicate the result of the user's inquiry.

10. A training device for an intent feature extraction model, characterized in that, The device includes: The knowledge graph construction module is used to obtain an initial corpus set and construct an intent knowledge graph based on the initial corpus set. Each node in the intent knowledge graph is a keyword of each initial corpus in the initial corpus set, each edge in the intent knowledge graph is used to indicate the association between two adjacent nodes, and each path composed of nodes and edges is used to indicate the intent information of each initial corpus. The intent classification module is used to classify each path according to each path in the intent knowledge graph and its corresponding intent information, so as to obtain multiple intent categories and at least one path corresponding to each intent category. The model training module is used to generate a training sample set based on the multiple intent categories and at least one path corresponding to each intent category, and to use the training sample set to iteratively train the intent feature extraction model to be trained, so as to obtain the trained intent feature extraction model. The space construction module is used to input multiple initial corpora from the initial corpus set into the trained intent feature extraction model to obtain the first intent vectors corresponding to the multiple initial corpora, and the intent vector space composed of the first intent vectors corresponding to the multiple initial corpora.

11. An intent recognition device, characterized in that, The device includes: The feature extraction module is used to input the statement to be identified into the trained intent feature extraction model as described in any one of claims 1-5 to obtain a first intent vector corresponding to the statement to be identified, wherein the statement to be identified is a statement including user intent information, and the intent feature extraction model is used to extract features from the statement to be identified to obtain the first intent vector, wherein the first intent vector is used to determine the intent of the statement to be identified. An intent recognition module is configured to project a first intent vector corresponding to the statement to be recognized onto the intent vector space as described in any one of claims 1-5, and calculate the matching degree between the first intent vector corresponding to the statement to be recognized and each second intent vector in the intent vector space, wherein the intent vector space includes a plurality of second intent vectors, each second intent vector corresponds to an intent category, and each intent category corresponds to an intent keyword; and take the intent keyword of the intent category corresponding to the second intent vector with the highest matching degree as the recognition result, wherein the recognition result is used to indicate the intent of the statement to be recognized.

12. A server, characterized in that, The server includes: a memory and a processor; the memory is used to store a computer program; the processor is used to implement, according to the computer program stored in the memory, the intent feature extraction model training method as described in any one of claims 1-5, or the intent recognition method as described in any one of claims 6-9.

13. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program, which, when executed by a processor, is used to implement the intent feature extraction model training method as described in any one of claims 1-5, or the intent recognition method as described in any one of claims 6-9.

14. A computer program product, characterized in that, The computer program product includes a computer program that, when executed by a processor, implements the intent feature extraction model training method according to any one of claims 1-5, or the intent recognition method according to any one of claims 6-9.