Method and related device for discovering new financial intent based on dynamic granular ball knowledge base

By employing a dynamic particle knowledge base method, the accuracy and stability issues of new intent recognition in intelligent customer service systems in the financial sector were addressed. This method enables effective recognition of unknown intents and continuous updating of known intents, thereby enhancing the system's dynamic adaptability.

CN122242525APending Publication Date: 2026-06-19SOUTHWESTERN UNIV OF FINANCE & ECONOMICS

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
SOUTHWESTERN UNIV OF FINANCE & ECONOMICS
Filing Date
2026-05-22
Publication Date
2026-06-19

Smart Images

  • Figure CN122242525A_ABST
    Figure CN122242525A_ABST
Patent Text Reader

Abstract

This invention discloses a method and related apparatus for discovering new financial intentions based on a dynamic particle sphere knowledge base, belonging to the field of natural language processing. The method includes: extracting semantic representations of samples, constructing a known intention particle sphere knowledge base, and classifying input samples into known or unknown intentions; performing particle sphere clustering on unknown intention samples to generate candidate new intention particles; calculating the internal shapeness, external separation, and structural support of each candidate particle sphere, and calculating a temporal stability index based on multi-round data; and confirming a new intention category when all the above indicators meet threshold conditions. This invention can jointly constrain the confirmation of new intentions from two dimensions: structural quality and evolutionary stability. This avoids misjudgments caused by single-round data fluctuations and ensures the structural reliability of candidate intentions, thereby achieving more accurate and stable new intention discovery in an open and dynamic environment.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of natural language processing, and in particular to a method and related apparatus for discovering new financial intentions based on a dynamic particle knowledge base. Background Technology

[0002] With the widespread application of intelligent dialogue systems in scenarios such as intelligent customer service, online consultation, and human-computer interaction, the challenges of intent recognition and novel intent discovery in open environments have gradually become one of the key technologies for improving the intelligence level of systems. In practical applications, during the training phase, systems can typically only acquire a limited number of known intent category samples. However, during the testing or operation phase, users continuously input diverse query requests, which include a large number of unseen novel intents. As the system operates over a long period, these unknown intents accumulate and gradually evolve into new categories with stable semantic structures. Therefore, the system not only needs to have the ability to detect unknown intents but also needs to discover potential novel intents from unknown samples and gradually incorporate them into the existing intent system, realizing the continuous dynamic evolution of the intent space from known to unknown.

[0003] Existing methods for discovering new intents typically follow a basic paradigm of "known intent modeling—unknown intent identification—new intent discovery," namely: first, semantic representation learning is performed using known intent samples to construct a category discrimination model; then, a decision boundary or confidence discrimination mechanism is built to distinguish between known and unknown intents; finally, cluster analysis is performed on the identified unknown samples to uncover potential new intent categories. However, in practical applications, existing methods still have several shortcomings, limiting their effectiveness in complex and dynamic scenarios.

[0004] In the open intent recognition stage, decision boundary-based discrimination methods are one of the important technical approaches for distinguishing between known and unknown intents. These methods construct a discrimination boundary for known intents and combine the distance from samples to the center of each category with the boundary relationship for discrimination. Specifically, for example, the discrimination region can be determined by learning boundary-related parameters through a model, or the discrimination boundary for the corresponding category can be constructed using the statistical distance from samples to the category center (e.g., the average distance) as the radius based on the distribution characteristics of samples within the category. However, these methods often fail to fully utilize the structural information within the known intent category during boundary construction, making it difficult to accurately characterize the true distribution range of the category, thus affecting the accuracy of unknown intent recognition.

[0005] In the new intent discovery phase, existing methods typically use traditional clustering algorithms (such as K-means clustering) to divide unknown samples directly after identification, and pre-determine the number of clusters. However, in real-world scenarios, the number of new intents is usually unknown and dynamically changing, making it difficult for fixed cluster settings to accurately reflect the true distribution structure of the data. Furthermore, after completing clustering, existing methods often directly treat the clustering results as new intent categories, lacking an effective mechanism for evaluating the structural quality of the clustering results. This can easily introduce unrepresentative or unstable categories, thus affecting the performance of subsequent models.

[0006] In typical application scenarios such as intelligent customer service and user intent parsing in the financial sector, the problem of discovering new intents in open environments is equally prominent. When users interact with financial systems, their expressions are often highly professional, semantically diverse, and have complex intents. With the acceleration of financial product innovation, dynamic adjustments to market rules, and continuous changes in regulatory policies, users' inquiry intents exhibit a faster evolution speed and more complex semantic structures. However, existing general new intent discovery methods face unique challenges when dealing with the financial sector: on the one hand, the known intent categories in the financial sector are often incompletely covered, and a large number of new intents emerge in a short period of time after the launch of new financial products or services; on the other hand, the semantic boundaries in financial dialogues are more blurred, and different intents may have high similarity, making it difficult for traditional static clustering-based methods to accurately distinguish them. Therefore, applying new intent discovery technology in open environments to financial scenarios not only requires solving the intent recognition problem in general domains but also overcoming the technical challenges unique to the financial sector, such as terminology sparsity and rapid intent evolution.

[0007] Furthermore, in continuously operating intelligent dialogue systems, user intents exhibit dynamic evolution, with new intents constantly emerging and gradually evolving into stable known categories, leading to continuous changes in the overall data distribution. Therefore, the system not only needs the ability to discover new intents but also to continuously update and optimize the existing intent structure. However, existing methods generally lack a unified dynamic update mechanism. On the one hand, it is difficult to incrementally optimize known intents using new samples in a timely manner; on the other hand, unknown samples are often processed only once, lacking the ability for continuous accumulation and evolutionary modeling. This makes it difficult for the intent knowledge base to maintain stability and scalability during long-term operation. Summary of the Invention

[0008] The purpose of this invention is to overcome the problems of the prior art and provide a method and related apparatus for discovering new financial intentions based on a dynamic particle knowledge base.

[0009] The objective of this invention is achieved through the following technical solution: a method for discovering novel financial intentions based on a dynamic particle knowledge base, comprising the following steps: Extract the semantic representation of the sample; Construct a known intent particle knowledge base based on the semantic representation of known intent samples; Based on the known intent particle knowledge base, the semantic representation of the input samples is classified into known intent samples or unknown intent samples. Perform particle clustering analysis on samples with unknown intents to generate one or more candidate new intent particles; Calculate the internal shapeness, external separation, and structural support of each candidate novel intention sphere; the internal shapeness characterizes the aggregation degree and distribution stability of the samples inside the sphere; the external separation measures the degree of distinction between candidate spheres and other spheres; and the structural support measures the sample size of the candidate spheres. During the multi-round data input process, the centroid of each candidate new intention particle is recorded in each round. The single-round stability index is calculated based on the ratio of the centroid offset distance between adjacent rounds to the particle radius in the current round. The single-round stability index of multiple consecutive rounds is comprehensively calculated to obtain the temporal stability index of the candidate new intention particle. When the internal shapeness, external separation, and structural support of a candidate new intention particle meet the corresponding preset threshold conditions, and the temporal stability index of the candidate new intention particle meets the preset stability threshold condition, the candidate new intention particle is confirmed as a new intention category.

[0010] In one example, the construction of a known intent particle knowledge base based on the semantic representation of known intent samples includes: The centroid, mean radius, and maximum radius of the sphere are calculated based on the semantic representation of the samples in each known intent category. The particle density is calculated based on the number of samples and mean radius of the known intent categories, and the particle density of all known intent categories is normalized. Based on the normalized particle density, an adaptive adjustment is made between the mean radius and the maximum radius to obtain the decision boundary for each known intent category; The centroid, decision boundary, and category label of each known intent category are associated and stored to form a known intent particle knowledge base.

[0011] In one example, the internal shape is determined based on the mean radius and standard deviation of the radius of the internal samples of the candidate novel intention sphere; the external separation is determined based on the average distance between the candidate novel intention sphere and the centroids of the nearest preset number of candidate spheres; and the structural support is determined based on the comparison of the number of candidate novel intention sphere samples among all candidate novel intention spheres.

[0012] In one example, before confirming the candidate new intent particle as a new intent category, the process further includes: Calculate the maximum semantic similarity between the candidate new intent particle and all known intent particles; When the maximum semantic similarity meets the preset semantic similarity threshold, the candidate new intent particle is confirmed as a new intent category.

[0013] In one example, after confirming the candidate new intent particle as a new intent category, the process further includes: The centroid and decision boundary of the candidate new intent spheres that are confirmed as new intent categories are calculated. The samples within the candidate new intent spheres are semantically summarized to generate semantic labels for the new intent categories. The spheres carrying the centroid, decision boundary, and semantic labels are added to the known intent sphere knowledge base.

[0014] In one example, when a new sample is classified into a known intent category, the centroid and decision boundary of the particle sphere corresponding to the known intent category are updated based on the new sample.

[0015] In one example, when the internal shape, external separation, and structural support of the candidate novel intention sphere do not meet the corresponding preset threshold conditions, and / or the temporal stability index of the candidate novel intention sphere does not meet the preset stability threshold condition, the method further includes: In subsequent rounds of data input, the centroid, radius, internal shape, external separation, structural support, and temporal stability index of the candidate new intention particles are continuously updated. When the internal shape, external separation, and structural support of the candidate new intention particles meet the corresponding preset threshold conditions, and the temporal stability index of the candidate new intention particles meets the preset stability threshold conditions, the candidate new intention particles are confirmed as new intention categories.

[0016] It should be further noted that the technical features corresponding to the above examples can be combined or replaced to form new technical solutions.

[0017] The present invention also includes a computer program product comprising a computer program that, when executed by a processor, implements the steps of the financial novel intent discovery method based on a dynamic particle knowledge base formed by any or a combination of the above examples.

[0018] The present invention also includes a storage medium storing computer instructions that, when executed, perform the steps of the financial new intent discovery method based on a dynamic particle knowledge base formed by any or more of the above examples.

[0019] The present invention also includes a terminal comprising a memory and a processor, wherein the memory stores computer instructions executable on the processor, and the processor executes the steps of the financial novel intent discovery method based on dynamic particle knowledge base formed by any or more of the above examples when executing the computer instructions.

[0020] Compared with the prior art, the beneficial effects of the present invention are: 1. This invention, from the perspective of sphere structure, comprehensively characterizes the internal compactness, external distinguishability, and scale representativeness of each candidate new intent sphere by calculating its internal shape, external separation, and structural support. This allows for the selection of candidate new intents with good structural characteristics, improving the reliability of new intent modeling. Furthermore, a temporal evolution stability modeling mechanism is introduced. By characterizing the structural changes of candidate intent spheres in multiple consecutive rounds of data, the temporal stability index of candidate new intent spheres is calculated. This evaluates the evolutionary stability of candidate intents in the time dimension, effectively improving the stability of new intent recognition and avoiding the reduction in accuracy of new intent discrimination due to data fluctuations in single-round clustering results.

[0021] The method of this invention combines the above-mentioned candidate new intention modeling based on particle structure characteristics and candidate new intention verification based on temporal evolution stability. It can jointly constrain the confirmation of new intentions from two dimensions: structural quality and evolution stability. This avoids misjudgment caused by single-round data fluctuations and ensures the structural reliability of candidate intentions, thereby achieving more accurate and stable new intention discovery in an open and dynamic environment.

[0022] 2. This invention introduces a semantic exclusion mechanism from the perspective of semantic space. By calculating the semantic similarity between candidate intent particles and all known intents, global semantic constraints are imposed on candidate new intents to avoid misidentifying local variants or boundary extensions of existing intents as new intent categories, thereby enhancing the semantic distinguishability between new intents and known intents.

[0023] 3. When a new sample is classified into a known intent category, the centroid and decision boundary of the corresponding known intent category particle are updated based on the new sample. Through this incremental update mechanism, the particle representation of known intents can adapt to changes in data distribution in a timely manner without rebuilding the entire knowledge base, maintaining the timeliness of the centroid and decision boundary, thereby maintaining classification accuracy in long-term operation. Attached Figure Description

[0024] The specific embodiments of the present invention will be further described in detail below with reference to the accompanying drawings. The accompanying drawings are provided to provide a further understanding of the present application and constitute a part of the present application. The same reference numerals are used in these drawings to denote the same or similar parts. The illustrative embodiments of the present application and their descriptions are used to explain the present application and do not constitute an improper limitation of the present application.

[0025] Figure 1 This is a flowchart of a method provided in an embodiment of the present invention; Figure 2 The method flowchart is provided for a preferred embodiment of the present invention. Detailed Implementation

[0026] The technical solution of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0027] Furthermore, the technical features involved in the different embodiments of the present invention described below can be combined with each other as long as they do not conflict with each other.

[0028] To address the problem that existing open intent recognition methods fail to fully utilize the distribution information within categories during decision boundary construction, particularly lacking effective characterization of sample density, local aggregation states, and category range differences, leading to inaccurate construction of known intent decision boundaries, this invention proposes a novel financial intent discovery method based on a dynamic granular knowledge base. This method can be applied to scenarios such as intelligent customer service systems and semantic intent parsing in the financial field. The data samples processed are user-input financial consultation texts, which can be natural language queries containing financial entity terminology and intent expressions, such as inquiries about transaction rules or account operation guidance. This invention's method extracts the semantic representation of financial texts, constructs a granular knowledge base of known financial intents, and utilizes a temporal stability mechanism to accurately discover newly emerging intent categories in the financial field, thereby improving the dynamic adaptability of intelligent financial interaction systems. Furthermore, this method is also applicable to open environment intent discovery in non-financial fields.

[0029] Of course, the method of this invention can also be applied to other human-computer interaction scenarios. In e-commerce intelligent customer service scenarios, the processed data samples are user product inquiry texts, such as inquiries about how to extend the warranty period; in government service platform scenarios, the processed data samples are public policy Q&A texts, such as inquiries about the process of registering for medical insurance in other places and the application process for electronic certificates; in online medical consultation scenarios, the processed data samples are patient symptom description texts, such as inquiries about the process of online follow-up visits and prescriptions and inquiries about the need to interpret examination reports. The core of each of the above scenarios is to perform granular sphere modeling on known intent samples, use the centroid, radius, and related statistical characteristics of the spheres to represent the category distribution, and adjust the decision boundary according to the sphere structure attributes, thereby achieving a more accurate characterization of the true distribution range of different categories and effectively improving the accuracy and stability of unknown intent recognition in open environments.

[0030] In the embodiments of this invention, in an intelligent dialogue system in an open environment, the text data input by the user is characterized by continuous change and dynamic expansion. During the training phase, only a limited number of known intent category samples can typically be obtained, while new intents that have never appeared before are continuously received during actual operation. Therefore, the new intent discovery task of this invention is designed as follows: Under the premise of training only using known intent samples, it can realize the recognition of known intents and the detection of unknown intents on input samples, and further mine potential new intent categories from unknown samples, while supporting the continuous updating and dynamic evolution of intent knowledge.

[0031] Specifically, let the training dataset be... for: in, Indicates the input text; Indicates a known intent category label. Given the number of categories, This is a set of known intent categories. Input samples are used during the testing or online phase. The tag space is expanded to: in, This represents the set of unknown intent categories that did not appear during the training phase.

[0032] The objective of this invention is to provide a solution for any input sample. First, it is determined whether the sample belongs to a known intent category or an unknown intent category. For samples determined to be of unknown intent, they are further aggregated and analyzed using structured modeling methods to discover potential new intent categories and gradually incorporate them into the known intent system, thereby achieving dynamic expansion of the intent space.

[0033] In one embodiment, such as Figure 1 As shown, a method for discovering novel financial intentions based on a dynamic particle sphere knowledge base includes the following steps: S1: Extract the semantic representation of the sample.

[0034] In the open intent recognition task, to obtain a semantic representation with good discriminative power, this step employs a representation learning method based on a pre-trained language model to encode the input financial text and extract its semantic representation. Specifically, the input text sample... First, the sequence is segmented and converted into the input format required by the model. Then, the processed sequence is input into a pre-trained language model for encoding, yielding the corresponding semantic representation vector. in, This represents a pre-trained language model; Indicates sample semantic representation vector, For feature dimensions.

[0035] Optionally, the pre-trained language model can be a mainstream language representation model, such as BERT, RoBERTa, ALBERT, or ELECTRA, and can be fine-tuned according to specific task requirements to improve the model's ability to express semantics in specific domains.

[0036] Furthermore, to enhance the discriminative power of semantic representation, a supervisory signal can be introduced into the model output layer for optimization. For example, a classification loss function based on category labels can be used to train the model, making samples of the same intent category more compact in the feature space and improving the discriminability between different categories.

[0037] S2: Construct a known intent particle knowledge base based on the semantic representation of known intent samples.

[0038] In step S2, after obtaining the semantic representation of all samples, a particle sphere is constructed for each known intent category, consisting of a centroid, decision boundary, and category label. The set of particle spheres for all known intent categories forms a known intent particle sphere knowledge base, which is used for subsequent intent classification and new intent discovery.

[0039] S3: Classify the semantic representation of the input samples based on the known intent particle knowledge base, and divide the input samples into known intent samples or unknown intent samples.

[0040] In step S3, the distance between the semantic representation of the input sample and the centroid of each particle in the known intent particle knowledge base is calculated to determine the nearest particle and whether the distance is less than the decision boundary of the nearest particle. If it is less than the decision boundary, the sample is determined to belong to the known intent category; otherwise, it is determined to be an unknown intent sample. The effective separation of known intent and unknown intent is achieved through classification.

[0041] S4: Perform particle clustering analysis on unknown intent samples to generate one or more candidate new intent particles.

[0042] In step S4, the semantic representations of all samples judged as having unknown intents are used as input, and an unsupervised granular clustering method is used for structured partitioning. That is, the clustering process automatically forms several granular structures based on the distribution of samples in the semantic space. Each granular structure corresponds to a group of semantically relatively clustered unknown samples, representing a potential candidate new intent, and can adaptively represent the intent structure inside the unknown sample.

[0043] S5: Calculate the internal shape, external separation, and structural support of each candidate novel intention sphere.

[0044] In step S5, the structural quality of each candidate novel intent particle is evaluated. Internal shapeness characterizes the aggregation and distribution stability of samples within the particle, external separation measures the degree of distinction between candidate particles and other particles, and structural support measures the sample size of candidate particles. These three indicators comprehensively represent the structural characteristics of the candidate novel intent from different dimensions.

[0045] S6: During the multi-round data input process, record the centroid of each candidate new intention particle in each round. Calculate the single-round stability index based on the ratio of the centroid offset distance between adjacent rounds to the particle radius in the current round. Then, perform a comprehensive calculation on the single-round stability index of multiple consecutive rounds to obtain the temporal stability index of the candidate new intention particle.

[0046] In step S6, a time dimension is introduced to model the evolutionary stability of candidate new intentions. Specifically, after the system executing the method of this invention runs for multiple rounds, each round generates candidate new intention particles and records the centroid position of the particle. By calculating the ratio of the centroid offset distance between adjacent rounds to the particle radius of the current round, a single-round stability index is obtained. Then, this index is comprehensively calculated (e.g., averaged or weighted) over multiple consecutive rounds to finally obtain a temporal stability index, which represents the degree of positional stability of candidate new intentions in the semantic space. The smaller the value, the more stable the structure.

[0047] S7: When the internal shapeness, external separation and structural support of the candidate new intention particle meet the corresponding preset threshold conditions, and the temporal stability index of the candidate new intention particle meets the preset stability threshold condition, the candidate new intention particle is confirmed as a new intention category.

[0048] In step S7, the new intent is finally confirmed based on multidimensional joint constraints. Only when the internal shape of the candidate new intent particle is high enough, the external separation is large enough, the structural support is large enough, and the temporal stability index is small enough, is it converted from the candidate state to the formal new intent category. This effectively avoids misjudgment caused by single-round data fluctuations or poor structural quality, and ensures the accuracy and reliability of the new intent discovery results.

[0049] The method of this invention integrates the semantic representation capabilities of large models with the granular structure modeling mechanism to achieve effective identification of unknown intentions and accurate discovery of new intentions in an open environment, and supports continuous updating and dynamic evolution of intention knowledge, thereby improving the accuracy, stability and scalability of intelligent dialogue systems in complex scenarios.

[0050] In one embodiment, a known intent particle knowledge base is constructed based on the semantic representation of known intent samples, including: S21: Calculate the centroid, mean radius, and maximum radius of the sphere based on the semantic representation of the samples in each known intent category.

[0051] Specifically, let the first The training sample set corresponding to each known intent category is: ,in This indicates the number of samples in that category. Indicates the first In the training sample set corresponding to the known intent category, the first... One training sample. Define a sphere using all samples in the training sample set. To represent categories The knowledge of the sphere The centroid is defined as: , Indicates the first In the training sample set corresponding to the known intent category, the first... For each training sample, the Euclidean distance from the centroid is: , Indicates the first In the training sample set corresponding to the known intent category, the first... One training sample, based on which, the particle mean radius for: Maximum radius for: Wherein, it represents the first In the training sample set corresponding to the known intent categories, the first... Training samples to the centroid of this type of particle The Euclidean distance.

[0052] S22: Calculate the sphere density based on the number of samples and mean radius of the known intent categories, and normalize the sphere density for all known intent categories.

[0053] Specifically, to characterize the degree of aggregation of samples within the granules, granule density is defined. for: in, To prevent extremely small constants with a denominator of zero, and further to eliminate the influence of dimensional differences between different categories, the particle density is normalized: in, This represents the normalized particle density. and These represent the minimum and maximum values ​​of the particle density for all known categories, respectively.

[0054] S23: Based on the normalized particle density, adaptive adjustment is performed between the mean radius and the maximum radius to obtain the decision boundary for each known intent category.

[0055] Specifically, based on the normalized density, the first... Decision boundaries for each category : This yields the decision boundary for each category. This invention proposes a decision boundary construction method based on particle density adjustment. By characterizing the distribution density of samples within the particle, it adaptively adjusts the mean radius and maximum radius to construct a discriminative boundary that reflects the true distribution characteristics of the categories.

[0056] S24: Store the centroid, decision boundary, and category label of each known intent category as a known intent centroid knowledge base.

[0057] In one embodiment, step S2 classifies the input samples into known intent samples or unknown intent samples (open intent classification), including: During the testing process, the semantic representation vector of the sample to be classified... First, calculate its relationship with the centers of each known intent category. The distance is calculated, and the category with the smallest distance is selected, denoted as: in, Represents the index number of the known intent category closest to the sample to be classified; if the following conditions are met: Then determine the semantic representation vector The corresponding sample belongs to the first If the intent category is known, the sample is classified as having an unknown intent; otherwise, the sample is classified as having an unknown intent. The centroid of the particle representing the nearest known intention category to the sample to be classified; Indicates the first The decision boundary radius of a known intention category particle.

[0058] This invention adjusts the decision boundary by introducing particle density, so that particles of higher density can obtain a wider discrimination range, while particles of lower density adopt a narrower boundary, thereby effectively improving the rationality and stability of the discrimination boundary between different categories, and thus improving the accuracy of unknown intent recognition in open environments.

[0059] In one embodiment, particle clustering analysis is performed on unknown intent samples to generate one or more candidate new intent particles, including: In the open intent recognition phase, samples belonging to unknown intents are identified from the input samples. These unknown samples are then subjected to further cluster analysis to uncover potential intent structures, thereby achieving the initial discovery of new intents.

[0060] Specifically, let's define the set of semantic representations of samples determined to have unknown intent in the previous stage. for: Based on the aforementioned set of unknown samples, an unsupervised granular clustering method is used to structurally partition them, organizing the unknown samples in the semantic space into several granular structures with local clustering characteristics, thereby revealing the potential intention distribution within the unknown samples. Unlike traditional clustering methods (such as K-means), which require a pre-defined number of categories, granular clustering, through an adaptive partitioning and merging mechanism, can automatically form granular structures based on the actual distribution of the data, without needing to pre-define the number of new intentions. Therefore, this method can more naturally depict the true structural distribution of unknown samples and effectively avoid clustering bias caused by an unreasonable number of categories. It should be noted that the specific implementation of the granular clustering process, as an unsupervised structural modeling method, does not constitute the innovative content of this invention. Any clustering method that can achieve the above-mentioned structural partitioning of unknown samples is applicable to this invention. Finally, a set of candidate new intention granular structures is obtained through the above process. : in, Indicates the first There are 10 candidate new intent spheres, each representing a set of relatively clustered unknown samples in the semantic space, which can be regarded as the initial representation of a candidate new intent.

[0061] In one embodiment, after obtaining the candidate particle set Subsequently, in order to effectively evaluate and screen candidate novel intentions, this invention models them from the perspective of granular structure, specifically summarizing the structural properties of candidate novel intentions into three aspects: internal shape, external separation, and structural support, thereby achieving a comprehensive characterization of candidate novel intentions.

[0062] Let the first A candidate new intention particle The unknown intent contained herein is: in, Indicates granules The number of samples included. Indicates the first The semantic representation vector of each sample, For feature dimension, Represents the real number field. (Spheroid) center of mass Defined as: (1) Internal formability Internal shape is used to comprehensively characterize the aggregation degree and distribution stability of samples within granules. First, the average radius of the granules is defined. for: Further define the standard deviation of the distance inside the pellet. for: Based on this, the internal shape of the granules is defined. for: in, To prevent extremely small constants with a denominator of zero.

[0063] The internal forming index of this invention takes into account the overall size of the spheres (by... Representation) and internal consistency (by (Characteristics). When the sample distribution inside the sphere is more compact and uniform, A larger value indicates that the candidate structure is closer to a stable new intent category.

[0064] (2) External separation External separation is used to measure the degree to which candidate novel intention spheres are distinguished from other spheres. Let... Indicates the relationship with spherical particles The closest in the feature space A set of indices of spheres, where The number of neighboring spheres is preset. For any sphere... Its center of mass is Then the sphere external separation Defined as: in, Indicates spherical particles With spherical The Euclidean distance between the centroids. When A larger value indicates that the sphere has good separation from its neighboring structures and is more likely to correspond to an independent new intention category; conversely, a smaller value indicates that the sphere has strong overlap or similarity with the surrounding structures.

[0065] (3) Structural support Structural support measures the size of the sample size contained in a candidate sphere, thus reflecting its representativeness as an emerging graph. Let the minimum sample size among all candidate spheres be defined. With the maximum value They are respectively: Then the spherical particles Normalized structural support Defined as: in, This structural support metric is used to characterize the scale of spherulites in unknown samples. When A larger value indicates that the sphere contains a large number of samples, and its corresponding semantic pattern has a higher frequency and stability in the unknown dataset, making it more likely to represent a real new intent category.

[0066] In real-world open environments, candidate new intentions are often not formed all at once, but rather evolve gradually into a stable structure over multiple rounds of data input. Judging based solely on the structural characteristics of a single round of data is susceptible to the influence of random sample disturbances, leading to misjudgments or instability of new intentions. Therefore, in one embodiment, this invention further introduces a temporal evolution stability modeling mechanism for candidate new intentions, characterizing the structural consistency of candidate particles from a temporal perspective.

[0067] Specifically, let the first The candidate spheres obtained in the round of data processing are Its center of mass is The mean radius is The corresponding number of balls in the previous round is Its center of mass is Define the candidate sphere in the first... Single-wheel timing stability index for: in, To prevent extremely small constants with a denominator of zero, this single-round temporal stability index is used to measure the degree of structural change of candidate spheres during continuous data input. To reduce the impact of single-round data fluctuations on the judgment results, this invention further refines the recent continuous... The stability results of each round are averaged to obtain a multi-round time-series stability index. : in, Indicates the first The single-wheel stability index corresponding to the wheel, This represents the length of the sliding window, i.e., the number of consecutive iterations used in the calculation. When When the value is small, it indicates that the centroid of the candidate particle has changed little in recent rounds of data, and it has strong stability in the semantic space, making it more likely to correspond to a real new intent category; conversely, when the value is large, it indicates that the particle structure is still in a fluctuating state and is not suitable for direct confirmation as a new intent.

[0068] Optionally, the stability of candidate particles can be further characterized by statistical analysis of the sample overlap ratio or member consistency in multiple consecutive rounds, thereby improving the reliability of new intent determination.

[0069] This invention, by introducing temporal evolution stability constraints, realizes the transformation from single-round structure determination to multi-round evolution verification, effectively reducing the risk of misjudgment caused by instantaneous data fluctuations.

[0070] Based on the above structural characteristics and temporal evolution stability modeling, in order to further improve the accuracy of new intent determination, in one embodiment, a semantic rejection mechanism for candidate new intents (a candidate new intent rejection mechanism based on global semantics) is introduced from the perspective of global semantic space to avoid misidentifying local variants or boundary extensions of existing intents as new intent categories.

[0071] In this embodiment, before confirming the candidate new intent particle as a new intent category, the method further includes: Calculate the maximum semantic similarity between the candidate new intent particle and all known intent particles; When the maximum semantic similarity meets the preset semantic similarity threshold, the candidate new intent particle is confirmed as a new intent category.

[0072] Specifically, let candidate particles be... The center of mass is The set of the centroids of the target sphere is known to be... , Define the maximum semantic similarity between the candidate particle and all known intents, given the number of known intents. for: in, This represents a similarity metric between vectors, which can be cosine similarity or other semantic similarity functions. This metric characterizes the semantic closeness between candidate particles and existing intent categories. When When the value is large (when the maximum semantic similarity does not meet the preset semantic similarity threshold), it indicates that the candidate particle has a high similarity to a known intent in the semantic space, and is more likely to be a local variant or boundary extension of an existing intent, rather than a new independent intent category; conversely, when the value is small (when the maximum semantic similarity meets the preset semantic similarity threshold), it indicates that the candidate particle has good distinguishability from existing intents in the semantic space, and is more likely to correspond to a new intent category.

[0073] In this embodiment, by setting a semantic similarity threshold, a global semantic rejection judgment is made on candidate new intentions. Only when the semantic similarity between a candidate particle and all known intentions is lower than the preset threshold is it considered to have the qualifications of a candidate new intention.

[0074] By jointly modeling the aforementioned structural characteristics, temporal evolution stability, and global semantic relationships, this invention can comprehensively characterize candidate novel intentions from multiple dimensions, including internal structural quality, external structural relationships, sample size, evolutionary stability, and semantic discriminability. Specifically, internal shape reflects the compactness and stability of the particle's interior; external separation reflects the degree of distinction between the candidate particle and its neighboring structures; structural support reflects the sample representativeness of the candidate particle; temporal stability characterizes the consistency of the candidate particle's structural evolution across multiple rounds of data; and semantic similarity measures the semantic proximity between the candidate particle and existing intentions. Based on these multidimensional structural characterization results, this invention further constructs a novel intention determination and confirmation strategy based on multidimensional constraints, achieving effective screening and confirmation of candidate particles.

[0075] Specifically, for candidate particles The corresponding structural index is internal formability. External separation and structural support Meanwhile, its time series stability index is The semantic similarity index is This invention employs a joint decision-making mechanism based on multiple constraints to confirm the novel intent of candidate particles. To ensure the structural rationality and reliability of the candidate novel intents, while also considering their evolutionary stability and semantic distinguishability, decision thresholds are set for each indicator: in, Indicates the internal formability threshold; Indicates the external separation threshold; Indicates the structural support threshold; Indicates the time stability threshold; This represents the semantic similarity threshold.

[0076] When candidate particles If the above conditions are met, it is determined to be a valid new intent category; otherwise, it is regarded as a candidate structure with unstable structure or semantic overlap with existing intents, and is not confirmed as a new intent.

[0077] Furthermore, for candidate particles identified as new intentions, this invention transforms their category attribute from unknown intention to a confirmed new intention category and incorporates them into a known intention particle knowledge base for unified management. For candidate particles that do not meet the determination criteria, this invention continues to retain their sample set and structural information as candidate new intentions, accumulating and evolving their structure in subsequent online phases. When they gradually meet the above determination criteria in subsequent rounds of data, they are then transformed into new intention categories.

[0078] Through the aforementioned multi-dimensional constraint determination mechanism, this invention achieves automatic confirmation and dynamic transformation from unknown intents to new intent categories, elevating new intent recognition from a single structural determination to a joint structural-temporal-semantic determination process, thereby effectively improving the accuracy and stability of new intent discovery and providing a reliable foundation for the dynamic updating of the subsequent intent knowledge base.

[0079] In one embodiment, after confirming the candidate new intent particles as new intent categories, the process further includes the inclusion and labeling of the new intent particles. For particles confirmed as new intent categories during the determination phase... They are labeled to formally integrate candidate intents into the known intent system.

[0080] In this embodiment, the inclusion and annotation process of new intent particles includes: calculating the centroid and decision boundary of candidate new intent particles that are confirmed as new intent categories, performing semantic summarization on samples within the candidate new intent particles, generating semantic labels for new intent categories, and adding particles carrying centroids, decision boundaries, and semantic labels to the known intent particle knowledge base.

[0081] Specifically, the particle size is calculated according to the above embodiments. center of mass Decision Boundaries and give it a new label And add it to the existing knowledge base of known intent particles.

[0082] Furthermore, to assign interpretable semantic labels to new intent categories, this invention utilizes a large model to perform semantic generalization and label generation on samples within the sphere. Specifically, this can be achieved from the sphere... Select several representative samples (e.g., samples close to the centroid or high-frequency semantic expressions), construct an input sequence, and input it into a pre-trained language model (such as BERT or other large language models). The model will then generate the intent name or semantic description corresponding to the sphere.

[0083] Through the above-described inclusion and annotation process, this invention achieves the transformation from candidate new intents to known intent categories, enabling new intents to participate in subsequent identification and classification processes in the form of standard categories, thereby continuously expanding and improving the intent knowledge base.

[0084] This invention also constructs a dynamic update and evolution mechanism for the intent knowledge base. In practical applications, human-computer interaction systems such as intelligent dialogue systems continuously receive new user input data. To enable the system to adapt to the dynamic environment, continuously accumulate unknown intents, and gradually identify and transform unknown intents into known intents, this invention constructs a dynamic update and evolution mechanism for the intent knowledge base, realizing the updating of the known intent knowledge base and the updating of the candidate new intent knowledge base.

[0085] This invention also constructs a known intent particle sphere incremental update mechanism. During continuous system operation, after a new round of test samples are input, some samples will be identified as belonging to a certain known intent category. For these samples identified as having known intents, this invention incrementally updates the particle sphere structure of the corresponding category, so that the centroid position and decision boundary of the known intent can promptly reflect the distribution changes of newly arriving data. That is: In one embodiment, when a new sample is classified as a known intent category, the centroid and decision boundary of the particle sphere corresponding to the known intent category are updated based on the new sample.

[0086] Specifically, let the first Let the spheres corresponding to the known intent categories be denoted as _____. Its original sample set is denoted as ,in This represents the number of samples contained in the current sphere, and the centroid of the sphere is denoted as . In the current round of processing, there are... The new sample was determined to be the first... Let there be 3 known intent categories, and denote their representation sets. for: in, This indicates that the current round is judged as the first The total number of new samples in each category; This indicates that during the current round of processing, it was determined to be the [number]th [item / item]. The first known intent category The semantic representation vector of each newly added sample. The updated set of sphere samples. for: Accordingly, the updated number of granule samples for: Updated center of mass of the grain This can be expressed as the weighted mean of the original sphere sample and the newly added sample, that is: After obtaining the new centroid of the granules, further analysis can be conducted based on the updated sample set. Recalculate the mean radius and maximum radius of the particle. Let the updated mean radius be... The updated maximum radius is The two are respectively from the updated sample to the new centroid. The distance statistics were obtained.

[0087] Based on this, the known intention sphere is recalculated according to the decision boundary construction formula based on sphere density adjustment in the above embodiments. The density, normalized density, and corresponding decision boundary radius are used to obtain the updated boundary range.

[0088] Through the above-mentioned update mechanism, the present invention can incorporate newly added known samples in the current round into the corresponding spheres in a timely manner without rebuilding the entire known intent knowledge base, and simultaneously update their centroid positions and decision boundaries, so that the known intent structure can continuously adapt to changes in data distribution in online scenarios.

[0089] This invention also constructs a continuous evolution mechanism for candidate novel intentions. During the continuous operation of the system, in addition to known intentions, some input samples are still judged as unknown intentions. For these unknown samples, this invention constructs a continuous evolution mechanism for candidate novel intentions based on the existing candidate particle structure, so as to realize the dynamic accumulation and gradual formation of candidate novel intentions.

[0090] Specifically, let the set of samples judged as having unknown intent in the current round be . ,for Each sample in the set First, calculate the sample Distance from the centroid of existing candidate particles, and determine the nearest particle: in, Indicates distance to unknown intent sample The index number of the most recent candidate new intent particle; if the acceptance criteria of the corresponding candidate particle are met (e.g., the corresponding index number is met), then the index number of the candidate particle is determined. Decision boundary of each candidate novel intention particle If the sample is classified as the first one, then the sample will be assigned to the second one. A candidate new intention particle : Otherwise, initialize the sample as a new candidate sphere. : After the sample attribution is updated, the structural properties of the affected candidate spheres are updated. Specifically, the update methods for the sphere centroid, mean radius, and maximum radius are consistent with the incremental update methods for the known intended spheres in the above embodiments. Based on the above update results, the internal shape is adjusted. External separation Structural support Time series stability index and semantic similarity metrics Recalculate.

[0091] As unknown samples are continuously added, the sample size and structural characteristics of the particles corresponding to the candidate new intentions will gradually change. When a candidate particle's structural indicators meet the criteria for determining a new intention during its evolution, it is confirmed as a new intention category, completing the dynamic evolution process from unknown to known.

[0092] For candidate particles that have not yet met the judgment criteria, their candidate state is retained, and their structural features are continuously updated as subsequent data arrives, so as to achieve continuous tracking and modeling of potential new intentions.

[0093] To enhance the system's stability in dynamic environments, this invention utilizes the aforementioned evolutionary mechanism to transform the candidate intent formation process from a single-round judgment to a progressive confirmation process based on multi-round data accumulation, thereby effectively reducing the risk of misjudgment caused by occasional sample fluctuations. Simultaneously, through adaptive adjustment of the granular structure, the stability and scalability of the overall intent structure are maintained, enabling continuous evolution and efficient updating of the knowledge base.

[0094] Furthermore, to enable continuous operation of intent recognition and new intent discovery in an open environment, this invention constructs an online new intent discovery process, such as... Figure 2 As shown, the process consists of an offline training phase and an online testing phase. The semantic representation model is constructed during the training phase and is only used for feature extraction in the online phase without parameter updates.

[0095] Specifically, during the training phase, the pre-trained language model is first fine-tuned and its representation learned using training data to obtain a semantic representation model f. Subsequently, the training samples are mapped to the semantic space to obtain known intent representations, and a known intent particle knowledge base is constructed based on these representations to obtain the particle structure and statistical features corresponding to each known intent.

[0096] During the online testing phase, when a new round of test samples is input, the semantic representation vector (intent representation) is first extracted using the pre-trained representation model f. Then, based on the centroid and decision boundary of each particle in the known intent particle knowledge base, open classification based on the particle boundary is performed to divide the input samples into two categories: known intent and unknown intent.

[0097] For samples identified as having known intent, they are assigned to the corresponding known intent particle sphere, and the sample set of that particle sphere is updated. Simultaneously, the centroid of the particle sphere and its decision boundary are incrementally updated, thereby achieving dynamic optimization of the known intent structure.

[0098] For samples identified as having unknown intents, they are input into a candidate new intent particle library for processing. Specifically, based on the relationship between the unknown intent sample and existing candidate particles, they are assigned or newly created, thereby constructing and dynamically updating candidate particles. Subsequently, the structural features of the candidate particles are continuously accumulated and updated, and their internal shape, external separation, and structural support are structurally modeled. Based on this, temporal evolution stability analysis is further introduced to characterize the structural changes of candidate particles in consecutive rounds of data, evaluating their stability over time. Simultaneously, combined with a global semantic rejection mechanism, the semantic similarity between candidate particles and existing intents is analyzed to perform semantic-level screening of candidate particles. Based on the above multi-dimensional modeling, candidate particles are uniformly judged and confirmed according to a multi-dimensional joint judgment strategy. When a candidate particle simultaneously meets the judgment conditions of structural characteristics, temporal stability, and semantic similarity, it is transformed into a new intent category, and a corresponding semantic label is generated for it using a large model. Finally, the new intent particle and its semantic label are incorporated into the known intent particle knowledge base, realizing the dynamic expansion and updating of the intent category set. Candidate particles that do not meet the criteria are retained in the candidate new intent particle library and undergo continuous structural updates and evolution as subsequent data arrives until the new intent confirmation criteria are met.

[0099] By cyclically executing the above process, this invention achieves a closed-loop processing mechanism from input sample to intent recognition, candidate new intent generation, new intent confirmation, and dynamic updating of the knowledge base, thereby enabling continuous discovery of new intents and continuous improvement of the intent system in an open environment.

[0100] The present invention also provides a computer program product, comprising a computer program that, when executed by a processor, implements the steps of the financial novel intent discovery method based on a dynamic particle knowledge base, formed by any or a combination of the above examples. The processor may be a single-core or multi-core central processing unit or a specific integrated circuit, or one or more integrated circuits configured to implement the present invention.

[0101] The present invention also provides a storage medium having the same inventive concept as the financial new intent discovery method based on dynamic particle knowledge base formed by any or more of the above examples, wherein computer instructions are stored thereon, and the computer instructions, when executed, perform the steps of the financial new intent discovery method based on dynamic particle knowledge base formed by any or more of the above examples.

[0102] Based on this understanding, the technical solution of this embodiment, or the part that contributes to the prior art, or a part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this invention. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

[0103] This invention also provides a terminal that shares the same inventive concept as any or a combination of examples corresponding to the aforementioned method for discovering new financial intentions based on a dynamic particle knowledge base. The terminal includes a memory and a processor. The memory stores computer instructions executable on the processor. When the processor executes the computer instructions, it performs the steps of the aforementioned method for discovering new financial intentions based on a dynamic particle knowledge base. The processor may be a single-core or multi-core central processing unit or a specific integrated circuit, or one or more integrated circuits configured to implement this invention.

[0104] In one example, the terminal, i.e., the electronic device, is represented in the form of a general-purpose computing device. The components of the electronic device may include, but are not limited to: at least one processing unit (processor) mentioned above, at least one storage unit mentioned above, and a bus connecting different system components (including storage units and processing units).

[0105] The storage unit stores program code that can be executed by the processing unit, causing the processing unit to perform the steps described in the "Exemplary Methods" section above, based on various exemplary embodiments of the present invention. For example, the processing unit can execute the aforementioned method for discovering new financial intentions based on a dynamic particle knowledge base.

[0106] The storage unit may include readable media in the form of volatile storage units, such as random access memory (RAM) and / or cache storage units, and may further include read-only memory (ROM).

[0107] The storage unit may also include a program / utility having a set (at least one) of program modules, including but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of these examples may include an implementation of a network environment.

[0108] A bus can represent one or more of several types of bus structures, including a memory cell bus or memory cell controller, a peripheral bus, a graphics acceleration port, a processing unit, or a local bus that uses any of the various bus structures.

[0109] Through the above description, those skilled in the art will readily understand that the exemplary embodiments described herein can be implemented by software or by combining software with necessary hardware. Therefore, the technical solution according to this exemplary embodiment can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (such as a CD-ROM, USB flash drive, external hard drive, etc.) or on a network, including several instructions to cause a computing device (such as a personal computer, server, terminal device, or network device, etc.) to execute the method of the exemplary embodiment of this application.

[0110] The above detailed embodiments are a description of the present invention. It should not be considered that the specific embodiments of the present invention are limited to these descriptions. For those skilled in the art, several simple deductions and substitutions can be made without departing from the concept of the present invention, and all of these should be considered to fall within the protection scope of the present invention.

Claims

1. A method for discovering novel financial intentions based on a dynamic particle sphere knowledge base, characterized in that, Includes the following steps: Extract the semantic representation of the sample; Construct a known intent particle knowledge base based on the semantic representation of known intent samples; Based on the known intent particle knowledge base, the semantic representation of the input samples is classified into known intent samples or unknown intent samples. Perform particle clustering analysis on samples with unknown intents to generate one or more candidate new intent particles; Calculate the internal shapeness, external separation, and structural support of each candidate novel intention sphere; the internal shapeness represents the degree of aggregation and distribution stability of the samples inside the sphere; the external separation measures the degree of distinction between candidate spheres and other spheres; and the structural support measures the sample size of the candidate spheres. During the multi-round data input process, the centroid of each candidate new intention particle is recorded in each round. The single-round stability index is calculated based on the ratio of the centroid offset distance between adjacent rounds to the particle radius in the current round. The single-round stability index of multiple consecutive rounds is comprehensively calculated to obtain the temporal stability index of the candidate new intention particle. When the internal shapeness, external separation, and structural support of a candidate new intention particle meet the corresponding preset threshold conditions, and the temporal stability index of the candidate new intention particle meets the preset stability threshold condition, the candidate new intention particle is confirmed as a new intention category.

2. The financial novel intent discovery method based on a dynamic particle sphere knowledge base according to claim 1, characterized in that, The construction of a known intent particle knowledge base based on the semantic representation of known intent samples includes: The centroid, mean radius, and maximum radius of the sphere are calculated based on the semantic representation of the samples in each known intent category. The particle density is calculated based on the number of samples and mean radius of the known intent categories, and the particle density of all known intent categories is normalized. Based on the normalized particle density, an adaptive adjustment is made between the mean radius and the maximum radius to obtain the decision boundary for each known intent category; The centroid, decision boundary, and category label of each known intent category are associated and stored to form a known intent particle knowledge base.

3. The financial novel intent discovery method based on a dynamic particle sphere knowledge base according to claim 1, characterized in that, The internal shapeness is determined based on the mean radius and standard deviation of the radius of the internal samples of the candidate novel intention spheres; the external separation is determined based on the average distance between the candidate novel intention sphere and the centroid of the nearest preset number of candidate spheres; the structural support is determined based on the comparison results of the number of candidate novel intention sphere samples among all candidate novel intention spheres.

4. The financial novel intent discovery method based on a dynamic particle sphere knowledge base according to claim 1, characterized in that, Before confirming the candidate new intent particle as a new intent category, the process also includes: Calculate the maximum semantic similarity between the candidate new intent particle and all known intent particles; When the maximum semantic similarity meets the preset semantic similarity threshold, the candidate new intent particle is confirmed as a new intent category.

5. The financial novel intent discovery method based on a dynamic particle sphere knowledge base according to claim 1, characterized in that, After confirming the candidate new intent particle as a new intent category, the process also includes: The centroid and decision boundary of the candidate new intent spheres that are confirmed as new intent categories are calculated. The samples within the candidate new intent spheres are semantically summarized to generate semantic labels for the new intent categories. The spheres carrying the centroid, decision boundary, and semantic labels are added to the known intent sphere knowledge base.

6. The financial novel intent discovery method based on a dynamic particle sphere knowledge base according to claim 1, characterized in that, When a new sample is classified into a known intent category, the centroid and decision boundary of the particle sphere corresponding to the known intent category are updated based on the new sample.

7. The financial novel intent discovery method based on a dynamic particle sphere knowledge base according to claim 1, characterized in that, When the internal shape, external separation, and structural support of the candidate novel intention sphere do not meet the corresponding preset threshold conditions, and / or the temporal stability index of the candidate novel intention sphere does not meet the preset stability threshold condition, the method further includes: In subsequent rounds of data input, the centroid, radius, internal shape, external separation, structural support, and temporal stability index of the candidate new intention particles are continuously updated. When the internal shape, external separation, and structural support of the candidate new intention particles meet the corresponding preset threshold conditions, and the temporal stability index of the candidate new intention particles meets the preset stability threshold conditions, the candidate new intention particles are confirmed as new intention categories.

8. A computer program product, comprising a computer program, characterized in that, When the computer program is executed by the processor, it implements the steps of the financial new intent discovery method based on a dynamic particle knowledge base as described in any one of claims 1-7.

9. A storage medium storing computer instructions thereon, characterized in that, When the computer instructions are executed, they perform the steps of the financial new intent discovery method based on any one of claims 1-7.

10. A terminal comprising a memory and a processor, wherein the memory stores computer instructions executable on the processor, characterized in that, When the processor executes the computer instructions, it performs the steps of the financial new intent discovery method based on a dynamic particle knowledge base as described in any one of claims 1-7.