Personalized dialogue method and system based on personality description and conversation history
By building a personalized dialogue system and utilizing the Transformer architecture and latent variable generator to process user personality traits and dialogue history, the problems of insufficient information and high computational load in existing systems are solved, achieving efficient generation and improved accuracy of personalized responses.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- UNIV OF JINAN
- Filing Date
- 2024-07-15
- Publication Date
- 2026-06-26
Smart Images

Figure CN118916454B_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of personalized dialogue technology, and in particular relates to a personalized dialogue method and system based on personality description and conversation history. Background Technology
[0002] The statements in this section are merely background information related to the present invention and do not necessarily constitute prior art.
[0003] Personalized dialogue systems generate responses by considering the dialogue history and analyzing the user's individual characteristics. This allows the generated responses to be more closely aligned with the user's personality, significantly enhancing the user's dialogue experience. To achieve responses consistent with the user's personality, personalized dialogue models fully consider user role information during the design and training process. This information can be explicit, such as the user's age, gender, and occupation, or it can be implicit, requiring analysis and extraction from the user's dialogue content. In this way, personalized dialogue models can gain a more comprehensive understanding of the user, thereby generating responses that better match their personality. Furthermore, in practical applications, personalized dialogue systems have demonstrated enormous potential. They not only provide users with more considerate and personalized services but also continuously optimize their performance based on user feedback, further improving user satisfaction.
[0004] Based on past explorations and practices, the construction of personalized dialogue systems has mainly followed two paths.
[0005] The first approach is to directly model user personality from predefined role descriptions or user attributes. The core of this method lies in using pre-defined user personality characteristics to construct a personalized dialogue model.
[0006] However, the above methods face several challenges in practical applications. First, the amount of information carried by predefined role descriptions or user attributes is often limited and cannot fully reflect the user's personality traits. Second, this method involves user privacy issues because it requires the collection and use of users' personal information. These problems significantly limit the model's capabilities, resulting in generated responses that often lack subtle personalized differences.
[0007] Another approach is to extract features from the user's conversation history and implicitly model the user's personality. This method does not require pre-setting user personality traits; instead, it extracts user personality traits by analyzing the user's historical conversation data and builds a personalized conversation model accordingly.
[0008] This approach offers greater flexibility and is better able to adapt to the individual needs of different users. By deeply analyzing users' conversation history, the model can learn information such as users' language habits, interests, and emotional attitudes, thereby generating responses that are more tailored to the user's personality.
[0009] However, early work primarily focused on modeling explicit personas, using predefined user personality traits. While this approach achieved personalized dialogue to some extent, its limitations prevented the generation of more nuanced personalized responses. Subsequent research discovered a new method for modeling from the context of user dialogue history, a breakthrough that revolutionized the construction of personalized dialogue systems. This historical context contains rich personalized information, including not only the user's own responses but also subtle, unintentionally revealed characteristics. By deeply mining and analyzing this information, researchers can more accurately understand users' personality traits and generate more personalized responses that better meet their needs.
[0010] First, a user's historical responses can often reflect information about their language style, interests, and preferences. For example, some users may prefer humorous language, while others prefer concise expression. By analyzing these historical responses, personalized dialogue systems can capture users' language habits, thereby generating responses that are more closely aligned with their individual personalities.
[0011] Furthermore, researchers can capture conversational styles between the current user and other users from dialogue history response pairs. These stylistic differences are crucial for generating responses that match the user's personality. When faced with new input, personalized dialogue systems can also look back at historical data, examine how users have responded to similar questions before, and draw on similar interactions to generate responses. However, this approach also faces some challenges. Because the user's dialogue history context contains a large amount of historical dialogue data, this data is often too large to be fully loaded into the model, significantly increasing the computational burden. Summary of the Invention
[0012] To overcome the shortcomings of the prior art, the present invention provides a personalized dialogue method based on personality description and conversation history. The answers generated by the technical solution of the present invention effectively utilize personality feature information combined with the dialogue environment, thereby inferring more diverse and coherent answers.
[0013] To achieve the above objectives, one or more embodiments of the present invention provide the following technical solutions:
[0014] Firstly, it discloses a personalized dialogue method based on personality descriptions and conversation history, including:
[0015] Construct a complex text dataset that includes user personality traits, dialogue context information, and responses;
[0016] Encode text in complex text datasets to convert user personality features, dialogue context information, and responses into a unified vector representation;
[0017] The encoded user personality feature vector is enhanced to divide it into different groups of information;
[0018] The enhanced information and the encoded dialogue context information vector are input into the personality recognition network of the latent variable generator to obtain the latent role variables sampled from the role distribution.
[0019] The dialogue context information vector and response vector are input into the response recognition network of the latent variable generator to obtain response latent variables from the latent response distribution.
[0020] The latent variables of the role and the latent variables of the response are passed through the feature classifier and then fed into the cue learning for deep processing. They are then passed to the decoder to generate the response corresponding to the dialogue.
[0021] As a further technical solution, when encoding text in complex text datasets, a pre-trained encoder is used. The encoder adopts a Transformer architecture and a self-attention mechanism to capture semantic and contextual information in the text and integrate data from different sources. The last hidden vector of the last layer of the encoder is used as the representation of the information.
[0022] As a further technical solution, the encoded user personality feature vector is enhanced, specifically including:
[0023] The personalized information in the encoded user personality feature vector is expanded into multiple personality feature groups. By adding different orthogonal vectors to each personalized information part, the personalized information is divided into different groups.
[0024] As a further technical solution, the latent variable generator also includes a personalized prior network and a response prior network;
[0025] When the enhanced information and the encoded dialogue context information vector are input into the personality recognition network of the latent variable generator, the dialogue context information vector will be used as the input to the personality prior network.
[0026] The dialogue context information vector and response vector are input into the response recognition network of the latent variable generator, and the dialogue context information vector will be used as the input to the response prior network.
[0027] As a further technical solution, the latent variable generator generates latent variables during training. and It is sampled from the role and response recognition network, and during the inference process, latent variables... and It was sampled from the role and response prior network.
[0028] As a further technical solution, the input sequence of the feature classifier is constructed as follows:
[0029] Markers are inserted between context information and feature information to help the model identify the boundaries between context and feature information, thus constructing k input sequences.
[0030] As a further technical solution, a query step is also included, specifically:
[0031] Input the content and dialogue history into the BERT model separately;
[0032] The hidden states of the CLS tags output by BERT are taken as their embedding representations Q to capture their intrinsic features;
[0033] Calculate query Q with each dialogue history embedding The similarity between them;
[0034] The Top-K selection operation is used, which sorts the data by similarity from high to low and selects the top K dialogue history embeddings that are most similar to the query.
[0035] Embed the current query in Q and each similar conversation history in Q. A detailed comparison was conducted, and cosine similarity was used to quantify the degree of similarity between them;
[0036] Calculate the average similarity across all selected dialogue histories. If the calculated average similarity C... filter If the content is below the threshold τ set according to the experiment, then this part of the dialogue history is considered irrelevant to the topic of the current query and is filtered out.
[0037] Secondly, a personalized dialogue system based on personality descriptions and conversation history was disclosed, including:
[0038] The complex text dataset building module is configured to: build a complex text dataset containing user personality traits, dialogue context information, and responses;
[0039] The encoder module is configured to encode text in complex text datasets, converting user personality features, dialogue context information, and responses into a unified vector representation.
[0040] The information enhancement module is configured to enhance the encoded user personality feature vector, dividing it into different groups of information.
[0041] The latent variable generator is configured to input augmented information and encoded dialogue context information vectors into its personality recognition network to obtain latent role variables sampled from the role distribution.
[0042] The dialogue context information vector and response vector are input into the response recognition network of the latent variable generator to obtain response latent variables from the latent response distribution.
[0043] The response generation module is configured to: pass the latent variables of the role and the latent variables of the response through the feature classifier and then feed them into the cue learning for deep processing, and then pass them to the decoder to generate the response corresponding to the dialogue.
[0044] As a further technical solution, it also includes: a historical information selector, configured as follows:
[0045] Get the current query request;
[0046] Filter out content that is irrelevant to the current query;
[0047] The system filters out individual characteristics and refined conversation history information.
[0048] As a further technical solution, the response generation module also includes a feature prompting learning module, configured as follows:
[0049] After filtering by the feature classifier, several relevant features that best match the query are selected;
[0050] These features are combined with the context to form an input sequence, which is then fed into a pre-trained model for deep processing.
[0051] The above one or more technical solutions have the following beneficial effects:
[0052] This invention's technical solution is based on personalized description information and dialogue history. It uses an information augmentation algorithm to cluster the personalized description text into multiple fine-grained sparse categories, and a feature classifier precisely selects features highly relevant to the current dialogue context based on the input query. Simultaneously, a historical information selector performs refined filtering of the dialogue history, retaining the parts valuable for generating a response. These processed personalized description texts are combined with the filtered historical context information, and after deep processing using a feature hint learning strategy, they are fed into the decoder to generate personalized responses.
[0053] The model proposed in this invention integrates the proposed algorithm, classifier, selector, and feature cueing learning strategy to construct a complete dialogue system. Experiments on two datasets show that, compared with existing models, the model proposed in this invention outperforms other models in terms of consistency and coherence.
[0054] Advantages of additional aspects of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. Attached Figure Description
[0055] The accompanying drawings, which form part of this invention, are used to provide a further understanding of the invention. The illustrative embodiments of the invention and their descriptions are used to explain the invention and do not constitute an improper limitation of the invention.
[0056] Figure 1 This is a diagram illustrating the overall model architecture of an embodiment of the present invention.
[0057] Figure 2 This is a detailed structural diagram of the feature classifier according to an embodiment of the present invention;
[0058] Figure 3 This is a schematic diagram illustrating the specific working principle of the historical information selector in an embodiment of the present invention;
[0059] Figure 4 This is a schematic diagram illustrating the specific structure of the feature prompting learning strategy in an embodiment of the present invention. Detailed Implementation
[0060] It should be noted that the following detailed descriptions are exemplary and intended to provide further illustration of the invention. Unless otherwise specified, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention pertains.
[0061] It should be noted that the terminology used herein is for the purpose of describing particular implementations only and is not intended to limit the exemplary implementations of the present invention.
[0062] Where there is no conflict, the embodiments and features in the embodiments of the present invention can be combined with each other.
[0063] Example 1
[0064] This embodiment discloses a personalized dialogue system based on personality description and conversation history, including:
[0065] The complex text dataset building module is configured to: build a complex text dataset containing user personality traits, dialogue context information, and responses;
[0066] The encoder module is configured to encode text in complex text datasets, converting user personality features, dialogue context information, and responses into a unified vector representation.
[0067] The information enhancement module is configured to enhance the encoded user personality feature vector, dividing it into different groups of information.
[0068] The latent variable generator is configured to input augmented information and encoded dialogue context information vectors into its personality recognition network to obtain latent role variables sampled from the role distribution.
[0069] The dialogue context information vector and response vector are input into the response recognition network of the latent variable generator to obtain response latent variables from the latent response distribution.
[0070] The response generation module is configured to: pass the latent variables of the role and the latent variables of the response through the feature classifier and then feed them into the cue learning for deep processing, and then pass them to the decoder to generate the response corresponding to the dialogue.
[0071] In this implementation example, a historical information selector is also included, configured as follows:
[0072] Get the current query request;
[0073] Filter out content that is irrelevant to the current query;
[0074] The system filters out individual characteristics and refined conversation history information.
[0075] In this embodiment, the response generation module further includes a feature cue learning module, configured as follows:
[0076] After filtering by the feature classifier, several relevant features that best match the query are selected;
[0077] These features are combined with the context to form an input sequence, which is then fed into a pre-trained model for deep processing.
[0078] In this implementation example, the technical solution can extract information from messy character description text while ensuring accurate use of information from the dialogue history context. Therefore, this task can be formally defined as a dialogue corpus C = (C i ,P i ,Y i ), where C i This refers to the historical context containing multiple discourses, P i It is a text configuration file containing personalized information about the users in the conversation, Y i It is a response. The goal is to generate different responses Y = (Y1, Y2, ..., YY) by learning the latent dependencies between P, C, and Y in the corpus C. n Therefore, the goal of response generation is to generate a response given the dialogue context and personality feature information, i.e., to estimate the conditional probability, as shown in the following formula:
[0079]
[0080] Where φ represents the trainable parameters of the pre-trained Transformer model, and N and xi.
[0081] The following is a detailed description of the modules in the system:
[0082] In one implementation example, regarding the encoder module:
[0083] Processing data requires dealing with complex text datasets containing user personality traits (P), dialogue context information (C), and responses (R), making the encoder crucial. It accepts raw text from the dataset as input, which includes not only user information and dialogue context records but also potentially useless information such as dialogue sentences lacking personality traits. The encoder's primary task is to effectively encode this complex text data so that subsequent models can understand and utilize it.
[0084] In this embodiment, a pre-trained GPT-2 (Language models are unsupervised multitask learners) is used as the encoder, which can effectively convert user personality features, dialogue context information, and responses into a unified vector representation. The GPT-2 encoder employs a Transformer architecture and a self-attention mechanism, enabling it to capture semantic and contextual information in the text and effectively integrate data from different sources. The last hidden vector of the encoder's last layer is used as the representation of the information.
[0085]
[0086] In one implementation example, regarding the information enhancement module:
[0087] This module primarily processes user personality traits P to make the information of each personality trait clearer while ensuring the diversity between different data. First, the dense character text description information needs to be clustered into sparse categories. To better cluster different personality traits, Algorithm 1 is designed to process it using information augmentation (see Algorithm 1) to divide it into different groups of information.
[0088]
[0089] Before training begins, personal information p i Expanding to multiple individual feature groups by adding different orthogonal vectors x = (x1, x2, ..., x... n ) to each personalized information p i In some cases, this personalized information can be divided into different groups, for example: [(p i +x1),(p i +x2),…,(p i +x n If different personalized information p i p j They have the same x i Adding values means these contexts belong to the same group. In this way, all personality traits within the same group will maintain a certain relationship.
[0090] In one implementation example, regarding the latent variable generator
[0091] In this implementation example, the CVAE model is primarily used for training. The Conditional Variational Autoencoder (CVAE) effectively increases the diversity and information content of responses in open-ended dialogue generation tasks by enriching the context vector with sampled latent variables. Another reason for using CVAE is its simplicity and ease of training.
[0092] In this implementation example, several important assumptions are made for the CVAE model. The CVAE model is represented using four random variables: dialogue context C, response Y, character text description information set P, and latent variable Z. Among these four random variables, C consists of the historical context, that is, messages and responses, which are composed of all previous utterances in the dialogue history, and can be represented as C = (a1, b1), (a2, b2), ..., (a... n ,b n Similarly, P consists of character description text: P = p1, p2, ..., p kY represents the target user's response. Z is used to capture the potential distribution of valid responses.
[0093] CVAE has a recognition network and a prior network, which can be represented as follows: and p θ (z|~), these two networks are used to approximate the true posterior distribution q(z|~) and the prior distribution p(z|~), symbol θ and φ are the parameters of the CVAE recognition network and the prior network, respectively. Assume these distributions follow a Gaussian distribution, for example... Where μ is the mean, σ 2 It is variance.
[0094] Subsequently established and By using the CVAE model to process it, we can obtain the latent variables of the characters sampled from the character distribution. right and Using the same CVAE model, the latent variables of the response from the latent response distribution can be obtained. in and Used as input for personality recognition networks Will be used as a PriorNetp personal prior network θ The input is (z|c). and Used as input for response recognition network Will be used as a response prior network PriorNetp θ The input is (z|c).
[0095] By assuming z p and z y They are independent latent variables, and the generation process can be represented by the following conditional distribution: p(y, z) p , z y |c)=p(y|c,z p ,z y )p(z p |c)p(z y |c).
[0096] Based on previous work, p(y|c,z) p ,z y p(z) is called the response generator. p |c), p(z) y |c) is called the prior network.
[0097] Due to the assumption of latent variable z p zy Following an isotropic Gaussian distribution, there is a recognition network. and Two prior networks and In order to train z from the prior network and the recognition network p and z y Sampling is performed, and to ensure end-to-end differentiability, a reparameterization technique is used to make the sampling operation differentiable. μ is derived using a role and response prior network. p μ y and variance The role and response recognition network generates the corresponding mean μ′ p μ′ y and variance have:
[0098]
[0099] Among them, W p W y This refers to the weight vector, b p ,b y This refers to the bias vectors corresponding to the role and response prior networks. The KL divergence between the prior network and the recognition network will be included in the final loss function.
[0100] CVAEs are trained by maximizing the conditional log-likelihood, but this involves marginalizing the latent variables, a challenging process. Previous work has shown that training CVAEs using stochastic gradient variational Bayes (SGVB) by maximizing the variational lower bound of the conditional log-likelihood can be effective. Subsequently, it is assumed that the latent variable z... p z y It follows a multivariate Gaussian distribution with a diagonal covariance matrix. In addition, bag-of-words loss is incorporated into the loss function to address the problem of vanishing latent variables. Therefore, the variational lower bound can be written as:
[0101]
[0102] Where KL(·) refers to the KL divergence, y bow This represents a bag of responses.
[0103] During training, latent variables and These were sampled from a role and response recognition network. However, during the inference process, latent variables... and It is sampled from the role and response prior network. After sampling, the latent variables are... and and The data are fed into their respective linear layers, then passed through a classifier, and finally fed into a cue learning layer for deep processing before being passed to a decoder. This constitutes the language model.
[0104] Feature classifier:
[0105] To effectively process personality feature information and ensure that the model can accurately select appropriate personality descriptions, an innovative method is proposed. This method first generates latent variables from contextual information and personality description text using a prior network. It is obtained by sampling a set of distributions constructed separately for each vector, which can capture the latent structural and semantic information in the text, helping the model to better understand and select personality features. Next, a feature classifier is trained to further improve the recognition ability of personality features, see... Figure 2 The model then selects relevant personality traits from these latent variables. However, directly teaching the classifier how to select from latent variables generated by sampling feature information from implicitly clustered role distributions is quite challenging. Therefore, pseudo-labels are introduced to guide the classifier's learning process. By assigning pseudo-labels to latent personality traits, the model can learn how to distinguish and select different personality descriptions based on these labels. In the input processing of the feature classifier, particular attention is paid to the integration of contextual and feature information. To enhance the model's ability to recognize these two types of information, special markers <|endoftext|> are cleverly inserted between them. These markers help the model identify the boundaries between context and feature information, thereby more accurately extracting and integrating relevant information to construct k input sequences.
[0106]
[0107] Meanwhile, these input sequences not only preserve the integrity of the original information but also, through the introduction of special labels, enable the model to more easily capture the correlation between context and features. The feature classifier itself is a neural network composed of multiple layers of sensing units, possessing powerful learning and recognition capabilities while ensuring the model excels in language reasoning. Through processing the input sequences, it can accurately identify the personality features most relevant to the current task. Finally, the score for each role is input into a softmax layer to obtain a normalized probability distribution.
[0108]
[0109] In this way, the personality description P most suitable for the current task can be determined based on the character with the highest score. adaptedThis method not only effectively processes individual characteristic information but also ensures the accuracy and reliability of the model in selecting personality descriptions. This is of great significance for building personalized and intelligent dialogue systems.
[0110] Historical Information Selector
[0111] Based on the input query information, a selection mechanism combining dialogue history and personality traits was designed. Considering that the vast user dialogue history database may contain valuable content similar to the current user's personality and past dialogues, an efficient history selector was constructed. See [link / reference]. Figure 3 The selector first uses the pre-trained BERT model to deeply embed the input content and dialogue history. Specifically, we embed the input content q and the dialogue history c. j Input the hidden states of the CLS labels output by BERT into the BERT model respectively, and take them as their embedding representations to capture their intrinsic features.
[0112] Q = BERT CLS (q)
[0113] Next, in order to find dialogue history that highly matches the current query information, the query Q and each dialogue history embedding are calculated. The similarity between them. However, even in similar conversation histories, there may be content unrelated to the current query mixed in. To accurately filter, a Top-K selection operation is used, that is, sorting by similarity from high to low, and selecting the top K conversation history embeddings that are most similar to the query.
[0114]
[0115] in, This indicates the embedding of similar dialogue history after selection.
[0116] After obtaining these similar dialogue history embeddings, another key problem needs to be solved: how to remove content irrelevant to the current query. To address this, a sophisticated filtering mechanism was designed. During the filtering process, the current query embedding Q is compared with each similar dialogue history embedding. A detailed comparison is performed, using cosine similarity to quantify the degree of similarity between them. Then, the average similarity C of all selected dialogue histories is calculated. filter :
[0117]
[0118] Where K is the number of selected dialogue histories. It is the embedding of the query Q and the j-th filtered dialogue history. The cosine similarity between them. If the calculated average similarity C... filterIf the content is below the threshold τ set according to the experiment, then this part of the dialogue history can be considered irrelevant to the topic of the current query and should be filtered out.
[0119] This step effectively filters out content irrelevant to the current query, improving the model's efficiency and accuracy while reducing noise interference. Finally, by combining the historical information required for the current query, the filtered personalized feature information P is... adapted With refined dialogue history information C filter These inputs serve as input to the generator. They not only provide the generator with rich background information but also strongly support its ability to generate personalized responses. In this way, it is possible to more accurately capture the user's personality traits and generate responses that better meet their expectations.
[0120] Feature suggestion learning module:
[0121] After processing by the above modules, the required feature information and historical context information are extracted. By setting the number of feature information, suitable personalized features can be accurately selected for the input query, laying the foundation for the next step, namely, implementing a feature learning strategy. After filtering by the feature classifier, several relevant features that best match the query are selected. Subsequently, these features are combined with the context to form an input sequence, which is then fed into a pre-trained model for deep processing, see... Figure 4 This process ensures more precise and efficient operations, providing strong support for subsequent decision-making and applications.
[0122] The process of feeding the input sequence into the pre-trained model GPT-2 is as follows: First, from the prompts P with k tags (k can be adjusted) adapted =[p1,…,p n ] and refined dialogue context information C filter =[c1,…,c m Construct the input sequence X = [P; C] in the [] array. Here, p1,…,p n It is discrete cue information composed of features, and c1,…,c m It is the sequence of words composed of the dialogue context, where n and m represent the number of feature information and context information, respectively.
[0123] For the data filtered by the historical information selector, the corresponding c was obtained. m At this point, for c m The word sequence is first preprocessed to obtain the corresponding hidden state.
[0124] h m =GPT-2 pretrain (e(c m ))
[0125] To process these cue feature information, a fully connected cue neural network was designed. This network first transforms the cue features into embedding vectors using an embedding function, and then utilizes the fully connected neural network f... θ These embedding vectors are processed to obtain the hidden state sequence s1 to s2. n .
[0126] s1,…,s n =f θ (e(p1),…,e(p n ))
[0127] Where e is the embedding of the feature label, f θ This represents a fully connected neural network, where θ is its trainable parameter.
[0128] Next, these hidden state sequences, along with the dialogue context information sequences, are fed into the pre-trained GPT-2 model:
[0129] U z =GPT-2 pretrain (s n h m )
[0130] The processed representation U z , that is [u z,1 ,…,u z,x Finally, to obtain the output probability distribution for response generation from these hidden states, a linear layer is applied. The role of this linear layer is to extract the hidden states (U0, U1, U2) from the GPT-2 output. z The vocabulary is converted into logits vectors of the same size as the vocabulary (|V|). These logits vectors are then converted into a probability distribution for each word in the vocabulary using a softmax function, thus enabling response generation.
[0131] r i =Softmax(W v u z +b v )
[0132] Where vector r i Let W represent the probability distribution on V. v ∈R |v|×d and b v ∈R |v| These are the weight parameters, and softmax(·) is the softmax function.
[0133] This embodiment's sub-solution is a dialogue generator, specifically a personalized dialogue generator. This generator leverages explicit and implicit user personalities to enhance the functionality of existing dialogue models. To achieve this, an information augmentation technique is designed to subdivide and clarify the personality traits of a character within each message. This process includes using a feature classifier to accurately locate specific personality traits and using a historical information selector to filter the context. The next step involves generating personalized responses through a prompt generation operation, in which functional prompts are obtained. Experimental results show that the responses generated by this embodiment's sub-solution effectively utilize personality trait information combined with the dialogue environment, thereby inferring more diverse and coherent answers.
[0134] Example 2
[0135] This embodiment discloses a personalized dialogue method based on personality description and conversation history, including:
[0136] Construct a complex text dataset that includes user personality traits, dialogue context information, and responses;
[0137] Encode text in complex text datasets to convert user personality features, dialogue context information, and responses into a unified vector representation;
[0138] The encoded user personality feature vector is enhanced to divide it into different groups of information;
[0139] The enhanced information and the encoded dialogue context information vector are input into the personality recognition network of the latent variable generator to obtain the latent role variables sampled from the role distribution.
[0140] The dialogue context information vector and response vector are input into the response recognition network of the latent variable generator to obtain response latent variables from the latent response distribution.
[0141] The latent variables of the role and the latent variables of the response are passed through the feature classifier and then fed into the cue learning for deep processing. They are then passed to the decoder to generate the response corresponding to the dialogue.
[0142] This embodiment's sub-solution improves the personalized dialogue system by combining the advantages of explicit role modeling and implicit feature extraction. We designed a dialogue method based on explicit and implicit user personalities. This method not only considers the user's dialogue history context but also incorporates the user's personalized descriptive information, thereby generating more personalized and accurate responses.
[0143] The first step involves processing the data. Then, an information augmentation module is used on the character description text to expand its information content, making its differences from other texts clearer. Next, the dialogue context and character description text are jointly modeled with a conditional variational autoencoder. Simultaneously, the dialogue context and response participate in the above process in parallel. This means that the modeling process considers not only the user's personality characteristics but also the contextual information of the dialogue, enabling the model to better understand the user's intent and needs. To better handle personality description information, a feature classifier is designed, whose main function is to better select features based on the input query information. In addition, a historical information selector is designed. Since the user's dialogue history contains a large amount of information, not all of it is relevant to the current dialogue. Therefore, the historical information selector can filter information from the historical context, retaining only information relevant to the current dialogue. Finally, a feature cue learning module is designed to model the character using the selected feature information and the filtered dialogue context, providing strong support for generating personalized responses. The overall block diagram is as follows: Figure 1 As shown.
[0144] For details on the implementation of the autoencoder, information augmentation module, latent variable generator, feature classifier, historical information selector, and feature cue learning module used in this method, please refer to the specific content in Example 1.
[0145] Example 2
[0146] The purpose of this embodiment is to provide a computer device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the program to implement the steps of the above-described method.
[0147] Example 3
[0148] The purpose of this embodiment is to provide a computer-readable storage medium.
[0149] A computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, performs the steps of the above-described method.
[0150] Example 5
[0151] The purpose of this embodiment is to provide a computer program product containing instructions that, when run on a computer, cause the computer to perform the methods and functions involved in any of the above embodiments.
[0152] The steps and methods involved in the apparatus of the above embodiments correspond to those in the method embodiments. For specific implementation details, please refer to the relevant description sections of the embodiments. The term "computer-readable storage medium" should be understood as a single medium or multiple media including one or more instruction sets; it should also be understood as including any medium capable of storing, encoding, or carrying an instruction set for execution by a processor and enabling the processor to perform any of the methods in this invention.
[0153] Those skilled in the art will understand that the modules or steps of the present invention described above can be implemented using general-purpose computer devices. Optionally, they can be implemented using computer-executable program code, thereby allowing them to be stored in a storage device for execution by a computer device, or they can be fabricated as separate integrated circuit modules, or multiple modules or steps can be fabricated as a single integrated circuit module. The present invention is not limited to any particular combination of hardware and software.
[0154] While the specific embodiments of the present invention have been described above in conjunction with the accompanying drawings, this is not intended to limit the scope of protection of the present invention. Those skilled in the art should understand that various modifications or variations that can be made by those skilled in the art without creative effort based on the technical solutions of the present invention are still within the scope of protection of the present invention.
Claims
1. A personalized dialogue method based on personality description and conversation history, characterized by: include: Construct a complex text dataset that includes user personality traits, dialogue context information, and responses; Encode text in complex text datasets to convert user personality features, dialogue context information, and responses into a unified vector representation; The encoded user personality feature vector is enhanced to divide it into different groups of information; The enhanced information and the encoded dialogue context information vector are input into the personality recognition network of the latent variable generator to obtain the latent role variables sampled from the role distribution. , ; The dialogue context information vector and response vector are input into the response recognition network of the latent variable generator to obtain response latent variables from the latent response distribution. , ; The latent variables of the role and the latent variables of the response are passed through the feature classifier and then fed into the cue learning for deep processing. They are then passed to the decoder to generate the response corresponding to the dialogue. The input sequence of the feature classifier is constructed as follows: Markers are inserted between context information and feature information to help the model identify the boundaries between context and feature information, thus constructing k input sequences; It also includes a query step, specifically including: Input the content and dialogue history into the BERT model separately; The hidden states of the CLS tokens output by BERT are taken as their query embedding representations Q to capture their intrinsic features; Compute the query embedding representation Q with each dialogue history embedding The similarity between them; The Top-K selection operation is used, which sorts the dialogue history embeddings from high to low similarity and selects the top K most similar to the query embedding representation Q. The current query embedding representation Q is combined with each similar dialogue history embedding. The comparison is performed, and the cosine similarity is used to quantify the degree of similarity between them; Calculate the average similarity across all selected dialogue histories. If the calculated average similarity... If the content is below the threshold τ set according to the experiment, then this part of the dialogue history is considered irrelevant to the topic of the current query and is filtered out.
2. The personalized dialogue method based on personality description and conversation history as described in claim 1, characterized in that, When encoding text in complex text datasets, a pre-trained encoder is used. The encoder employs a Transformer architecture and a self-attention mechanism to capture semantic and contextual information in the text and integrate data from different sources. The last hidden vector of the last layer of the encoder is used as the representation of the information.
3. The personalized dialogue method based on personality description and conversation history as described in claim 1, characterized in that, The encoded user personality feature vector is enhanced, specifically including: The personalized information in the encoded user personality feature vector is expanded into multiple personality feature groups. By adding different orthogonal vectors to each personalized information part, the personalized information is divided into different groups.
4. The personalized dialogue method based on personality description and conversation history as described in claim 1, characterized in that, The latent variable generator also includes a personality prior network and a response prior network; When the enhanced information and the encoded dialogue context information vector are input into the personality recognition network of the latent variable generator, the dialogue context information vector will be used as the input to the personality prior network. The dialogue context information vector and response vector are input into the response recognition network of the latent variable generator, and the dialogue context information vector will be used as the input to the response prior network.
5. The personalized dialogue method based on personality description and conversation history as described in claim 1, characterized in that, The latent variable generator generates latent variables during training. and These are sampled from the role and response recognition networks, respectively, and during the inference process, latent variables... and These were sampled from the role and response prior networks, respectively.
6. A personalized dialogue system based on personality description and conversation history, employing the personalized dialogue method based on personality description and conversation history as described in any one of claims 1-5, characterized in that, include: The complex text dataset building module is configured to: build a complex text dataset containing user personality traits, dialogue context information, and responses; The encoder module is configured to encode text in complex text datasets, converting user personality features, dialogue context information, and responses into a unified vector representation. The information enhancement module is configured to enhance the encoded user personality feature vector, dividing it into different groups of information. The latent variable generator is configured to input augmented information and encoded dialogue context information vectors into its personality recognition network to obtain latent role variables sampled from the role distribution. , ; The dialogue context information vector and response vector are input into the response recognition network of the latent variable generator to obtain response latent variables from the latent response distribution. , ; The response generation module is configured to: pass the latent variables of the role and the latent variables of the response through the feature classifier and then feed them into the prompt learning for deep processing, and then pass them to the decoder to generate a response corresponding to the dialogue; Also includes: a history information selector, configured as follows: Get the current query request; Filter out content that is irrelevant to the current query; Filter out individual characteristics and refined conversation history information; The response generation module further includes a feature prompting learning module, which is configured as follows: After filtering by the feature classifier, several relevant features that best match the query are selected; These features are combined with the context to form an input sequence, which is then fed into a pre-trained model for deep processing.
7. A computer program product, comprising a computer program, characterized in that, When the computer program is executed by a processor, it implements the method of any one of claims 1 to 5.
8. A computer device, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the processor executes the program, it implements the steps of the method described in any one of claims 1-5.
9. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the program is executed by the processor, it implements the steps of the method described in any one of claims 1-5.