Training method and device of mental attention tendency hierarchical emotion classification model
By constructing a hierarchical emotion classification model for mental attention tendencies, and utilizing a pre-trained large language model and a hierarchical task classification module, the model automatically evaluates parent-child dialogue texts, solving the problem of time-consuming and laborious manual annotation, and achieving efficient and accurate assessment of mental attention tendencies.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- TSINGHUA UNIVERSITY
- Filing Date
- 2026-04-17
- Publication Date
- 2026-06-23
AI Technical Summary
In existing technologies, the assessment of caregivers' mental attention tendencies relies on manual labeling, which is time-consuming, labor-intensive, and highly subjective, making it difficult to apply on a large scale.
By acquiring parent-child dialogue texts with hierarchical labels, and utilizing a pre-trained large language model and a hierarchical task classification module, a hierarchical emotion classification model for mental attention tendencies is constructed to achieve automated classification of parent-child dialogue texts.
It improves the efficiency and accuracy of mental attention tendency assessment, enhances the interpretability and credibility of the model in practical use, and is applicable to real clinical and educational intervention scenarios.
Smart Images

Figure CN122045964B_ABST
Abstract
Description
TECHNICAL FIELD
[0001] Embodiments of the present application relate to the technical field of natural language processing, in particular, to a training method and device of a mind-mindedness hierarchical sentiment classification model. BACKGROUND
[0002] “Mind-mindedness” refers to the tendency of caregivers to view infants as individuals with independent psychological states when interacting with them, rather than just satisfying their immediate physiological needs, and emphasizes the ability of caregivers to sensitively perceive the psychological state of infants. Mind-mindedness has a profound impact on the development of infants' language and learning ability, empathy and social cognitive ability. Therefore, in real clinical psychological assessment and educational intervention scenarios, it is of great practical significance to focus on the mind-mindedness of caregivers.
[0003] In related technologies, the mind-mindedness of caregivers in interactions with infants is usually achieved through manual annotation. For example, the free play video of caregivers and infants is transcribed word by word, and then appropriate language reflecting the internal state of infants is manually marked. Although this method is rigorous and meticulous, it is heavily dependent on manual resources, which is time-consuming and labor-intensive and highly subjective, making it difficult to expand to large-scale practical applications. SUMMARY
[0004] Embodiments of the present application provide a training method and device of a mind-mindedness hierarchical sentiment classification model, aiming to overcome the above problems or at least partially solve the above problems.
[0005] The first aspect of the embodiments of the present application provides a training method of a mind-mindedness hierarchical sentiment classification model, comprising:
[0006] Obtaining parent-child dialogue text carrying hierarchical labels, the parent-child dialogue text including parent speech text and child speech text, the root layer label in the hierarchical labels representing whether the parent-child dialogue text has mind-mindedness, the second layer label in the hierarchical labels representing a speech category of the parent-child dialogue text without mind-mindedness, and the third layer label in the hierarchical labels representing an emotional category of the parent-child dialogue text with mind-mindedness or an emotional category of the parent-child dialogue text without mind-mindedness;
[0007] Connecting multiple classification heads to be trained according to the hierarchical relationship represented by the hierarchical labels, to construct a hierarchical task classification module to be trained;
[0008] Processing the parent-child dialogue text through a pre-trained large language model and the hierarchical task classification module to be trained, to obtain a hierarchical task classification prediction result of the parent-child dialogue text.
[0009] Based on the hierarchical task classification prediction results and the hierarchical labels carried by the parent-child dialogue text, the model parameters of the hierarchical task classification module to be trained are updated to obtain the trained hierarchical task classification module.
[0010] The trained hierarchical task classification module and the pre-trained large language model are used to perform hierarchical task classification on the target parent-child dialogue text: identifying whether the target parent-child dialogue text has a mental focus tendency, identifying the discourse category of the target parent-child dialogue text when the target parent-child dialogue text does not have a mental focus tendency, and identifying the emotion category of the target parent-child dialogue text.
[0011] A second aspect of this application provides a training apparatus for a hierarchical emotion classification model of mental attention tendency, comprising:
[0012] The acquisition module is used to acquire parent-child dialogue texts carrying hierarchical tags. The parent-child dialogue texts include parent speech texts and child speech texts. The root tag in the hierarchical tags indicates whether the parent-child dialogue texts have a mental focus tendency. The second tag in the hierarchical tags indicates the speech category of parent-child dialogue texts that do not have a mental focus tendency. The third tag in the hierarchical tags indicates the emotional category of parent-child dialogue texts that have a mental focus tendency or the emotional category of parent-child dialogue texts that do not have a mental focus tendency.
[0013] The construction module is used to connect multiple classification heads to be trained according to the hierarchical relationship represented by the hierarchical labels, and construct a hierarchical task classification module to be trained.
[0014] The processing module is used to process the parent-child dialogue text through a pre-trained large language model and the hierarchical task classification module to be trained, so as to obtain the hierarchical task classification prediction result of the parent-child dialogue text.
[0015] The update module is used to update the model parameters of the hierarchical task classification module to be trained based on the hierarchical task classification prediction results and the hierarchical labels carried by the parent-child dialogue text, so as to obtain the trained hierarchical task classification module.
[0016] The trained hierarchical task classification module and the pre-trained large language model are used to perform hierarchical task classification on the target parent-child dialogue text: identifying whether the target parent-child dialogue text has a mental focus tendency, identifying the discourse category of the target parent-child dialogue text when the target parent-child dialogue text does not have a mental focus tendency, and identifying the emotion category of the target parent-child dialogue text.
[0017] A third aspect of this application provides an electronic device, including a processor, a memory, and a program or instructions stored in the memory and executable on the processor. When the program or instructions are executed by the processor, they implement the steps of the training method for the hierarchical emotion classification model of mental attention tendency of the first aspect of this application.
[0018] A fourth aspect of this application provides a readable storage medium storing a program or instructions that, when executed by a processor, implement the steps of the training method for the hierarchical emotion classification model of mental attention tendency according to the first aspect of this application.
[0019] In summary, the training method for the hierarchical affective classification model of mental attention orientation utilizes a pre-trained large language model to process parent-child dialogue texts, fully leveraging its powerful semantic understanding and language modeling capabilities to accurately capture semantic features in parent-child dialogues. Furthermore, a classification head is constructed according to the hierarchical relationship of the hierarchical labels, ensuring that the classification process of the hierarchical affective classification model aligns with the research logic of mental attention orientation in the field of psychology. This approach improves the interpretability and practical reliability of the model's classification prediction results while maintaining model accuracy. It addresses the problems of low efficiency, insufficient generalization, and weak interpretability of traditional methods, demonstrating stronger applicability and promotional value in real-world clinical and educational intervention scenarios, and effectively enhancing the auxiliary analytical capabilities in psychological assessment and treatment processes. Attached Figure Description
[0020] To more clearly illustrate the technical solutions of the embodiments of this application, the drawings used in the description of the embodiments of this application will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0021] Figure 1 This is a flowchart of the training method for a hierarchical emotion classification model of mental attention tendency proposed in an embodiment of this application;
[0022] Figure 2 This is a schematic diagram of the hierarchical label structure of the mental attention tendency task in the training method of the hierarchical emotion classification model of mental attention tendency proposed in an embodiment of this application.
[0023] Figure 3 This is a schematic diagram of the framework of the hierarchical emotional classification model of mental attention tendency, which is a training method for the model of mental attention tendency proposed in an embodiment of this application.
[0024] Figure 4 This is a schematic diagram of an electronic device according to an embodiment of this application. Detailed Implementation
[0025] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.
[0026] Reference Figure 1 , Figure 1 This is a flowchart illustrating the steps of training a hierarchical emotion classification model for mental attention orientation proposed in an embodiment of this application. Figure 1 As shown, specifically, the method includes the following steps S11~S14:
[0027] Step S11: Obtain parent-child dialogue texts carrying hierarchical tags. The parent-child dialogue texts include parent's speech texts and child's speech texts. The root tag in the hierarchical tags indicates whether the parent-child dialogue texts have a mental focus tendency. The second tag in the hierarchical tags indicates the speech category of parent-child dialogue texts that do not have a mental focus tendency. The third tag in the hierarchical tags indicates the emotional category of parent-child dialogue texts that have a mental focus tendency or the emotional category of parent-child dialogue texts that do not have a mental focus tendency.
[0028] In this embodiment, a large amount of real parent-child interactive dialogue data was collected, namely parent-child dialogue texts, including texts of parents' and children's speech, covering multiple scenarios such as daily interactions, games, and communication. Considering that mental attention-oriented tasks have a clear hierarchical structure, this application adopts the following... Figure 2 The label hierarchy shown first distinguishes whether psychological states (psychological, non-psychological) are involved, i.e., whether there is a mental focus tendency. Only when the true label indicates no mental focus tendency is further subdivided within the discourse category (physiological, general, behavioral). Emotional polarity, i.e., the sentiment category (positive, neutral, negative), provides a prediction for all samples. Psychological experts manually annotated parent-child dialogue texts according to this hierarchical label structure, constructing a hierarchical label system to obtain parent-child dialogue texts carrying hierarchical labels. Based on the logic of psychological research, a three-layer hierarchical label system is constructed, ensuring that the labels highly align with the professional analytical dimensions of mental focus tendency. This provides labeled data that fits practical application scenarios for subsequent training of the hierarchical sentiment classification model of mental focus tendency, guaranteeing the professionalism and accuracy of the model's learning objectives.
[0029] Hierarchical tags can be divided into three levels: root tag Second layer label and third-level labels The root layer label indicates whether the parent-child dialogue text has a mental focus tendency; the second layer label indicates the discourse category of the parent-child dialogue text that does not have a mental focus tendency; the third layer label indicates the emotional category of the parent-child dialogue text that has a mental focus tendency or the emotional category of the parent-child dialogue text that does not have a mental focus tendency.
[0030] Step S12: Connect multiple classification heads to be trained according to the hierarchical relationship represented by the hierarchical labels to construct a hierarchical task classification module to be trained.
[0031] In this embodiment, multiple interconnected classification heads to be trained are connected according to the hierarchical relationship represented by the hierarchical labels. The function of each classification head corresponds one-to-one with the hierarchical label level, thereby constructing a hierarchical task classification module to be trained. By connecting the classification heads according to the hierarchical relationship of the hierarchical labels, the reasoning logic of the hierarchical task classification module matches the hierarchical judgment logic of the labels. This makes the classification process of the hierarchical emotion classification model of mental attention tendencies align with the analytical approach of psychological research on mental attention tendencies, thereby improving the interpretability of the model's decisions.
[0032] In one implementation, step S12, "connecting multiple classification heads to be trained according to the hierarchical relationship represented by the hierarchical labels to construct a hierarchical task classification module to be trained," specifically includes the following steps S121~S124:
[0033] Step S121: Obtain three classification task heads with the same structure, each of the three classification task heads including: a fully connected layer, a Dropout layer, and an output projection layer;
[0034] Step S122: Randomly initialize the model parameters of the three classification task heads to serve as the psychological state classification task head to be trained, the non-psychological discourse classification task head to be trained, and the emotion classification task head to be trained.
[0035] Step S123: Connect the output of the mental state classification task head to be trained to the input of the non-mental discourse classification task head to be trained and to the input of the emotion classification task head to be trained.
[0036] Step S124: Connect the output of the non-psychological discourse classification task head to be trained to the input of the emotion classification task head to be trained.
[0037] In this embodiment, three identical classification task heads are constructed, each consisting of a fully connected layer, a Dropout layer, and an output projection layer. The model parameters (weights, biases, etc.) of these three classification task heads are randomly initialized and used as the training heads for mental state classification, non-mental discourse classification, and emotion classification, respectively. Figure 3 As shown, the output of the mental state classification task head (main classification head) to be trained is connected to the input of the non-mental discourse classification head (sub-classification head) to be trained and the input of the emotion classification head to be trained, respectively. The output of the non-mental discourse classification head to be trained is also connected to the input of the emotion classification head to be trained, thereby constructing a hierarchical task classification module to be trained.
[0038] The task is divided into three levels: **Mental State Classification Task Header:** This corresponds to the root label of the hierarchical system, responsible for classifying parent-child dialogue texts with the root label "Non-Mental" into "Physical," "General," and "Behavioral" categories, denoted as task k=root (root classification task). **Non-Mental Discourse Classification Task Header:** This corresponds to the second-level label of the hierarchical system, responsible for classifying parent-child dialogue texts with the root label "Non-Mental" into "Physical," "General," and "Behavioral" categories, denoted as task k=type (second-level classification task). **Emotion Classification Task Header:** This corresponds to the third-level label of the hierarchical system, responsible for classifying all parent-child dialogue texts into "Positive," "Neutral," and "Negative" emotion categories, denoted as task k=sent (third-level classification task).
[0039] This application considers that directly applying prior knowledge from other fields can enhance the interpretability and convergence speed of the hierarchical emotion classification model for mental attention bias. Since the research field of mental attention bias has necessary intersections and coupling with human emotion research, this application can directly utilize a pre-trained emotion classification model on an external emotion dataset. This model consists of an encoder and an emotion classification head composed of a fully connected layer + Dropout layer + output projection layer, learning stable emotion representations on general emotion tasks. Let the parameters of the pre-trained emotion classification model's emotion classification head be: When training the parent-child dialogue text based on "mental focus tendency" in this application, the sentiment classification head parameters of the pre-trained sentiment classification model can be used as the initialization parameters for task k=sent (sentiment category classification task), that is, let ← , ← , ← , ← This allows emotional knowledge to be transferred to psychological scenarios, accelerating convergence and improving the stability and interpretability of emotion prediction.
[0040] In one implementation, step S12, "connecting multiple classification heads to be trained according to the hierarchical relationship represented by the hierarchical labels to construct a hierarchical task classification module to be trained," specifically includes the following steps S125-S127:
[0041] Step S125: Obtain a pre-trained sentiment classification model. According to the structure of the pre-trained sentiment classification head in the pre-trained sentiment classification model, obtain a first classification task head and a second classification task head with the same structure. The pre-trained sentiment classification head, the first classification task head and the second classification task head all include: a fully connected layer, a Dropout layer and an output projection layer.
[0042] Step S126: Randomly initialize the model parameters of the first classification task head and the second classification task head to serve as the mental state classification task head to be trained and the non-mental discourse classification task head to be trained.
[0043] Step S127: Initialize the emotion classification task head to be trained using the model parameters of the pre-trained emotion classification head in the pre-trained emotion classification model.
[0044] In this embodiment, the pre-trained sentiment classification model is a sentiment classification model pre-trained on an external sentiment dataset. It consists of an encoder and a pre-trained sentiment classification head composed of a fully connected layer, a Dropout layer, and an output projection layer. Following the structure of the pre-trained sentiment classification head in the pre-trained sentiment classification model, a first classification task head and a second classification task head with identical structures are constructed. The model parameters of the first and second classification task heads are randomly initialized to obtain the mental state classification task head and the non-mental discourse classification task head to be trained, respectively. For example... Figure 3 The upper part shown utilizes the model parameters of the pre-trained sentiment classification head in the pre-trained sentiment classification model: the weights of the fully connected layers. and bias Weights of the output projection layer and bias Initialize the pre-built sentiment classification task head to be trained, that is, let ← , ← , ← , ← This enables the transfer and reuse of cross-domain sentiment knowledge, accelerating convergence and improving the stability and interpretability of sentiment prediction.
[0045] Step S13: Process the parent-child dialogue text using a pre-trained large language model and the hierarchical task classification module to be trained, and obtain the hierarchical task classification prediction result of the parent-child dialogue text.
[0046] In this embodiment, a pre-trained large language model can capture the contextual semantic features of parent-child dialogue text. First, the pre-trained large language model processes the parent-child dialogue text, outputting its semantic feature vector. Then, a hierarchical task classification module to be trained processes the semantic feature vector, outputting a hierarchical task classification prediction result for the parent-child dialogue text. The hierarchical classification prediction result includes several predictions: whether the parent-child dialogue text involves psychological states, the discourse category of the parent-child dialogue text, and the emotion category of the parent-child dialogue text.
[0047] In one implementation, step S13, "processing the parent-child dialogue text using a pre-trained large language model and the hierarchical task classification module to be trained, to obtain the hierarchical task classification prediction result of the parent-child dialogue text," specifically includes the following steps S131~S135:
[0048] Step S131: Input the parent-child dialogue text into a pre-trained large language model to obtain the semantic feature vector of the parent-child dialogue text;
[0049] Step S132: Process the semantic feature vector of the parent-child dialogue text through the fully connected layer and Dropout layer in the mental state classification task head to be trained, and obtain the mental state classification task features of the parent-child dialogue text.
[0050] Step S133: Process the psychological state classification task features of the parent-child dialogue text through the output projection layer in the psychological state classification task head to be trained, and obtain the psychological state classification prediction result of the parent-child dialogue text.
[0051] Step S134: When the psychological state classification prediction result of the parent-child dialogue text is that there is no mental attention tendency, the semantic feature vector of the parent-child dialogue text is processed by the fully connected layer and Dropout layer in the non-psychological discourse classification task head to be trained, so as to obtain the non-psychological discourse classification task features of the parent-child dialogue text.
[0052] Step S135: Process the non-psychological discourse classification task features of the parent-child dialogue text through the output projection layer in the non-psychological discourse classification task head to be trained, and obtain the non-psychological discourse classification prediction result of the parent-child dialogue text.
[0053] Step S136: Process the semantic feature vector of the parent-child dialogue text through the fully connected layer and Dropout layer in the emotion classification task head to be trained, and obtain the emotion classification task features of the parent-child dialogue text.
[0054] Step S137: Process the emotion classification task features of the parent-child dialogue text through the output projection layer in the emotion classification task head to be trained, and obtain the emotion classification prediction result of the parent-child dialogue text.
[0055] In this embodiment, a pre-trained large language model with strong semantic understanding capabilities (such as RoBERTa, Distilled BERT, or other Transformer architecture models) is selected as the shared backbone network, i.e., the shared large language model backbone. The shared backbone network obtains the semantic representation of the entire sentence through a multi-layer self-attention mechanism and outputs the hidden vector at the [CLS] position as the global semantic feature vector for that sample. For example... Figure 3 As shown in the left half, the parent-child dialogue text is first preprocessed using a word segmenter and a sub-word encoding module to obtain a sequence of token embeddings. The processed token embeddings sequence is then input into a pre-trained large language model (sharing a large language model backbone) to obtain the semantic feature vector of the parent-child dialogue text. The data is then input into the hierarchical task classification module for processing. The hierarchical task classification module includes: a task head for classifying psychological states to be trained (main classification head), a task head for classifying non-psychological discourse to be trained (sub-classification head), and a task head for classifying emotions to be trained.
[0056] global semantic feature vector The fully connected layers and Dropout layers of the mental state classification task head to be trained are input, linearly transformed and GELU nonlinearly activated, and a parameter-tuned Dropout specific to this classification head task is applied to obtain the mental state classification task features of the parent-child dialogue text. Classifying psychological states as task characteristics The projection layer of the mental state classification task head to be trained is input, linearly mapped, and the logits vector of the task is output. The logits vector of the task is then input into the Softmax function to obtain the predicted probability of each category. The category corresponding to the maximum predicted probability is selected as the mental state classification prediction result of the parent-child dialogue text: having mental attention tendency or not having mental attention tendency.
[0057] When the predicted psychological state classification result of the parent-child dialogue text is "mentally concerned," the semantic feature vector of the parent-child dialogue text is processed by the emotion classification task head to be trained to obtain the emotion classification prediction result of the parent-child dialogue text: positive, neutral, or negative. The specific processing procedure is similar to the processing procedure of the aforementioned psychological state classification task head to be trained.
[0058] Only when the psychological state classification prediction result of the parent-child dialogue text indicates no mental focus tendency, is the semantic feature vector of the parent-child dialogue text processed through the non-psychological discourse classification task head to be trained, in order to obtain the non-psychological discourse classification prediction result of the parent-child dialogue text: physiological, general, or behavioral. The specific processing procedure is similar to the processing procedure of the psychological state classification task head to be trained described above.
[0059] After obtaining the non-psychological discourse classification prediction results of the parent-child dialogue text, the semantic feature vector of the parent-child dialogue text is processed through the emotion classification task head to be trained, so as to obtain the emotion classification prediction results of the parent-child dialogue text: positive, neutral, or negative. The specific processing process is similar to the processing process of the psychological state classification task head to be trained mentioned above.
[0060] The process of obtaining hierarchical task classification prediction results for parent-child dialogue text described above can be represented by the following steps:
[0061] 1) The parent-child dialogue texts are samples, with a total of N samples, indexed i=1,…,N. Each sample contains text. And three levels of tags: root tag Used to distinguish whether a psychological state is involved, i.e., whether a mental focus tendency is involved. Non-psychological sub-layer label. (Second-level labels) are used to distinguish between physiological, general, or behavioral categories when there is no mental focus; emotional layer labels. (The third layer of labels) is used to characterize positive, neutral, or negative emotions. Correspondingly, the three labels belong to finite sets {1, ..., ...} }、{1,…, }、{1,…, }
[0062] 2) Text for sample i After passing through the encoder E(·) of the shared large language model backbone, the [CLS] vector is obtained:
[0063] ;
[0064] in, d represents the global semantic feature vector; d represents the hidden dimension.
[0065] 3) For any task k∈{root,type,sent}, its corresponding classification head first... Perform a fully connected transformation and nonlinear activation, then apply task-specific, hyperparameter-tuned Dropout to obtain task-level features:
[0066] ;
[0067] .
[0068] 4) Subsequently, the logits vector for this task is obtained through linear mapping:
[0069] ;
[0070] in, Let k be the number of categories for task k.
[0071] 5) For any task k, The predicted probabilities for each class are obtained by feeding them into the Softmax function:
[0072] ;
[0073] Among them, the real label is recorded as Then the model predicts the true class probability of sample i on task k as follows: .
[0074] Step S14: Based on the hierarchical task classification prediction results and the hierarchical labels carried by the parent-child dialogue text, update the model parameters of the hierarchical task classification module to be trained, and obtain the trained hierarchical task classification module.
[0075] The trained hierarchical task classification module and the pre-trained large language model are used to perform hierarchical task classification on the target parent-child dialogue text: identifying whether the target parent-child dialogue text has a mental focus tendency, identifying the discourse category of the target parent-child dialogue text when the target parent-child dialogue text does not have a mental focus tendency, and identifying the emotion category of the target parent-child dialogue text.
[0076] In this embodiment, the total loss is calculated based on the hierarchical labels carried in the parent-child dialogue text and the corresponding hierarchical task classification prediction results. The model parameters of the hierarchical task classification module to be trained are updated based on the total loss, resulting in a trained hierarchical task classification module. After training, the hierarchical task classification module, along with the pre-trained large language model, performs the following inference process for hierarchical task classification of the target parent-child dialogue text: first, it determines whether mental focus is involved; then, if mental focus is involved, it further subdivides into subcategories; finally, it predicts a unified emotional polarity. The main classification head first outputs Mental / Non-Mental; if it is Non-Mental, the sub-classification head outputs Physical, General, or Behavioral; the emotion classification head outputs Positive, Neutral, or Negative based on a shared backbone representation. These three elements combine to form structured labels that directly correspond to the three semantic layers of "mental relevance + discourse type + emotional polarity" in psychological research. This improves the interpretability of the encoding results for researchers and clinicians while ensuring model accuracy, greatly enhancing its application potential in real-world work and research scenarios.
[0077] In one implementation, step S14, "updating the model parameters of the hierarchical task classification module to be trained based on the hierarchical task classification prediction results and the hierarchical labels carried by the parent-child dialogue text," specifically includes the following steps S141~S145:
[0078] Step S141: Determine the loss of the mental state classification task based on the difference between the mental state classification prediction result output by the mental state classification task head to be trained for the parent-child dialogue text and the root layer label in the hierarchical label carried by the parent-child dialogue text.
[0079] Step S142: Determine the non-psychological discourse classification task loss based on the difference between the non-psychological discourse classification prediction result output by the non-psychological discourse classification task head to be trained for the parent-child dialogue text and the second-level label in the hierarchical label carried by the parent-child dialogue text.
[0080] Step S143: Determine the emotion classification task loss based on the difference between the emotion classification prediction result output by the emotion classification task head to be trained for the parent-child dialogue text and the third layer label in the hierarchical label carried by the parent-child dialogue text.
[0081] Step S144: Obtain the total loss based on the loss of the psychological state classification task, the loss of the non-psychological discourse classification task, and the loss of the emotion classification task;
[0082] Step S145: Based on the total loss, update the model parameters of the fully connected layers, Dropout layers, and output projection layers of the mental state classification task head, the non-mental discourse classification task head, and the emotion classification task head to be trained.
[0083] In this embodiment, the single-sample psychological state classification task loss is calculated based on the difference between the psychological state classification prediction result output by the psychological state classification task head to be trained for each parent-child dialogue text and the root layer label in the hierarchical labeling carried by the parent-child dialogue text. The average of the single-sample psychological state classification task loss for all samples is then calculated to obtain the psychological state classification task loss. Similar to calculating the psychological state classification task loss, the non-psychological discourse classification task loss and the emotion classification task loss are obtained. These three task losses are weighted and fused using pre-set hierarchical coefficients to obtain the total loss. Considering the mutual influence between the root layer classification task, the second layer classification task, and the third layer classification task, the model parameters of the fully connected layer, Dropout layer, and output projection layer of the psychological state classification task head, the non-psychological discourse classification task head, and the emotion classification task head to be trained are updated according to the total loss.
[0084] Furthermore, this application considers that the training dataset used is taken from a real dataset, which suffers from significant data imbalance. For example, the number of samples representing negative emotions is much lower than that representing positive and neutral emotions. To alleviate the problem of imbalanced label distribution at different levels, a Focal loss with class weights can be used for each task. Let the frequency of class c in the entire dataset be given by... Then its normalized weights are defined as:
[0085] ;
[0086] For sample i and task k, the single-sample loss is defined as:
[0087] ;
[0088] in, The category weight is the label corresponding to sample i, and the category weight is larger for the lower the frequency. The model predicts the true class probability of sample i on task k, where k∈{root,type,sent}. γ represents the true label; γ≥0 is the focus coefficient, and when γ=0 it degenerates into weighted cross-entropy; N is the number of parent-child dialogue texts (samples), with index i=1,…,N.
[0089] The single-sample loss formula described above can enhance the attention of the hierarchical sentiment classification model of mental attention tendencies to difficult-to-distinguish and minority class labels, and improve the long-tail problem of "mental attention tendencies" data on multi-level labels.
[0090] To prevent hierarchical emotion classification models based on mental focus orientation from learning illegal paths that contradict psychological definitions (such as those with mental focus orientation - physiological - positive), and to ensure that even if the upper layer makes a misprediction, the lower layer can still use the true labels for backpropagation, this application constructs a loss mask based on the true labels. An indicator function I(condition) is introduced, which takes the value 1 when condition is true and 0 otherwise, and is defined as follows:
[0091] ;
[0092] ;
[0093] ;
[0094] Therefore, the effective sample set for task k is:
[0095] ={i| =1};
[0096] Calculate the average loss on this set:
[0097] ;
[0098] Finally, through the pre-set hierarchical coefficients We weight and combine the losses from the three tasks to obtain the total loss:
[0099] .
[0100] In one implementation, the training method for the hierarchical affective classification model of mental attention tendencies further includes the following steps S15-S17, and the aforementioned step S142, "determining the non-psychological discourse classification task loss based on the difference between the non-psychological discourse classification prediction result output by the non-psychological discourse classification task head to be trained for the parent-child dialogue text and the second-level label in the hierarchical label carried by the parent-child dialogue text," further includes the following steps S1421-S1422:
[0101] Step S15: In the case that the first layer of the hierarchical labels carried by a parent-child dialogue text indicates that the parent-child dialogue text does not have a mental focus tendency, configure the loss mask value of the non-psychological discourse classification task to be 1 for the parent-child dialogue text.
[0102] Step S16: The first layer of labels in the hierarchical labels carried by a parent-child dialogue text represents that the parent-child dialogue text has a mental focus tendency, and the loss mask value of the non-psychological discourse classification task is set to 0 for the parent-child dialogue text.
[0103] Step S17: Identify parent-child dialogue texts with a loss mask value of 1 as valid parent-child dialogue texts for non-psychological discourse classification tasks;
[0104] Step S1421: Based on the difference between the non-psychological discourse classification prediction result output by the non-psychological discourse classification task head to be trained for the effective parent-child dialogue text and the second-level label in the hierarchical label carried by the effective parent-child dialogue text, determine the non-psychological discourse classification task loss of the effective parent-child dialogue text.
[0105] Step S1422: Take the average value of the non-psychological discourse classification task loss of multiple valid parent-child dialogue texts as the non-psychological discourse classification task loss.
[0106] In this embodiment, considering that the loss mask primarily affects the loss of the non-psychological discourse classification task, all parent-child dialogue texts participating in the training are traversed. If the first-level label of a parent-child dialogue text indicates no mental focus tendency, then a loss mask value of 1 is assigned separately for the non-psychological discourse classification task of that sample. If the first-level label of a parent-child dialogue text indicates mental focus tendency, then a loss mask value of 0 is assigned separately for the non-psychological discourse classification task of that sample. Parent-child dialogue texts with a loss mask value of 1 are identified as valid parent-child dialogue texts for the non-psychological discourse classification task. When calculating the loss of the non-psychological discourse classification task, only valid parent-child dialogue texts for the non-psychological discourse classification task are considered.
[0107] In one implementation, the training method for the hierarchical emotion classification model of mental attention orientation further includes the following steps S18-S22:
[0108] Step S18: Input the target parent-child dialogue text into the pre-trained large language model to obtain the semantic feature vector of the target parent-child dialogue text;
[0109] Step S19: Input the semantic feature vector of the target parent-child dialogue text into the trained hierarchical task classification module;
[0110] Step S20: The semantic feature vector of the target parent-child dialogue text is processed by the trained psychological state classification task head in the trained hierarchical task classification module to obtain the psychological state classification prediction result of the target parent-child dialogue text.
[0111] Step S21: When the psychological state classification prediction result of the target parent-child dialogue text indicates that the target parent-child dialogue text has a mental attention tendency, the semantic feature vector of the target parent-child dialogue text is processed by the trained emotion classification task head in the trained hierarchical task classification module to obtain the emotion classification prediction result of the target parent-child dialogue text.
[0112] Step S22: When the psychological state classification prediction result of the target parent-child dialogue text indicates that the target parent-child dialogue text does not have a mental focus tendency, the semantic feature vector of the target parent-child dialogue text is processed by the trained non-psychological discourse classification task head in the trained hierarchical task classification module to obtain the non-psychological discourse classification prediction result of the target parent-child dialogue text. Additionally, the semantic feature vector of the target parent-child dialogue text is processed by the trained emotion classification task head in the trained hierarchical task classification module to obtain the emotion classification prediction result of the target parent-child dialogue text.
[0113] In this embodiment, when the trained hierarchical sentiment classification model for mental focus tendencies performs hierarchical task classification on unlabeled target parent-child dialogue text in a real-world application scenario, it first processes the target parent-child dialogue text using a pre-trained large language model to obtain the semantic feature vector of the target parent-child dialogue text. Then, it processes the semantic feature vector of the target parent-child dialogue text using the trained mental state classification task head to obtain the mental state classification prediction result of the target parent-child dialogue text. If the mental state classification prediction result of the target parent-child dialogue text indicates a mental focus tendency, then the trained sentiment classification task head in the trained hierarchical task classification module processes the semantic feature vector of the target parent-child dialogue text to obtain the sentiment classification prediction result of the target parent-child dialogue text. If the psychological state classification prediction result of the target parent-child dialogue text is that it does not have a mental focus tendency, the semantic feature vector of the target parent-child dialogue text is processed by the trained non-psychological discourse classification task head to obtain the non-psychological discourse classification prediction result of the target parent-child dialogue text. And, the semantic feature vector of the target parent-child dialogue text is processed by the trained emotion classification task head to obtain the emotion classification prediction result of the target parent-child dialogue text.
[0114] The training method for the hierarchical affective classification model of mental attention orientation proposed in this application achieves automated and standardized processing of psychological texts through data preprocessing and text semantic pre-segmentation, significantly improving coding efficiency and reducing reliance on expert manual annotation. The hierarchical classification mechanism based on a pre-trained large language model enhances the semantic understanding and generalization ability of the hierarchical affective classification model of mental attention orientation, ensuring high accuracy while making the model's decision-making process clearer and more traceable, thereby significantly improving interpretability and credibility in practical use. By addressing the problems of low efficiency, insufficient generalization, and weak interpretability of traditional methods, this application has stronger applicability and promotional value in real clinical and educational intervention scenarios, and can effectively improve the auxiliary analysis capabilities in the process of psychological assessment and treatment.
[0115] 1. A hierarchical classification method for "mental attention tendency" text based on a large language model is proposed for professional coding tasks in the field of psychology, achieving a balance between model accuracy, interpretability and practical applicability.
[0116] 2. Construct a data preprocessing and text semantic pre-segmentation mechanism specifically for psychological corpora. Through adjustable parameters, the mechanism enables the generation of structured and hierarchical samples from original parent-child interaction texts, thereby improving the generalization ability and adaptability of the hierarchical emotion classification model for mental attention tendencies to real-world scenarios.
[0117] 3. Based on the hierarchical classification structure design of pre-trained language models, the classification process is optimized by combining domain knowledge and semantic hierarchy relationships, thereby improving the ability to capture semantic features of mental attention tendencies and the classification accuracy.
[0118] 4. The design of a traceable and interpretable decision path supports backtracking from the final classification results to the semantic paragraph and theoretical label levels, improving the transparency and clinical credibility of the hierarchical emotion classification model of mental attention tendencies.
[0119] 5. An application mechanism oriented towards real-world clinical psychological assessment and educational intervention scenarios, overcoming the problems of existing methods relying on specific data structures and insufficient generalization ability, to achieve large-sample, multi-context, real-time automated analysis.
[0120] Furthermore, the following experiments demonstrate that the training method of the hierarchical emotional classification model of mental attention orientation proposed in this application has high interpretability and practical reliability.
[0121] This experiment statistically analyzed the average results across 10 independent tests, comparing the performance of three classification models based on large language models (ordinary classification model, multi-level classification model, and multi-level sentiment transfer learning classification model). The comparison results are shown in Table 1. The experimental setup was as follows:
[0122] 1. A foundation for large language models;
[0123] 2. Sentiment pre-trained classification model;
[0124] 3. Fine-tuning settings: Full fine-tuning (unfreeze all layers for fine-tuning), maximum text fill length 128, learning rate 1×10⁻¹⁰. 5 The optimizer weight decay is 0.01; the scheduler parameters are patience = 3, factor = 0.5, and dropout is set to 0.2; the loss function is Focal loss with a parameter of 3.5; the weight ratios for each layer (including the base) are 1.0, 0.8, 0.6, and 0.4. The learning rate (including the base) has a weight ratio of 1:2:4:8. The batch size for both the training and validation sets is 4, and the number of training epochs is 40.
[0125] Table 1. Performance comparison results of various models
[0126]
[0127] As shown in Table 1, the accuracy of the ordinary classification model (single classifier head) is 67.665%, and the Macro F1 score is 55.726%, indicating weak interpretability. After introducing a multi-level classifier head and loss mask, the model (multi-level classification model) accuracy improves to 76.901%, and the Macro F1 score improves to 65.005%, demonstrating enhanced interpretability. By introducing sentiment pre-trained model transfer learning on top of the multi-level classification model, the model performance is further improved. The model (multi-level sentiment transfer learning classification) achieves an accuracy of 82.247% and a Macro F1 score of 81.672%, reaching a strong level of interpretability, while also maintaining a stable performance of 76.739% on the Micro F1 score.
[0128] The above results fully demonstrate that the hierarchical emotional classification model of mental attention tendency (multi-level emotional transfer learning classification model) trained according to the scheme of this application has high interpretability and practical reliability. It has stronger implementation ability and promotion value in real clinical and educational intervention scenarios, and can effectively improve the auxiliary analysis ability in the process of psychological assessment and treatment.
[0129] Based on the same inventive concept, one embodiment of this application provides a training device for a hierarchical emotion classification model of mental attention tendency, the device comprising:
[0130] The acquisition module is used to acquire parent-child dialogue texts carrying hierarchical tags. The parent-child dialogue texts include parent speech texts and child speech texts. The root tag in the hierarchical tags indicates whether the parent-child dialogue texts have a mental focus tendency. The second tag in the hierarchical tags indicates the speech category of parent-child dialogue texts that do not have a mental focus tendency. The third tag in the hierarchical tags indicates the emotional category of parent-child dialogue texts that have a mental focus tendency or the emotional category of parent-child dialogue texts that do not have a mental focus tendency.
[0131] The construction module is used to connect multiple classification heads to be trained according to the hierarchical relationship represented by the hierarchical labels, and construct a hierarchical task classification module to be trained.
[0132] The processing module is used to process the parent-child dialogue text through a pre-trained large language model and the hierarchical task classification module to be trained, so as to obtain the hierarchical task classification prediction result of the parent-child dialogue text.
[0133] The update module is used to update the model parameters of the hierarchical task classification module to be trained based on the hierarchical task classification prediction results and the hierarchical labels carried by the parent-child dialogue text, so as to obtain the trained hierarchical task classification module.
[0134] The trained hierarchical task classification module and the pre-trained large language model are used to perform hierarchical task classification on the target parent-child dialogue text: identifying whether the target parent-child dialogue text has a mental focus tendency, identifying the discourse category of the target parent-child dialogue text when the target parent-child dialogue text does not have a mental focus tendency, and identifying the emotion category of the target parent-child dialogue text.
[0135] In one alternative implementation, the building module includes:
[0136] The first acquisition submodule is used to obtain three classification task heads with the same structure. Each of the three classification task heads includes: a fully connected layer, a Dropout layer, and an output projection layer.
[0137] The first random initialization submodule is used to randomly initialize the model parameters of the three classification task heads as the psychological state classification task head to be trained, the non-psychological discourse classification task head to be trained, and the emotion classification task head to be trained.
[0138] The first connection submodule is used to connect the output of the mental state classification task head to be trained to the input of the non-mental discourse classification task head to be trained and to the input of the emotion classification task head to be trained.
[0139] The second connection submodule is used to connect the output of the non-psychological discourse classification task head to be trained to the input of the emotion classification task head to be trained.
[0140] In one alternative implementation, the building module includes:
[0141] The second acquisition submodule is used to obtain a pre-trained sentiment classification model. According to the structure of the pre-trained sentiment classification head in the pre-trained sentiment classification model, a first classification task head and a second classification task head with the same structure are obtained. The pre-trained sentiment classification head, the first classification task head and the second classification task head all include: a fully connected layer, a Dropout layer and an output projection layer.
[0142] The second random initialization submodule is used to randomly initialize the model parameters of the first classification task head and the second classification task head, so as to serve as the mental state classification task head to be trained and the non-mental discourse classification task head to be trained.
[0143] The knowledge transfer submodule is used to initialize the emotion classification task head to be trained by utilizing the model parameters of the pre-trained emotion classification head in the pre-trained emotion classification model.
[0144] In one alternative implementation, the update module includes:
[0145] The first loss calculation submodule is used to determine the loss of the mental state classification task based on the difference between the mental state classification prediction result output by the mental state classification task head to be trained for the parent-child dialogue text and the root layer label in the hierarchical label carried by the parent-child dialogue text.
[0146] The second loss calculation submodule is used to determine the non-psychological discourse classification task loss based on the difference between the non-psychological discourse classification prediction result output by the non-psychological discourse classification task head to be trained for the parent-child dialogue text and the second layer label in the hierarchical label carried by the parent-child dialogue text.
[0147] The third loss calculation submodule is used to determine the emotion classification task loss based on the difference between the emotion classification prediction result output by the emotion classification task head to be trained for the parent-child dialogue text and the third layer label in the hierarchical label carried by the parent-child dialogue text.
[0148] The total loss calculation submodule is used to obtain the total loss based on the psychological state classification task loss, the non-psychological discourse classification task loss, and the emotion classification task loss.
[0149] The model parameter update submodule is used to update the model parameters of the fully connected layers, Dropout layers, and output projection layers of the mental state classification task head, the non-mental discourse classification task head, and the emotion classification task head to be trained, based on the total loss.
[0150] In one alternative implementation, the processing module includes:
[0151] The parent-child dialogue text processing submodule is used to input the parent-child dialogue text into a pre-trained large language model to obtain the semantic feature vector of the parent-child dialogue text.
[0152] The first task feature acquisition submodule is used to process the semantic feature vector of the parent-child dialogue text through the fully connected layer and Dropout layer in the mental state classification task head to be trained, so as to obtain the mental state classification task features of the parent-child dialogue text.
[0153] The first classification prediction submodule is used to process the psychological state classification task features of the parent-child dialogue text through the output projection layer in the psychological state classification task head to be trained, so as to obtain the psychological state classification prediction result of the parent-child dialogue text.
[0154] The second task feature acquisition submodule is used to process the semantic feature vector of the parent-child dialogue text through the fully connected layer and Dropout layer in the non-psychological discourse classification task head to be trained, when the psychological state classification prediction result of the parent-child dialogue text is that there is no mental attention tendency, so as to obtain the non-psychological discourse classification task features of the parent-child dialogue text.
[0155] The second classification prediction submodule is used to process the non-psychological discourse classification task features of the parent-child dialogue text through the output projection layer in the non-psychological discourse classification task head to be trained, so as to obtain the non-psychological discourse classification prediction result of the parent-child dialogue text.
[0156] The third task feature acquisition submodule is used to process the semantic feature vector of the parent-child dialogue text through the fully connected layer and Dropout layer in the emotion classification task head to be trained, so as to obtain the emotion classification task features of the parent-child dialogue text.
[0157] The third classification prediction submodule is used to process the sentiment classification task features of the parent-child dialogue text through the output projection layer in the sentiment classification task head to be trained, so as to obtain the sentiment classification prediction result of the parent-child dialogue text.
[0158] In one alternative embodiment, the device further includes:
[0159] The first configuration module is used to configure a loss mask value of 1 for a parent-child dialogue text when the first layer of the hierarchical labels carried by the parent-child dialogue text represents that the parent-child dialogue text does not have a mental attention tendency.
[0160] The second configuration module is used to characterize the parent-child dialogue text as having a mental focus tendency in the first layer of hierarchical labels carried by the parent-child dialogue text, and to configure the loss mask value of the non-psychological discourse classification task to be 0 for the parent-child dialogue text.
[0161] The effective text identification module is used to identify parent-child dialogue texts with a loss mask value of 1 as effective parent-child dialogue texts for non-psychological discourse classification tasks.
[0162] The second loss calculation submodule includes:
[0163] The second single loss calculation subunit is used to determine the non-psychological discourse classification task loss of the effective parent-child dialogue text based on the difference between the non-psychological discourse classification prediction result output by the non-psychological discourse classification task head to be trained for the effective parent-child dialogue text and the second layer label in the hierarchical label carried by the effective parent-child dialogue text.
[0164] The second averaging subunit is used to take the average of the non-psychological discourse classification task losses of multiple valid parent-child dialogue texts as the non-psychological discourse classification task loss.
[0165] In one alternative embodiment, the device further includes:
[0166] The target parent-child dialogue text processing module is used to input the target parent-child dialogue text into the pre-trained large language model to obtain the semantic feature vector of the target parent-child dialogue text.
[0167] The input module is used to input the semantic feature vector of the target parent-child dialogue text into the trained hierarchical task classification module;
[0168] The psychological state classification prediction module is used to process the semantic feature vector of the target parent-child dialogue text through the trained psychological state classification task head in the trained hierarchical task classification module to obtain the psychological state classification prediction result of the target parent-child dialogue text.
[0169] The sentiment classification prediction module is used to process the semantic feature vector of the target parent-child dialogue text by means of the trained sentiment classification task head in the trained hierarchical task classification module when the psychological state classification prediction result of the target parent-child dialogue text indicates that the target parent-child dialogue text has a mental attention tendency, so as to obtain the sentiment classification prediction result of the target parent-child dialogue text.
[0170] The non-psychological discourse and emotion classification prediction module is used to process the semantic feature vector of the target parent-child dialogue text by means of the trained non-psychological discourse classification task head in the trained hierarchical task classification module, when the psychological state classification prediction result of the target parent-child dialogue text indicates that the target parent-child dialogue text does not have a mental focus tendency, to obtain the non-psychological discourse classification prediction result of the target parent-child dialogue text; and to process the semantic feature vector of the target parent-child dialogue text by means of the trained emotion classification task head in the trained hierarchical task classification module, to obtain the emotion classification prediction result of the target parent-child dialogue text.
[0171] Based on the same concept, one embodiment of this application provides an electronic device, referring to... Figure 4 , Figure 4 This is a schematic diagram illustrating an electronic device according to an embodiment of this application. For example... Figure 4 As shown, the electronic device 100 includes a memory 110 and a processor 120. The memory 110 and the processor 120 are connected via a bus communication. The memory 110 stores a program or instruction, which can be executed on the processor 120 to implement the steps in the training method of the hierarchical emotion classification model of mental attention tendency described in any of the above embodiments of this application.
[0172] Based on the same inventive concept, this disclosure also provides a readable storage medium storing a program or instructions that, when executed by a processor, implement the steps in the training method of the hierarchical emotion classification model of mental attention tendency described in any of the above embodiments of this application.
[0173] The processor is the processor in the electronic device described in the above embodiments. The readable storage medium includes computer-readable storage media, such as computer read-only memory (ROM), random access memory (RAM), magnetic disk, or optical disk.
[0174] Based on the same inventive concept, this disclosure also provides a computer program product, including a computer program that, when executed by a processor of a computer device, is capable of performing the steps in the training method of the hierarchical emotion classification model of mental attention tendency described in any of the above embodiments of this application.
[0175] The various embodiments in this specification are described in a progressive manner, with each embodiment focusing on the differences from other embodiments. The same or similar parts between the various embodiments can be referred to each other.
[0176] Those skilled in the art will understand that embodiments of this application can be provided as methods, apparatus, or computer program products. Therefore, embodiments of this application can take the form of entirely hardware embodiments, entirely software embodiments, or embodiments combining software and hardware aspects. Furthermore, embodiments of this application can take the form of computer program products implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
[0177] This application describes embodiments with reference to flowchart illustrations and / or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of this application. It should be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal device, generate instructions for implementing the flowchart illustrations. Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.
[0178] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing terminal device to operate in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.
[0179] These computer program instructions can also be loaded onto a computer or other programmable data processing terminal equipment, causing a series of operational steps to be performed on the computer or other programmable terminal equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable terminal equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.
[0180] Although preferred embodiments of the present application have been described, those skilled in the art, upon learning the basic inventive concept, can make other changes and modifications to these embodiments. Therefore, the appended claims are intended to be interpreted as including the preferred embodiments as well as all changes and modifications falling within the scope of the embodiments of the present application.
[0181] Finally, it should be noted that in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or terminal device that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or terminal device. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or terminal device that includes said element.
[0182] The training method and apparatus for a hierarchical emotion classification model of mental attention tendency provided in this application have been described in detail above. Specific examples have been used to illustrate the principles and implementation methods of this application. The description of the above embodiments is only for the purpose of helping to understand the method and core ideas of this application. At the same time, for those skilled in the art, there will be changes in the specific implementation methods and application scope based on the ideas of this application. Therefore, the content of this specification should not be construed as a limitation of this application.
Claims
1. A training method for a hierarchical affective classification model based on mental attention tendencies, characterized in that, include: Obtain parent-child dialogue texts carrying hierarchical tags. The parent-child dialogue texts include parent speech texts and child speech texts. The root tag in the hierarchical tags indicates whether the parent-child dialogue texts have a mental focus tendency. The second tag in the hierarchical tags indicates the speech category of parent-child dialogue texts that do not have a mental focus tendency. The third tag in the hierarchical tags indicates the emotional category of parent-child dialogue texts that have a mental focus tendency or the emotional category of parent-child dialogue texts that do not have a mental focus tendency. Based on the hierarchical relationship represented by the hierarchical labels, multiple classification heads to be trained are connected to construct a hierarchical task classification module to be trained. The parent-child dialogue text is processed by a pre-trained large language model and the hierarchical task classification module to be trained, and the hierarchical task classification prediction result of the parent-child dialogue text is obtained. Based on the hierarchical task classification prediction results and the hierarchical labels carried by the parent-child dialogue text, the model parameters of the hierarchical task classification module to be trained are updated to obtain the trained hierarchical task classification module. The trained hierarchical task classification module and the pre-trained large language model are used to perform hierarchical task classification on the target parent-child dialogue text: identifying whether the target parent-child dialogue text has a mental focus tendency, identifying the discourse category of the target parent-child dialogue text when the target parent-child dialogue text does not have a mental focus tendency, and identifying the emotion category of the target parent-child dialogue text.
2. The training method for the hierarchical emotion classification model of mental attention tendency according to claim 1, characterized in that, Based on the hierarchical relationship represented by the hierarchical labels, multiple classification heads to be trained are connected to construct a hierarchical task classification module to be trained, including: Three classification task heads with identical structures are obtained, each of which includes: a fully connected layer, a Dropout layer, and an output projection layer; The model parameters of the three classification task heads are randomly initialized to serve as the psychological state classification task head, the non-psychological discourse classification task head, and the emotion classification task head to be trained. Connect the output of the mental state classification task head to be trained to the input of the non-mental discourse classification task head to be trained and to the input of the emotion classification task head to be trained. Connect the output of the non-psychological discourse classification task head to be trained to the input of the emotion classification task head to be trained.
3. The training method for the hierarchical emotion classification model of mental attention tendency according to claim 1, characterized in that, Based on the hierarchical relationship represented by the hierarchical labels, multiple classification heads to be trained are connected to construct a hierarchical task classification module to be trained, including: A pre-trained sentiment classification model is obtained. According to the structure of the pre-trained sentiment classification head in the pre-trained sentiment classification model, a first classification task head and a second classification task head with the same structure are obtained. The pre-trained sentiment classification head, the first classification task head and the second classification task head all include: a fully connected layer, a Dropout layer and an output projection layer. The model parameters of the first classification task head and the second classification task head are randomly initialized to serve as the mental state classification task head to be trained and the non-mental discourse classification task head to be trained. The model parameters of the pre-trained sentiment classification head in the pre-trained sentiment classification model are used to initialize the sentiment classification task head to be trained.
4. The training method for the hierarchical emotion classification model of mental attention tendency according to claim 2 or 3, characterized in that, Based on the hierarchical task classification prediction results and the hierarchical labels carried by the parent-child dialogue text, the model parameters of the hierarchical task classification module to be trained are updated, including: The psychological state classification task loss is determined based on the difference between the psychological state classification prediction result output by the psychological state classification task head to be trained for the parent-child dialogue text and the root layer label in the hierarchical label carried by the parent-child dialogue text. The non-psychological discourse classification task loss is determined based on the difference between the non-psychological discourse classification prediction result output by the non-psychological discourse classification task head to be trained for the parent-child dialogue text and the second-level label in the hierarchical label carried by the parent-child dialogue text. Based on the difference between the emotion classification prediction result output by the emotion classification task head to be trained for the parent-child dialogue text and the third layer label in the hierarchical label carried by the parent-child dialogue text, the emotion classification task loss is determined. The total loss is obtained based on the loss from the psychological state classification task, the loss from the non-psychological discourse classification task, and the loss from the emotion classification task. Based on the total loss, the model parameters of the fully connected layers, Dropout layers, and output projection layers of the mental state classification task head, the non-mental discourse classification task head, and the emotion classification task head to be trained are updated.
5. The training method for the hierarchical emotion classification model of mental attention tendency according to claim 4, characterized in that, The parent-child dialogue text is processed using a pre-trained large language model and the hierarchical task classification module to be trained, resulting in a hierarchical task classification prediction result for the parent-child dialogue text, including: The parent-child dialogue text is input into a pre-trained large language model to obtain the semantic feature vector of the parent-child dialogue text; The semantic feature vector of the parent-child dialogue text is processed by the fully connected layer and Dropout layer in the head of the mental state classification task to be trained, so as to obtain the mental state classification task features of the parent-child dialogue text. The psychological state classification task features of the parent-child dialogue text are processed by the output projection layer in the psychological state classification task head to be trained, and the psychological state classification prediction result of the parent-child dialogue text is obtained. When the psychological state classification prediction result of the parent-child dialogue text is that there is no mental attention tendency, the semantic feature vector of the parent-child dialogue text is processed by the fully connected layer and Dropout layer in the non-psychological discourse classification task head to be trained, so as to obtain the non-psychological discourse classification task features of the parent-child dialogue text. The non-psychological discourse classification task features of the parent-child dialogue text are processed by the output projection layer in the non-psychological discourse classification task head to be trained, and the non-psychological discourse classification prediction result of the parent-child dialogue text is obtained. The semantic feature vector of the parent-child dialogue text is processed by the fully connected layer and Dropout layer in the emotion classification task head to be trained, so as to obtain the emotion classification task features of the parent-child dialogue text. The emotion classification task features of the parent-child dialogue text are processed by the output projection layer in the emotion classification task head to be trained, so as to obtain the emotion classification prediction result of the parent-child dialogue text.
6. The training method for the hierarchical emotion classification model of mental attention tendency according to claim 4, characterized in that, Also includes: In a parent-child dialogue text, the first layer of the hierarchical labels indicates that the parent-child dialogue text does not have a mental focus tendency. Therefore, the loss mask value for the non-psychological discourse classification task is set to 1 for the parent-child dialogue text. The first layer of labels in the hierarchical labels carried by a parent-child dialogue text represents that the parent-child dialogue text has a mental focus tendency, and the loss mask value for the non-psychological discourse classification task is 0 for the parent-child dialogue text. Parent-child dialogue texts with a loss mask value of 1 are identified as valid parent-child dialogue texts for non-psychological discourse classification tasks. Based on the difference between the non-psychological discourse classification prediction result output by the non-psychological discourse classification task head for the parent-child dialogue text and the second-level label in the hierarchical labeling carried by the parent-child dialogue text, the non-psychological discourse classification task loss is determined, including: Based on the difference between the non-psychological discourse classification prediction result output by the non-psychological discourse classification task head to be trained for effective parent-child dialogue text and the second-level label in the hierarchical label carried by the effective parent-child dialogue text, the non-psychological discourse classification task loss of the effective parent-child dialogue text is determined. The average value of the non-psychological discourse classification task loss of multiple valid parent-child dialogue texts is used as the non-psychological discourse classification task loss.
7. The training method for the hierarchical emotion classification model of mental attention tendency according to claim 1, characterized in that, Also includes: The target parent-child dialogue text is input into the pre-trained large language model to obtain the semantic feature vector of the target parent-child dialogue text; Input the semantic feature vector of the target parent-child dialogue text into the trained hierarchical task classification module; The semantic feature vector of the target parent-child dialogue text is processed by the trained psychological state classification task head in the trained hierarchical task classification module to obtain the psychological state classification prediction result of the target parent-child dialogue text. When the psychological state classification prediction result of the target parent-child dialogue text indicates that the target parent-child dialogue text has a mental focus tendency, the semantic feature vector of the target parent-child dialogue text is processed by the trained emotion classification task head in the trained hierarchical task classification module to obtain the emotion classification prediction result of the target parent-child dialogue text. When the psychological state classification prediction result of the target parent-child dialogue text indicates that the target parent-child dialogue text does not have a mental focus tendency, the semantic feature vector of the target parent-child dialogue text is processed by the trained non-psychological discourse classification task head in the trained hierarchical task classification module to obtain the non-psychological discourse classification prediction result of the target parent-child dialogue text. Furthermore, the semantic feature vector of the target parent-child dialogue text is processed by the trained emotion classification task head in the trained hierarchical task classification module to obtain the emotion classification prediction result of the target parent-child dialogue text.
8. A training device for a hierarchical emotion classification model of mental attention tendencies, characterized in that, include: The acquisition module is used to acquire parent-child dialogue texts carrying hierarchical tags. The parent-child dialogue texts include parent speech texts and child speech texts. The root tag in the hierarchical tags indicates whether the parent-child dialogue texts have a mental focus tendency. The second tag in the hierarchical tags indicates the speech category of parent-child dialogue texts that do not have a mental focus tendency. The third tag in the hierarchical tags indicates the emotional category of parent-child dialogue texts that have a mental focus tendency or the emotional category of parent-child dialogue texts that do not have a mental focus tendency. The construction module is used to connect multiple classification heads to be trained according to the hierarchical relationship represented by the hierarchical labels, and construct a hierarchical task classification module to be trained. The processing module is used to process the parent-child dialogue text through a pre-trained large language model and the hierarchical task classification module to be trained, so as to obtain the hierarchical task classification prediction result of the parent-child dialogue text. The update module is used to update the model parameters of the hierarchical task classification module to be trained based on the hierarchical task classification prediction results and the hierarchical labels carried by the parent-child dialogue text, so as to obtain the trained hierarchical task classification module. The trained hierarchical task classification module and the pre-trained large language model are used to perform hierarchical task classification on the target parent-child dialogue text: identifying whether the target parent-child dialogue text has a mental focus tendency, identifying the discourse category of the target parent-child dialogue text when the target parent-child dialogue text does not have a mental focus tendency, and identifying the emotion category of the target parent-child dialogue text.
9. An electronic device, characterized in that, The system includes a processor, a memory, and a program or instructions stored in the memory and executable on the processor, wherein the program or instructions, when executed by the processor, implement the steps of the training method for the hierarchical emotion classification model of mental attention tendency as described in any one of claims 1-7.
10. A readable storage medium, characterized in that, The program or instructions are stored on the readable storage medium, and when the program or instructions are executed by the processor, they implement the steps of the training method for the hierarchical emotion classification model of mental attention tendency as described in any one of claims 1-7.