A social media-based mental state detection method, device and equipment

By constructing a directed graph structure and introducing causal constraints and time interval perception mechanisms, the problem of imprecise characterization of users' long-term psychological dynamics in existing technologies has been solved, and more accurate mental state detection has been achieved.

CN122201776APending Publication Date: 2026-06-12LANZHOU UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
LANZHOU UNIV
Filing Date
2026-03-16
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

Existing technologies struggle to accurately depict users' long-term psychological dynamics, neglecting the semantic interactions and dependencies between posts, and failing to adequately model the evolution of psychological states.

Method used

A directed graph structure is constructed, and post-level semantic embedding vectors are obtained through a pre-trained language model. Information is then propagated in the graph neural network, and causal constraints and time interval awareness mechanisms are introduced. A Transformer encoder is used for self-attention adjustment to output user-level semantic representations.

🎯Benefits of technology

Explicitly characterizing the semantic interactions between posts and the structural relationships of user behavior enhances the ability to finely depict users' long-term psychological dynamics and improves the accuracy and reliability of mental state detection.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122201776A_ABST
    Figure CN122201776A_ABST
Patent Text Reader

Abstract

The application provides a social media-based mental state detection method, device and equipment, and belongs to the field of state prediction. The method comprises the following steps: obtaining social media data, generating a user post sequence and inputting the user post sequence into a pre-trained language model to obtain a corresponding post-level semantic embedding vector; constructing a directed graph structure for each user, taking each post as a graph node, and establishing a directed edge only in time sequence from a past node to a current node; generating an edge weight according to the semantic similarity and time interval between nodes; inputting the directed graph into a graph neural network to output a node-level evolution embedding sequence that fuses causal constraints and time semantics; inputting the node-level evolution embedding sequence into a time sequence encoder and introducing a causal attention mask and a time interval perception mechanism, and outputting a user-level semantic representation after pooling; and outputting a mental state prediction result through a classification layer based on the user-level semantic representation, so that the accuracy and reliability of the prediction are improved.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of state prediction, specifically relating to a method, apparatus, and device for detecting mental state based on social media. Background Technology

[0002] In recent years, abnormal mental states have become a significant challenge in global public health. Traditional methods relying on clinical interviews and scale assessments are insufficient for large-scale, continuous screening. Meanwhile, social media has become a crucial platform for people to express emotions, record their lives, and engage in social interaction, generating a wealth of textual and behavioral data that contains rich clues about psychological states. Modeling and analyzing social media data holds promise for identifying individuals at risk of potential abnormal mental states in a non-invasive and low-cost manner, thus compensating for limitations caused by insufficient professional medical resources and low consultation rates. Conducting research on social media-based user detection of mental states not only has significant theoretical value, contributing to a deeper understanding of the relationship between mental states and language behavior, but also possesses significant practical implications, providing strong technical support for early warning, precise intervention, and public mental health governance.

[0003] In the early stages of social media mental health research, limited by data scale and modeling capabilities, studies primarily employed statistical feature-based methods to model users' psychological states. These methods relied on manually designed language and behavioral features, such as the proportion of emotional words, frequency of negative emotion words, usage rate of first-person pronouns, syntactic complexity, and posting time distribution, combined with traditional machine learning models (such as SVM, logistic regression, and random forest) for classification. Their advantage lay in the intuitive semantics and strong interpretability of the features, making them suitable for small-scale data analysis. However, this paradigm heavily depended on human experience, struggled to capture implicit semantics and complex psychological expressions, and had limited generalization ability to linguistic diversity and cross-platform scenarios, gradually failing to meet the needs of refined modeling. With the rise of deep learning in natural language processing, research gradually shifted from manual feature creation to automatic representation learning, proposing methods based on long-sequence representations. These methods concatenate all posts published by a user over a period of time into a long text sequence and use models such as CNN, RNN, or Transformer for end-to-end encoding to directly learn user-level semantic representations. The core idea was to capture users' long-term language usage patterns through holistic modeling, reducing reliance on manual feature design. Although this method improves performance compared to traditional methods, it often requires text truncation due to model input length limitations, resulting in the loss of some key information. At the same time, the structural relationships and temporal characteristics between posts are weakened, making it difficult to depict the dynamic evolution of psychological states.

[0004] To overcome the limitations of long sequence modeling, existing research has gradually developed hierarchical semantic fusion methods, which have become the mainstream paradigm in the field of mental state recognition. These methods typically follow a hierarchical modeling approach of "post-level—user-level," first semantically encoding individual posts independently, and then fusing the post representations through pooling, attention, or temporal models to construct a user-level semantic representation. However, existing fusion strategies largely still exhibit "brute-force" aggregation, integrating information through simple weighting, concatenation, or sequential encoding, failing to explicitly characterize the semantic interactions and dependencies between different posts. Furthermore, most methods treat posts as independent sequential units, ignoring the potential structural connections and interaction patterns between user behaviors, making it difficult to fully model the complex behavioral relationships during the evolution of mental states. This, to some extent, limits the model's ability to finely characterize the long-term psychological dynamics of users. Summary of the Invention

[0005] To address the problem that existing technologies lack the ability to accurately depict users' long-term psychological dynamics, this invention provides a method, apparatus, and device for detecting mental state based on social media.

[0006] To achieve the above objectives, the present invention provides the following technical solution: A method for detecting mental state based on social media, the method comprising: Obtain and preprocess users' historical posting data on social media platforms to generate user post sequences; Each post in the user post sequence is input into a pre-trained language model to obtain the corresponding post-level semantic embedding vector; A directed graph structure is constructed for users, with each post-level semantic embedding vector as a graph node, and directed edges are established only in time order from the previous node to the next node to impose causal constraints; edge weights are generated based on the semantic similarity between nodes and the time interval. The directed graph is input into a graph neural network, and information is propagated along the direction of the directed edge through the information propagation mechanism. The information of neighbor nodes is aggregated, and a node-level evolutionary embedding sequence that integrates causal constraints and temporal semantics is output. The node-level evolutionary embedding sequence is input into the temporal encoder in chronological order. The temporal encoder introduces a causal attention mask and a time interval awareness mechanism in the self-attention mechanism. When only focusing on historical information, it adaptively adjusts the attention weight according to the time interval between posts. The node-level evolutionary embedding sequence is encoded into a context-aware node embedding sequence. The context-aware node embedding sequence is then aggregated to output a user-level semantic representation. Based on the user-level semantic representation, mental state categories are output through a classification layer.

[0007] Optionally, the pre-trained language model is a MentalBERT model pre-trained on a corpus in the mental health domain, and is enhanced through unsupervised contrastive learning; the enhancement training includes: A portion of the corpus is extracted from user posts in each mental state category. For the same post, two post-level semantic embedding vectors are generated using two different Dropout random masks as positive sample pairs, and the remaining embeddings in the same batch are used as negative samples. The SimCSE-style contrastive loss function is used for optimization to improve the model's ability to discriminate semantics related to mental health.

[0008] Optionally, the edge weights of the directed graph are determined by the semantic similarity between nodes and the time interval, with the larger the time interval, the smaller the weight, in order to reflect the stronger influence of recent behavior on psychological state. edge weight The calculation method is as follows: ; in, The cosine similarity between semantic embeddings is represented. and Posts and Post-level semantic embedding vectors This represents the time interval between the publication of two posts. This is the preset positive decay hyperparameter.

[0009] Optionally, the graph neural network is a graph attention network, which encodes the user behavior graph through two graph attention layers; In each layer, each node adaptively aggregates information from all its causal predecessor nodes and updates its own representation; after two layers of propagation, the output is a chronologically ordered sequence of node-level evolutionary embeddings. ; in, For the first The final embedding of the post, This represents the total number of user posts.

[0010] Optionally, the temporal encoder is a Transformer encoder, and two improvements are introduced into its self-attention mechanism to model the temporal causality and irregular time intervals of user behavior; the two improvements include: Causal attention masking: Applying a mask to the attention score matrix so that each position can only focus on itself and its preceding positions, preventing future information from influencing historical representations; mask matrix. Defined as: ; Time interval awareness mechanism: This mechanism measures the time interval between any two posts. The bias is mapped to a learnable scalar bias and superimposed on the attention logits to explicitly characterize the moderating effect of temporal distance on the strength of behavioral association; the bias is calculated and superimposed as follows: ; ; MLP stands for Multilayer Perceptron. To obtain the adjusted attention score, the attention weights from position i to position j are normalized using the softmax function. and These are the query vector and the key vector, respectively. For vector dimensions; The Transformer encoder encodes the input node-level evolutionary embedding sequence into a context-aware node embedding sequence. ;in For the first Context-aware embedding of each post.

[0011] Optionally, the user-level semantic representation is obtained by attention pooling of the context-aware node embedding sequence, including: Embedded for each context-aware node Learn an attention score that reflects the contribution of the corresponding post to the user's overall mental state; then, perform a weighted summation of all embeddings to generate a single user-level representation vector. Attention Score and end-user-level semantic representation The calculation method is as follows: ; ; ; in, and For learnable linear transformation parameters, For learnable context vectors, This represents the total number of user posts.

[0012] Optionally, after outputting the mental state prediction results, the method further includes a joint optimization step: performing end-to-end optimization of the model by combining the joint cross-entropy loss and supervised contrast loss to enhance the semantic discriminativeness between different mental state types; The supervised contrastive loss is used to bring user representations of the same mental state category closer together and push user representations of different categories further apart in the user-level representation space, and is defined as follows: ; in, Represents cosine similarity. It's a temperature over-parameter. This indicates the number of embeddings in a mini-batch. This is a positive sample pair.

[0013] A social media-based mental state detection device, the device comprising: The acquisition module is used to acquire and preprocess users' historical posting data on social media platforms to generate user post sequences; The extraction module is used to input each post in the user post sequence into a pre-trained language model to obtain the corresponding post-level semantic embedding vector; The building module is used to construct a directed graph structure for each user, treating each post as a graph node, and establishing directed edges from past nodes to the current node only in chronological order to impose causal constraints; edge weights are generated based on the semantic similarity between nodes and the time interval. The aggregation module is used to input the directed graph into the graph neural network, propagate information along the direction of the directed edge through the information propagation mechanism, aggregate the information of neighbor nodes, and output a node-level evolutionary embedding sequence that integrates causal constraints and temporal semantics. The output module is used to input the node-level evolutionary embedding sequence into the temporal encoder in chronological order. The temporal encoder introduces a causal attention mask and a time interval awareness mechanism in the self-attention mechanism. When only focusing on historical information, it adaptively adjusts the attention weight according to the time interval between posts. The module encodes the node-level evolutionary embedding sequence into a context-aware node embedding sequence and aggregates the context-aware node embedding sequence to output a user-level semantic representation. The analysis module is used to output mental state categories through a classification layer based on the user-level semantic representation.

[0014] A computer-readable storage medium storing a computer program that, when executed by a processor, implements the aforementioned social media-based mental state detection method.

[0015] A computer device includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor, when executing the program, implements the aforementioned social media-based mental state detection method.

[0016] The mental state detection method based on social media provided by this invention has the following beneficial effects: This invention acquires and preprocesses historical user posting data to generate post sequences, and utilizes a pre-trained language model to obtain post-level semantic embedding vectors, providing a foundation for subsequent analysis. A directed graph structure is constructed, establishing directed edges in chronological order and imposing causal constraints. Edge weights are generated based on semantic similarity and time intervals, explicitly characterizing the semantic interactions, dependencies, and potential structural associations and interaction patterns of user behavior between different posts, overcoming the limitations of "brute-force" aggregation and treating posts as independent units. A graph neural network propagates information along the directed edges, outputting a node-level evolutionary embedding sequence that integrates causal constraints and temporal semantics, fully modeling the complex behavioral associations in the process of psychological state evolution. A temporal encoder introduces a causal attention mask and a time interval awareness mechanism, constraining the model to focus on historical information and adaptively adjusting the impact of time distance on attention, enhancing the ability to finely characterize users' long-term psychological dynamics. Finally, based on user-level semantic representation, the system outputs mental state prediction results, improving the accuracy and reliability of predictions. Attached Figure Description

[0017] To more clearly illustrate the embodiments and design schemes of the present invention, the accompanying drawings required for this embodiment will be briefly described below. The drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0018] Figure 1 This is a flowchart illustrating a social media-based mental state detection method provided by the present invention according to an exemplary embodiment.

[0019] Figure 2 This is a flowchart of a social media user mental state detection method provided by the present invention according to an exemplary embodiment.

[0020] Figure 3 This is a block diagram of a social media-based mental state detection device provided by the present invention according to an exemplary embodiment. Detailed Implementation

[0021] To enable those skilled in the art to better understand and implement the technical solutions of the present invention, the present invention will be described in detail below with reference to the accompanying drawings and specific embodiments. The following embodiments are only used to more clearly illustrate the technical solutions of the present invention and should not be construed as limiting the scope of protection of the present invention.

[0022] The technical solutions provided by the various embodiments of the present invention will be described in detail below with reference to the accompanying drawings.

[0023] First, this invention provides a method for detecting mental state based on social media, specifically as follows: Figure 1 As shown, it includes the following steps: S101. Obtain and preprocess the user's historical posting data on social media platforms to generate a sequence of user posts.

[0024] In this stage, users' social media data is cleaned and standardized to remove noisy data and extract valid user information.

[0025] For example, one or more of the following preprocessing operations can be performed: Active user filtering: Remove users with fewer than 50 posts per year and more than 100,000 posts per year to avoid noise.

[0026] Text cleanup: Remove non-alphabetic symbols such as emojis, numbers, punctuation marks, and hyperlinks.

[0027] Spelling standardization: Correcting or removing irregular expressions and elongated words, such as "goooooood" and "helllllp".

[0028] Lowercase conversion and length filtering: Convert all words to lowercase, keeping only words with a length between three and fifteen characters.

[0029] Abbreviation expansion and stop word removal: Expand common abbreviations, such as "i'm" → "i am", "can't" → "can not", and remove frequent stop words.

[0030] S102. Input each post in the user's post sequence into the pre-trained language model to obtain the corresponding post-level semantic embedding vector.

[0031] The pre-trained language model is a MentalBERT model pre-trained on mental health corpora, and it is enhanced through unsupervised contrastive learning. The enhancement training includes: extracting a portion of the corpus from user posts of each mental state category; for the same post, generating two post-level semantic embedding vectors as positive sample pairs through two different Dropout random masks, and using the remaining embeddings in the same batch as negative samples; and optimizing the model with a SimCSE-style contrastive loss function to improve the model's ability to discriminate mental health-related semantics.

[0032] S103. Construct a directed graph structure for each user, using each post-level semantic embedding vector as a graph node, and establish directed edges from the previous node to the next node only in time order to apply causal constraints; generate edge weights based on the semantic similarity and time interval between nodes.

[0033] In this step, the edge weights of the directed graph are determined by the semantic similarity between nodes and the time interval. The larger the time interval, the smaller the weight, in order to reflect the stronger influence of recent behavior on psychological state.

[0034] edge weight The calculation method is as follows: ; in, The cosine similarity between semantic embeddings is represented. and Posts and Post-level semantic embedding vectors This represents the time interval between the publication of two posts. This is the preset positive decay hyperparameter.

[0035] S104. Input the directed graph into a graph neural network, and propagate information along the directed edge direction through the information propagation mechanism to aggregate neighbor node information and output a node-level evolutionary embedding sequence that integrates causal constraints and temporal semantics.

[0036] The graph neural network is a graph attention network, which encodes the user behavior graph through two graph attention layers. In each layer, each node adaptively aggregates information from all its causal predecessor nodes and updates its own representation. After two layers of propagation, the output is a chronologically ordered sequence of node-level evolutionary embeddings. ; in, For the first The final embedding of the post, This represents the total number of user posts.

[0037] S105. Input the node-level evolutionary embedding sequence into the temporal encoder in chronological order. The temporal encoder introduces causal attention mask and time interval awareness mechanism in the self-attention mechanism. When only focusing on historical information, it adaptively adjusts the attention weight according to the time interval between posts. Encode the node-level evolutionary embedding sequence into a context-aware node embedding sequence. Then aggregate the context-aware node embedding sequence and output the user-level semantic representation.

[0038] The temporal encoder is a Transformer encoder, and two improvements are introduced into its self-attention mechanism to model the temporal causality and irregular time intervals of user behavior; these two improvements are: Causal attention masking: Applying a mask to the attention score matrix so that each position can only focus on itself and its preceding positions, preventing future information from influencing historical representations; mask matrix. Defined as: .

[0039] By introducing an explicit causal constraint mechanism in the user behavior graph modeling process, this ensures that the information transmission between user behaviors strictly follows a unidirectional influence relationship from the past to the future, thus avoiding reverse interference of future behaviors on the representation of historical psychological states from a structural level.

[0040] Time interval awareness mechanism: This mechanism measures the time interval between any two posts. The bias is mapped to a learnable scalar bias and superimposed on the attention logits to explicitly characterize the moderating effect of temporal distance on the strength of behavioral association; the bias is calculated and superimposed as follows: ; ; MLP stands for Multilayer Perceptron. To obtain the adjusted attention score, the attention weights of the i-th position post to the j-th position post are obtained by normalization using the softmax function. and These are the query vector and the key vector, respectively. For vector dimensions.

[0041] By combining this time interval perception mechanism, the moderating effect of uneven time intervals between different behaviors on the intensity of psychological state is explicitly characterized during the semantic fusion process, making the learning process of user-level semantic representation more in line with the objective laws of emotional accumulation and decay in psychodynamics.

[0042] The Transformer encoder encodes the input node-level evolutionary embedding sequence into a context-aware node embedding sequence. ;in For the first Context-aware embedding of each post.

[0043] In constructing and learning efficient user-level semantic representations, this step explicitly introduces causal constraints on user behavior evolution to ensure that information transmission between behaviors strictly follows a unidirectional influence relationship from the past to the future. At the same time, it incorporates the uneven time intervals between user behaviors into the modeling process to characterize the differentiated impact of different behaviors on the evolution of psychological states. Furthermore, it enhances the discriminativeness and stability of user-level semantic representations through a contrastive learning mechanism, thereby achieving accurate modeling and reliable identification of the user's psychological state evolution process.

[0044] User-level semantic representations are obtained by attention pooling of the context-aware node embedding sequence.

[0045] For example, embedding for each context-aware node Learn an attention score that reflects the contribution of the corresponding post to the user's overall mental state; then, perform a weighted summation of all embeddings to generate a single user-level representation vector. Attention Score and end-user-level semantic representation The calculation method is as follows: ; ; ; in, and For learnable linear transformation parameters, For learnable context vectors, This represents the total number of user posts.

[0046] S106. Based on this user-level semantic representation, output the mental state category through the classification layer.

[0047] This mental state category can serve as supplementary data or a reference for relevant medical personnel when diagnosing mental illnesses.

[0048] After outputting the mental state category, the method also includes a joint optimization step: the model is optimized end-to-end by combining the joint cross-entropy loss and the supervised contrastive loss to enhance the semantic discriminativeness between different mental state types; This supervised contrastive loss is used to bring user representations of the same mental state category closer together and push user representations of different categories further apart in the user-level representation space, and is defined as: ; in, Represents cosine similarity. It's a temperature over-parameter. This indicates the number of embeddings in a mini-batch. This is a positive sample pair.

[0049] Using the above method, historical user posting data is acquired and preprocessed to generate post sequences. A pre-trained language model is then used to obtain post-level semantic embedding vectors, providing a foundation for subsequent analysis. A directed graph structure is constructed, establishing directed edges in chronological order and imposing causal constraints. Edge weights are generated based on semantic similarity and time intervals, explicitly characterizing the semantic interactions, dependencies, and potential structural connections and interaction patterns of user behavior between different posts, overcoming the limitations of "brute-force" aggregation and treating posts as independent units. A graph neural network propagates information along the directed edges, outputting a node-level evolutionary embedding sequence that integrates causal constraints and temporal semantics, fully modeling the complex behavioral connections in the evolution of psychological states. A temporal encoder introduces a causal attention mask and a time interval awareness mechanism, constraining the model to focus on historical information and adaptively adjusting the impact of time distance on attention, enhancing the ability to finely characterize users' long-term psychological dynamics. Finally, based on user-level semantic representations, mental state prediction results are output, improving the accuracy and reliability of predictions.

[0050] Based on the above steps, the present invention also provides an embodiment, such as... Figure 2 As shown, the flowchart of the social media user mental state detection technology proposed in this invention is illustrated, which mainly includes four stages: data preprocessing, user post embedding representation, user behavior sequence representation, and method optimization.

[0051] ① Data preprocessing: This stage involves cleaning and standardizing users' social media data.

[0052] ② User Post Embedding Representation: The user post embedding representation primarily focuses on semantic representation of user post sequences. The proposed solution employs a MentalBERT model pre-trained on a large number of social media posts related to mental health to capture implicit emotional and psychological state features within social media. Building upon this, the invention further introduces an unsupervised contrastive pre-training technique based on SimCSE to enhance MentalBERT's semantic discrimination ability in multi-task learning. This enhanced training, through an unsupervised contrastive learning mechanism, effectively optimizes semantic representation, enabling the pre-trained model to better adapt to downstream tasks, particularly in user behavior analysis and mental state detection tasks within the mental health field, significantly improving the accuracy and robustness of semantic representation.

[0053] Step 1: Unsupervised contrastive pre-training.

[0054] First, 30% of user posts were randomly selected from each mental state category in the mental state dataset and used as the corpus for contrastive pre-training. For each post text... MentalBERT uses different dropout random masks to generate two independent embeddings:

[0055] ; in Form a positive sample pair. For a given anchor embedding... The remaining embeddings in the same batch are considered negative samples.

[0056] Then, an unsupervised contrastive loss function based on SimCSE is used as the objective function for optimization to maximize the semantic similarity between positive samples while separating the representations of different samples. For each anchor point... Its contrast loss is defined as:

[0057] ; in, Represents cosine similarity. It's a temperature over-parameter. This represents the number of embeddings in the mini-batch. The final contrastive loss is calculated by applying the results to all samples in the batch. It is obtained by averaging.

[0058] Finally, a semantic similarity benchmark dataset was used to evaluate and guide the learning of sentence representations, with Spearman rank correlation coefficient and Pearson correlation coefficient as evaluation metrics.

[0059] Step 2: User post embedding representation.

[0060] This invention utilizes an enhanced pre-trained MentalBERT to obtain posts. Initial contextualized embedding , is represented as: ; in, express Dimensional embedding vector.

[0061] ③ User behavior sequence representation: Based on the user post embeddings obtained in the previous stage, this stage first constructs a causal constraint user behavior graph for each user by introducing causal constraints and temporal modeling mechanisms, and encodes it using a graph neural network to obtain the embedding representation of each node; then, the Transformer encoder is used to perform sequential modeling of the node sequence according to the time publication order, and a causal attention mechanism and a time interval awareness mechanism are proposed to model the unidirectional influence relationship from past behavior to future behavior, and the semantic fusion process is optimized by the time interval awareness mechanism.

[0062] Step 1: Constructing a causal constraint temporal behavior graph.

[0063] This invention constructs a causally constrained temporal behavior graph for each user, where time sequence is explicitly imposed as a causal constraint to prevent future behaviors from interfering with historical behaviors, thereby formalizing the evolution of user behavior as a time-controlled dynamic system. For each user... This invention creates a directed graph. ,in and Let represent the node set and edge set, respectively. Specifically, each post is treated as a node, and directed edges point only from past posts to the current post to maintain temporal causality. The weights of the edges are determined by semantic similarity and an exponential decay function based on the time interval between posts.

[0064] Node definition: Each post in the user behavior is treated as a node in the graph and initialized using a pre-trained MentalBERT.

[0065] Definition of causal constraint edges: Graph The edge design in the algorithm is used to encode the semantic relevance and temporal dynamics of user behavior. For any two nodes... and If and only if At that time, establish from arrive directed edges This ensures strict adherence to temporal causality. Initial weights of edges. Defined by a mixture function of semantic similarity and time decay:

[0066] ; in, The cosine similarity between semantic embeddings is represented. Indicates the time elapsed between the two posts. Hyperparameter Controlling the decay rate reflects the psychological hypothesis that recent behavior has a stronger impact on the current state, while earlier historical events gradually lose weight but are still retained.

[0067] Step 2: User behavior graph encoding based on graph neural networks.

[0068] This invention employs a two-layer graph attention network (GAT) as an encoder to capture complex dependencies and higher-order associations in the user behavior graph. The GAT architecture enables the model to adaptively assign different importance to different behaviors, effectively capturing the previously described long-term dependencies and temporal causality. Through the hierarchical information propagation process of the two-layer GAT, each node… It effectively aggregates semantic information from its causal history while preserving its unique temporal characteristics. Specifically, for a given... Post by user The final sequence of encoded behaviors Represented as:

[0069] ; in, It is the output of the second layer GAT. -th post dimensional vector, and The sequence It includes the evolution of users' psychological state and behavioral patterns.

[0070] Step 3: User-level semantic representation based on Transformer.

[0071] To further embed these nodes The aggregation is a global user representation. This invention uses a single-layer Transformer encoder for context aggregation and makes targeted modifications to the standard Transformer design to better capture the temporal dependencies and causal structures of user behavior.

[0072] Temporal interval-aware positional bias: Standard positional encoding only captures the token order, ignoring irregular time intervals between user actions. To explicitly model temporal dependencies between posts, this invention introduces temporal interval-aware positional bias into a self-attention mechanism.

[0073] Given a sequence of posts sorted by time, position and The time interval between them is defined as: ; This interval is mapped to a learnable scalar bias using the following formula: ; This bias captures the effect of time distance while mitigating the bias caused by long time intervals.

[0074] This bias is added to the attention logits under the causality mask: ; This design allows the Transformer to adaptively adjust its influence based on the time interval between past posts and the current post, without introducing additional temporal embedding.

[0075] Causal Attention Mechanism: Besides the modeling time interval, preserving the causal order of user behavior is also crucial. Therefore, this invention employs a causal self-attention mechanism, which prevents the model from using future behavior information to feed back past behavior information by applying a causal mask to the attention score matrix, thereby ensuring temporally consistent representation learning. The formula for calculating causal attention weights is:

[0076] ; in, It is a causal mask matrix, defined as: ; User-level semantic representation: After further semantic encoding by the Transformer encoder, a series of context-aware node embeddings are obtained. Subsequently, this invention employs an attention-based pooling mechanism to weighted aggregate the embedded sequence, obtaining a unified user-level representation. This method enables the model to adaptively assign different importance weights to each behavior, thereby effectively highlighting key posts that best reflect the user's overall psychological state.

[0077] Specifically, for each node embedding This invention first uses a learnable context vector. Calculate a hidden score and corresponding attention weights : ; ; in, and These represent the weight matrix and bias term, respectively. The normalized weights... Indicates the first Each post contributes to the final representation. Comprehensive user-level semantic embedding. Calculated as a weighted sum of the node sequences:

[0078] .

[0079] ④ Method Optimization and Mental State Classification: This invention employs a joint optimization of cross-entropy loss and supervised contrastive learning to improve the accuracy of mental state classification results. Cross-entropy loss is used for classification tasks to directly optimize the classification results, while supervised contrastive learning enhances the discriminativeness of user behavior representations, ensuring that similar behaviors are clustered into the same category and dissimilar behaviors are separated. The two loss functions work together to optimize the performance of the mental state detection model and improve the robustness and accuracy of the classification results.

[0080] Secondly, the present invention also provides a mental state detection device based on social media, such as... Figure 3 As shown, it includes: The acquisition module 201 is used to acquire and preprocess the user's historical posting data on social media platforms to generate a sequence of user posts.

[0081] The extraction module 202 is used to input each post in the user's post sequence into a pre-trained language model to obtain the corresponding post-level semantic embedding vector.

[0082] Module 203 is used to construct a directed graph structure for each user, taking each post as a graph node, and establishing directed edges from past nodes to the current node only in chronological order to impose causal constraints; edge weights are generated based on semantic similarity and time interval between nodes.

[0083] The aggregation module 204 is used to input the directed graph into the graph neural network, propagate information along the directed edge direction through the information propagation mechanism, aggregate the information of neighbor nodes, and output a node-level evolutionary embedding sequence that integrates causal constraints and temporal semantics.

[0084] The output module 205 is used to input the node-level evolutionary embedding sequence into the temporal encoder in chronological order. The temporal encoder introduces causal attention mask and time interval awareness mechanism in the self-attention mechanism. When only focusing on historical information, it adaptively adjusts the attention weight according to the time interval between posts. The node-level evolutionary embedding sequence is encoded into a context-aware node embedding sequence. The context-aware node embedding sequence is then aggregated to output a user-level semantic representation.

[0085] Analysis module 206 is used to output mental state categories through a classification layer based on the user-level semantic representation.

[0086] The present invention also provides a computer-readable storage medium storing a computer program that can be used to execute the above-described... Figure 1 The steps of a social media-based mental state detection method are provided.

[0087] This invention also provides a computer device. At the hardware level, the computer device includes a processor, an internal bus, a network interface, memory, and non-volatile memory, and may also include other hardware required for various operations. The processor reads the corresponding computer program from the non-volatile memory into memory and then executes it to achieve the above-mentioned functions. Figure 1 The steps of a social media-based mental state detection method are provided.

[0088] Those skilled in the art will understand that embodiments of the present invention can be provided as methods, systems, or computer program products. Therefore, the present invention can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention can take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

[0089] This invention is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and / or block diagrams, as well as combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart... Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.

[0090] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.

[0091] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.

[0092] It should be noted that the specific embodiments described above enable those skilled in the art to more fully understand the present invention, but do not limit the present invention in any way. Therefore, although the present invention has been described in detail in this specification, those skilled in the art should understand that modifications or equivalent substitutions can still be made to the present invention; and all technical solutions and improvements that do not depart from the spirit and scope of the present invention are covered within the protection scope of the patent of the present invention. No reference numerals in the claims should be construed as limiting the scope of the claims.

Claims

1. A method for detecting mental state based on social media, characterized in that, The method includes: Obtain and preprocess users' historical posting data on social media platforms to generate user post sequences; Each post in the user post sequence is input into a pre-trained language model to obtain the corresponding post-level semantic embedding vector; A directed graph structure is constructed for users, with each post-level semantic embedding vector as a graph node, and directed edges are established only in time order from the previous node to the next node to impose causal constraints; edge weights are generated based on the semantic similarity between nodes and the time interval. The directed graph is input into a graph neural network, and information is propagated along the direction of the directed edge through the information propagation mechanism. The information of neighbor nodes is aggregated, and a node-level evolutionary embedding sequence that integrates causal constraints and temporal semantics is output. The node-level evolutionary embedding sequence is input into the temporal encoder in chronological order. The temporal encoder introduces a causal attention mask and a time interval awareness mechanism in the self-attention mechanism. When only focusing on historical information, it adaptively adjusts the attention weight according to the time interval between posts. The node-level evolutionary embedding sequence is encoded into a context-aware node embedding sequence. The context-aware node embedding sequence is then aggregated to output a user-level semantic representation. Based on the user-level semantic representation, mental state categories are output through a classification layer.

2. The method according to claim 1, characterized in that, The pre-trained language model is a MentalBERT model pre-trained on a corpus in the mental health domain, and it is enhanced through unsupervised contrastive learning; the enhancement training includes: A portion of the corpus is extracted from user posts in each mental state category. For the same post, two post-level semantic embedding vectors are generated using two different Dropout random masks as positive sample pairs, and the remaining embeddings in the same batch are used as negative samples. The SimCSE-style contrastive loss function is used for optimization to improve the model's ability to discriminate semantics related to mental health.

3. The method according to claim 1, characterized in that, The edge weights of the directed graph are determined by the semantic similarity between nodes and the time interval. The longer the time interval, the smaller the weight, in order to reflect the stronger influence of recent behavior on psychological state. edge weight The calculation method is as follows: ; in, The cosine similarity between semantic embeddings is represented. and Posts and Post-level semantic embedding vectors This represents the time interval between the publication of two posts. This is the preset positive decay hyperparameter.

4. The method according to claim 1, characterized in that, The graph neural network is a graph attention network, which encodes the user behavior graph through two graph attention layers; In each layer, each node adaptively aggregates information from all its causal predecessor nodes and updates its own representation; after two layers of propagation, the output is a chronologically ordered sequence of node-level evolutionary embeddings. ; in, For the first The final embedding of the post, This represents the total number of user posts.

5. The method according to claim 1, characterized in that, The temporal encoder is a Transformer encoder, and two improvements are introduced into its self-attention mechanism to model the temporal causality and irregular time intervals of user behavior; the two improvements include: Causal attention masking: Applying a mask to the attention score matrix so that each position can only focus on itself and its preceding positions, preventing future information from influencing historical representations; mask matrix. Defined as: ; Time interval awareness mechanism: This mechanism measures the time interval between any two posts. The bias is mapped to a learnable scalar bias and superimposed on the attention logits to explicitly characterize the moderating effect of temporal distance on the strength of behavioral association; the bias is calculated and superimposed as follows: ; ; MLP stands for Multilayer Perceptron. To obtain the adjusted attention score, the attention weights from position i to position j are normalized using the softmax function. and These are the query vector and the key vector, respectively. For vector dimensions; The Transformer encoder encodes the input node-level evolutionary embedding sequence into a context-aware node embedding sequence. ;in For the first Context-aware embedding of each post.

6. The method according to claim 5, characterized in that, User-level semantic representations are obtained by attention pooling of the context-aware node embedding sequence, including: Embedded for each context-aware node Learn an attention score that reflects the contribution of the corresponding post to the user's overall mental state; then, perform a weighted summation of all embeddings to generate a single user-level representation vector. Attention Score and end-user-level semantic representation The calculation method is as follows: ; ; ; in, and For learnable linear transformation parameters, For learnable context vectors, This represents the total number of user posts.

7. The method according to claim 1, characterized in that, After outputting the mental state prediction results, the method also includes a joint optimization step: performing end-to-end optimization of the model by combining the joint cross-entropy loss and supervised contrastive loss to enhance the semantic discriminativeness between different mental state types; The supervised contrastive loss is used to bring user representations of the same mental state category closer together and push user representations of different categories further apart in the user-level representation space, and is defined as follows: ; in, Represents cosine similarity. It's a temperature over-parameter. This indicates the number of embeddings in a mini-batch. This is a positive sample pair.

8. A mental state detection device based on social media, characterized in that, The device includes: The acquisition module is used to acquire and preprocess users' historical posting data on social media platforms to generate user post sequences; The extraction module is used to input each post in the user post sequence into a pre-trained language model to obtain the corresponding post-level semantic embedding vector; The building module is used to construct a directed graph structure for each user, treating each post as a graph node, and establishing directed edges from past nodes to the current node only in chronological order to impose causal constraints; edge weights are generated based on the semantic similarity between nodes and the time interval. The aggregation module is used to input the directed graph into the graph neural network, propagate information along the direction of the directed edge through the information propagation mechanism, aggregate the information of neighbor nodes, and output a node-level evolutionary embedding sequence that integrates causal constraints and temporal semantics. The output module is used to input the node-level evolutionary embedding sequence into the temporal encoder in chronological order. The temporal encoder introduces a causal attention mask and a time interval awareness mechanism in the self-attention mechanism. When only focusing on historical information, it adaptively adjusts the attention weight according to the time interval between posts. The module encodes the node-level evolutionary embedding sequence into a context-aware node embedding sequence and aggregates the context-aware node embedding sequence to output a user-level semantic representation. The analysis module is used to output mental state categories through a classification layer based on the user-level semantic representation.

9. A computer-readable storage medium, characterized in that, The storage medium stores a computer program, which, when executed by a processor, implements the method described in any one of claims 1 to 7.

10. A computer device, characterized in that, The method includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the program to implement the method described in any one of claims 1 to 7.