Method and system for sentiment classification of review text based on hybrid information mining

By employing a hybrid information mining method that combines graph convolutional neural networks, recurrent neural networks, and feedforward neural networks, semantic and interactive features of comment texts are extracted. This addresses the problem of insufficient accuracy in sentiment classification in existing technologies and achieves more efficient sentiment classification of comment texts.

CN116127070BActive Publication Date: 2026-06-26NAT INNOVATION INST OF DEFENSE TECH PLA ACAD OF MILITARY SCI

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
NAT INNOVATION INST OF DEFENSE TECH PLA ACAD OF MILITARY SCI
Filing Date
2023-01-30
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Existing text sentiment classification methods mainly rely on the correlation between text semantics and target category, resulting in poor sentiment classification accuracy.

Method used

This paper adopts a hybrid information mining approach, which obtains the latent feature vectors of users and products, extracts the semantic information and interaction features of the comment text using graph convolutional neural networks and recurrent neural networks, and combines feedforward neural networks and gated neural networks to perform high-order interaction information fusion to achieve accurate sentiment category prediction.

Benefits of technology

It significantly improves the performance and accuracy of sentiment classification for comment texts, enabling accurate judgment of the sentiment tendency of comment texts.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116127070B_ABST
    Figure CN116127070B_ABST
Patent Text Reader

Abstract

The application discloses a kind of based on mixed information mining's review text sentiment classification method and system, comprising: determining the user and commodity of review text, obtain the latent feature vector of user and commodity;According to latent feature vector and review text, determine the semantic information of review text;Get the goods and sentiment category that have interaction with user, the user of review text, the goods and sentiment category that have interaction with user are handled, obtain user behavior characteristics;Get the user and sentiment category that have interaction with commodity, the commodity of review text, the user and sentiment category that have interaction with commodity are handled, obtain commodity attribute characteristics;User behavior characteristics and commodity attribute characteristics are handled, obtain high-order interaction information;Fusion semantic information and high-order interaction information, determine the sentiment category of review text.The application can realize the sufficient mining fusion of review text semantic information and the interaction information of user commodity, improve the performance and precision of review text sentiment classification.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of Internet information classification technology, and in particular to a method and system for sentiment classification of comment text based on hybrid information mining. Background Technology

[0002] Text sentiment classification refers to the process of analyzing, processing, summarizing, and reasoning about subjective texts with emotional overtones to classify the emotional tendency expressed in the text. The emotional tendency of a text refers to the positive (e.g., joy, happiness, praise) or negative (e.g., sorrow, anger, criticism) inclination reflected in the text and the intensity of this emotional tendency. Current research suggests that text sentiment categories are typically divided into two (positive, negative), three (positive, negative, neutral), or a range of positive integers (e.g., [1, 2, 3, 4, 5]).

[0003] With the rise and rapid development of social media on the internet, text sentiment classification technology has been applied in many aspects of real life, including online product reviews, film and television reviews, social sentiment analysis, and personalized recommendations.

[0004] Existing text sentiment classification methods generally involve the following steps: acquisition and preprocessing of training data, establishment of a text representation model, selection of text features, selection of a classification method, and performance evaluation. When in use, existing text sentiment classification methods typically build a model based solely on the relationship between text semantics and the target category. The performance and accuracy of sentiment classification are limited by the correlation between the semantics of the comment text and the target category, resulting in relatively poor sentiment classification accuracy. Summary of the Invention

[0005] To address some or all of the technical problems existing in the prior art, this invention provides a method and system for sentiment classification of comment text based on hybrid information mining.

[0006] The technical solution of the present invention is as follows:

[0007] Firstly, a sentiment classification method for comment text based on hybrid information mining is provided, the method comprising:

[0008] Obtain the comment text to be classified, determine the user and product corresponding to the comment text, and obtain the latent feature vectors of the user and product;

[0009] Based on the latent feature vector and the comment text, the semantic information of the comment text is determined;

[0010] Based on the user corresponding to the comment text, obtain the products that have interacted with the user and their corresponding sentiment categories. Then, use a pre-trained first graph convolutional neural network to process the user corresponding to the comment text, the products that have interacted with the user, and their corresponding sentiment categories to obtain the user behavior features of the comment text.

[0011] Based on the product corresponding to the comment text, the user who interacted with the product and the corresponding sentiment category are obtained. The product corresponding to the comment text, the user who interacted with the product and the corresponding sentiment category are processed using a pre-trained second graph convolutional neural network to obtain the product attribute features of the comment text.

[0012] The user behavior features and product attribute features are processed using a pre-trained feedforward neural network to obtain high-order interaction information between the user and the product;

[0013] The sentiment category of the comment text is determined by integrating the semantic information of the comment text and the higher-order interaction information.

[0014] In some possible implementations, an embedding retrieval operation is used to obtain the latent feature vectors of the user and the product corresponding to the comment text.

[0015] In some possible implementations, the semantic information of the comment text is determined based on the latent feature vector and the comment text, including:

[0016] The comment text is processed using a pre-trained recurrent neural network to extract word-level semantic information from the comment text;

[0017] Based on the latent feature vector, the phrase-level semantic information of the comment text is fused using an attention mechanism to obtain the sentence-level semantic information of the comment text;

[0018] Based on the latent feature vector, an attention mechanism is used to fuse the sentence-level semantic information of the comment text to obtain the semantic information of the comment text.

[0019] In some possible implementations, we define the comment text D as being able to represent n sentences, where D = {S1, ..., S...} n}, the i-th sentence S i Can be represented as l i One word, This represents the l-th sentence of the i-th sentence. i One word;

[0020] The phrase-level semantic information of the comment text is represented as follows:

[0021]

[0022] The sentence-level semantic information of the comment text can be obtained using the following formula:

[0023]

[0024] The semantic information of the comment text can be obtained using the following formula:

[0025]

[0026] in, This represents a phrase-level semantic information of the i-th sentence and j-th word, taking into account the context. LSTM() represents a recurrent neural network. i Let represent the sentence-level semantic information of the i-th sentence, Att represent the attention mechanism, u represent the latent feature vector of the user, p represent the latent feature vector of the product, and d represent the semantic information of the comment text.

[0027] In some possible implementations, the user behavior characteristics of the comment text are obtained using the following formula:

[0028]

[0029] The product attribute features of the review text are obtained using the following formula:

[0030]

[0031] in, The comment text represents user behavior features, W1 represents the first convolutional neural network, U represents the user, P represents the product, and R represents the product. U This represents the collection of products that have interacted with user U. Represents the sentiment category r UP The corresponding parameterized matrix r UP This represents user U's sentiment category towards product P. W2 represents the product attribute features of the review text, W2 represents the second convolutional neural network, and R... P This represents the set of users who interact with product P.

[0032] In some possible implementations, the higher-order interaction information is obtained using the following formula:

[0033]

[0034] z represents higher-order interaction information, f() represents a feedforward neural network, and ⊙ represents the vector dot product operation. This indicates a vector concatenation operation.

[0035] In some possible implementations, the semantic information of the comment text and the higher-order interaction information are fused to determine the sentiment category of the comment text, including:

[0036] The semantic information and the higher-order interaction information of the comment text are fused using a pre-trained gated neural network to obtain mixed information of the semantic information and the higher-order interaction information;

[0037] The mixed information is processed using a pre-trained single-layer neural network to determine the sentiment category of the comment text.

[0038] In some possible implementations, the mixed information is obtained using the following formula:

[0039] v = gate(d, z)

[0040] The sentiment category of the comment text is determined using the following formula:

[0041]

[0042]

[0043] Where v represents mixed information, and gate() represents a gated neural network. The vector represents the predicted sentiment category of the comment text, softmax represents the normalized exponential function, and W and b represent the parameters of a single-layer neural network. This indicates the predicted sentiment category of the comment text.

[0044] In some possible implementations, the neural network is trained through the following steps:

[0045] Obtain the training dataset, which includes comment texts, the users corresponding to the comment texts, the products corresponding to the comment texts, and the actual sentiment categories corresponding to the comment texts.

[0046] The neural network is trained using the backpropagation algorithm based on a preset objective function, with the comment text, corresponding user, and product as input and the actual sentiment category of the comment text as output.

[0047] The objective function is set as follows:

[0048] L=L1+αL2

[0049] L1 = L u +L p +L r

[0050] L u =-logσ(u T Mr p-(u′) T M r p)

[0051] L p =-logσ(u T M r pu T M r p′)

[0052] L r =-logσ(u T M r pu T M r′ p)

[0053]

[0054] Where L represents the objective function, α represents the hyperparameters, X represents the training dataset, and C represents the total number of sentiment categories in the comment text. y represents the value of the c-th dimension of the predicted vector of the sentiment category of the comment text output by the neural network. c This represents the value of the c-th dimension of the actual sentiment category vector of the comment text, where σ represents the sigmoid function, σ(x) = 1 / (1 + exp(-x)), u T M r p represents the similarity of users, items, and sentiment categories in the input training data, (u′) T M r p、u T M r p′ and u T M r′ p represents the similarity of unobserved users, items, and sentiment categories in the input training data; u′ represents the latent feature vector corresponding to unobserved users; p′ represents the latent feature vector corresponding to unobserved items; r′ represents the unobserved sentiment category; and M... r M represents the parameterized matrix corresponding to the sentiment category r. r′ This represents the parameterized matrix corresponding to the sentiment category r′.

[0055] Secondly, a sentiment classification system for comment text based on hybrid information mining is also provided. This system utilizes the aforementioned sentiment classification method for comment text based on hybrid information mining to classify the sentiment of comment text, including:

[0056] The feature vector acquisition module is used to obtain the latent feature vectors of the user and the product corresponding to the comment text;

[0057] The text semantic mining module is used to determine the semantic information of the comment text based on the latent feature vector and the comment text.

[0058] The local interaction mining module is used to obtain interaction information of users, products and sentiment categories, and to determine the user behavior characteristics and product attribute characteristics of the comment text;

[0059] The advanced interaction mining module is used to obtain advanced interaction information between users and products based on user behavior characteristics and product attribute characteristics of comment text;

[0060] The sentiment category prediction module is used to fuse semantic information and higher-order interaction information of the comment text to determine and output the sentiment category of the comment text.

[0061] The main advantages of the technical solution of this invention are as follows:

[0062] The sentiment classification method and system for comment text based on hybrid information mining of the present invention can fully mine and integrate the semantic information of comment text and the interactive information of users and products, which can significantly improve the performance and accuracy of sentiment classification of comment text and obtain accurate sentiment classification results of comment text. Attached Figure Description

[0063] The accompanying drawings, which are included to provide a further understanding of embodiments of the invention and constitute a part of this invention, illustrate exemplary embodiments of the invention and, together with their description, serve to explain the invention and do not constitute an undue limitation thereof. In the drawings:

[0064] Figure 1 This is a flowchart of a comment text sentiment classification method based on hybrid information mining, according to an embodiment of the present invention. Detailed Implementation

[0065] To make the objectives, technical solutions, and advantages of this invention clearer, the technical solutions of this invention will be clearly and completely described below in conjunction with specific embodiments and corresponding drawings. Obviously, the described embodiments are only a part of the embodiments of this invention, and not all of them. All other embodiments obtained by those skilled in the art based on the embodiments of this invention without creative effort are within the scope of protection of this invention.

[0066] The technical solutions provided by the embodiments of the present invention will be described in detail below with reference to the accompanying drawings.

[0067] refer to Figure 1 In a first aspect, an embodiment of the present invention provides a sentiment classification method for comment text based on hybrid information mining, the method comprising the following steps S1-S6:

[0068] Step S1: Obtain the comment text to be classified, determine the user and product corresponding to the comment text, and obtain the latent feature vectors of the user and product.

[0069] Assumptions: The input of a review text is represented by a tuple <U, P, D>, where U represents the user, P represents the product, and D represents the review text. The output of the sentiment category of the review text is represented by a vector. This indicates that the number of sentiment categories in the comment text is determined based on the actual situation, for example, it can be 5 or 10. When the number of sentiment categories is 5 (e.g., category 1 represents negative reviews, category 5 represents positive reviews, and so on for intermediate categories), then the vector... This is a vector with dimension 5. In one embodiment of the present invention, the vector is taken as... The dimension with the largest median value is used as the corresponding sentiment category, for example, the output vector corresponding to a comment text. When the value is (0.1,0.1,0.1,0.1,0.6), the sentiment category corresponding to the comment text is category 5.

[0070] In one embodiment of the present invention, for a user and a product corresponding to a comment text, an embedding lookup operation is used to obtain the latent feature vectors of the user and the product corresponding to the comment text, respectively. Embedding lookup is a one-to-one mapping, assigning a unique embedding vector to each user and a unique embedding vector to each product. Embedding vectors can be extracted from various explicit information sources. When no explicit information source is available, they are learned from the user's interaction records with the product. Therefore, the embedding vectors are model parameters, determined through automatic learning and updating during model training.

[0071] Step S2: Determine the semantic information of the comment text based on the latent feature vector and the comment text.

[0072] Since the semantic information and sentiment polarity of comment texts are highly correlated—for example, if a comment text contains words like "like," "good quality," "excellent service," or "fast logistics"—it is very likely to be a positive review (positive sentiment polarity); conversely, if a comment text contains words like "disappointed" or "poor quality"—it is very likely to be a negative review (negative sentiment polarity). Therefore, in order to achieve sentiment classification of comment texts and improve the accuracy of sentiment classification, it is necessary to extract the semantic information of the comment texts and perform sentiment classification based on the extracted semantic information.

[0073] In one embodiment of the present invention, the semantic information of the comment text is determined based on the latent feature vector and the comment text, further including the following steps S21-S23:

[0074] Step S21: Use a pre-trained recurrent neural network to process the comment text and extract the word-level semantic information of the comment text;

[0075] Step S22: Based on the latent feature vector, the attention mechanism is used to fuse the word-level semantic information of the comment text to obtain the sentence-level semantic information of the comment text;

[0076] Step S23: Based on the latent feature vector, use an attention mechanism to fuse the sentence-level semantic information of the comment text to obtain the semantic information of the comment text.

[0077] Specifically, let's define it as follows: the comment text D can be represented as n sentences, D = {S1, ..., S2}. n}, the i-th sentence S i Can be represented as l i One word, This represents the l-th sentence of the i-th sentence. i One word;

[0078] The phrase-level semantic information of the comment text is represented as follows:

[0079]

[0080] in, This represents the phrase-level semantic information of the j-th word in the i-th sentence, taking into account the context. LSTM() represents a recurrent neural network. The parameters of the recurrent neural network are automatically learned and updated during model training.

[0081] Specifically, the sentence-level semantic information of the comment text is obtained using the following formula:

[0082]

[0083] The semantic information of the comment text can be obtained using the following formula:

[0084]

[0085] Among them, s i Let represent the sentence-level semantic information of the i-th sentence, Att represent the attention mechanism, u represent the user's latent feature vector, p represent the product's latent feature vector, and d represent the semantic information of the review text. The parameters corresponding to the attention mechanism are automatically learned and updated during model training.

[0086] In one embodiment of the present invention, by processing the comment text in the above manner, the semantic information of the comment text can be captured very well.

[0087] Step S3: Based on the user corresponding to the comment text, obtain the products that have interacted with the user and their corresponding sentiment categories. Use a pre-trained first graph convolutional neural network to process the user corresponding to the comment text, the products that have interacted with the user, and their corresponding sentiment categories to obtain the user behavior features of the comment text.

[0088] Specifically, for user U in the comment text tuple <U,P,D>, obtain the products that the user has interacted with and their corresponding sentiment categories. The products that the user has interacted with indicate that the user has a specific sentiment category for the corresponding products.

[0089] After obtaining the products that the user has interacted with and their corresponding sentiment categories, the first convolutional neural network is used to process the user corresponding to the comment text, the products that the user has interacted with, and their corresponding sentiment categories to obtain the user behavior features of the comment text.

[0090] In one embodiment of the present invention, the user behavior characteristics of the comment text are obtained using the following formula:

[0091]

[0092] in, The comment text represents user behavior features, W1 represents the first convolutional neural network, U represents the user, P represents the product, and R represents the product. U This represents the collection of products that have interacted with user U. Represents the sentiment category r UP The corresponding parameterized matrix, r UP Let p represent the sentiment category of user U towards product P, and p represent the latent feature vector of the product. The parameters of the convolutional neural network in the first graph are determined through automatic learning and updating during model training.

[0093] Based on the above assumptions and definitions, the set of goods R that interacts with user U is... U Represented as R U ={P|r UP ≠0},r UP ≠0 indicates that user U's emotional category towards product P is not 0, meaning that the user already has a specific emotional category towards the corresponding product.

[0094] Step S4: Based on the product corresponding to the comment text, obtain the users who interacted with the product and their corresponding sentiment categories. Use a pre-trained second graph convolutional neural network to process the product corresponding to the comment text, the users who interacted with the product, and their corresponding sentiment categories to obtain the product attribute features of the comment text.

[0095] Specifically, for product P in the comment text tuple <U,P,D>, obtain the users who have interacted with the product and their corresponding sentiment categories. Here, users who have interacted with the product indicate that the corresponding users have a specific sentiment category for the product.

[0096] After obtaining the users who have interacted with the product and their corresponding sentiment categories, the second graph convolutional neural network is used to process the product corresponding to the comment text, the users who have interacted with the product, and their corresponding sentiment categories to obtain the product attribute features of the comment text.

[0097] In one embodiment of the present invention, the product attribute features of the review text are obtained using the following formula:

[0098]

[0099] in, The product attribute features of the review text are represented by W2, the second graph convolutional neural network is represented by P, the product is represented by U, and R is represented by R. P This represents the set of users who have interacted with product P. Represents the sentiment category r UP The corresponding parameterized matrix, r UP Let represent the sentiment category of user U towards product P, and u represent the user's latent feature vector. The parameters of the convolutional neural network in the second graph are determined through automatic learning and updating during model training.

[0100] Based on the above assumptions and definitions, the set R of users who interact with product P is... P Represented as R P ={U|r UP ≠0},r UP ≠0 indicates that user U's emotional category towards product P is not 0, meaning that the user already has a specific emotional category for the product.

[0101] In one embodiment of the present invention, the above processing can fully mine the interactive information of user products, so as to improve the performance and accuracy of subsequent sentiment classification.

[0102] Step S5: Use a pre-trained feedforward neural network to process user behavior features and product attribute features to obtain high-order interaction information between users and products.

[0103] Specifically, in one embodiment of the present invention, higher-order interaction information is obtained using the following formula:

[0104]

[0105] z represents higher-order interaction information, f() represents a feedforward neural network, and ⊙ represents the vector dot product operation. This represents a vector concatenation operation. The parameters of the feedforward neural network are determined through automatic learning and updating during model training.

[0106] Step S6: Integrate the semantic information and higher-order interaction information of the comment text to determine the sentiment category of the comment text.

[0107] In one embodiment of the present invention, the semantic information and higher-order interaction information of the comment text are fused to determine the sentiment category of the comment text, further including the following steps S61-S62:

[0108] Step S61: Use a pre-trained gated neural network to fuse the semantic information and higher-order interaction information of the comment text to obtain mixed information of semantic information and higher-order interaction information;

[0109] Step S62: Use a pre-trained single-layer neural network to process the mixed information and determine the sentiment category of the comment text.

[0110] Specifically, in one embodiment of the present invention, the mixed information is obtained using the following formula:

[0111] v = gate(d, z)

[0112] The sentiment category of a comment text is determined using the following formula:

[0113]

[0114]

[0115] Where v represents mixed information, gate() represents a gated neural network, d represents the semantic information of the comment text, and z represents the higher-order interaction information of the comment text. The vector represents the predicted sentiment category of the comment text, softmax represents the normalized exponential function, and W and b represent the parameters of a single-layer neural network. This indicates the predicted sentiment category of the comment text. Indicates taking the prediction vector The dimension with the largest median value. The parameters of the gated neural network and the single-layer neural network are determined through automatic learning and updating during model training.

[0116] The sentiment classification method for comment text based on hybrid information mining provided in one embodiment of the present invention, by employing the above processing, can fully mine and integrate the semantic information of comment text and the interactive information of user products, which can significantly improve the performance and accuracy of sentiment classification of comment text and obtain accurate sentiment classification results of comment text.

[0117] Furthermore, in one embodiment of the present invention, the embedding vectors, attention mechanism parameters, and parameters of each neural network mentioned above are all updated and optimized through model training.

[0118] Specifically, in one embodiment of the present invention, the embedding vector, attention mechanism, and various neural networks are trained through the following steps:

[0119] Obtain the training dataset, which includes comment texts, the users corresponding to the comment texts, the products corresponding to the comment texts, and the actual sentiment categories corresponding to the comment texts.

[0120] The input consists of the comment text, the corresponding user and product, and the output consists of the actual sentiment category corresponding to the comment text. Based on the preset objective function, the embedding vector, attention mechanism and neural network are trained using the backpropagation algorithm.

[0121] Specifically, during model training, the embedding vector, attention mechanism, and neural network are all initialized with parameters. Based on these initialization parameters and the aforementioned specific data processing procedures and formulas, a corresponding predicted sentiment category can be obtained for each input training data. An objective function is calculated based on the predicted sentiment category. Using the objective function, the backpropagation algorithm is used to continuously update the various model parameters until a set iteration stopping condition is met, thus completing model training and determining the embedding vector, attention mechanism, and neural network. The model includes the embedding vector, attention mechanism, and neural network.

[0122] In one embodiment of the present invention, the objective function is set as follows during model training:

[0123] L=L1+αL2

[0124] The L1 loss component is used to fully mine local interaction information of users and products, specifically represented as follows:

[0125] L1 = L u +L p +L r

[0126] L u =-logσ(u T M r p-(u′) T M r p)

[0127] L p =-logσ(u T M r pu T M r p′)

[0128] L r =-logσ(u T M r pu T M r′ p)

[0129] The L2 loss component is used to represent the difference between the model's predicted sentiment category and the actual sentiment category, specifically expressed as:

[0130]

[0131] Where L represents the objective function, α represents the hyperparameters, X represents the training dataset, and C represents the total number of sentiment categories in the comment text. y represents the value of the c-th dimension of the predicted vector of the sentiment category of the comment text output by the neural network. c This represents the value of the c-th dimension of the actual sentiment category vector of the comment text, where σ represents the sigmoid function, σ(x) = 1 / (1 + exp(-x)), u T M r p represents the similarity of users, items, and sentiment categories in the input training data, (u′) T M r p、u T M r p′ and u T M r′ p represents the similarity of unobserved users, items, and sentiment categories in the input training data; u′ represents the latent feature vector corresponding to unobserved users; p′ represents the latent feature vector corresponding to unobserved items; r′ represents the unobserved sentiment category; and M... r M represents the parameterized matrix corresponding to the sentiment category r. r′ This represents the parameterized matrix corresponding to the sentiment category r′.

[0132] L1 = L u +L p +L r This represents the difference between maximizing the similarity of user product sentiment categories corresponding to the training data and the similarity of unobserved user product sentiment categories.

[0133] Furthermore, in one embodiment of the present invention, the unobserved user, product, and sentiment categories corresponding to the input training data are determined in the following manner:

[0134] For the training data <U,P,D,r>, a user is randomly sampled from all users in the training dataset to form an unobserved pair of user, item, and sentiment category (U′,P,r);

[0135] Randomly sample one item from all items in the training dataset to form an unobserved pair of user, item, and sentiment category (U, P′, r);

[0136] Randomly sample a sentiment category from all sentiment categories in the training dataset to form an unobserved user, item, and sentiment category pair (U, P, r′).

[0137] In one embodiment of the present invention, by constructing the above-mentioned objective function for model training, it is possible to fully mine the local interaction information of user products and improve the prediction accuracy of the trained model.

[0138] Furthermore, in one embodiment of the present invention, the parameters of the model are updated using the following formula:

[0139]

[0140] Where, Θ t+1 Let Θ represent the set of parameters for all models at the (t+1)th iteration. t Let represent the set of parameters for all models at iteration t, Δ[·] represent the optimizer, η represent the learning rate, and Θ represent the set of model parameters. The optimizer can be, for example, Adam or SGD, and the learning rate needs to be preset to control the speed of parameter updates.

[0141] Secondly, an embodiment of the present invention also provides a sentiment classification system for comment text based on hybrid information mining, the system comprising:

[0142] The feature vector acquisition module is used to obtain the latent feature vectors of the user and the product corresponding to the comment text;

[0143] The text semantic mining module is used to determine the semantic information of the comment text based on the latent feature vector and the comment text.

[0144] The local interaction mining module is used to obtain interaction information of users, products and sentiment categories, and to determine the user behavior characteristics and product attribute characteristics of the comment text;

[0145] The advanced interaction mining module is used to obtain advanced interaction information between users and products based on user behavior characteristics and product attribute characteristics of comment text;

[0146] The sentiment category prediction module is used to fuse semantic information and higher-order interaction information of the comment text to determine and output the sentiment category of the comment text.

[0147] The modules described above are devices corresponding to the steps of the methods described above. The specific working principle and beneficial effects of each module can be found in the above-mentioned sentiment classification method for comment text, and will not be repeated here.

[0148] It should be noted that, in this document, relational terms such as "first" and "second" are used merely to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Additionally, the terms "front," "back," "left," "right," "upper," and "lower" in this document refer to the placement shown in the accompanying drawings.

[0149] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. A sentiment classification method for comment text based on hybrid information mining, characterized in that, include: Obtain the comment text to be classified, determine the user and product corresponding to the comment text, and obtain the latent feature vectors of the user and product; Based on the latent feature vector and the comment text, the semantic information of the comment text is determined; Based on the user corresponding to the comment text, obtain the products that have interacted with the user and their corresponding sentiment categories. Then, use a pre-trained first graph convolutional neural network to process the user corresponding to the comment text, the products that have interacted with the user, and their corresponding sentiment categories to obtain the user behavior features of the comment text. Based on the product corresponding to the comment text, the user who interacted with the product and the corresponding sentiment category are obtained. The product corresponding to the comment text, the user who interacted with the product and the corresponding sentiment category are processed using a pre-trained second graph convolutional neural network to obtain the product attribute features of the comment text. The user behavior features and product attribute features are processed using a pre-trained feedforward neural network to obtain high-order interaction information between the user and the product; By integrating the semantic information of the comment text and the higher-order interaction information, the sentiment category of the comment text is determined; The user behavior characteristics of the comment text are obtained using the following formula: ; The product attribute features of the review text are obtained using the following formula: ; in, This indicates user behavior characteristics in the comment text. The first graph represents a convolutional neural network. Indicates user, Indicates goods, Indicates to users An interactive collection of products. Indicates sentiment category The corresponding parameterized matrix, Indicates user For goods Emotional categories This represents the latent feature vector of a product. This indicates the product attribute characteristics of the review text. This represents the convolutional neural network in the second graph. Indicates relation to goods An interactive set of users This represents the latent feature vector of a user.

2. The sentiment classification method for comment text based on hybrid information mining according to claim 1, characterized in that, The implicit feature vectors of the user and product corresponding to the comment text are obtained by using an embedding retrieval operation.

3. The sentiment classification method for comment text based on hybrid information mining according to claim 1, characterized in that, Based on the latent feature vector and the comment text, the semantic information of the comment text is determined, including: The comment text is processed using a pre-trained recurrent neural network to extract word-level semantic information from the comment text; Based on the latent feature vector, the phrase-level semantic information of the comment text is fused using an attention mechanism to obtain the sentence-level semantic information of the comment text; Based on the latent feature vector, an attention mechanism is used to fuse the sentence-level semantic information of the comment text to obtain the semantic information of the comment text.

4. The sentiment classification method for comment text based on hybrid information mining according to claim 3, characterized in that, Settings: Comment text D It can be represented as n sentences. The i-th sentence Can be represented as One word, , This represents the i-th sentence. One word; The phrase-level semantic information of the comment text is represented as follows: ; The sentence-level semantic information of the comment text can be obtained using the following formula: ; The semantic information of the comment text can be obtained using the following formula: ; in, This indicates that the phrase-level semantic information of the j-th word in the i-th sentence has been taken into account. Represents a recurrent neural network. This represents the sentence-level semantic information of the i-th sentence. This represents the attention mechanism. This indicates the semantic information of the comment text.

5. The sentiment classification method for comment text based on hybrid information mining according to claim 4, characterized in that, The higher-order interaction information is obtained using the following formula: ; Represents higher-order interactive information. This represents a feedforward neural network. This represents the vector dot product operation. This indicates a vector concatenation operation.

6. The sentiment classification method for comment text based on hybrid information mining according to claim 5, characterized in that, By integrating the semantic information of the comment text and the higher-order interaction information, the sentiment category of the comment text is determined, including: The semantic information and the higher-order interaction information of the comment text are fused using a pre-trained gated neural network to obtain mixed information of the semantic information and the higher-order interaction information; The mixed information is processed using a pre-trained single-layer neural network to determine the sentiment category of the comment text.

7. The sentiment classification method for comment text based on hybrid information mining according to claim 6, characterized in that, The mixed information is obtained using the following formula: ; The sentiment category of the comment text is determined using the following formula: ; ; in, Representing mixed information, This represents a gated neural network. A predicted vector representing the sentiment category of the comment text. Represents the normalized exponential function, and Represents the parameters of a single-layer neural network. This indicates the predicted sentiment category of the comment text.

8. The sentiment classification method for comment text based on hybrid information mining according to claim 7, characterized in that, The neural network is trained through the following steps: Obtain the training dataset, which includes comment texts, the users corresponding to the comment texts, the products corresponding to the comment texts, and the actual sentiment categories corresponding to the comment texts. The neural network is trained using the backpropagation algorithm based on a preset objective function, with the comment text, corresponding user, and product as input and the actual sentiment category of the comment text as output. The objective function is set as follows: ; ; ; ; ; ; in, Describe the objective function. Indicates hyperparameters, Represents the training data set. This represents the total number of sentiment categories in the comment text. The value of the c-th dimension of the predicted vector representing the sentiment category of the comment text output by the neural network. This represents the value of the c-th dimension of the actual sentiment category vector of the comment text. This represents the sigmoid function. , This indicates the similarity of users, items, and sentiment categories in the input training data. , and These all represent the similarity of unobserved users, products, and sentiment categories corresponding to the input training data. This represents the latent feature vector corresponding to unobserved users. This represents the latent feature vector corresponding to the unobserved product. This indicates unobserved sentiment categories. Indicates sentiment category The corresponding parameterized matrix, Indicates sentiment category The corresponding parameterized matrix, The lost portion is used to fully mine local interaction information of users and products. The loss component represents the gap between the predicted sentiment category and the actual sentiment category.

9. A sentiment classification system for comment text based on hybrid information mining, characterized in that, The system uses the comment text sentiment classification method based on hybrid information mining as described in any one of claims 1-8 to perform comment text sentiment classification, including: The feature vector acquisition module is used to obtain the latent feature vectors of the user and the product corresponding to the comment text; The text semantic mining module is used to determine the semantic information of the comment text based on the latent feature vector and the comment text. The local interaction mining module is used to obtain interaction information of users, products and sentiment categories, and to determine the user behavior characteristics and product attribute characteristics of the comment text; The advanced interaction mining module is used to obtain advanced interaction information between users and products based on user behavior characteristics and product attribute characteristics of comment text; The sentiment category prediction module is used to fuse semantic information and higher-order interaction information of the comment text to determine and output the sentiment category of the comment text.