A deep learning-based domain expert classification recommendation method
By combining the BTM topic model and the Attention mechanism, the problems of insufficient concurrency, interpretability, stability and accuracy of deep learning text classification models are solved, and efficient text classification and domain expert recommendation are achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- HUAIYIN INSTITUTE OF TECHNOLOGY
- Filing Date
- 2023-05-29
- Publication Date
- 2026-06-19
AI Technical Summary
Existing deep learning text classification models are inadequate in terms of concurrency, interpretability, stability, training speed, and accuracy, making it difficult to effectively handle large-scale online text data.
We employ a domain expert classification and recommendation method based on deep learning, combining topic mining, natural language processing, and domain knowledge. We use the BTM topic model and attention mechanism to crawl and clean data, construct word vectors and feature vocabularies, and combine convolutional neural networks and Transformer for feature extraction and classification.
It improves the accuracy and interpretability of text classification, enhances the model's ability to handle long texts, and improves the model's transferability and interpretability.
Smart Images

Figure CN116595180B_ABST
Abstract
Description
Technical Field
[0001] This specification relates to the field of computing technology, and in particular to a domain expert classification and recommendation method based on deep learning. Background Technology
[0002] Topic modeling is a technique for text data analysis based on probabilistic statistical methods. It views text data as a combination of multiple topics, each composed of multiple words. In topic modeling, each piece of text data can be represented as a combination of multiple topics, and each topic is composed of the probability distribution of some words. The basic idea of topic modeling is to model text data, representing it as a probability distribution of topics and words. During model training, by learning the model parameters, the distribution of each piece of text data across different topics and the word distribution of each topic can be obtained. The advantage of topic modeling is that it does not rely on a manually set list of keywords, but rather mines topics from within the data itself, making it more objective, flexible, and applicable to various types of text data.
[0003] The rapid increase in text volume has placed higher demands on text processing. Compared with traditional data, text data on the internet has many new characteristics, such as large volume, high repetition, and high redundancy. Relying entirely on manual processing of this information is too costly. Text classification is one of the most fundamental tasks in text processing. Using computers to quickly and efficiently perform text classification can help alleviate the information processing problems brought about by the rapid growth of information.
[0004] Deep learning text classification models are currently mainly based on artificial neural networks, convolutional neural networks, and recurrent neural networks. These networks are considered black-box models, with limited parameter interpretability, which hinders network optimization and practical application. Furthermore, text classification models based on traditional neural networks still have room for improvement in terms of concurrency, stability, training speed, and accuracy.
[0005] It is evident that there is an urgent need for a deep learning text classification method based on topic models that offers higher concurrency, interpretability, stability, training speed, and accuracy. The advantage of a deep learning-based domain expert classification and recommendation method lies in its ability to combine topic mining, natural language processing, and domain knowledge, and automatically select important features for text classification, thereby improving classifier performance. Furthermore, because it can automatically extract topics and features, it significantly reduces the complexity of the classifier and training time. Summary of the Invention
[0006] In view of this, the present invention provides a domain expert classification recommendation method based on deep learning, which performs the following steps 1 to 5 to obtain a classification model of the sample text, and performs the following steps A to C to achieve classification of the target sample text.
[0007] The technical solution adopted in this invention is: a domain expert classification and recommendation method based on deep learning, comprising:
[0008] Step 1: Crawl and clean text data samples from technical experts, create an expert database, generate relevant word vectors, and divide them into training, validation, and test sets according to proportions.
[0009] Step 2: Use the word vectors from Step 1 to encode the sample labels and text, and use the (BTM topic model) model to build an expert vocabulary.
[0010] Step 3: Input the dataset processed in Step 1 into the topic matching-feature extraction model, and use the expert vocabulary from Step 2 for auxiliary classification.
[0011] Step 4: Iteratively train and calculate the cross-entropy loss. Update the parameters through backpropagation using the Adam optimizer. After each parameter update, calculate the value of the loss function on the validation set.
[0012] Step 5: Train the model and adjust parameters such as the learning rate and the number of hidden layers to optimize F1.
[0013] Step A: Obtain the sample text as in Step 1.
[0014] Step B: Follow Step 3 to obtain the domain expert category of the sample text from Step A.
[0015] Step C: Based on the domain expert categories in Step B, recommend relevant domain experts from the expert database built in Step 1.
[0016] Step 6: For new documents, first use a topic model to obtain their topic distribution, and then use a classification model to classify and recommend documents.
[0017] Step 1 includes:
[0018] Step 1.1: All samples are used to create word vectors using the Stanford GloVe open-source word vector code. Add the necessary special characters, such as "#PAD#" and "#UNK#", to assist in classification. Add "#PAD#", "#UNK#", "#CLS#", "#SEP#", "#MASK#", "#NUM#", etc. as needed.
[0019] Step 1.2: The format of each data point is [label, content]. Visualize the length of each sample to obtain the maximum sentence length for model processing, and divide the training set, test set, and validation set according to a 6:2:2 ratio.
[0020] Step 1.3: Create an expert database.
[0021] Step 2 includes:
[0022] Step 2.1: Encode all samples using the word vectors from Step 1.1, and then proceed to Step 2.2;
[0023] Step 2.2: Using the encoded samples from Step 2.1, calculate the self-attention coefficient for each word in the sentence, and then proceed to Step 2.3;
[0024] Step 2.3: Using the encoded samples from Step 2.1, perform word frequency statistics to filter out noisy words for each sample, calculate the semantic similarity of each word, and then proceed to Step 2.4;
[0025] Step 2.4: Based on the vocabulary self-attention coefficients obtained in Step 2.2 and the vocabulary semantic similarity in Step 2.3, extract the topic words to obtain the topic words for each sample, and construct the feature word entries of the labels according to the corresponding labels, and then proceed to Step 2.5;
[0026] Step 2.5: Based on the method described in Step 2.4, train on samples with known labels to construct a feature word vocabulary;
[0027] As a preferred embodiment of the present invention, step 3 includes the following steps 3.1 to 3.2:
[0028] Step 3.1: Input the domain expert data processed in Step 1.2 into the feature extraction module. First, encode the input text using the word vectors created in Step 1.1, and then perform positional encoding to obtain the vectorized sample text. Then proceed to Step 3.2.
[0029] Step 3.2: Use convolutional neural networks and Transformer as feature extraction modules to learn the semantic information of the vectorized sample text obtained in Step 3.1, and obtain the semantic information of the sample text.
[0030] Step 4: Iteratively train and calculate the cross-entropy loss. Update the parameters through backpropagation using the Adam optimizer. After each parameter update, calculate the value of the loss function on the validation set.
[0031] Step 5: Input the semantic information of the sample text from Step 3.2 into a fully connected neural network in the output layer as a classifier for classification. Update the parameters through backpropagation using the Adam optimizer. Calculate the value of the loss function on the validation set after each parameter update.
[0032] More specifically,
[0033] Step 1 includes:
[0034] Step 1.1: Clean the crawled data D1, perform word segmentation, and then perform word segmentation processing based on a specific domain to obtain V1;
[0035] Step 1.2: Use the Stanford GloVe open-source code to create word vectors for data D1, and add the necessary special characters, such as "#PAD#" and "#UNK#", to assist in classification. Add "#PAD#", "#UNK#", "#CLS#", "#SEP#", "#MASK#", "#NUM#", etc. as needed.
[0036] In practical applications, since computers are ineffective at classifying and recognizing numbers, a symbol "#NUM#" is defined to replace numbers encountered during the classification process. The dictionary range 0-19 is reserved for user-defined special symbols, with the actual dictionary encoding starting from 20. The dictionary and defined symbols are constructed through code.
[0037] Step 1.3: The format of each data point is [label, content]. Visualize the length of each sample to obtain the maximum sentence length for model processing, and divide the training set, test set, and validation set according to a 6:2:2 ratio.
[0038] Step 1.4: Create an expert database, where each entry represents an expert's name, research field, publications, and patents.
[0039] Step 2 includes:
[0040] Step 2.1: Encode D1 using the word vectors from Step 1.1, then proceed to Step 2.2;
[0041] Step 2.2: Using the encoded samples from Step 2.1, calculate the self-attention coefficient for each word in the sentence, and then proceed to Step 2.3;
[0042] Step 2.2.1: First, perform a linear transformation on the word vectors of the text to obtain three matrices: Q, K, and V. Multiply matrices Q and K to calculate the relevance matrix A. Where Q, K, and V are...
[0043] Step 2.2.2: Normalize the correlation matrix A using Softmax to obtain A', and multiply A' by V to obtain the final weight matrix Y.
[0044]
[0045] Step 2.3: Using the encoded samples from Step 2.1, perform word frequency statistics to filter out noisy words for each sample, calculate the semantic similarity of each word, and then proceed to Step 2.4;
[0046]
[0047] χ i =TF-IDF*λ i,j
[0048] χ j =TF-IDF*λ i,j
[0049] n i,j This indicates the number of times the word appears in the document, n. k,j |D| represents the total number of occurrences of all words in the document, and |D| represents the total number of texts in the corpus containing the word t. i The number of texts is used to incorporate vocabulary information as prior knowledge when extracting word pairs.
[0050] Step 2.4: Based on the vocabulary self-attention coefficients obtained in Step 2.2 and the vocabulary semantic similarity in Step 2.3, extract the topic words to obtain the topic words for each sample, and construct the feature word entries of the labels according to the corresponding labels, and then proceed to Step 2.5;
[0051] Step 2.4.1: The calculation process in the BTM model:
[0052] Step 2.4.2: The input to the model is: hyperparameters α and β, word pair set B, and number of topics K;
[0053] Step 2.4.3: Model output: Document-topic distribution θ, Topic-vocabulary distribution
[0054] Step 2.4.4: The document production process for the BTM model is as follows:
[0055] Step 2.4.5: For the entire corpus, generate topic distributions using the Dirichlet distribution.
[0056] Step 2.4.6: For each topic, generate a topic-vocabulary distribution using the Dirichlet distribution. and
[0057] Step 2.4.7: Iterate through each word pair in the word pair set and execute:
[0058] Step 2.4.8: For word pair C n =(w i ,w j Assign a topic:
[0059] Step 2.4.9: From Topic Z n Extracting words from the text
[0060] Step 2.4.10: From Topic Z n Extracting words from the text
[0061] Step 2.4.11: Based on the modeling and calculation process, the Gibbs sampling algorithm is used to solve for the target distributions: text-topic distribution θ and topic-vocabulary distribution.
[0062] Step 2.4.12: Perform Gibbs sampling on the results:
[0063]
[0064]
[0065] Step 2.4.13: Based on the sampling results, derive the probability formula.
[0066]
[0067]
[0068] Step 2.5: Based on the method described in Step 2.4, train on samples with known labels to construct a feature word vocabulary;
[0069] As a preferred embodiment of the present invention, step 3 includes the following steps 3.1 to 3.2:
[0070] Step 3.1: Input the dataset processed in Step 1.2 into the feature extraction module to learn the semantic information of the sample text. First, encode the result after word encoding in Step 1.2 using sin and cos functions according to the following formula:
[0071] PE(pos,2i)=sin(pos / 10000 2i / embed )
[0072] PE(pos,2i+1)=cos(pos / 10000 2i / embed )
[0073] In the above formula, pos refers to the position of a word in the sentence, 2i and 2i+1 are the odd and even positions of the word, and embedding_dim is the dimension of the word vector. The resulting encoded vector X embed The dimension is [seq_length, embedding_dim], and then proceed to step 3.2.
[0074] Step 3.2: Input the encoded vector obtained in Step 3.1 into the Transformer feature extractor model to learn semantic information, and obtain the semantic vector X. Text The dimension is [seq_length, embedding_dim].
[0075] Step 3.3: Convolve the data using convolution kernels with widths of 2, 3, and 4 and length equal to the sentence length, respectively. Perform max pooling and full connection on the results to obtain the dimension [batch_size, embedding_dim]. After changing the dimension through global average pooling, obtain the dimension [batch_size, seq_length, embedding_dim]. Finally, perform residual connection and layer normalization on the results, and then proceed to step 4.1.
[0076] In practical applications, step 4 includes the following steps 4.1 to 4.2:
[0077] Step 4.1: In a deep learning-based domain expert classification and recommendation method, calculate the semantic vector X of the sample text obtained in step 3.2. Text The similarity between the feature words and each tag feature word in the feature word vocabulary obtained in step 2.4 is calculated as follows:
[0078]
[0079] Step 4.2: Normalize the one-hot encoded label results obtained in Step 4.1 using the sigmoid function. The calculation formula is shown below:
[0080]
[0081] Thus, the final label Y is obtained. Lable .
[0082] Step 5: In the output layer, input the semantic vector X from step 3.2. Text A fully connected neural network with one layer is input as a classifier to classify and obtain the predicted label Y. Text As the output of the model classification, its dimension is [n_l], and its calculation formula is as follows:
[0083]
[0084] The test label Y is obtained by calculating the classifier using Cross Entropy Loss. Text And the mixed learning label Y obtained in step 4.2 Lable As the cross-entropy loss, its calculation formula is as follows:
[0085]
[0086] The parameters are updated via backpropagation using the Adam optimizer, and the value of the loss function on the validation set is calculated after each parameter update.
[0087] Compared with the prior art, the significant advantages of the present invention are:
[0088] (1) A topic model that integrates TF-IDF prior knowledge is proposed to extract topic words and construct a feature word vocabulary.
[0089] This method integrates TF-IDF prior knowledge into the BTM topic model to extract feature words, effectively improving text clustering performance by enriching the information of each label with feature words. Utilizing TF-IDF prior knowledge allows for a better understanding of topic distribution, identification and interpretation of clustering results, and elimination of similarities between labels. Furthermore, by comparing clustering results under different TF-IDF weights, performance evaluation and parameter adjustment are possible.
[0090] (2) Introducing the Attention mechanism into the BTM model can make the model pay more attention to the more important parts or keywords in the text, thereby improving the clustering or classification effect of the model.
[0091] Improving model accuracy: By introducing the Attention mechanism, the model can focus more on important content in the text, such as specific topics or keywords, thereby enabling more accurate classification and clustering of the text.
[0092] Enhanced processing of long texts: For long text data, the Attention mechanism helps the model process key information more precisely, improving its ability to handle long texts. Improved model interpretability: Through the Attention mechanism, the model can focus on certain words or phrases when processing text, which have a significant impact on the model's output. Incorporating this information into the model's output enhances its interpretability and allows for a better understanding of the model's decision-making criteria.
[0093] Improve model transferability: By adding an attention mechanism, the model can focus on different features in different datasets when processing different text data, thus making the model more universal and transferable. Attached Figure Description
[0094] Figure 1 This is a flowchart of a domain expert classification and recommendation method based on deep learning, as described in an embodiment of the present invention.
[0095] Figure 2 This is a schematic diagram of the Attention structure to which this invention pertains.
[0096] Figure 3 This is a schematic diagram of the topic extraction module described in this invention.
[0097] Figure 4 This is a schematic diagram of the convolutional neural network to which this invention pertains. Detailed Implementation
[0098] The following description is intended to disclose the present invention and enable those skilled in the art to implement it. The preferred embodiments described below are merely examples, and other obvious variations will occur to those skilled in the art. The basic principles of the invention defined in the following description can be applied to other embodiments, modifications, improvements, equivalents, and other technical solutions that do not depart from the spirit and scope of the invention.
[0099] A domain expert classification and recommendation method based on deep learning includes:
[0100] Step 1: Crawl and clean text data samples from experts in the technical field, create an expert database, generate relevant word vectors, and divide them into training, validation, and test sets according to proportions.
[0101] In practical applications, step 1 includes the following steps:
[0102] Step 1.1: Clean the crawled data D1, perform word segmentation, and then perform word segmentation processing based on a specific domain to obtain V1;
[0103] Step 1.2: Use the Stanford GloVe open-source code to create word vectors for data D1, and add the necessary special characters, such as "#PAD#" and "#UNK#", to assist in classification. Add "#PAD#", "#UNK#", "#CLS#", "#SEP#", "#MASK#", "#NUM#", etc. as needed.
[0104] In practical applications, since computers are ineffective at classifying and recognizing numbers, a symbol "#NUM#" is defined to replace numbers encountered during the classification process. The dictionary range 0-19 is reserved for these custom symbols, with the actual dictionary encoding starting from 20. The dictionary is constructed using code, and the defined symbols are shown in Table 1 below.
[0105] Table 1. Explanation of Special Symbols
[0106]
[0107]
[0108] Step 1.3: The format of each data point is [label, content]. Visualize the length of each sample to obtain the maximum sentence length for model processing. Divide the training set, test set, and validation set according to a 6:2:2 ratio. Here, we take the field of chemical science as an example.
[0109] Table 2 Category Labels - Research Fields
[0110]
[0111] Step 1.4: Create an expert database. Each data entry represents an expert's name, papers, patents, and research projects, as shown in Table 2.
[0112] Table 2 Expert Information Table
[0113] serial number Name patent paper Research projects
[0114] Step 2: Use the word vectors from Step 1 to encode the sample labels and text, and use the topic extraction module to build an expert vocabulary.
[0115] Step 2.1: Encode D1 using the word vectors from Step 1.1, then proceed to Step 2.2;
[0116] Step 2.2: Using the encoded samples from Step 2.1, calculate the self-attention coefficient for each word in the sentence, and then proceed to Step 2.3;
[0117] Step 2.2.1: First, perform a linear transformation on the word vectors of the text to obtain three matrices: Q, K, and V. Multiply matrices Q and K to calculate the relevance matrix A. Where Q, K, and V are...
[0118] Step 2.2.2: Normalize the correlation matrix A using Softmax to obtain A', and multiply A' by V to obtain the final weight matrix Y.
[0119]
[0120] Step 2.3: Using the encoded samples from Step 2.1, perform word frequency statistics to filter out noisy words for each sample, calculate the semantic similarity of each word, and then proceed to Step 2.4;
[0121]
[0122] χ i=TF-IDF*λ i,j
[0123] χ j =TF-IDF*λ i,j
[0124] n i,j This indicates the number of times the word appears in the document, n. k,j |D| represents the total number of occurrences of all words in the document, and |D| represents the total number of texts in the corpus containing the word t. i The number of texts is used to incorporate vocabulary information as prior knowledge when extracting word pairs.
[0125] Step 2.4: Based on the vocabulary self-attention coefficients obtained in Step 2.2 and the vocabulary semantic similarity in Step 2.3, extract the topic words to obtain the topic words for each sample, and construct the feature word entries of the labels according to the corresponding labels, and then proceed to Step 2.5;
[0126] Step 2.4.1: The calculation process in the topic extraction module:
[0127] Step 2.4.2: The input to the model is: hyperparameters α and β, word pair set B, and number of topics K;
[0128] Step 2.4.3: Model output: Document-topic distribution θ, Topic-vocabulary distribution
[0129] Step 2.4.4: The document production process for the BTM model is as follows:
[0130] Step 2.4.5: For the entire corpus, generate topic distributions using the Dirichlet distribution.
[0131] Step 2.4.6: For each topic, generate a topic-vocabulary distribution using the Dirichlet distribution. and
[0132] Step 2.4.7: Iterate through each word pair in the word pair set and execute:
[0133] Step 2.4.8: For word pair C n =(w i ,w j Assign a topic:
[0134] Step 2.4.9: From Topic Z n Extracting words from the text
[0135] Step 2.4.10: From Topic Z nExtracting words from the text
[0136] Step 2.4.11: Based on the modeling and calculation process, the Gibbs sampling algorithm is used to solve for the target distributions: text-topic distribution θ and topic-vocabulary distribution.
[0137] Step 2.4.12: Perform Gibbs sampling on the results:
[0138]
[0139]
[0140] Step 2.4.13: Based on the sampling results, derive the probability formula.
[0141]
[0142]
[0143] Step 2.5: Based on the method described in Step 2.4, train on samples with known labels to construct a feature word vocabulary.
[0144] Step 3: Input the dataset processed in Step 1 into the domain expert classification module, and perform auxiliary classification by combining it with the expert vocabulary in Step 2.
[0145] Step 3.1: Input the dataset processed in Step 1.2 into the feature extraction module to learn the semantic information of the sample text. First, encode the result after word encoding in Step 1.2 using sin and cos functions according to the following formula:
[0146] PE(pos,2i)=sin(pos / 10000 2i / embed )
[0147] PE(pos,2i+1)=cos(pos / 10000 2i / embed )
[0148] In the above formula, pos refers to the position of a word in the sentence, 2i and 2i+1 are the odd and even positions of the word, and embedding_dim is the dimension of the word vector. The resulting encoded vector X embed The dimension is [seq_length, embedding_dim], and then proceed to step 3.2.
[0149] Step 3.2: Input the encoded vector obtained in Step 3.1 into the Transformer feature extractor model to learn semantic information, and obtain the semantic vector X. TextThe dimension is [seq_length, embedding_dim].
[0150] Step 3.3: Convolve the data using convolution kernels with widths of 2, 3, and 4 and length equal to the sentence length, respectively. Perform max pooling and full connection on the results to obtain the dimension [batch_size, embedding_dim]. After changing the dimension through global average pooling, obtain the dimension [batch_size, seq_length, embedding_dim]. Finally, perform residual connection and layer normalization on the results, and then proceed to step 4.1.
[0151] Step 4: Iteratively train and calculate the cross-entropy loss. Update the parameters through backpropagation using the Adam optimizer. After each parameter update, calculate the value of the loss function on the validation set.
[0152] Step 4.1: In a deep learning-based domain expert classification and recommendation method, calculate the semantic vector X of the sample text obtained in step 3.2. Text The similarity between the feature words and each tag feature word in the feature word vocabulary obtained in step 2.4 is calculated as follows:
[0153]
[0154] Step 4.2: Normalize the one-hot encoded label results obtained in Step 4.1 using the sigmoid function. The calculation formula is shown below:
[0155]
[0156] Thus, the final label Y is obtained. Lable .
[0157] Step 5: Train the model and adjust parameters such as the learning rate and the number of hidden layers to optimize F1.
[0158] Step A: Obtain the sample text as in Step 1.
[0159] Step B: Follow Step 3 to obtain the domain expert category of the sample text from Step A.
[0160] Step C: Based on the domain expert categories in Step B, recommend relevant domain experts from the expert database built in Step 1.
[0161] Step 5: In the output layer, input the semantic vector X from step 3.2. Text A fully connected neural network with one layer is input as a classifier to classify and obtain the predicted label Y. Text As the output of the model classification, its dimension is [n_l], and its calculation formula is as follows:
[0162]
[0163] The test label Y is obtained by calculating the classifier using Cross Entropy Loss. Text and the one-hot encoded label Y obtained in step 4.2 Lable As the cross-entropy loss, its calculation formula is as follows:
[0164]
[0165] The parameters are updated via backpropagation using the Adam optimizer, and the value of the loss function on the validation set is calculated after each parameter update.
[0166] Table 3 Variable Definitions
[0167]
[0168]
[0169] The embodiments of the present invention have been described in detail above with reference to the accompanying drawings. However, the present invention is not limited to the above embodiments. Within the scope of knowledge possessed by those skilled in the art, various changes can be made without departing from the spirit of the present invention.
Claims
1. A deep learning-based field expert classification recommendation method, characterized in that, include: Step 1: Scrape and clean text data samples from technical experts, create an expert database, generate relevant word vectors, and divide the data into training, validation, and test sets according to a set ratio; Step 2: Use the word vectors from Step 1 to encode the sample labels and text, and use the topic keyword extraction module to build an expert vocabulary; Step 3: Input the dataset processed in Step 1 into the topic extraction module, and use the expert vocabulary from Step 2 for auxiliary classification; Step 4: Iteratively train and calculate the cross-entropy loss. Update the parameters through backpropagation using the Adam optimizer. After each parameter update, calculate the value of the loss function on the validation set. Step 5: Train the model, adjust the learning rate and the number of hidden layers to achieve optimal F1 score. Step A: Obtain the sample text as in Step 1. Step B: Following step 3, obtain the domain expert category of the sample text from step A. Step C: Based on the expert research categories in Step B, recommend experts in relevant research fields from the expert database constructed in Step 1; Step 6: For expert requests, first use a classification model to classify the document to obtain the domain category, then calculate the similarity with domain experts in the expert database to obtain the final recommendation result. Step 1 includes, Step 1.1: All samples are used to create word vectors using the Stanford GloVe open-source word vector code. Add the required special characters, such as "#PAD#" and "#UNK#", to assist in classification. Add "#PAD#", "#UNK#", "#CLS#", "#SEP#", "#MASK#", and "#NUM#" as needed. Step 1.2: The format of each data point is [label, content]. Visualize the length of each sample to obtain the maximum sentence length that the model can process. Divide the training set, test set, and validation set in a 6:2:2 ratio. Step 1.3: Create an expert database; Step 3 includes: Step 3.1: Inputting the dataset processed in Step 1.2 into the feature extraction module to learn the semantic information of the sample text. First, the result after word encoding in Step 1.2 is positionally encoded using sin and cos functions according to the following formula: In the above formula This refers to the position of words in a sentence. and It refers to the position of the character in the odd / even position. It is the dimension of the word vector, resulting in the encoded vector. Dimensions Then proceed to step 3.2; Step 3.2: The code vector obtained in step 3.1 is input into a feature extractor Transformer model to learn semantic information, and a semantic vector is obtained dimension is ; Step 3.3: Convolve the data using convolution kernels with widths of 2, 3, and 4, and length equal to the sentence length. Perform max pooling and full connection on the results to obtain the dimension [batch_size, embedding_dim]. After changing the dimension through global average pooling, obtain the dimension [batch_size, seq_length, embedding_dim]. Finally, perform residual connection and layer normalization on the results.
2. The domain expert classification and recommendation method based on deep learning as described in claim 1, characterized in that, Step 2 includes, Step 2.1: Encode all samples using the word vectors from Step 1.1, then proceed to Step 2.2; Step 2.2: Using the encoded samples from Step 2.1, calculate the self-attention coefficient for each word in the sentence, and then proceed to Step 2.3; Step 2.3: Using the encoded samples from Step 2.1, perform word frequency statistics to filter out noisy words for each sample, calculate the semantic similarity of each word, and then proceed to Step 2.4; Step 2.4: Based on the vocabulary self-attention coefficients obtained in Step 2.2 and the vocabulary semantic similarity in Step 2.3, extract the topic words to obtain the topic words for each sample, and construct the feature word entries of the labels according to the corresponding labels, and then proceed to Step 2.5; Step 2.5: Based on the sample topic words and label feature word entries constructed in Step 2.4, train the sample with known labels to construct a feature word vocabulary.
3. The domain expert classification and recommendation method based on deep learning as described in claim 2, characterized in that, Step 2.2 includes, Step 2.2.1: Firstly, the word vectors of the text are subjected to linear transformation to obtain three matrices Q, K, and V. The correlation matrix A is calculated by multiplying the Q and K matrices, wherein Q, ; Step 2.2.2: Normalize the correlation matrix A by Softmax to get A', multiply A' with V to get the final weight matrix Y, V. 4.The deep learning-based domain expert classification recommendation method of claim 1, wherein, In step 2.4: Step 2.4.1: The calculation process in the BTM model; Step 2.4.2: The input to the model is: hyperparameters α and β, word pair set B, and number of topics k; Step 2.4.3: Model output: Text-topic distribution θ, Topic-vocabulary distribution ; Step 2.4.4: The document production process for the BTM model is as follows: Step 2.4.5: For the entire corpus, generate a text-topic distribution using the Dirichlet distribution. ; Step 2.4.6: For each topic, generate a topic-vocabulary distribution using the Dirichlet distribution. and ; Step 2.4.7: Traverse each word pair in the word pair set and execute: Step 2.4.8: Assign a topic to the word pair Assign a topic: Step 2.4.9: Extracting words from the topic Step 2.4.10: Extracting words from the topic Step 2.4.11 : Model-based computation of the process employs Gibbs sampling algorithm to solve the target distribution: text-topic distribution , topic- vocabulary distribution , Step 2.4.12: Perform Gibbs sampling on the results: Step 2.4.13: Based on the sampling results, derive the probability formula.
5. The domain expert classification and recommendation method based on deep learning as described in claim 4, characterized in that, Step 3 further includes step 3.1: Inputting the domain expert data processed in step 1.2 into the feature extraction module, first encoding the input text using the word vectors created in step 1.1, then performing positional encoding to obtain the vectorized sample text, and then proceeding to step 3.2; Step 3.2: Use convolutional neural networks and Transformer as feature extraction modules to learn the semantic information of the vectorized sample text obtained in Step 3.1, and obtain the semantic information of the sample text.
6. The domain expert classification and recommendation method based on deep learning as described in claim 1, characterized in that, The step 4 comprises step 4.1: in a deep learning-based field expert classification recommendation method, the semantic vector of the sample text obtained in step 3.2 is calculated And the similarity of each label feature word entry in the feature word vocabulary obtained in step 2.4, the similarity is calculated as shown below: Step 4.2: The one-hot encoded label results obtained from Step 4.1 are normalized using the sigmoid function, which is calculated as follows: Step 4.2: The one-hot encoded label results obtained from Step 4.1 are normalized using the sigmoid function, which is calculated as follows: Thus obtaining the final label .
7. The domain expert classification and recommendation method based on deep learning as described in claim 6, characterized in that, The semantic vector from step 3.2 is then processed at the output layer. A fully connected neural network with one layer is input as a classifier to classify and obtain predicted labels. As the output of the model classification, its dimension is The calculation formula is as follows: The predicted label from the classifier is calculated using Cross Entropy Loss and the mixed learning label obtained in step 4.2 As a cross-entropy loss, its formula is as follows: The parameters are updated via backpropagation using the Adam optimizer, and the value of the loss function on the validation set is calculated after each parameter update.
Citation Information
Patent Citations
Expert recommendation method based on big data
CN110909236A
Online question and answer community expert recommendation method based on multi-head self-attention mechanism
CN115408603A