A multi-label text classification method, device, computer equipment and storage medium
By constructing a multi-label text classification model, utilizing a combination of input, hidden, and output layers, and iteratively training with the confidence threshold of the hidden layers, the problems of data consistency and performance loss in multi-level multi-label text classification are solved, achieving more accurate text classification and a simplified training process.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- CHINA PING AN PROPERTY INSURANCE CO LTD
- Filing Date
- 2023-06-21
- Publication Date
- 2026-06-16
AI Technical Summary
Existing multi-level, multi-label text classification methods cannot guarantee the overall consistency of the data, resulting in performance loss.
A multi-label text classification method is adopted. By obtaining the input text and hierarchical labels, the initial classification model is used for vector transformation and feature extraction. Iterative training is then performed by combining the confidence threshold of the hidden layer to construct a multi-label text classification model.
It achieves more accurate multi-label text classification, simplifies the model training process, and ensures model accuracy.
Smart Images

Figure CN116775776B_ABST
Abstract
Description
Technical Field
[0001] This application belongs to the field of artificial intelligence technology, specifically relating to a multi-label text classification method, apparatus, computer equipment, and storage medium. Background Technology
[0002] Multi-level, multi-label text classification is a text classification problem involving multi-level label systems. Different levels within its label tree structure are dependent on each other, and the number of labels at each level may vary. For example, in the car insurance intent recognition requirement of a car owner service business, car insurance labels are divided into five levels. The first level includes two labels: "Compulsory Traffic Accident Liability Insurance" and "Commercial Insurance." The second level, "Commercial Insurance," includes nine sub-labels such as "Vehicle Damage Insurance" and "Theft Insurance." Other levels also contain multiple sub-labels. Using traditional single-label classification methods or methods that transform this multi-level, multi-label classification problem may fail to guarantee overall data consistency and could lead to performance losses. Summary of the Invention
[0003] The purpose of this application is to propose a multi-label text classification method, apparatus, computer device, and storage medium to solve the technical problem that existing multi-level, multi-label classification methods cannot guarantee the overall consistency of data and may lead to performance loss.
[0004] To address the aforementioned technical problems, this application provides a multi-label text classification method, employing the following technical solution:
[0005] A multi-label text classification method includes:
[0006] Obtain the input text and the corresponding hierarchical tags of the input text;
[0007] The input text and the hierarchical labels are imported into a preset initial classification model, wherein the initial classification model includes an input layer, several hidden layers and an output layer;
[0008] The input text is transformed into an input vector through the input layer;
[0009] By extracting local features of the input text from the input vector through several hidden layers, a local feature relationship tree of the input text is obtained;
[0010] The local feature relationship tree of the input text is transformed by the output layer to obtain the multi-label classification result of the input text;
[0011] Based on the multi-label classification results of the input text, the hierarchical labels, and the preset hidden layer confidence threshold, the initial classification model is iterated until the model is fitted, thus obtaining a multi-label text classification model.
[0012] Receive a text classification instruction and obtain the text to be classified corresponding to the text classification instruction;
[0013] The text to be classified is imported into the multi-label text classification model, and the multi-label classification result of the text to be classified is output.
[0014] Furthermore, the hidden layers are arranged in a hierarchical structure. The step of extracting local features of the input text from the input vector through each of the hidden layers to obtain a local feature relation tree of the input text specifically includes:
[0015] The hierarchical hidden layers each extract local features of the input text from the input vector.
[0016] The extracted local features of the input text are organized using a tree structure to obtain a local feature relationship tree of the input text.
[0017] Furthermore, the hierarchical hidden layers each extract local features of the input text from the input vector, specifically including:
[0018] The root node features corresponding to the local feature relationship tree are extracted through the first hidden layer;
[0019] Starting from the root node features, the hidden layers with a hierarchical structure are used to sequentially extract all leaf node features of the local feature relation tree from the input vector.
[0020] Further, the initial classification model is iterated based on the multi-label classification results of the input text, the hierarchical labels, and the preset hidden layer confidence threshold until the model is fitted, thereby obtaining a multi-label text classification model, specifically including:
[0021] The classification error is obtained by calculating the error between the multi-label classification result based on the input text and the hierarchical labels.
[0022] Compare the classification error with a preset error threshold;
[0023] When the classification error exceeds a preset error threshold, the initial classification model is iteratively updated.
[0024] During the iterative update of the initial classification model, the confidence of several hidden layers is continuously monitored, and hidden layers whose confidence reaches the hidden layer confidence threshold are iteratively frozen.
[0025] When the classification error is less than or equal to a preset error threshold, the multi-label text classification model is obtained.
[0026] Furthermore, during the iterative update of the initial classification model, the confidence levels of several hidden layers are continuously monitored, and hidden layers whose confidence levels reach the hidden layer confidence threshold are iteratively frozen, specifically including:
[0027] During the iterative update of the initial classification model, the output results of the iterative update of the initial classification model are obtained;
[0028] Calculate the confidence level of each hidden layer based on the output results;
[0029] Compare the confidence level of each hidden layer with the confidence threshold of the hidden layer;
[0030] When the confidence level of the hidden layer reaches the confidence threshold of the hidden layer, the hidden layer is iteratively frozen.
[0031] Furthermore, after calculating the error between the multi-label classification result based on the input text and the hierarchical labels to obtain the classification error, the method further includes:
[0032] The classification error is propagated through the input layer, several hidden layers, and output layer of the initial classification model using the backpropagation algorithm, resulting in the input layer error, several hidden layer errors, and output layer error.
[0033] The comparison of the classification error with a preset error threshold specifically includes:
[0034] The input layer error, several hidden layer errors, and the output layer error are compared with a preset error threshold, and the error comparison result is output.
[0035] Furthermore, the step of importing the text to be classified into the multi-label text classification model and outputting the multi-label classification result of the text to be classified specifically includes:
[0036] Import the text to be classified into the multi-label text classification model;
[0037] The text to be classified is vectorized by the input layer of the multi-label text classification model to obtain the text vector to be classified.
[0038] The local features of the text to be classified are extracted from the text vector by several hidden layers of the multi-label text classification model, and the local feature relationship tree of the text to be classified is obtained.
[0039] The local feature relationship tree of the text to be classified is transformed by the output layer of the multi-label text classification model to obtain the multi-label classification result of the text to be classified.
[0040] To address the aforementioned technical problems, this application also provides a multi-label text classification device, which employs the following technical solution:
[0041] A multi-label text classification device includes:
[0042] The input text acquisition module is used to acquire the input text and the corresponding hierarchical tags of the input text;
[0043] The input text import module is used to import the input text and the hierarchical labels into a preset initial classification model, wherein the initial classification model includes an input layer, several hidden layers and an output layer;
[0044] The vector conversion module is used to perform vector conversion on the input text through the input layer to obtain an input vector;
[0045] The feature extraction module is used to extract local features of the input text from the input vector through several hidden layers, and obtain a local feature relation tree of the input text.
[0046] The classification output module is used to transform the local feature relationship tree of the input text through the output layer to obtain the multi-label classification result of the input text;
[0047] The model iteration module is used to iterate the initial classification model based on the multi-label classification results of the input text, the hierarchical labels, and the preset hidden layer confidence threshold until the model is fitted, thereby obtaining a multi-label text classification model.
[0048] The classification instruction module is used to receive text classification instructions and obtain the text to be classified corresponding to the text classification instructions.
[0049] The text classification module is used to import the text to be classified into the multi-label text classification model and output the multi-label classification result of the text to be classified.
[0050] To address the aforementioned technical problems, this application also provides a computer device that employs the following technical solution:
[0051] A computer device includes a memory and a processor, the memory storing computer-readable instructions, the processor executing the computer-readable instructions to implement the steps of the multi-label text classification method as described in any of the preceding claims.
[0052] To address the aforementioned technical problems, this application also provides a computer-readable storage medium, employing the technical solution described below:
[0053] A computer-readable storage medium storing computer-readable instructions, which, when executed by a processor, implement the steps of the multi-label text classification method as described in any one of the preceding descriptions.
[0054] Compared with the prior art, the embodiments of this application have the following main advantages:
[0055] This application discloses a multi-label text classification method, apparatus, computer device, and storage medium, belonging to the field of artificial intelligence technology. By acquiring input text and its corresponding hierarchical labels, the input text and hierarchical labels are imported into a preset initial classification model. The initial classification model includes an input layer, several hidden layers, and an output layer. The input layer transforms the input text into a vector, obtaining an input vector. The hidden layers extract local features of the input text from the input vector, obtaining a local feature relation tree. The output layer transforms the local feature relation tree to obtain the multi-label classification result of the input text. Based on the multi-label classification result, hierarchical labels, and a preset hidden layer confidence threshold, the initial classification model is iterated until the model is fitted, resulting in a multi-label text classification model. The application receives a text classification instruction, acquires the text to be classified corresponding to the instruction, imports the text to be classified into the multi-label text classification model, and outputs the multi-label classification result of the text to be classified. This application, by training a multi-label text classification model, can extract multi-level features of text, thereby performing more accurate multi-label text classification. Furthermore, when training a multi-label text classification model, the model is iterated by combining the confidence threshold of the hidden layer, which makes the model training process simpler while ensuring the model accuracy. Attached Figure Description
[0056] To more clearly illustrate the solutions in this application, the accompanying drawings used in the description of the embodiments of this application will be briefly introduced below. Obviously, the accompanying drawings described below are some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0057] Figure 1 An exemplary system architecture diagram is shown, in which this application can be applied;
[0058] Figure 2 A flowchart of an embodiment of the multi-label text classification method according to this application is shown;
[0059] Figure 3 A schematic diagram of the structure of one embodiment of the multi-label text classification apparatus according to this application is shown;
[0060] Figure 4 A schematic diagram of the structure of one embodiment of a computer device according to this application is shown. Detailed Implementation
[0061] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application pertains; the terminology used herein in the specification of the application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "comprising" and "having," and any variations thereof, in the specification, claims, and foregoing drawings of this application, are intended to cover non-exclusive inclusion. The terms "first," "second," etc., in the specification, claims, or foregoing drawings of this application are used to distinguish different objects, not to describe a particular order.
[0062] In this document, the term "embodiment" means that a particular feature, structure, or characteristic described in connection with an embodiment may be included in at least one embodiment of this application. The appearance of this phrase in various places throughout the specification does not necessarily refer to the same embodiment, nor is it a separate or alternative embodiment mutually exclusive with other embodiments. It will be explicitly and implicitly understood by those skilled in the art that the embodiments described herein can be combined with other embodiments.
[0063] To enable those skilled in the art to better understand the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings.
[0064] like Figure 1 As shown, system architecture 100 may include terminal devices 101, 102, and 103, a network 104, and a server 105. Network 104 serves as the medium for providing communication links between terminal devices 101, 102, and 103 and server 105. Network 104 may include various connection types, such as wired or wireless communication links, or fiber optic cables, etc.
[0065] Users can use terminal devices 101, 102, and 103 to interact with server 105 via network 104 to receive or send messages, etc. Various communication client applications can be installed on terminal devices 101, 102, and 103, such as web browser applications, shopping applications, search applications, instant messaging tools, email clients, social media platform software, etc.
[0066] Terminal devices 101, 102, and 103 can be various electronic devices with displays and support web browsing, including but not limited to smartphones, tablets, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer III), MP4 players (Moving Picture Experts Group Audio Layer IV), laptops, and desktop computers, etc.
[0067] Server 105 can be a server that provides various services, such as a backend server that supports the pages displayed on terminal devices 101, 102, and 103. The server can be a standalone server or a cloud server that provides basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content delivery networks (CDN), and big data and artificial intelligence platforms.
[0068] It should be noted that the multi-label text classification method provided in this application embodiment is generally executed by a server, and correspondingly, the multi-label text classification device is generally set in the server.
[0069] It should be understood that Figure 1 The number of terminal devices, networks, and servers shown is merely illustrative. Depending on implementation needs, any number of terminal devices, networks, and servers can be included.
[0070] Continue to refer to Figure 2 The diagram illustrates a flowchart of an embodiment of the multi-label text classification method according to this application. Embodiments of this application can acquire and process relevant data based on artificial intelligence technology. Artificial intelligence (AI) refers to the theories, methods, technologies, and application systems that utilize digital computers or machines controlled by digital computers to simulate, extend, and expand human intelligence, perceive the environment, acquire knowledge, and use that knowledge to obtain optimal results.
[0071] Foundational technologies for artificial intelligence generally include sensors, dedicated AI chips, cloud computing, distributed storage, big data processing, operating / interactive systems, and mechatronics. AI software technologies mainly encompass computer vision, robotics, biometrics, speech processing, natural language processing, and machine learning / deep learning.
[0072] Multi-level, multi-label text classification is a text classification problem involving multi-level label systems. Different levels of the label tree structure are dependent on each other, and the number of labels at each level may vary. For this type of classification problem, using traditional single-label classification methods or methods that transform it into multiple single-level label classification problems may not guarantee overall data consistency and may also lead to performance losses.
[0073] This application enables the extraction of multi-level features from text through training a multi-label text classification model, thereby achieving more accurate multi-label text classification. Furthermore, during the training of the multi-label text classification model, iterative model training is performed using a hidden layer confidence threshold, simplifying the training process while maintaining model accuracy.
[0074] The multi-label text classification method includes the following steps:
[0075] S201, Obtain the input text and the corresponding level label of the input text.
[0076] In this embodiment, the input text serves as training data for training a multi-label text classification model. The server acquires the input text and its corresponding hierarchical labels. This involves preprocessing the input text and partitioning the dataset, including word segmentation, text cleaning, and feature extraction. The processed input text is then divided into training, validation, and test sets according to a specific ratio for subsequent model training. The hierarchical labels corresponding to the input text are organized in a tree structure, where each node represents a label and has a unique identifier and label name. For each node, the identifiers of its parent and child nodes need to be recorded. The hierarchical labels corresponding to the input text are used to calculate the model's classification error for model iteration.
[0077] It should be noted that a hierarchical tree structure of labels needs to be predefined, and each label needs to be encoded, such as one-hot encoding. For each input text, its corresponding label encoding can be converted into a label tree, and each node in the label tree corresponds to a label.
[0078] S202, import the input text and hierarchical labels into the preset initial classification model, wherein the initial classification model includes an input layer, several hidden layers and an output layer.
[0079] In multi-level, multi-label text classification problems, each label is treated as a binary classification problem, and the set of all labels is treated as a multi-class classification problem. For multi-level labels, a tree structure can be used to organize the labels, where each node represents a label and the child nodes of each node are its child labels.
[0080] In this embodiment, the initial classification model can be a deep neural network (DNN) or a convolutional neural network (CNN). This application uses a DNN model as an example. For a DNN, fully connected layers can be used to construct the model. Specifically, each input text is converted into a vector, and then the vector is input into the DNN. The DNN includes several hidden layers, each corresponding to a node in the label tree, used to extract the local features of the text corresponding to the node. For each level of label, a separate node needs to be set up in the output layer of the DNN to output the probability of that label. A sigmoid or softmax activation function can be used to convert the node's output into a probability value, thereby achieving the classification of the corresponding label.
[0081] S203, the input text is transformed into a vector through the input layer to obtain the input vector.
[0082] In this embodiment, a bag-of-words model can be preset in the input layer. The input text is then transformed into a vector using the bag-of-words model to obtain the input vector.
[0083] S204 extracts local features of the input text from the input vector through several hidden layers to obtain a local feature relationship tree of the input text.
[0084] In this embodiment, the DNN can use fully connected layers to build the model and a hierarchical structure to arrange several hidden layers. The local features of the input text are extracted from the input vector through the several hidden layers to obtain the local feature relationship tree of the input text.
[0085] Extracting multi-level features from text by using multiple hidden layers arranged in a hierarchical structure can be achieved by stacking multiple hidden layers. In a DNN network, each hidden layer can take the output of the previous layer as input and extract higher-level features by adjusting the weights and biases. Therefore, as the number of hidden layers increases, the network can extract more complex and abstract features, thus better capturing the multi-level features of text data.
[0086] For example, one or more hidden layers can be added, each used to extract more abstract and higher-level local features. By stacking multiple hidden layers, multi-level features of text data can be extracted step by step, thus enabling more accurate classification.
[0087] Furthermore, local features of the input text are extracted from the input vector through several hidden layers to obtain a local feature relation tree of the input text, specifically including:
[0088] Several hidden layers with a hierarchical structure extract local features of the input text from the input vector;
[0089] A tree structure is used to organize the extracted local features of the input text, resulting in a local feature relationship tree of the input text.
[0090] In this embodiment, the server extracts local features of the input text sequentially from the input vector based on a tree structure through several hidden layers, and uses a tree structure to organize the extracted local features of the input text to obtain a local feature relationship tree of the input text. By setting several hidden layers with a hierarchical structure, the model can easily extract multi-dimensional features from the text.
[0091] Furthermore, several hidden layers employing a hierarchical structure extract local features of the input text from the input vector, specifically including:
[0092] The root node features corresponding to the local feature relation tree are extracted through the first hidden layer;
[0093] Starting from the root node features, several hidden layers with a hierarchical structure are used to sequentially extract the leaf node features of the local feature tree from the input vector.
[0094] In this embodiment, the root node features corresponding to the local feature relationship tree are first extracted through the first hidden layer. Then, starting from the root node features, the entire input vector is traversed, and several hidden layers with a hierarchical structure are used to extract all leaf node features of the local feature relationship tree from the input vector in turn.
[0095] By employing a hierarchical structure of a label tree to arrange several hidden layers, the model's generalization ability can be enhanced by passing the prediction results of each node to its child nodes. Specifically, during prediction, the root node can be predicted first, and then the prediction result can be passed to its child nodes, and so on, until the leaf nodes. This ensures that the prediction results of child nodes are influenced by their parent nodes, thereby improving the accuracy and stability of the entire model.
[0096] S205 transforms the local feature relationship tree of the input text through the output layer to obtain the multi-label classification result of the input text.
[0097] In this embodiment, the output layer has multiple output nodes, each corresponding to a tree node in the local feature relation tree. The server uses the output nodes of the output layer to transform the tree nodes in the local feature relation tree of the input text, thereby obtaining the multi-label classification result of the input text.
[0098] For each label at each level, a separate node needs to be set up in the output layer of the DNN to output the probability of that label. The sigmoid or softmax activation function can be used to convert the output of the node into a probability value, thereby achieving the classification of the corresponding label.
[0099] S206. Based on the multi-label classification results of the input text, hierarchical labels, and preset hidden layer confidence thresholds, the initial classification model is iterated until the model is fitted, thus obtaining a multi-label text classification model.
[0100] In this embodiment, the server calculates the classification error based on the multi-label classification results and hierarchical labels of the input text, and iterates the initial classification model by passing the classification error and comparing the error. At the same time, the hidden layer iteration is controlled by the hidden layer confidence threshold during the iteration process until the model is fitted, and a multi-label text classification model is obtained.
[0101] It should be noted that when training the model, cross-entropy can be used as the loss function, and the backpropagation algorithm can be used to update the model parameters. At the same time, regularization and other methods can be used to prevent the model from overfitting. For imbalanced label data, resampling, weighting and other methods can be used to balance the label distribution to improve the classification performance of the model.
[0102] Furthermore, based on the multi-label classification results of the input text, hierarchical labels, and preset hidden layer confidence thresholds, the initial classification model is iterated until the model is fitted, resulting in a multi-label text classification model, specifically including:
[0103] The classification error is obtained by calculating the error between the multi-label classification result based on the input text and the hierarchical labels.
[0104] Compare the classification error with the preset error threshold;
[0105] When the classification error exceeds the preset error threshold, the initial classification model is iteratively updated.
[0106] During the iterative update of the initial classification model, the confidence of several hidden layers is continuously monitored, and hidden layers whose confidence reaches the hidden layer confidence threshold are iteratively frozen.
[0107] When the classification error is less than or equal to the preset error threshold, a multi-label text classification model is obtained.
[0108] In this embodiment, the server calculates the error between the multi-label classification result and the hierarchical labels based on the input text using the loss function of the initial classification model, obtains the classification error, compares the classification error with a preset error threshold, and iteratively updates the initial classification model when the classification error is greater than the preset error threshold. During the iterative update of the initial classification model, the confidence of several hidden layers is continuously monitored, and hidden layers whose confidence reaches the hidden layer confidence threshold are iteratively frozen. When the classification error is less than or equal to the preset error threshold, the multi-label text classification model is obtained.
[0109] Furthermore, during the iterative update of the initial classification model, the confidence levels of several hidden layers are continuously monitored. Hidden layers whose confidence levels reach the hidden layer confidence threshold are iteratively frozen, specifically including:
[0110] During the iterative update of the initial classification model, obtain the output results of the iterative update of the initial classification model;
[0111] Calculate the confidence level of each hidden layer based on the output results;
[0112] Compare the confidence level of each hidden layer with the hidden layer confidence threshold;
[0113] When the confidence level of the hidden layer reaches the hidden layer confidence threshold, the hidden layer is iteratively frozen.
[0114] In this embodiment, during the iterative update of the initial classification model, the server obtains the output results of the iterative update of the initial classification model, calculates the confidence of each hidden layer based on the output results, compares the confidence of each hidden layer with the hidden layer confidence threshold, and iteratively freezes the hidden layer when the confidence of the hidden layer reaches the hidden layer confidence threshold.
[0115] In one specific embodiment of this application, by setting a hidden layer freezing method based on confidence, when the confidence of the hidden layer is high, the network before that layer is frozen, and only the subsequent finer-grained labels are optimized, so that the optimization direction is focused on the fine-grained labels that are difficult to judge, making the model training process simpler while ensuring model accuracy.
[0116] Furthermore, after calculating the error between the multi-label classification result and the hierarchical labels based on the input text to obtain the classification error, the following steps are also included:
[0117] The classification error is propagated through the input layer, several hidden layers, and output layer of the initial classification model using the backpropagation algorithm, resulting in the input layer error, several hidden layer errors, and output layer error.
[0118] The comparison between classification error and a preset error threshold includes:
[0119] The input layer error, several hidden layer errors, and output layer error are compared with preset error thresholds, and the error comparison results are output.
[0120] In this embodiment, the server propagates the classification error through the input layer, several hidden layers, and the output layer of the initial classification model using a preset backpropagation algorithm, obtaining the input layer error, several hidden layer errors, and the output layer error. The input layer error, several hidden layer errors, and the output layer error are then compared with preset error thresholds, and the error comparison results are output.
[0121] In one specific embodiment of this application, assuming the goal is to identify car insurance intent, a neural network model can be constructed using the following four hierarchical structures:
[0122] The first tier includes compulsory traffic accident liability insurance and commercial insurance.
[0123] The second layer includes secondary labels under compulsory traffic accident liability insurance and commercial insurance, such as vehicle damage insurance and theft insurance;
[0124] The third layer includes the three levels of labels under vehicle damage insurance and theft insurance, such as natural damage and collision damage.
[0125] The fourth layer includes four levels of labels below natural losses and collision losses, such as landslides and cliff collapses;
[0126] In this example, a four-layer hierarchical structure is used to represent the multi-level, multi-label classification problem of car insurance intent. The number of labels in each layer is different, and there are dependencies between the labels. A corresponding classifier can be introduced into the neural network model to classify the labels in each layer. At the same time, a corresponding loss function is introduced to optimize the model. During the training process, the backpropagation algorithm commonly used in deep learning can be used to train the model. The final model has multi-level classification labels, and then the car insurance intent is identified based on the multi-level classification labels.
[0127] S207, Receive text classification instruction and obtain the text to be classified corresponding to the text classification instruction.
[0128] S208: Import the text to be classified into the multi-label text classification model and output the multi-label classification result of the text to be classified.
[0129] In this embodiment, after receiving the text classification instruction, the server obtains the text to be classified corresponding to the text classification instruction, imports the text to be classified into the multi-label text classification model, and outputs the multi-label classification result of the text to be classified.
[0130] In this embodiment, the multi-label text classification method runs on an electronic device (e.g., Figure 1 The server shown can receive text classification instructions via wired or wireless connection. It should be noted that the aforementioned wireless connection methods may include, but are not limited to, 3G / 4G connections, WiFi connections, Bluetooth connections, WiMAX connections, Zigbee connections, UWB (ultra-wideband) connections, and other currently known or future wireless connection methods.
[0131] Furthermore, the text to be classified is imported into a multi-label text classification model, and the multi-label classification results of the text to be classified are output, specifically including:
[0132] Import the text to be classified into a multi-label text classification model;
[0133] The input layer of the multi-label text classification model performs vector transformation on the text to be classified, resulting in the text vector to be classified.
[0134] By extracting local features of the text to be classified from the text vector through several hidden layers of the multi-label text classification model, a local feature relation tree of the text to be classified is obtained.
[0135] The output layer of the multi-label text classification model is used to transform the local feature relationship tree of the text to be classified, so as to obtain the multi-label classification result of the text to be classified.
[0136] In this embodiment, the multi-label text classification model also includes an input layer, several hidden layers, and an output layer. The server imports the text to be classified into the multi-label text classification model. The input layer of the multi-label text classification model performs vector transformation on the text to be classified to obtain the text vector. The several hidden layers of the multi-label text classification model extract local features of the text to be classified from the text vector to obtain the local feature relation tree of the text to be classified. The output layer of the multi-label text classification model transforms the local feature relation tree of the text to be classified to obtain the multi-label classification result of the text to be classified.
[0137] In the above embodiments, this application discloses a multi-label text classification method, belonging to the field of artificial intelligence technology. By acquiring the input text and its corresponding hierarchical labels, the input text and hierarchical labels are imported into a preset initial classification model. The initial classification model includes an input layer, several hidden layers, and an output layer. The input layer performs vector transformation on the input text to obtain an input vector. Several hidden layers extract local features of the input text from the input vector to obtain a local feature relation tree. The output layer transforms the local feature relation tree to obtain the multi-label classification result of the input text. Based on the multi-label classification result of the input text, the hierarchical labels, and a preset hidden layer confidence threshold, the initial classification model is iterated until the model is fitted, resulting in a multi-label text classification model. A text classification instruction is received, the text to be classified corresponding to the text classification instruction is acquired, the text to be classified is imported into the multi-label text classification model, and the multi-label classification result of the text to be classified is output. This application, by training a multi-label text classification model, can extract multi-level features of text, thereby performing multi-label text classification more accurately. Furthermore, when training a multi-label text classification model, the model is iterated by combining the confidence threshold of the hidden layer, which makes the model training process simpler while ensuring the model accuracy.
[0138] It should be emphasized that, in order to further ensure the privacy and security of the text to be classified, the text to be classified can also be stored in a node of a blockchain.
[0139] The blockchain referred to in this application is a novel application model of computer technologies such as distributed data storage, peer-to-peer transmission, consensus mechanisms, and encryption algorithms. Essentially, a blockchain is a decentralized database, a chain of data blocks linked together using cryptographic methods. Each data block contains information about a batch of network transactions, used to verify the validity of the information (anti-counterfeiting) and generate the next block. A blockchain can include an underlying blockchain platform, a platform product service layer, and an application service layer.
[0140] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments can be implemented by instructing related hardware with computer-readable instructions. These computer-readable instructions can be stored in a computer-readable storage medium, and when executed, they can include the processes of the embodiments of the methods described above. The aforementioned storage medium can be a non-volatile storage medium such as a magnetic disk, optical disk, or read-only memory (ROM), or random access memory (RAM).
[0141] It should be understood that although the steps in the flowcharts of the accompanying figures are shown sequentially as indicated by the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some steps in the flowcharts of the accompanying figures may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily completed at the same time, but can be executed at different times, and their execution order is not necessarily sequential, but can be performed alternately or in turn with other steps or at least some of the sub-steps or stages of other steps.
[0142] Further reference Figure 3 As a response to the above Figure 2 The implementation of the method shown in this application provides an embodiment of a multi-label text classification device, which is similar to... Figure 2 Corresponding to the method embodiments shown, this device can be specifically applied to various electronic devices.
[0143] like Figure 3 As shown, the multi-label text classification device 300 described in this embodiment includes:
[0144] The input text acquisition module 301 is used to acquire the input text and the corresponding hierarchical tags of the input text.
[0145] The input text import module 302 is used to import the input text and hierarchical labels into a preset initial classification model, wherein the initial classification model includes an input layer, several hidden layers and an output layer;
[0146] The vector conversion module 303 is used to convert the input text into a vector through the input layer to obtain the input vector.
[0147] The feature extraction module 304 is used to extract local features of the input text from the input vector through several hidden layers to obtain a local feature relation tree of the input text.
[0148] The classification output module 305 is used to transform the local feature relationship tree of the input text through the output layer to obtain the multi-label classification result of the input text;
[0149] The model iteration module 306 is used to iterate the initial classification model based on the multi-label classification results of the input text, the hierarchical labels, and the preset hidden layer confidence threshold until the model is fitted, thus obtaining a multi-label text classification model.
[0150] The classification instruction module 307 is used to receive text classification instructions and obtain the text to be classified corresponding to the text classification instructions.
[0151] The text classification module 308 is used to import the text to be classified into the multi-label text classification model and output the multi-label classification result of the text to be classified.
[0152] Furthermore, the hidden layers are arranged in a hierarchical structure, and the feature extraction module 304 specifically includes:
[0153] The local feature extraction unit is used to extract local features of the input text from the input vector using several hidden layers with a hierarchical structure.
[0154] Tree-structured organizational units are used to organize the extracted local features of the input text using a tree structure, resulting in a local feature relationship tree of the input text.
[0155] Furthermore, the local feature extraction unit specifically includes:
[0156] The root node feature extraction subunit is used to extract the root node features corresponding to the local feature relation tree through the first hidden layer.
[0157] The leaf node feature extraction subunit is used to extract all leaf node features of the local feature relationship tree from the input vector sequentially, starting from the root node features, using several hidden layers with a hierarchical structure.
[0158] Furthermore, the model iteration module 306 specifically includes:
[0159] The error calculation unit is used to calculate the error between the multi-label classification result and the hierarchical label based on the input text, and obtain the classification error.
[0160] Error comparison unit, used to compare classification error with preset error threshold;
[0161] The iterative update unit is used to iteratively update the initial classification model when the classification error is greater than a preset error threshold.
[0162] The confidence processing unit is used to continuously monitor the confidence of several hidden layers during the iterative update of the initial classification model, and to iteratively freeze the hidden layers whose confidence reaches the hidden layer confidence threshold.
[0163] The model output unit is used to obtain a multi-label text classification model when the classification error is less than or equal to a preset error threshold.
[0164] Furthermore, the reliability processing unit specifically includes:
[0165] The output result acquisition sub-unit is used to obtain the output result of the iterative update of the initial classification model during the iterative update process of the initial classification model;
[0166] The confidence calculation subunit is used to calculate the confidence level of each hidden layer based on the output results;
[0167] The confidence comparison subunit is used to compare the confidence level of each hidden layer with the confidence threshold of the hidden layer.
[0168] The hidden layer freezing subunit is used to iteratively freeze the hidden layer when the confidence level of the hidden layer reaches the hidden layer confidence threshold.
[0169] Furthermore, the model iteration module 306 also includes:
[0170] The backpropagation unit is used to propagate the classification error through the input layer, several hidden layers, and output layer of the initial classification model using the backpropagation algorithm, thereby obtaining the input layer error, several hidden layer errors, and output layer error.
[0171] The error comparison unit specifically includes:
[0172] The difference comparison unit subunit is used to compare the input layer error, several hidden layer errors and the output layer error with a preset error threshold, and output the error comparison result.
[0173] Furthermore, the text classification module 308 specifically includes:
[0174] The text to be classified import unit is used to import the text to be classified into the multi-label text classification model;
[0175] The vector transformation unit is used to transform the text to be classified into vectors through the input layer of the multi-label text classification model, so as to obtain the text vector to be classified.
[0176] The local feature extraction unit is used to extract local features of the text to be classified from the text vector through several hidden layers of the multi-label text classification model, and obtain the local feature relation tree of the text to be classified.
[0177] The multi-label classification unit is used to transform the local feature relationship tree of the text to be classified through the output layer of the multi-label text classification model to obtain the multi-label classification result of the text to be classified.
[0178] In the above embodiments, this application discloses a multi-label text classification device, belonging to the field of artificial intelligence technology. By acquiring input text and its corresponding hierarchical labels, the input text and hierarchical labels are imported into a preset initial classification model. The initial classification model includes an input layer, several hidden layers, and an output layer. The input layer performs vector transformation on the input text to obtain an input vector. Several hidden layers extract local features of the input text from the input vector to obtain a local feature relation tree. The output layer transforms the local feature relation tree to obtain a multi-label classification result for the input text. Based on the multi-label classification result, hierarchical labels, and a preset hidden layer confidence threshold, the initial classification model is iterated until the model is fitted, resulting in a multi-label text classification model. A text classification instruction is received, the text to be classified corresponding to the instruction is acquired, the text to be classified is imported into the multi-label text classification model, and the multi-label classification result for the text to be classified is output. This application, by training a multi-label text classification model, can extract multi-level features of text, thereby performing multi-label text classification more accurately. Furthermore, when training a multi-label text classification model, the model is iterated by combining the confidence threshold of the hidden layer, which makes the model training process simpler while ensuring the model accuracy.
[0179] To address the aforementioned technical problems, embodiments of this application also provide a computer device. Please refer to [link / reference needed]. Figure 4 , Figure 4 This is a basic structural block diagram of the computer device in this embodiment.
[0180] The computer device 4 includes a memory 41, a processor 42, and a network interface 43 that are interconnected via a system bus. It should be noted that only the computer device 4 with components 41-43 is shown in the figure; however, it should be understood that it is not required to implement all the shown components, and more or fewer components can be implemented alternatively. Those skilled in the art will understand that the computer device described here is a device capable of automatically performing numerical calculations and / or information processing according to pre-set or stored instructions, and its hardware includes, but is not limited to, microprocessors, application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), digital signal processors (DSPs), embedded devices, etc.
[0181] The computer device can be a desktop computer, laptop, handheld computer, or cloud server, etc. The computer device can interact with the user via a keyboard, mouse, remote control, touchpad, or voice control.
[0182] The memory 41 includes at least one type of readable storage medium, including flash memory, hard disk, multimedia card, card-type memory (e.g., SD or DX memory), random access memory (RAM), static random access memory (SRAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), programmable read-only memory (PROM), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, the memory 41 may be an internal storage unit of the computer device 4, such as the hard disk or memory of the computer device 4. In other embodiments, the memory 41 may also be an external storage device of the computer device 4, such as a plug-in hard disk, smart media card (SMC), secure digital (SD) card, flash card, etc., equipped on the computer device 4. Of course, the memory 41 may also include both the internal storage unit and its external storage device of the computer device 4. In this embodiment, the memory 41 is typically used to store the operating system and various application software installed on the computer device 4, such as computer-readable instructions for multi-tag text classification methods. In addition, the memory 41 can also be used to temporarily store various types of data that have been output or will be output.
[0183] In some embodiments, the processor 42 may be a central processing unit (CPU), controller, microcontroller, microprocessor, or other data processing chip. The processor 42 is typically used to control the overall operation of the computer device 4. In this embodiment, the processor 42 is used to execute computer-readable instructions stored in the memory 41 or to process data, for example, to execute computer-readable instructions for the multi-label text classification method.
[0184] The network interface 43 may include a wireless network interface or a wired network interface, which is typically used to establish communication connections between the computer device 4 and other electronic devices.
[0185] In the above embodiments, this application discloses a multi-computer device belonging to the field of artificial intelligence technology. By acquiring input text and its corresponding hierarchical labels, the input text and hierarchical labels are imported into a preset initial classification model. The initial classification model includes an input layer, several hidden layers, and an output layer. The input layer performs vector transformation on the input text to obtain an input vector. Several hidden layers extract local features of the input text from the input vector to obtain a local feature relation tree. The output layer transforms the local feature relation tree to obtain a multi-label classification result of the input text. Based on the multi-label classification result of the input text, the hierarchical labels, and a preset hidden layer confidence threshold, the initial classification model is iterated until the model is fitted, resulting in a multi-label text classification model. A text classification instruction is received, the text to be classified corresponding to the text classification instruction is acquired, the text to be classified is imported into the multi-label text classification model, and the multi-label classification result of the text to be classified is output. This application, by training a multi-label text classification model, can extract multi-level features of text, thereby performing multi-label text classification more accurately. Furthermore, when training a multi-label text classification model, the model is iterated by combining the confidence threshold of the hidden layer, which makes the model training process simpler while ensuring the model accuracy.
[0186] This application also provides another embodiment, namely, providing a computer-readable storage medium storing computer-readable instructions that can be executed by at least one processor to cause the at least one processor to perform the steps of the multi-label text classification method described above.
[0187] In the above embodiments, this application discloses a storage medium belonging to the field of artificial intelligence technology. By acquiring input text and its corresponding hierarchical labels, the input text and hierarchical labels are imported into a preset initial classification model. The initial classification model includes an input layer, several hidden layers, and an output layer. The input layer performs vector transformation on the input text to obtain an input vector. Several hidden layers extract local features of the input text from the input vector to obtain a local feature relation tree. The output layer transforms the local feature relation tree to obtain a multi-label classification result of the input text. Based on the multi-label classification result of the input text, the hierarchical labels, and a preset hidden layer confidence threshold, the initial classification model is iterated until the model is fitted, resulting in a multi-label text classification model. A text classification instruction is received, the text to be classified corresponding to the text classification instruction is acquired, the text to be classified is imported into the multi-label text classification model, and the multi-label classification result of the text to be classified is output. This application, by training a multi-label text classification model, can extract multi-level features of text, thereby performing multi-label text classification more accurately. Furthermore, when training a multi-label text classification model, the model is iterated by combining the confidence threshold of the hidden layer, which makes the model training process simpler while ensuring the model accuracy.
[0188] Through the above description of the embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus necessary general-purpose hardware platforms. Of course, they can also be implemented by hardware, but in many cases the former is a better implementation method. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product is stored in a storage medium (such as ROM / RAM, magnetic disk, optical disk), and includes several instructions to cause a terminal device (which may be a mobile phone, computer, server, air conditioner, or network device, etc.) to execute the methods described in the various embodiments of this application.
[0189] This application can be used in a wide variety of general-purpose or special-purpose computer system environments or configurations. Examples include: personal computers, server computers, handheld or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set-top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, and distributed computing environments including any of the above systems or devices. This application can be described in the general context of computer-executable instructions executed by a computer, such as program modules. Generally, program modules include routines, programs, objects, components, data structures, etc., that perform specific tasks or implement specific abstract data types. This application can also be practiced in distributed computing environments where tasks are performed by remote processing devices connected via a communication network. In distributed computing environments, program modules can reside in local and remote computer storage media, including storage devices.
[0190] Obviously, the embodiments described above are only some embodiments of this application, not all embodiments. The accompanying drawings show preferred embodiments of this application, but do not limit the patent scope of this application. This application can be implemented in many different forms; rather, the purpose of providing these embodiments is to provide a more thorough and comprehensive understanding of the disclosure of this application. Although this application has been described in detail with reference to the foregoing embodiments, those skilled in the art can still modify the technical solutions described in the foregoing specific embodiments, or make equivalent substitutions for some of the technical features. Any equivalent structures made using the content of this application's specification and drawings, directly or indirectly applied to other related technical fields, are similarly within the scope of patent protection of this application.
Claims
1. A multi-label text classification method, characterized in that, include: Obtain the input text and the corresponding hierarchical tags of the input text; The input text and the hierarchical labels are imported into a preset initial classification model, wherein the initial classification model includes an input layer, several hidden layers and an output layer; The input text is transformed into an input vector through the input layer; By extracting local features of the input text from the input vector through several hidden layers, a local feature relationship tree of the input text is obtained; The local feature relationship tree of the input text is transformed by the output layer to obtain the multi-label classification result of the input text; Based on the multi-label classification results of the input text, the hierarchical labels, and the preset hidden layer confidence threshold, the initial classification model is iterated until the model is fitted, thus obtaining a multi-label text classification model. Receive a text classification instruction and obtain the text to be classified corresponding to the text classification instruction; Import the text to be classified into the multi-label text classification model and output the multi-label classification result of the text to be classified. The process of iterating the initial classification model based on the multi-label classification results of the input text, the hierarchical labels, and the preset hidden layer confidence threshold until the model is fitted, to obtain a multi-label text classification model, specifically includes: The classification error is obtained by calculating the error between the multi-label classification result based on the input text and the hierarchical labels. Compare the classification error with a preset error threshold; When the classification error exceeds a preset error threshold, the initial classification model is iteratively updated. During the iterative update of the initial classification model, the confidence of several hidden layers is continuously monitored, and hidden layers whose confidence reaches the hidden layer confidence threshold are iteratively frozen. When the classification error is less than or equal to a preset error threshold, the multi-label text classification model is obtained; During the iterative update of the initial classification model, the confidence levels of several hidden layers are continuously monitored, and hidden layers whose confidence levels reach the hidden layer confidence threshold are iteratively frozen, specifically including: During the iterative update of the initial classification model, the output results of the iterative update of the initial classification model are obtained; Calculate the confidence level of each hidden layer based on the output results; Compare the confidence level of each hidden layer with the confidence threshold of the hidden layer; When the confidence level of the hidden layer reaches the confidence threshold of the hidden layer, the hidden layer is iteratively frozen; The iterative freezing process targets all network layers in the hidden layer before their confidence level reaches the hidden layer confidence threshold. After iterative freezing of the hidden layer, the subsequent fine-grained labels of the hidden layer are optimized.
2. The multi-label text classification method as described in claim 1, characterized in that, The hidden layers are arranged in a hierarchical structure. The step of extracting local features of the input text from the input vector through each of the hidden layers to obtain a local feature relation tree of the input text specifically includes: The hierarchical hidden layers each extract local features of the input text from the input vector. The extracted local features of the input text are organized using a tree structure to obtain a local feature relationship tree of the input text.
3. The multi-label text classification method as described in claim 2, characterized in that, The hierarchical hidden layers extract local features of the input text from the input vector, specifically including: The root node features corresponding to the local feature relationship tree are extracted through the first hidden layer; Starting from the root node features, the hidden layers with a hierarchical structure are used to sequentially extract all leaf node features of the local feature relation tree from the input vector.
4. The multi-label text classification method as described in claim 1, characterized in that, After calculating the error between the multi-label classification result based on the input text and the hierarchical labels to obtain the classification error, the method further includes: The classification error is propagated through the input layer, several hidden layers, and output layer of the initial classification model using the backpropagation algorithm, resulting in the input layer error, several hidden layer errors, and output layer error. The comparison of the classification error with a preset error threshold specifically includes: The input layer error, several hidden layer errors, and the output layer error are compared with a preset error threshold, and the error comparison result is output.
5. The multi-label text classification method as described in any one of claims 1 to 4, characterized in that, The step of importing the text to be classified into the multi-label text classification model and outputting the multi-label classification result of the text to be classified specifically includes: Import the text to be classified into the multi-label text classification model; The text to be classified is vectorized by the input layer of the multi-label text classification model to obtain the text vector to be classified. The local features of the text to be classified are extracted from the text vector by several hidden layers of the multi-label text classification model, and the local feature relationship tree of the text to be classified is obtained. The local feature relationship tree of the text to be classified is transformed by the output layer of the multi-label text classification model to obtain the multi-label classification result of the text to be classified.
6. A multi-label text classification device, characterized in that, The multi-label text classification device implements the steps of the multi-label text classification method as described in any one of claims 1 to 5, wherein the multi-label text classification device comprises: The input text acquisition module is used to acquire the input text and the corresponding hierarchical tags of the input text; The input text import module is used to import the input text and the hierarchical labels into a preset initial classification model, wherein the initial classification model includes an input layer, several hidden layers and an output layer; The vector conversion module is used to perform vector conversion on the input text through the input layer to obtain an input vector; The feature extraction module is used to extract local features of the input text from the input vector through several hidden layers, and obtain a local feature relation tree of the input text. The classification output module is used to transform the local feature relationship tree of the input text through the output layer to obtain the multi-label classification result of the input text; The model iteration module is used to iterate the initial classification model based on the multi-label classification results of the input text, the hierarchical labels, and the preset hidden layer confidence threshold until the model is fitted, thereby obtaining a multi-label text classification model. The classification instruction module is used to receive text classification instructions and obtain the text to be classified corresponding to the text classification instructions. The text classification module is used to import the text to be classified into the multi-label text classification model and output the multi-label classification result of the text to be classified.
7. A computer device, characterized in that, The method includes a memory and a processor, wherein the memory stores computer-readable instructions, and the processor executes the computer-readable instructions to implement the steps of the multi-label text classification method as described in any one of claims 1 to 5.
8. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores computer-readable instructions, which, when executed by a processor, implement the steps of the multi-label text classification method as described in any one of claims 1 to 5.