Extreme multi-label learning method based on space-time network clustering reduction integration

A learning method and multi-label technology, applied in the field of multi-label text mining, can solve problems such as ignoring label sparsity, label-level model training, and poor learning scalability, so as to solve time and space consumption, improve representation ability, and improve general chemical effect

Pending Publication Date: 2022-06-28
YUNNAN UNIV
0 Cites 0 Cited by

AI-Extracted Technical Summary

Problems solved by technology

[0004] (1) The traditional multi-label model cannot adapt to the shortcomings of extreme multi-label scenarios
[0005] Traditional multi-label learning only focuses on a relatively small number of labels, such as less than 100, but with the increasing number of Internet data, the number of labels has exceeded ten thousand or one million. Due to the huge number of labels, the traditional multi-label learning method takes a long time The complexity is too large to adapt to extreme multi-label learning scenarios
[0006] (2) Low generalization performance of existing extreme multi-label learning models
[0007] Existing extreme multi-label learning methods are mainly based on tree ensembles, embeddings,...
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Abstract

The invention discloses an extreme multi-label learning method based on space-time network clustering reduction integration in the technical field of multi-label text mining. The extreme multi-label learning method comprises the following steps: space-time network attention integration characterization; self-adaptive label relation enhancement and clustering reduction learning are carried out; carrying out unbalanced learning on the weighted reduction label set; according to the method, interactive attention among the words, the phrases and the labels in the multi-label text is integrated, the dependency relationship among the words, the phrases and the labels is explored, and the extreme multi-label text characterization capability is effectively improved; a self-adaptive label relation enhancement and clustering reduction learning mechanism is provided, through self-adaptive label relation enhancement, the dependency relation between the labels can be effectively mined, the generalization of the model is improved, and through clustering reduction learning, the labels of different magnitudes can be effectively adapted to the existing model for training; a weighted reduced label set imbalance learning mechanism is provided, and the problems of poor model generalization and expandability and the like caused by label sparsity and imbalance are solved.

Application Domain

Technology Topic

Image

  • Extreme multi-label learning method based on space-time network clustering reduction integration
  • Extreme multi-label learning method based on space-time network clustering reduction integration
  • Extreme multi-label learning method based on space-time network clustering reduction integration

Examples

  • Experimental program(1)

Example Embodiment

[0058] The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.
[0059] The present invention provides a technical solution: an extreme multi-label learning method based on space-time network clustering reduction integration, comprising the following steps:
[0060] S1: Space-time network attention ensemble representation;
[0061] S11: Acquisition of original extreme multi-label data; learning based on extreme multi-label data acquired in different actual application scenarios;
[0062] S12: Phrase-level representation CNN and word-level representation RNN; given document representation x i ∈R d×n , for phrase-level representations, the CNN convolution kernel W can be used i ∈R ωd and the bias term b i Learn phrase-level representations of ω-grams, let the vector c i expressive word (e i-ω+1 ,...,e i ), the feature p i Represented as: p i =δ(Conv1D(W i , c i )+b i ); for word-level representation; RNN can be used to learn bidirectional word-level information, expressed as:
[0063] S13: Spatial semantic information representation; through a hybrid attention mechanism, including a convolutional multi-head self-attention module and a convolutional interactive attention module, a spatial semantic information representation is finally obtained, which not only considers the phrase-to-phrase relationship, but also considers The relationship between phrases and tags;
[0064] S131: The specific technical steps of the multi-head self-attention module:
[0065] S1311: Single-Head Attention Computation; Dot Product-Based Attention, Q ∈ R 2r×l , K∈R 2r×l and V∈R 2r×l Represent the three embedding matrices of query, key, and value, respectively, and the attention output matrix is ​​represented as:
[0066] S1312: Multi-head attention calculation; based on the single-head attention calculated by S1311, multi-head attention can be calculated, which can be expressed as:
[0067] P=Muti-head Attention(Q,K,V)=Concat(head 1 ,head 2 ,...,head h )wherehead i =Attention(Q i ,K i ,V i );
[0068] S1313: Convolutional multi-head self-attention calculation; in multi-label text learning, since each document can be assigned to multiple labels, we use a multi-label attention mechanism to focus on different label relationships, based on the matrix calculated by S1312 P∈R 2r×l , and the final output multi-label attention Sj (j=1,2,...,k) is expressed as:
[0069] S132: Convolutional interactive attention Specific technical steps:
[0070] S1321: Tag graph embedding; use Node2Vec [6] to generate tag co-occurrence graph vector to explore the structure information of tags, that is, each tag can be regarded as a node, if any two tags appear together in a document, there is an edge between them connected, based on random walks, and captures high-order label dependencies through graph embedding technology, each label can be expressed as a 2r-dimensional vector, namely L j ∈R 2r (j=1,2,...k) denotes the i-th label, so the whole label embedding is denoted as L∈R k×2r;
[0071] S1322: Convolutional interactive attention calculation; based on S13, the representation L∈R is obtained k×2r , through the matrix K∈R 2r×l and V∈R 2r×l , the interactive attention of convolution can be expressed as: I 1 =V×softmax(LK) T; based on the matrix S∈R obtained from S131 2r×k and I based on S132 1 ∈R 2r×k , then S13 can be expressed as: C=Concat(S,I 1 ).
[0072] S14: temporal semantic information representation; use a hybrid attention mechanism to capture temporal semantic information representation, including recurrent self-attention module and recurrent interactive attention module;
[0073] S141: Recurrent self-attention module; to better model contextual word-level dependencies, a weighted self-attention mechanism is used to focus on different aspects of the document, which not only learns long-term temporal dependencies, but also captures various Dense part, recurrent self-attention U ∈ R 2r×k Can be described as: T=tanh(W 1 H)A=softmax(W 2 T) T U=HA;
[0074] S142: Recurrent interactive attention module; similar to convolutional interactive attention, interactive attention is introduced to capture fine-grained word-level signals, and the matching score between words and tags is calculated, then the recurrent interactive attention can be described as The matrix U∈R obtained based on S141 2r×k and S142 get I 2 ∈R 2r×k , then S14 can be expressed as: R=Concat(U,I 2 ).
[0075] S15: Spatial-Temporal Network Attention Ensemble Representation. C∈R obtained based on S13 2r×k and R∈R obtained based on S14 2r ×k , an adaptive weighted ensemble strategy is proposed, which first uses l 2 Normalized C ∈ R 2r×k and R∈R 2r×k , and then transform it to the weight α∈R through an MLP layer and a fully connected layer k×1 and β∈R k×1 , which can be expressed as: Finally, the final spatial-temporal network attention ensemble representation is obtained by normalizing the weights, which is described as: M=α×C+β×R, and finally the spatial-temporal network attention ensemble representation M is obtained.
[0076] S2: Adaptive Label Relation Enhancement and Clustering Reduction Learning;
[0077] S21: Label tree clustering; by summing the inner product of the sparse text feature containing the label and the label text feature, then standardizing to obtain the feature representation of each label, and then recursing using balanced k-means (k=2) The clustering, iterate until the following conditions are met: Given the maximum number of labels of each cluster, it is required to divide the labels into S clusters, and the number of labels contained in each label cluster satisfies the requirement of less than or greater than the maximum number of labels. Half; when S clusters are obtained, based on the representation M obtained by the S1 module, M can be mapped to the S-dimensional vector P through a fully connected layer;
[0078] S22: Label relationship enhancement; adding a bottleneck layer to the original prediction P by conveying the label relationship to adaptively realize label enhancement, which can be formally described as:
[0079] S23: Clustering reduction learning; S cluster indices obtained based on S21 label tree clustering are denoted as y S ∈{0,1} S , and the representation obtained by label relation enhancement An adaptive clustering reduction learning mechanism is proposed, which is described as:
[0080]
[0081] S3: Weighted Reduced Label Set Imbalanced Learning.
[0082] S31: Reduced label set embedding; based on the k clusters obtained in S21, a reduced label set U can be obtained, and based on the representation M obtained by S1, a reduced label set embedding vector Q can be obtained, described as: Q=σ (W Q M+b Q );
[0083] S32: Weighted unbalanced learning; based on the k clusters obtained in S21, a reduced set U of labels can be obtained, and at the same time, the labels y corresponding to the true labels can be found U ∈{0,1} U , in order to solve the multi-label imbalance problem, a weighted imbalance loss is used for learning, which is described as:
[0084]
[0085] in is the true label corresponding to the sample in the reduced label set U, Q k is the embedding obtained using S31, γ+ and γ- express the contribution of different positive and negative sample weights, usually, γ->γ + , usually set γ + is 0, γ- is 1;
[0086] Therefore, based on the above three modules, an extreme multi-label learning method based on spatial-temporal network clustering reduction ensemble is proposed. The overall model training can be described as:
[0087] L=L S +L Q
[0088] Table 1 Related terms
[0089]
[0090] In the description of this specification, description with reference to the terms "one embodiment," "example," "specific example," etc. means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one aspect of the present invention. in one embodiment or example. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
[0091] The above-disclosed preferred embodiments of the present invention are provided only to help illustrate the present invention. The preferred embodiments do not exhaust all the details, nor do they limit the invention to only the described embodiments. Obviously, many modifications and variations are possible in light of the content of this specification. The present specification selects and specifically describes these embodiments in order to better explain the principles and practical applications of the present invention, so that those skilled in the art can well understand and utilize the present invention. The present invention is to be limited only by the claims and their full scope and equivalents.
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

no PUM

Description & Claims & Application Information

We can also present the details of the Description, Claims and Application information to help users get a comprehensive understanding of the technical details of the patent, such as background art, summary of invention, brief description of drawings, description of embodiments, and other original content. On the other hand, users can also determine the specific scope of protection of the technology through the list of claims; as well as understand the changes in the life cycle of the technology with the presentation of the patent timeline. Login to view more.
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Similar technology patents

Classification and recommendation of technical efficacy words

  • Enhanced Representational Capabilities
  • Improve generalization

Target tracking method based on TLD algorithm

Owner:INST OF OPTICS & ELECTRONICS - CHINESE ACAD OF SCI

Image retrieval method and system

InactiveCN108829848AImprove accuracyImprove generalizationCharacter and pattern recognitionSpecial data processing applicationsExclusive orMultiple image
Owner:HUAZHONG UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products