Multi-label learning design method based on hashing method

A multi-label learning and design method technology, applied in computing, special data processing applications, instruments, etc., can solve problems such as high-dimensional and sparse label spaces, reduce time and space complexity, improve accuracy, and increase scalability sexual effect

Active Publication Date: 2015-06-17
NANJING UNIV OF POSTS & TELECOMM
View PDF2 Cites 22 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0011] The purpose of the present invention is to solve the problems encountered when the multi-label learning method is applied in a large-scale data scene, and propose a design method for multi-label learning based on the hash method. The method uses the hash algorithm and Bayesian statistics Combined with the multi-label learning algorithm of learning, the correlation bet

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-label learning design method based on hashing method
  • Multi-label learning design method based on hashing method
  • Multi-label learning design method based on hashing method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0031] The invention will be described in further detail below in conjunction with the accompanying drawings.

[0032] Such as figure 2 As shown, the present invention provides a design method based on the multi-label learning of the hash method, and the specific implementation steps of the method include the following:

[0033] (1) Tag correlation extension

[0034] In the multi-label learning algorithm based on Bayesian statistical theory, an important step is to calculate the posterior probability. Given a multi-label training set D={(x i ,Y i )|1≤i≤m} and test samples x, Y i is the corresponding sample x i The label set vector for the jth category y j (1≤j≤q), the formula for calculating the posterior probability based on Bayes' theorem is as follows:

[0035] f ( x , y j ) = P ( ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a multi-label learning design method based on a hashing method. Through the combination of a hashing algorithm and a multi-label learning algorithm based on Bayesian statistics, the correlation between labels is effectively utilized so as to improve the predicting performance of a multi-label learning model, labels and neighbors of the labels are introduced to computation of the posterior probability through the characteristics of the neighbors, the correlation between the labels is fully considered, and the accuracy of the algorithms is improved; the problem that the label space in multi-label learning of large-scale data is higher in dimension and sparse is solved through an MinHash algorithm; the purpose of learning large-scale data is achieved by finding the neighbors through locality sensitive hashing (LSH), the neighbors can be rapidly and efficiently found, and the expandability of the multi-label learning algorithm is improved.

Description

technical field [0001] The invention relates to a design method of multi-label learning based on a hash method, which belongs to the technical field of machine learning. Background technique [0002] In the traditional supervised learning framework, samples generally have a clear single semantic label, that is, each sample example belongs to only one category. Under this supervised learning framework, a variety of algorithms have been proposed and achieved good results. However, in many real-world applications, the semantic labels of research objects are usually not unique, and there are often situations where a sample can be assigned a set of multiple labels. For example, in text classification, a news report may cover multiple aspects of an event, and thus, should be assigned to multiple topics (e.g., politics and economics); in bioinformatics, a gene or protein often has Multiple functions; in image annotation, an image can often be annotated by multiple subject words. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/35
Inventor 吴建盛孙永胡海峰
Owner NANJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products