Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Automatic feature construction method based on attention mechanism and reinforcement learning

A technology of reinforcement learning and construction methods, applied in machine learning, based on specific mathematical models, computer components, etc., can solve problems such as poor scalability, cumbersome, and difficult to generalize, and achieve efficient feature selection process and optimization iteration efficiency. Effect

Pending Publication Date: 2021-03-09
SOUTHEAST UNIV
View PDF0 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this is often tedious and hard-to-generalize work, motivating research efforts related to automatic feature generation
Most of the early work on automatic feature generation generated features through a combination of strictly predefined methods, which made the method poorly scalable; later, deep learning-based methods emerged to implicitly learn high-order feature intersections, but the model lacked scalability. explanatory

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Automatic feature construction method based on attention mechanism and reinforcement learning
  • Automatic feature construction method based on attention mechanism and reinforcement learning
  • Automatic feature construction method based on attention mechanism and reinforcement learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0085] The recommendation system usually needs to meet the recommendation explanatory, that is, while providing the recommendation to the user, the reason for the recommendation is given. Therefore, the recommendation system data set is very suitable for verifying the interpretability of the model. The present invention uses the MovieLens-1M data set commonly used in related experiments in the recommendation field as an example to conduct experiments. The MovieLens-1M data set contains 1 million user rating data for 4000 movies from 6000 users. The detailed statistical information is shown in Table 1.

[0086] Table 1 Statistics of MovieLens-1M dataset

[0087]

[0088] The present invention automatically learns feature combinations based on the similarity of features in global data. Specifically, the present invention measures the correlations among feature fields according to their average attention scores in the whole data, which are obtained by the attention scores. T...

Embodiment 2

[0091] Compared with manual mining feature combination, the model of the present invention achieves finer-grained feature combination, and the process is completely transparent. image 3 Shows the correlations between different eigenvalues ​​of the input features, which are obtained by attention scores. The color depth represents the attention score of the vertical axis element under the horizontal axis element, and the darker the color, the larger the value.

[0092] from image 3 It can be analyzed that the automatic feature generation method based on the self-attention mechanism proposed by the present invention can identify meaningful combination features ( i.e. solid line rectangle). This makes perfect sense, since young college girls are likely to prefer movies in the romance genre. Generally speaking, high-order feature combinations are difficult for manual methods, and it takes a lot of energy and experience to dig out high-order features. However, the feature inter...

Embodiment 3

[0094] Due to the characteristics of the small-sample learning method of SVM, four public supervised classification datasets with small sample size are selected for the experimental data. The statistics of these datasets are summarized in Table 1, which contain inconsistent number of instances, number of features, and class imbalance. All datasets are available in the OpenML repository (http: / / www.openml.org / home).

[0095] Table 1 Statistical information of the data sets in the comparison experiment of machine learning methods

[0096]

[0097] The model has only two hyperparameters: the maximum number of iterations is set to 10, and the embedding size is set to 3. All experiments take the average value after 5-fold cross-validation as the final result, and Table 2 shows the experimental results.

[0098] Table 2 Comparison of machine learning methods (based on SVM classifier)

[0099]

[0100] From Table 2, it can be seen that compared with the other three methods tha...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an automatic feature construction method based on an attention mechanism and reinforcement learning, and the method sequentially comprises the following steps: 1, giving a dataset DTR of a classification problem, including a numerical type feature set S, setting the parameter maximum iteration number maxIterations, and setting the value of an embedded size as embeddingSize; and 2, transmitting the data set and the parameters into the automatic feature construction method, and operating to obtain a classification result. According to the method, a feature generator based on a self-attention mechanism and a feature selector based on reinforcement learning are included, generated features are continuously explored and utilized through iteration, feature generation ofa test set is guided through a globally optimal feature generation and selection scheme in limited steps, and therefore an optimal classification result is automatically obtained.

Description

technical field [0001] The invention relates to an automatic feature construction method, in particular to an automatic feature construction method based on an attention mechanism and reinforcement learning, and belongs to the technical field of automatic machine learning. Background technique [0002] In recent years, automatic machine learning itself has become a new subfield of machine learning. Every step of machine learning can be developed towards automation. Experts in model selection and hyperparameter optimization have proposed relatively mature and available frameworks. Classification or regression machine learning models have achieved low threshold or zero threshold or even free modeling. Today, feature engineering is one of the difficulties in the application of AI in the industry, and the quality of features is the most important basis for the performance of subsequent learning models. [0003] Since raw features rarely yield satisfactory results, manual featur...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06N7/00G06N20/00
CPCG06N20/00G06N7/01G06F18/2415
Inventor 何洁月蔡嘉跃吴宇
Owner SOUTHEAST UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products