Intersection-based automatic feature generation method

An automatic and processing method technology, applied in the field of machine learning, can solve problems such as taking a long time, and achieve the effects of improving efficiency, comprehensive information extraction, and rich feature dimensions

Active Publication Date: 2021-02-19
北京融七牛信息技术有限公司
View PDF8 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

It does not require technicians to develop code, but requires technicians to have a deep understanding of business and data. The effect of features depends on the business experience of technicians, and when faced with a large amount of data, it still takes a lot of time

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Intersection-based automatic feature generation method
  • Intersection-based automatic feature generation method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] Embodiments of the present invention are described below with reference to the drawings, in which like parts are denoted by like reference numerals. In the case of no conflict, the following embodiments and the technical features in the embodiments can be combined with each other.

[0029] Such as figure 1 As shown, the method of the present invention includes step S1. In step S1, analyze all selected data sheets, judge the data type of each field (wherein data type comprises character type, character classification type, numerical value classification type, integral type, floating-point type, time type, Boolean type), And give the analysis report of each field.

[0030] Among them, according to the data type, you can choose the appropriate binning method and the available feature generation operator to improve the feature effect. The analysis report can be used to guide the user in selecting fields to use. The analysis report includes commonly used statistical anal...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to an intersection-based automatic feature generation method, which comprises the steps of S1, for a to-be-processed data table, performing binarization according to the type ofdata in the data table and converting binary features; S2, performing iterative feature intersection on the generated binary features to generate intersection features, including S21, calculating a plurality of feature evaluation indexes based on the binary features; S22, according to the specified feature generation number and the specified iteration round number, calculating the number m of features needing to be reserved in each round, the number n of cross features and the number k of features to be crossed; S23, selecting k binary features from the generated binary features, and selectingn cross features from the cross features generated by the last iteration; S24, performing cross operation on the k binary features and the n cross features in pairs to generate new cross features; and S25, selecting m cross features from the newly generated cross features, and reserving the m cross features as the cross features generated by the current iteration. Feature development efficiency of the user is greatly improved.

Description

technical field [0001] The present invention relates to the technical field of machine learning, and more specifically, relates to an automatic feature generation method based on intersection. Background technique [0002] With the emergence of massive data, people tend to use machine learning techniques to build models to solve practical problems. The basic process of training a machine learning model mainly includes: 1) clarifying the modeling goal and collecting available data; 2) feature generation and feature selection; 3) building a model; 4) evaluating the effect of the model. In the above process, the feature generation process is very important, and the quality of feature generation determines the upper limit of the model. [0003] Currently, the feature generation methods are as follows: [0004] 1) Artificial feature generation [0005] Technicians develop features through the cleaning and screening of underlying data, the design of feature logic, and the devel...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/22G06F16/2453G06F16/2455G06Q10/06
CPCG06F16/2282G06Q10/06393G06F16/2453G06F16/2455Y02P90/30
Inventor 周楚杰杨帆黄馨
Owner 北京融七牛信息技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products