Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and system for determining characteristic significance of machine learning sample

A technology of machine learning and machine learning models, applied in machine learning, instruments, computer components, etc., can solve problems such as difficult to effectively determine the importance of machine learning samples

Inactive Publication Date: 2018-05-11
THE FOURTH PARADIGM BEIJING TECH CO LTD
View PDF0 Cites 16 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The exemplary embodiment of the present invention aims to overcome the defect in the prior art that it is difficult to effectively determine the importance of each feature of a machine learning sample

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for determining characteristic significance of machine learning sample
  • Method and system for determining characteristic significance of machine learning sample
  • Method and system for determining characteristic significance of machine learning sample

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0077] In order to enable those skilled in the art to better understand the present invention, exemplary embodiments of the present invention will be described in further detail below in conjunction with the accompanying drawings and specific implementation methods.

[0078] In an exemplary embodiment of the present invention, feature importance is determined in the following manner: training a feature pool model based on at least a part of features of machine learning samples, wherein continuous features need to be discretized. On this basis, the importance of each feature is measured based on the prediction effect of the feature pool model.

[0079] Here, machine learning is an inevitable product of the development of artificial intelligence research to a certain stage. It is dedicated to improving the performance of the system itself by means of calculation and using experience. In computer systems, "experience" usually exists in the form of "data". Through machine learning...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a method and system for determining the characteristic significance of a machine learning sample. The method comprises that (A) a historical data record which comprises a mark and at least one piece of attributed information related to a machine learning problem is obtained; (B) the obtained historical data record is utilized to train at least one characteristic pool model which is used to provide a prediction result related to the machine learning problem on the basis at least part of characteristics; and (C) an effect of the at least one characteristic pool model is obtained, and significance of the different characteristics is determined according to the obtained effect of the at least one characteristic pool model. In the step (B), at least one continuous characteristic among at least part of the characteristics is used to implement a discrete operation and further to train the characteristic pool model. Via the method and system, significance of the characteristics in the machine learning sample can be determined effectively.

Description

technical field [0001] The present invention generally relates to the field of artificial intelligence, and more specifically relates to a method and system for determining feature importance for machine learning samples. Background technique [0002] With the emergence of massive data, artificial intelligence technology has developed rapidly, and in order to mine value from large amounts of data, it is necessary to generate samples suitable for machine learning based on data records. [0003] Here, each data record can be regarded as a description about an event or object, corresponding to an instance or sample. In a data record, there are items that reflect some aspect or nature of an event or object, which may be called "attributes". [0004] In practice, the prediction effect of a machine learning model is related to the selection of the model, the available data and the extraction of features. How to extract the features of machine learning samples from each attribute...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06N99/00G06K9/62
CPCG06N20/00G06F18/2113G06F18/214
Inventor 罗远飞涂威威
Owner THE FOURTH PARADIGM BEIJING TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products