Unlock instant, AI-driven research and patent intelligence for your innovation.

Visualization system and method based on interpretable random forest

A random forest and forest technology, applied in the field of visualization system based on interpretable random forest, can solve the problems of poor interpretability of random forest model, inability to display and analyze random forest model, etc.

Inactive Publication Date: 2021-07-09
UNIV OF ELECTRONICS SCI & TECH OF CHINA
View PDF8 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to overcome the deficiencies of the prior art, and provide a visualization system and method based on interpretable random forests, so as to solve the problem that existing random forests cannot analyze and understand the prediction of feature samples in multiple dimensions and angles. And it is impossible to display and analyze the random forest model from the perspective of data, features, tree and path structure and prediction results, resulting in poor interpretability of the random forest model

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Visualization system and method based on interpretable random forest
  • Visualization system and method based on interpretable random forest
  • Visualization system and method based on interpretable random forest

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0069] Such as Figure 5 Shown is the system interface based on the survival information data of Titanic personnel, Figure 5 Several samples of passengers who were originally alive but were misclassified as dead were found in the individual analysis and prediction heat map of the individual learner in , which shows the data analysis results of a certain passenger. This passenger is in each individual learner The prediction results of are all red, which means that his result in each learner is death, so his final prediction result is also death. However, his true condition is that he survived. Next, we look at the distribution of the sample in the scatterplot.

[0070] The misclassified samples are distributed in Figure 5 In the dimensionality reduction scatter diagram of , most of the samples in this range are red, that is, dead, and only a few samples are blue, that is, alive. The similarity of the samples in the dimensionality reduction scatter plot means that they are...

Embodiment 2

[0075] Such as Image 6 Shown is the system interface based on breast cancer number data, which is used to further describe the present invention in detail.

[0076] The data set used in this use case is the breast cancer data collected by Dr. W.H. Wolberg. His statistical data is to study and analyze under what circumstances a patient's breast mass is benign. There are currently many machine learning methods that can help analyze and learn from this data set, and can achieve high prediction accuracy, and random forest is one of them. Learning this data set with a random forest model can achieve very good performance. However, the model's decision-making workflow is agnostic to us. For the medical field, unknowable things mean a lot of risks. Even if the accuracy of the model is high, they will not take the risk of using this prediction result. Therefore, the interpretability of the model is very important. Next, we analyze the interpretable random forest model based on br...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a visualization system and method based on an interpretable random forest. The system comprises a data module, a visualization module, a rendering module and an interaction module; the data module is used for storing, extracting, counting and analyzing training set data and model data; the visualization module is used for performing visualization algorithm mapping on the data information stored in the data module and generating a geometric figure structure composed of space and time sequence after encoding; and the rendering module is used for outputting the geometric figure structure generated by the visualization module and displaying the geometric figure structure on a screen in the interaction module in the form of actual pixel points. The method can analyze and understand the prediction of the feature sample in a multi-dimensional and multi-angle manner, and can display and analyze the random forest model from the aspects of data, features, tree and path structures and prediction results, thereby improving the interpretability of the random forest model.

Description

technical field [0001] The invention relates to the technical field of big data machine learning, in particular to a visualization system and method based on interpretable random forest. Background technique [0002] The random forest model in machine learning is an integrated learner composed of multiple decision trees, and it makes decisions by voting. The large number of trees and the complexity of the structure make random forests very difficult to understand. Due to the excellent performance of random forest, it can achieve very good prediction results in decision-making events in medical, operational and other fields, but the interpretability of its structure is poor. How to display the random forest model clearly and understandably to users became an urgent need. [0003] Existing systems that apply random forest prediction cannot analyze and understand the prediction of feature samples from multiple dimensions and angles, and cannot display and analyze the random f...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06F3/0488G06F3/0484G06F3/0483
CPCG06F3/0483G06F3/04842G06F3/04883G06F18/24323G06F18/214
Inventor 蒲剑苏张婷婷夏瑜潞邵慧张景文
Owner UNIV OF ELECTRONICS SCI & TECH OF CHINA