【机器学习笔记】使用lightgbm画并保存Feature Importance

资料参考：
1. Evaluate Feature Importance using Tree-based Model
2. lgbm.fi.plot: LightGBM Feature Importance Plotting
3. lightgbm官方文档

前言

基于树的模型可以用来评估特征的重要性。在本博客中，我将使用LightGBM中的GBDT模型来评估特性重要性的步骤。 LightGBM是由微软发布的高精度和高速度梯度增强框架（一些测试表明LightGBM可以产生与XGBoost一样的准确预测，但速度可以提高25倍）。

首先，我们导入所需的软件包：用于数据预处理的pandas，用于GBDT模型的LightGBM以及用于构建功能重要性条形图的matplotlib。

import pandas as pd
import matplotlib.pylab as plt
import lightgbm as lgb

然后，我们需要加载和预处理训练数据。在这个例子中，我们使用预测性维护数据集。

# read data
train = pd.read_csv('E:\Data\predicitivemaintance_processed.csv')# drop the columns that are not used for the model
train = train.drop(['Date', 'FailureDate'],axis=1)# set the target column
target = 'FailNextWeek'# One-hot encoding
feature_categorical = ['Model']
train = pd.get_dummies(train, columns=feature_categorical)

接下来，我们用训练数据训练GBDT模型：

lgb_params = {'boosting_type': 'gbdt','objective': 'binary','num_leaves': 30,'num_round': 360,'max_depth':8,'learning_rate': 0.01,'feature_fraction': 0.5,'bagging_fraction': 0.8,'bagging_freq': 12
}
lgb_train = lgb.Dataset(train.drop(target, 1), train[target])
model = lgb.train(lgb_params, lgb_train)

模型训练完成后，我们可以调用训练模型的plot_importance函数来获取特征的重要性。

plt.figure(figsize=(12,6))
lgb.plot_importance(model, max_num_features=30)
plt.title("Featurertances")
plt.show()

保存feature importance

booster = model.booster_
importance = booster.feature_importance(importance_type='split')
feature_name = booster.feature_name()
# for (feature_name,importance) in zip(feature_name,importance):
#     print (feature_name,importance)
feature_importance = pd.DataFrame({'feature_name':feature_name,'importance':importance} )
feature_importance.to_csv('feature_importance.csv',index=False)

完美~

【机器学习笔记】使用lightgbm画并保存Feature Importance相关推荐

【机器学习】用特征量重要度(feature importance)解释模型靠谱么？怎么才能算出更靠谱的重要度？
[机器学习]用特征量重要度(feature importance)解释模型靠谱么?怎么才能算出更靠谱的重要度? 我们用机器学习解决商业问题的时候,不仅需要训练一个高精度高泛化性的模型,往往还需要解释哪 ...
机器学习基础理论学习笔记（8）特征选择（feature selection）（一）
0.说明本文也许比较乱,请看目录再食用. 后续会出文机器学习基础理论学习笔记 (8)特征选择(feature selection)(二) 将分类问题和回归问题分开总结. 以及或将出文 ...
天池龙珠训练营-机器学习学习笔记-03 LightGBM 分类
天池龙珠训练营-机器学习学习笔记-03 LightGBM 分类本学习笔记为阿里云天池龙珠计划机器学习训练营的学习内容,学习链接为:训练营一原理简介: 它是一款基于GBDT(梯度提升决策树)算法的 ...
Python机器学习笔记：sklearn库的学习
自2007年发布以来,scikit-learn已经成为Python重要的机器学习库了,scikit-learn简称sklearn,支持包括分类,回归,降维和聚类四大机器学习算法.还包括了特征提取,数据 ...
【学习打卡02】可解释机器学习笔记之ZFNet
可解释机器学习笔记之ZFNet 文章目录可解释机器学习笔记之ZFNet ZFNet介绍 ZFNet结构特征可视化可视化结构特征不变性特征演化遮挡性分析其他内容总结和思考首先非常感谢同 ...
【学习打卡05】可解释机器学习笔记之CAM+Captum代码实战
可解释机器学习笔记之CAM+Captum代码实战文章目录可解释机器学习笔记之CAM+Captum代码实战代码实战介绍 torch-cam工具包可视化CAM类激活热力图预训练ImageNet- ...
吴恩达机器学习笔记week8——神经网络 Neutral network
吴恩达机器学习笔记week8--神经网络 Neutral network 8-1.非线性假设 Non-linear hypotheses 8-2.神经元与大脑 Neurons and the brai ...
【学习打卡03】可解释机器学习笔记之CAM类激活热力图
可解释机器学习笔记之CAM类激活热力图文章目录可解释机器学习笔记之CAM类激活热力图 CAM介绍 CAM算法原理 GAP全局平均池化 GAP VS GMP CAM算法的缺点及改进 CAM可视化同 ...
李弘毅机器学习笔记：第十三章—CNN
李弘毅机器学习笔记:第十三章-CNN 为什么用CNN Small region Same Patterns Subsampling CNN架构 Convolution Propetry1 Propet ...
李弘毅机器学习笔记：第十四章—Why deep?
李弘毅机器学习笔记:第十四章-Why deep? 问题1:越深越好? 问题2:矮胖结构 v.s. 高瘦结构引入模块化深度学习使用语音识别举例语音辨识: 传统的实现方法:HMM-GMM 深度学习 ...

【机器学习笔记】使用lightgbm画并保存Feature Importance

前言

保存feature importance

【机器学习笔记】使用lightgbm画并保存Feature Importance相关推荐

最新文章

热门文章