机器学习实战python版决策树以及Matplotlib注解绘制决策树

这一章代码比较难懂，主要是matplotlib的函数调用参数多，调用灵活，让初学者费解。

<span style="font-size:18px;"><span style="font-size:18px;">import matplotlib.pyplot as pltdecisionNode = dict(boxstyle="sawtooth", fc="0.8")#boxstyle = "swatooth"意思是注解框的边缘是波浪线型的，fc控制的注解框内的颜色深度
leafNode = dict(boxstyle="round4", fc="0.8")
arrow_args = dict(arrowstyle="<-")#箭头符号</span></span>

<span style="font-size:18px;"><span style="font-size:18px;">def plotNode(nodeTxt, centerPt, parentPt, nodeType):createPlot.ax1.annotate(nodeTxt, xy=parentPt#起点位置,  xycoords='axes fraction',xytext=centerPt#注解框位置, textcoords='axes fraction',va="center", ha="center", bbox=nodeType, arrowprops=arrow_args )</span></span>

<span style="font-size:18px;"><span style="font-size:18px;"></span>
</span>

<span style="font-size:18px;"><span style="font-size:18px;">def createPlot(inTree):fig = plt.figure(1, facecolor='white')fig.clf()axprops = dict(xticks=[], yticks=[])createPlot.ax1 = plt.subplot(111, frameon=False, **axprops)    #no ticks#createPlot.ax1 = plt.subplot(111, frameon=False) #ticks for demo puropses plotTree.totalW = float(getNumLeafs(inTree))plotTree.totalD = float(getTreeDepth(inTree))plotTree.xOff = -0.5/plotTree.totalW; plotTree.yOff = 1.0;plotTree(inTree, (0.5,1.0), '')plt.show()</span></span>

<span style="font-size:18px">
</span>

<span style="font-size:18px"><img src="https://img-blog.csdn.net/20151127122340740?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQv/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center" alt="" /></span>

<span style="font-size:18px">。</span>

构造注解树

<span style="font-size:18px;">首先我们要知道树的高度和宽度，就是叶子和深度。以便我们好确定注解框的位置。</span>

<span style="font-size:18px;"><span style="font-size:18px;">def getNumLeafs(myTree):numLeafs = 0firstStr = myTree.keys()[0]#父结点键secondDict = myTree[firstStr]for key in secondDict.keys():if type(secondDict[key]).__name__=='dict':#test to see if the nodes are dictonaires, if not they are leaf nodesnumLeafs += getNumLeafs(secondDict[key])else:   numLeafs +=1return numLeafsdef getTreeDepth(myTree):maxDepth = 0firstStr = myTree.keys()[0]secondDict = myTree[firstStr]for key in secondDict.keys():if type(secondDict[key]).__name__=='dict':#test to see if the nodes are dictonaires, if not they are leaf nodesthisDepth = 1 + getTreeDepth(secondDict[key])#递归else:   thisDepth = 1if thisDepth > maxDepth: maxDepth = thisDepth#深度加一return maxDepth</span></span>

<span style="font-size:18px;">通过上面的连个函数，我们就可以获得树的大小了。下面我们要绘制前面的那张图，接下来的代码就不那么简单了。</span>

<span style="font-size:18px;"><span style="font-size:18px;">def plotMidText(cntrPt, parentPt, txtString):#在中间写上分支的条件，比如前面的0，1xMid = (parentPt[0]-cntrPt[0])/2.0 + cntrPt[0]yMid = (parentPt[1]-cntrPt[1])/2.0 + cntrPt[1]createPlot.ax1.text(xMid, yMid, txtString, va="center", ha="center", rotation=30)def plotTree(myTree, parentPt, nodeTxt):#if the first key tells you what feat was split on我们发现这是一层层递归下去了，直到我们找到最下面的子结点，就画上。numLeafs = getNumLeafs(myTree)  #this determines the x width of this treedepth = getTreeDepth(myTree)firstStr = myTree.keys()[0]     #the text label for this node should be thiscntrPt = (plotTree.xOff + (1.0 + float(numLeafs))/2.0/plotTree.totalW, plotTree.yOff)plotMidText(cntrPt, parentPt, nodeTxt)plotNode(firstStr, cntrPt, parentPt, decisionNode)secondDict = myTree[firstStr]plotTree.yOff = plotTree.yOff - 1.0/plotTree.totalDfor key in secondDict.keys():if type(secondDict[key]).__name__=='dict':#test to see if the nodes are dictonaires, if not they are leaf nodes   plotTree(secondDict[key],cntrPt,str(key))        #recursionelse:   #it's a leaf node print the leaf nodeplotTree.xOff = plotTree.xOff + 1.0/plotTree.totalWplotNode(secondDict[key], (plotTree.xOff, plotTree.yOff), cntrPt, leafNode)plotMidText((plotTree.xOff, plotTree.yOff), cntrPt, str(key))plotTree.yOff = plotTree.yOff + 1.0/plotTree.totalD
</span></span>

<span style="font-size:18px;">
上面代码理解起来比较难，我的方法就是，因为这个树比较小，我们可以跟着函数的流程走一遍，自己画一下，大概就能搞懂了，最后才发现，设置初始值真是很有技巧的才行

。

</pre><h2 class="html" name="code"><span style="font-size:18px">测试和存储分类器</span></h2><pre class="html" name="code"><span style="font-size:18px;">就是我们给了一系列的条件，看看根据这个分类器能不能得到我们理想的结果。</span>

<span style="font-size:18px;"><span style="font-size:18px;">def classify(inputTree,featLabels,testVec):firstStr = inputTree.keys()[0]secondDict = inputTree[firstStr]featIndex = featLabels.index(firstStr)key = testVec[featIndex]valueOfFeat = secondDict[key]if isinstance(valueOfFeat, dict): classLabel = classify(valueOfFeat, featLabels, testVec)else: classLabel = valueOfFeatreturn classLabel</span></span>

<span style="font-size:18px;">这也是一个递归函数，一步步找到最低端的子结点。</span>

<span style="font-size:18px;"><span style="font-size:18px;">>>> import trees
>>> myDat,labels = trees.createDataSet()
>>> labels
['no surfacing', 'flippers']
>>> myTree = treePlotter.retrieveTree(0)
>>> myTree
{'no surfacing': {0: 'no', 1: {'flippers': {0: 'no', 1: 'yes'}}}}
>>> trees.classify(myTree,labels,[1,0])
'no'
>>> trees.classify(myTree,labels,[1,1])
'yes'</span></span>

<span style="font-size:18px;">我们知道建立一个决策树是很好时间的，我们在一个大的数据集下建立的决策树费时很久，我们希望下次可以直接用而不是再生成一遍，所以我们要存储已建好的决策树。</span>

<span style="font-size:18px;"><span style="font-size:18px;">def storeTree(inputTree,filename):import picklefw = open(filename,'w')pickle.dump(inputTree,fw)fw.close()def grabTree(filename):import picklefr = open(filename)return pickle.load(fr)</span></span>

示例：

<span style="font-size:18px;"><span style="font-size:18px;">
</span></span>

<span style="font-size:18px;"><span style="font-size:18px;">>>> fr = open('lenses.txt')
>>> lenses = [inst.strip().split('\t') for inst in fr.readlines()]
>>> lenses
[['young', 'myope', 'no', 'reduced', 'no lenses'], ['young', 'myope', 'no', 'normal', 'soft'], ['young', 'myope', 'yes', 'reduced', 'no lenses'], ['young', 'myope', 'yes', 'normal', 'hard'], ['young', 'hyper', 'no', 'reduced', 'no lenses'], ['young', 'hyper', 'no', 'normal', 'soft'], ['young', 'hyper', 'yes', 'reduced', 'no lenses'], ['young', 'hyper', 'yes', 'normal', 'hard'], ['pre', 'myope', 'no', 'reduced', 'no lenses'], ['pre', 'myope', 'no', 'normal', 'soft'], ['pre', 'myope', 'yes', 'reduced', 'no lenses'], ['pre', 'myope', 'yes', 'normal', 'hard'], ['pre', 'hyper', 'no', 'reduced', 'no lenses'], ['pre', 'hyper', 'no', 'normal', 'soft'], ['pre', 'hyper', 'yes', 'reduced', 'no lenses'], ['pre', 'hyper', 'yes', 'normal', 'no lenses'], ['presbyopic', 'myope', 'no', 'reduced', 'no lenses'], ['presbyopic', 'myope', 'no', 'normal', 'no lenses'], ['presbyopic', 'myope', 'yes', 'reduced', 'no lenses'], ['presbyopic', 'myope', 'yes', 'normal', 'hard'], ['presbyopic', 'hyper', 'no', 'reduced', 'no lenses'], ['presbyopic', 'hyper', 'no', 'normal', 'soft'], ['presbyopic', 'hyper', 'yes', 'reduced', 'no lenses'], ['presbyopic', 'hyper', 'yes', 'normal', 'no lenses']]
>>> import trees
>>> lensesLabels = ['age','prescript','astigmatic','tearRate']
>>> lensesTree = trees.createTree(lenses,lensesLabels)
>>> lensesTree
{'tearRate': {'reduced': 'no lenses', 'normal': {'astigmatic': {'yes': {'prescript': {'hyper': {'age': {'pre': 'no lenses', 'presbyopic': 'no lenses', 'young': 'hard'}}, 'myope': 'hard'}}, 'no': {'age': {'pre': 'soft', 'presbyopic': {'prescript': {'hyper': 'soft', 'myope': 'no lenses'}}, 'young': 'soft'}}}}}}
>>> import treePlotter
>>> treePlotter.createPlot(lensesTree)</span></span>

<span style="font-size:18px">
</span>

<span style="font-size:18px"><img src="https://img-blog.csdn.net/20151127122636905?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQv/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center" alt="" /></span>

<span style="font-size:18px">。</span>

当然这个决策树，还有不少的缺陷，叫过度匹配，就是指匹配选项过多了，要删去一些结点和分支。还有就是我们处理的是标称型数据，就是数据是有标量名字的，而不是数值型数据，数值型数据是随意的。当然我们也可以量化数值型数据变成标称型数据。

<span style="font-size:18px;">
</span>

<span style="font-size:18px;">
</span>

<span style="font-size:18px;">希望大家多多指导。</span>

<span style="font-size:18px;"></span>

机器学习实战python版决策树以及Matplotlib注解绘制决策树相关推荐

01、python数据分析与机器学习实战——Python数据可视化库-Matplotlib
Matplotlib介绍 Matplotlib 是一个 Python 的 2D绘图库,它以各种硬拷贝格式和跨平台的交互式环境生成出版质量级别的图形. Matplotlib基础 1.折线图绘制假设,我 ...
《机器学习实战》学习笔记（三）：决策树
欢迎关注WX公众号:[程序员管小亮] [机器学习]<机器学习实战>读书笔记及代码总目录 https://blog.csdn.net/TeFuirnever/article/details ...
Python3《机器学习实战》学习笔记（三）：决策树实战篇
转载请注明作者和出处: http://blog.csdn.net/c406495762 运行平台: Windows Python版本: Python3.x IDE: Sublime text3 ...
Python3《机器学习实战》学习笔记（三）：决策树实战篇之为自己配个隐形眼镜
转载请注明作者和出处: http://blog.csdn.net/c406495762 运行平台: Windows Python版本: Python3.x IDE: Sublime text3 一前 ...
OpenCV计算机视觉实战(Python版)_002图像基本操作
OpenCV计算机视觉实战(Python版) https://www.bilibili.com/video/BV1ct411F7Te?p=2 数据读取-图像 cv2.IMREAD_COLOR:彩色图像 ...
机器学习：使用Matplotlib注解绘制树形图以及实例运用
前言内容承接上篇文章:决策树(Decision Tree)算法详解及python实现这篇补充上篇没有说到的知识点以及运用. 一.增益率采用与信息增益相同的表达式,增益率定义为: 其中: 称为属性 ...
python predictabel_Python3《机器学习实战》学习笔记（三）：决策树实战篇之为自己配个隐形眼镜...
) p1Vect = np.log(p1Num/p1Denom) #取对数,防止下溢出 p0Vect = np.log(p0Num/p0Denom) return p0Vect,p1Vect,pAbu ...
机器学习实战—朴素贝叶斯及要点注解
书籍:<机器学习实战>中文版 IDE:PyCharm Edu 4.02 环境:Adaconda3 python3.6 #!/usr/bin/env python3 # -*- codin ...
Python3《机器学习实战》学习笔记（二）：决策树基础篇之让我们从相亲说起
转载请注明作者和出处: http://blog.csdn.net/c406495762 运行平台: Windows Python版本: Python3.x IDE: Sublime text3 个人网 ...

机器学习实战python版决策树以及Matplotlib注解绘制决策树

构造注解树

示例：

机器学习实战python版决策树以及Matplotlib注解绘制决策树相关推荐

最新文章

热门文章