数据分析 NO.15 数据可视化

数据可视化

先创建一个画布，然后往画布填充元素，然后展现出来

import numpy as np
import matplotlib.pyplot as plt
%matplotlib inlineX = np.linspace(0, 2*np.pi,100)# 均匀的划分数据 ,  0到2pi  100份均分
Y = np.sin(X)
Y1 = np.cos(X)plt.title("Hello World!!")
plt.plot(X,Y)
plt.plot(X,Y1)

BAR CHART 柱状图
统计某个特征的频率或者数值

data = [5,25,50,20]
plt.bar(range(len(data)),data)    #逗号前面是X，后面是Y

data = [5,25,50,20]
plt.barh(range(len(data)),data)

3.多个Bar

data = [[5,25,50,20],[4,23,51,17],[6,22,52,19]]
X = np.arange(4)plt.bar(X + 0.00, data[0], color = 'b', width = 0.25,label = "A")
plt.bar(X + 0.25, data[1], color = 'g', width = 0.25,label = "B")
plt.bar(X + 0.50, data[2], color = 'r', width = 0.25,label = "C")    #横轴的移动plt.legend()   #加了label就要用.legend(）

np.arange：

data = [[5,25,50,20],[4,23,51,17],[6,22,52,19]]
X = np.arange(4)plt.bar(X, data[0], color = 'b', width = 0.25)
plt.bar(X, data[1], color = 'g', width = 0.25,bottom = data[0])
plt.bar(X, data[2], color = 'r', width = 0.25,bottom = np.array(data[0]) + np.array(data[1]))    #纵轴的移动。plt.show()

3.散点图
散点图用来衡量两个连续变量之间的相关性

N = 50
x = np.random.rand(N)
y = np.random.rand(N)plt.scatter(x, y)

N = 50
x = np.random.rand(N)
y = np.random.rand(N)
colors = np.random.randn(N)
area = np.pi * (15 * np.random.rand(N))**2  #  调整大小plt.scatter(x, y, c=colors, alpha=0.5, s = area)   #alpha  透明度

N = 50
x = np.random.rand(N)
y = np.random.rand(N)
colors = np.random.randint(0,2,size =50)
plt.scatter(x, y, c=colors, alpha=0.5,s = area)

在实际的工作中，遇到要求通过散点反应该列数据的分布情况，即纵坐标是其本身的数值，横坐标是索引（个人理解），可以通过excel画，python的话输入代码

plt.plot(df["age"],".",markersize=50)

4.直方图
直方图是用来衡量连续变量的概率分布的。在构建直方图之前，我们需要先定义好bin（值的范围），也就是说我们需要先把连续值划分成不同等份，然后计算每一份里面数据的数量。
（例子：比如把年龄分好）

a = np.random.rand(100)
plt.hist(a,bins= 20)   #bins 是把下面切成多少块
plt.ylim(0,15)

直方图增加边框，即增加边框颜色

plt.hist(edgecolor="black")

5.BOXPLOTS 线箱图
boxlot用于表达连续特征的百分位数分布。统计学上经常被用于检测单变量的异常值，或者用于检查离散特征和连续特征的关系

上四分位数：是%75分位数下四分位数：是%25分位数
其中可以通过plt.boxplot(x)，也可以通过df本身的dpi进行表示df.plot(kind=box)
第一种方式x为一个二维数组，第二种方式则为一个dataframe，列名为每一个箱线。
第一种方式可以用plt.boxplot(df.values)表示是相同的

x = np.random.randint(20,100,size = (30,3))plt.boxplot(x)
plt.ylim(0,120)
plt.xticks([1,2,3],['A','B','C'])  #xticks   在X轴上的什么位置填入一个labelplt.hlines(y = np.mean(x,axis = 0)[1] ,xmin =0,xmax=3)

存在几个箱线图数据样本量不等的情况，可以通过Series幅值后，通过字典变dataframe的方式进行绘制，对应index的value值若不存在则为NaN。箱线会自动滤除空值情况。

s1 = pd.Series(list1)
s2 = pd.Series(list2)
s3 = pd.Series(list3)
s4 = pd.Series(list4)
result = {"a":s1,"b":s2,"c":s3,"d":s4}
result = pd.DataFrame(result)
result.plot(kind="box")  #以下三种写法都可以
result.boxplot()
plt.boxplot(result.values)

6.颜色的调整和文字的增加

颜色代码：颜色代码

fig, ax = plt.subplots(facecolor='darkseagreen')
data = [[5,25,50,20],[4,23,51,17],[6,22,52,19]]
X = np.arange(4)plt.bar(X, data[0], color = 'darkorange', width = 0.25,label = 'A')
plt.bar(X, data[1], color = 'steelblue', width = 0.25,bottom = data[0],label = 'B')
plt.bar(X, data[2], color = 'violet', width = 0.25,bottom = np.array(data[0]) + np.array(data[1]),label = 'C')
ax.set_title("Figure 1")
plt.legend()

增加文字：
plt.text(0

W = [0.00,0.25,0.50]
for i in range(3):for a,b in zip(X+W[i],data[i]):plt.text(a,b,"%.0f"% b,ha="center",va= "bottom")  #a是X轴，b是Y轴。  "%.0f"% b显示文字格式
plt.xlabel("Group")
plt.ylabel("Num")
plt.text(0.0,48,"TEXT")

在数据可视化的过程中，图片中的文字经常被用来注释图中的一些特征。使用annotate()方法可以很方便地添加此类注释。在使用annotate时，要考虑两个点的坐标：被注释的地方xy(x, y)和插入文本的地方xytext(x, y)。其中官方参考文档见此：地址
其他用户参考地址
plt.annotate()
s:str, 注释信息内容
xy:(float,float), 箭头点所在的坐标位置
xytext:(float,float), 注释内容的坐标位置
weight: str or int, 设置字体线型，其中字符串从小到大可选项有{‘ultralight’, ‘light’, ‘normal’, ‘regular’, ‘book’, ‘medium’, ‘roman’, ‘semibold’, ‘demibold’, ‘demi’, ‘bold’, ‘heavy’, ‘extra bold’, ‘black’}
color: str or tuple, 设置字体颜色 ,单个字符候选项{‘b’, ‘g’, ‘r’, ‘c’, ‘m’, ‘y’, ‘k’, ‘w’}，也可以’black’,‘red’等，tuple时用[0,1]之间的浮点型数据，RGB或者RGBA, 如: (0.1, 0.2, 0.5)、(0.1, 0.2, 0.5, 0.3)等
arrowprops：dict，设置指向箭头的参数，字典中key值有①arrowstyle：设置箭头的样式，其value候选项如’->‘,’|-|‘,’-|>‘,也可以用字符串’simple’,‘fancy’等，详情见顶部的官方项目地址链接。
connectionstyle：设置箭头的形状，为直线或者曲线，候选项有’arc3’,‘arc’,‘angle’,‘angle3’，可以防止箭头被曲线内容遮挡
color：设置箭头颜色，见前面的color参数。


X = np.linspace(0, 2*np.pi,100)# 均匀的划分数据
Y = np.sin(X)
Y1 = np.cos(X)plt.plot(X,Y)
plt.plot(X,Y1)
plt.annotate('Points',xy=(1, np.sin(1)),xytext=(2, 0.5), fontsize=16,arrowprops=dict(arrowstyle="->"))
plt.title("这是一副测试图！")

plt.rc("font", family="SimHei", size="15")#此行代码很好理解！#显示中文  其中字体可以使用Latex字体效果不错
plt.rcParams["axes.unicode_minus"]=False#用来显示负号
mpl.rcParams['agg.path.chunksize'] = 10000#画布不够大时，修改后面的参数
plt.figure(figsize=(12,8))#调整jupyter 里图的大小

7.Subplots 在一块画布上画出多个图形
代码中.subplot(211)只是将画布分为2行1列在几个块绘图

plt.figure(figsize=(15, 8))
plt.subplot(211)
plt.plot(vir)
plt.subplot(212)
plt.plot(vir)

内部参数可参考：链接

%pylab.inline
pylab.rcParams['figure.figsize'] = (10, 6) # 调整图片大小

fig, axes = plt.subplots(nrows=2, ncols=2,facecolor='darkslategray',figsize=(18,12))   #把画布分成2行，2列
ax0, ax1, ax2, ax3 = axes.flatten()  #展开     这两行代码相当于底图       也可以通过索引进行选择ax0.plot(df[“age”])
ax0.set_title("minzgi“”)
ax0.set_xticks([10,20,30])colors = ['red', 'tan', 'lime']ax0.hist(x, n_bins, normed=1, histtype='bar', color=colors, label=colors)ax0.legend(prop={'size': 10})ax0.set_title('bars with legend')ax1.hist(x, n_bins, normed=1, histtype='bar', stacked=True)
ax1.set_title('stacked bar')ax2.hist(x, n_bins, histtype='step', stacked=True, fill=False)
ax2.set_title('stack step (unfilled)')Make a multiple-histogram of data-sets with different length.x_multi = [np.random.randn(n) for n in [10000, 5000, 2000]]ax3.hist(x_multi, n_bins, histtype='bar')ax3.set_title('different sample sizes')fig.tight_layout()       # Adjust subplot parameters to give specified padding.  调整好看的大小
plt.show()

循环列表subplot绘图

fig,axes = plt.subplots(ncols=3,nrows=6,figsize=(20,40))
for index,ax in enumerate(axes.flatten()):ax.plot(df["JTBefore"],df.iloc[:,index],".")ax.set_xlabel("径跳量值",fontsize=16)ax.set_ylabel("归一化值",fontsize=16)ax.set_xticks([i for i in range(10)]) #划分横坐标ax.set_xticklabels([i for i in range(10)]) #重新横坐标命名ax.grid(axis="x,linestyle="--")ax.set_title(df.columns[index] + "_相关性：%s"%(df["JTBefore"].corr(df.iloc[:,index])),fontsize=16)
plt.savefig(r"C:\Users\Administrator\Desktop\a.png")

8.共享X 轴，Y轴

# ShareX or ShareY
N_points = 100000
n_bins = 20# Generate a normal distribution, center at x=0 and y=5
x = np.random.randn(N_points)
y = .4 * x + np.random.randn(100000) + 5fig, axs = plt.subplots(1, 2, sharey=True, tight_layout=True)   #sharey 共享Y轴， tight_layout更紧凑，为了更好看。# We can set the number of bins with the `bins` kwarg
axs[0].hist(x, bins=n_bins)
axs[1].hist(y, bins=n_bins)

Pandas API

df.plot.scatter(x = "height",y = "weight",c = "born")
或
df.plot(kind="scatter",x="180",y="77",c="1918")

df['birth_state'].value_counts()[:50].plot.barh()

grouped = df.groupby("birth_state")
gs = grouped.size()
gs[gs >=10].sort_values().plot.bar()

df[['height','weight']].plot.hist()

df4 = pd.DataFrame({'a': np.random.randn(1000) + 1, 'b': np.random.randn(1000),'c': np.random.randn(1000) - 1}, columns=['a', 'b', 'c'])
df4.plot.hist(alpha=0.5)

其中直方图内的参数：
normed：是否将直方图的频数转换成频率
cumulative：是否需要计算累计频数或频率；
可参考：链接

df = pd.DataFrame(np.random.rand(10, 5), columns=['A', 'B', 'C', 'D', 'E'])
df.plot(kind = "box")

pandas.DataFrame.plot一个坐标系画多张图片：


ax = du_offer.plot(x='max_load', y='w0', legend='w0')
du_offer.plot(x='max_load', y='w1', legend='w1', title=du, ax=ax)

.plot()相关参数：链接
10. seaborn

displot() 做单个连续变量的数据分布

import seaborn as sns
sns.set()   #全局设置

tips = sns.load_dataset("tips")
iris = sns.load_dataset("iris")   #导入数据sns.distplot(iris.sepal_length)   #单变量的直方图

notebook 中

！+ 空格    #表示这行代码是在终端中进行的！

# 多个变量在一幅图中比较
sns.distplot(iris.sepal_length,bins = 20,kde = False)
sns.distplot(iris.sepal_width,bins = 20,kde = False)

.jointplot() 返回两个变量之间的关系的

# 返回的结果是散点图，以及两个变量的直方图
sns.jointplot(x = "sepal_length",y = "sepal_width",data=iris)  #返回3个，一个是他们的关系，然后是两个数据的分别直方图
或sns.jointplot(x=tips["total_bill"],y=tips["tip"])

.pairplot（）传入一个数据，会把数据的所有特征以两两的之间的关系都做出来

sns.pairplot(iris)

离散分类的变量

sns.stripplot(x="day", y="total_bill", data=tips);

sns.stripplot(x="day", y="total_bill", data=tips, jitter=True,hue = "smoker");   #jitter=True  上密集的线部分打散 #hue=""   对后面的分组

工作代码

result_data = pd.DataFrame()
value_mean = []
pur_data = [sv_qian_3,sv_hou_3,sv_qian_4,sv_hou_4,sv_qian_16,sv_hou_16]
for i in pur_data:result_data = result_data.append(i)value_mean.append(np.mean(i).values[0])
plt.figure(figsize=(18,12))
plt.rc("font", family="SimHei", size="18")
sns.stripplot(x="name",y="SV", data=result_data,jitter=True);
for i,value in enumerate(value_mean):plt.annotate("平均值为：%d"%int(value),xy=(i,5),fontsize=15)
# value_mean
plt.ylabel("XX值",fontsize=18)
plt.xlabel("XXX",fontsize=18)

# POINT不会重叠
sns.swarmplot(x="day", y="total_bill", data=tips);  #点 完全不重叠

sns.barplot(x="tip", y="day",hue = "smoker", data=tips，estimator=np.median)  #横轴默认的是平均数，可以用estimator进行修改。

linear relationships
展示两个连续性变量的关系

sns.lmplot(x="total_bill", y="tip", data=tips)   #输入两个连续性变量sns.lmplot(x="x", y="y", data=anscombe.query("dataset == 'I'"),scatter_kws={"s": 80})
#data=anscombe.query("dataset == 'I'")  选择数据中“i”， scatter_kws=  设置点的大小

sns.lmplot(x="x", y="y", data=anscombe.query("dataset == 'II'"),ci=None, scatter_kws={"s": 80});     # ci=None  不展示置信区间

sns.lmplot(x="x", y="y", data=anscombe.query("dataset == 'II'"),order=2, ci=None, scatter_kws={"s": 80});     #order =2 表示最高到2次项   y=wx+b+w1X**2   作用是进行拟合

sns.lmplot(x="x", y="y", data=anscombe.query("dataset == 'III'"),robust=True, ci=None, scatter_kws={"s": 80});   #robust=True   不拟合异常点！！

sns.lmplot(x="total_bill", y="tip", hue="smoker", data=tips)

关于图例的使用可参考：
Legend的使用
图例的位置设置参数如下：可设置在图片外，大于1即可
plt.legend(bbox_to_anchor=(1.05,1.04))

多子图绘制legend()

ax[0][1].scatter(np.random.choice(1000, len(df2)), df2['Tw'], s=0.5, c='blue', label='蓝色:未报警数据') #绘图时，给定label标签
handles01, labels01 = ax[0][1].get_legend_handles_labels()
ax[0][1].legend(handles01, labels01, loc='upper left')

df.plot()参数详解：
df.plot参数详解

关于matplotlib的命令与格式可参考于：
参数配置文件与参数配置

图形文件的保存：
plt.savefig()

plt.savefig("figpth.png",dpi=400,bbox_inches="tight")#其中dpi是分辨率，bbox_inches是可以裁剪图表周围的空白处。

rc方法：

多个图例绘制在一幅图里面：(注意是plt 不是df.plot)

plt.scatter(data = df_ceshi[df_ceshi["标签"] == 1],x = "temp" ,y = "wensheng")
plt.scatter(data = df_ceshi[df_ceshi["标签"] == 2],x = "temp" ,y = "wensheng")
plt.legend(["1","2"])
plt.show()或for name,group in grouped:print(group)print(type(group))plt.scatter(data = group,x = "temp" ,y = "wensheng")#plt.legend(["1","2"])plt.show()

添加水平直线：

plt.axvline(x=90,ls="-",c='red')
plt.axhline(y=15,ls="-",c="yellow")#添加水平直线

添加水平区域阴影

plt.axvspan(xmin=1,xmax=10,facecolor="b",alpha=0.3)
plt.axhspan(ymin=1,ymax=10,facecolor="b",alpha=0.3)#添加水平阴影

更换横纵坐标的显示以及刻度(以下以X轴为例)：
主要采用的函数的方法是plt.xticks()

plt.scatter(data=df,x="Unnamed: 0",y="temp_1")
plt.scatter(data=df,x="Unnamed: 0",y="temp_2")
plt.legend(["temp1","temp2"])
plt.xticks(ticks=[2,4,6],labels=["低速","中速","高速"])#ticks显示的刻度，后面label是更换刻度显示

注意上面的ticks显示的刻度是根据横坐标的个数来定的也就是只能选取整数，可以使用range(x,y)
主要不要和plt.xlabel()弄混淆，其功能是添加横坐标标签！

工作代码：


"""
@FILE   : Feature_count_plot.py
@Modify Time:2020/6/16
@Description:
根据各个特征进行作图，特征PCA降维
"""
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
plt.rc("font",family="SimHei",size=20)
plt.rcParams["axes.unicode_minus"]=False
from sklearn.decomposition import PCA
import osdef exists(path):if not os.path.exists(path):os.makedirs(path)def feature_plot(path,df):#根据特征直接作图plt.figure(figsize=(18, 12))df = df.iloc[:,5:-1]#只有特征项# df = df.applymap(lambda x:round(x,2))#频域作图，数据格式修改为复数for columns, val in df.iteritems():plt.plot(val)plt.title(columns)exists(os.path.dirname(path) + "\\" + "All_plot")plt.savefig(os.path.dirname(path) + "\\" + "All_plot" + "\\" + columns + ".png")plt.clf()def data_pca(path,df):#将多维特征进行降维然后作图# df = df.applymap(lambda x:round(x,2))x_data = df.iloc[:,5:-2]estimator = PCA(n_components=2)X_pca=estimator.fit_transform(x_data)plt.plot(X_pca[:,0])plt.title("主成分_1")exists(os.path.dirname(path) + "\\" + "Pca_plot")plt.savefig(os.path.dirname(path) + "\\" + "Pca_plot" + "\\" + "feature_1.png")plt.clf()plt.plot(X_pca[:,1])plt.title("主成分_2")plt.savefig(os.path.dirname(path) + "\\" + "Pca_plot" + "\\"  + "feature_2.png")def All_plot(df):#多图在一个图上# df = pd.read_csv(r"E:\轴承特征指标数据-程序\太北SS4 0276A 11\result\time\result.csv")df = df.iloc[:, 5:-1]fig, axes = plt.subplots(nrows=4, ncols=4, figsize=(18, 12))ax = []ax = axes.flatten()i = 0for columns, val in df.iteritems():ax[i].plot(val)ax[i].set_title(columns)#     ax[i].set_xticks([4000,8000,12000])#     ax[i].set_xticklabels(["低速"，"中速"，"高速"])i += 1fig.tight_layout()plt.savefig(r"C:\Users\yunda\Desktop\a.png")plt.suptitle('Title',x=0.5,y=0.5)#增加标题，里面参数有x=,y=设置坐标if __name__ == "__main__":path = r"E:\轴承特征指标数据-程序\太北HXD1 1433B 35\result\fre\result.csv"df = pd.read_csv(path)feature_plot(path,df)data_pca(path,df)# All_plot(path,df)

图效果

不同的数据绘制在不同的图中并将其重复绘图保存：
可是开始用plt.figure(num=“fig”)指定num,后续要对其中某个图操作，在操作前加入plt.figure(num=“fig”) 指定即可。

ex1:
plt.figure(num="fig1")
plt.plot(df["Temp_1"])
plt.figure(num="fig2")
plt.plot(df["Temp_2"])plt.figure(num="fig2")
plt.savfig()#在这里就只保存了"fig2"图ex2:
plt.figure(num="fig1")
plt.figure(num="fig2")
for i in [1,2]:plt.figure(num="fig1")plt.plot(df["Temp_"+str(i)])
for i in [3,4]:plt.figure(num="fig2")plt.plot(df["Temp_"+str(i)])
plt.figure(num="fig1")
plt.savefig(r"C:\Users\yunda\Desktop\1.png")
plt.figure(num="fig2")
plt.savefig(r"C:\Users\yunda\Desktop\2.png")

清空画布:

Use clf() to clear figure.
Use cla() to clear axes.

matplotblib画图去除图中白边（即坐标轴旁边空白部分）：
方法一：

self.fig.tight_layout(pad=1.5)
plt.fig.tight_layout(pad=1.5)

方法二：也可以使用参数subplots_adjust()

plt.subplos_adjust(left=0.06,right=0.97,top=0.94,bottom=0.08) #设置白边宽度

设置坐标轴字体的方向。
plt.xlabel(“K”,rotation=0)

弹窗绘图(可放大缩小)：

#弹窗绘图
import sys
import timeimport numpy as npfrom matplotlib.backends.qt_compat import QtCore, QtWidgets, is_pyqt5
if is_pyqt5():from matplotlib.backends.backend_qt5agg import (FigureCanvas, NavigationToolbar2QT as NavigationToolbar)
else:from matplotlib.backends.backend_qt4agg import (FigureCanvas, NavigationToolbar2QT as NavigationToolbar)
from matplotlib.figure import Figureclass ApplicationWindow(QtWidgets.QMainWindow):def __init__(self):super().__init__()self._main = QtWidgets.QWidget()self.setCentralWidget(self._main)layout = QtWidgets.QVBoxLayout(self._main)static_canvas = FigureCanvas(Figure(figsize=(5, 3)))layout.addWidget(static_canvas)self.addToolBar(NavigationToolbar(static_canvas, self))#         dynamic_canvas = FigureCanvas(Figure(figsize=(5, 3)))
#         layout.addWidget(dynamic_canvas)
#         self.addToolBar(QtCore.Qt.BottomToolBarArea,
#                         NavigationToolbar(dynamic_canvas, self))#         self._static_ax = static_canvas.figure.subplots()
#         t = np.linspace(0, 10, 501)
#         self._static_ax.plot(t, np.tan(t), ".")#         self._dynamic_ax = dynamic_canvas.figure.subplots()
#         self._timer = dynamic_canvas.new_timer(
#             100, [(self._update_canvas, (), {})])
#         self._timer.start()#     def _update_canvas(self):
#         self._dynamic_ax.clear()
#         t = np.linspace(0, 10, 101)
#         # Shift the sinusoid as a function of time.
#         self._dynamic_ax.plot(t, np.sin(t + time.time()))
#         self._dynamic_ax.figure.canvas.draw()if __name__ == "__main__":qapp = QtWidgets.QApplication(sys.argv)app = ApplicationWindow()app.show()qapp.exec_()

matlabl库绘图中增加次坐标：

fig, ax1 = plt.subplots(figsize=(18,12))
ax2 = ax1.twinx()
ax1.plot(vir_g, color='b')
ax2.plot(speed, color='g')
ax2.set_yticks([-800,10],[' '," "])
ax1.set_yticks([-40,-20,0,20,40,60,80,100],['',"","",""])
for index,i in enumerate(result_index):plt.axvline(x=i*4096,ls="--",c='red')plt.annotate(dis["station"][index],xy=(i*4096-200, 80),fontsize=16,rotation=90)

绘图背景中增加网格线
采用matlab里面的plt.grid(linestyle=‘-.’)方法。

让坐标轴倒序排列显示
通过设置plt.xlim(6000,0)即可显示，y轴同理。

绘制热力图
主要有两种方式：matlab和seaborn两个库相关链接
方法一：

import matplotlib.pyplot as plt
from matplotlib import font_manager
import numpy as np
np.random.seed(30)
data = np.random.randint(70, 100, (30, 8))   #主要为二维数组
plt.imshow(data)
plt.xticks(range(0, 8), ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H'])
plt.yticks(range(0, 30), np.array(range(1, 31), dtype='U3'))
# 显示颜色条
plt.colorbar()
plt.title('30个产品的ABCDEFGH指标热力图', fontsize=25, color='#0033cc', fontproperties=font_manager.FontProperties(fname="STKAITI.TTF"))
plt.show()

方法二：

plt.figure(dpi=120)
# sns.heatmap(data=result["车体垂向平稳指标"])
# cmap=plt.get_cmap('Greens')
# tt = pd.DataFrame(result_lis,index=["1～3Hz","3～6Hz","6～10Hz","10～20Hz"])
tt = pd.DataFrame(result_lis,index=["10～20Hz","6～10Hz","3～6Hz","1～3Hz"])
sns.heatmap(data=tt,cmap=plt.get_cmap('Greens'))
#sns.heatmap(data=mm,cmap=sns.cubehelix_palette(as_cmap=True)) 换颜色风格
plt.xticks(range(5, 50,5), [str(11000+i*100) for i in range(11)])
# plt.yticks(range(0, 4),["1～3Hz","3～6Hz","6～10Hz","10～20Hz"])
plt.yticks(rotation=0)
plt.title("测试",fontsize=14)
plt.xlabel("测试",fontsize=14)
plt.ylabel("测试",fontsize=14)

坐标移动-横纵坐标0可以重合
主要用的函数为 plt.gca() #gca大概可以为get current axes的意思
对图中坐标轴的4个支柱“bottom”，“top”，“left”，“right”进行移动
用法代码主要如下：

ax = plt.gca()
ax.xaxis.set_ticks_position('bottom')   #锁定X轴,需要先锁定才能移动。实际情况可以不用此行代码
ax.spines['bottom'].set_position(('data',0))  #将X轴移动到0的位置 “data”只指0是数值的移动ax.spines['left'].set_position(('data',0)) #移动Y轴ax.spines['top'].set_color('none')  # 设置顶部支柱的颜色为空
ax.spines['right'].set_color('none')  # 设置右边支柱的颜色为空ax.spines['top'].set_visible(False)  #此方法也可以消除框
ax.spines['right'].set_visible(False)

也可以参考此链接

自定义图例
工作中遇到了，在一副图中绘制两次plt.plot()，但是实际再给图例的时候plt.legend()时，给出默认为第一个图，手动也不行。
后查询资料发现需要将绘图方法以变量获取即可，见一下代码：

fig, ax1 = plt.subplots(figsize=(18,12))
ax2 = ax1.twinx()                            #共用X轴m1, = ax1.plot(df["公里标"],df["rms"],"blue")
m2, = ax2.plot(df["公里标"],df["speed"],"red")plt.legend([m1,m2],["有效值","速度"])

seaborn实在柱状图
工作中需要统计各机务段情况，按照排序显示，可以采用sns.barplot()的方式。

sns.barplot(x,y,data,order=["xx"],orient="h")

数据分析 NO.15 数据可视化相关推荐

大数据可视化python_大数据分析之Python数据可视化的四种简易方法
本篇文章探讨了大数据分析之Python数据可视化的四种简易方法,希望阅读本篇文章以后大家有所收获,帮助大家对相关内容的理解更加深入. < 数据可视化是任何数据科学或机器学习项目的一个重要组成部分 ...
python数据分析实战：数据可视化的一些基本操作
数据可视化 1.散点图这里有我自己整理了一套最新的python系统学习教程,包括从基础的python脚本到web开发. 爬虫.数据分析.数据可视化.机器学习等. 小编这里推荐加小编的python学习 ...
Python生态概览（一）：数据分析库、数据可视化库、文本处理库、机器学习库、深度学习库
一.Python库之数据分析:Numpy, Pandas, SciPy 二.Python库之数据可视化:Matplotlib,Seaborn, Mayavi 三.Python库之文本处理:PyPDF2 ...
[转载] Python数据分析之Matplotlib数据可视化实例
参考链接: 使用Python进行数据分析和可视化2 Matplotlib数据可视化的应用实例分析 :2000至2017年各季度国民生产总值数据 npy文件--numpy专用的二进制格式 np.lo ...
【python与数据分析】Matplotlib数据可视化
目录前言一.数据可视化库matplotlib 1.综述 2.pyplot基础语法 (1)创建画布与创建子图 (2)添加画布内容 (3)保存与展示图形 (4)设置pyplot的动态rc参数二.绘制 ...
数据分析入门之数据可视化(散点图、折线图、饼图、柱状图、直方图)
文章目录 1.散点图 1.1.导入数据 1.2.数据可视化 1.3.设置参数 1.4.自定义样式 1.5.解决中文不能显示 2.折线图 2.1.导入数据 2.2.日期类型转换 2.3.数据可视化 3. ...
数据分析(8)--matplotlib 数据可视化
数据可视化基本概念数据可视化是指借助于图形化的手段,清晰.快捷有效的传达与沟通信息.同时,也可以辅助用户做出相应的判断,更好的去洞悉数据背后的价值. 字不如表,表不如图. 观察号码的频率,每个号码 ...
Python数据分析之Matplotlib数据可视化实例
Matplotlib数据可视化的应用实例分析 :2000至2017年各季度国民生产总值数据 npy文件--numpy专用的二进制格式 np.load()和np.save()是读写磁盘数组数据的两个重 ...
数据分析Power BI数据可视化教程（二）——关于切片器和地图可视化教程
Power BI 是基于云的商业数据分析和共享工具,它能帮您把复杂的数据转化成最简洁的视图.通过它,您可以快速创建丰富的可视化交互式报告,即使在外也能用手机端 APP 随时查看.甚至检测公司各项业务的 ...

数据分析 NO.15 数据可视化

数据分析 NO.15 数据可视化相关推荐

最新文章

热门文章