一、pandas

1、pandas常用的数据类型

(1)series:一维,带标签数组;(标签是指索引),也可以自己指定索引。
(2)DataFrame:二维,series容器。
pandas取行或者列的注意点:
方括号写数字,表示取行,对行进行操作df[:20] ;写字符串表示取列索引,对列进行操作df[“row_labels”] 结果是个series类型 因为只有一列;若取了两列则还是DataFrame类型。
pandas还有更多的经过优化的选择方式:
df.loc通过标签索引取行数据
df.iloc通过位置取行数据

import pandas as pd#(1)series and DataFrame
t=pd.Series([1,13,21,12,3,4],index=list("abcdef"))
#自定义索引,注意元素个数要和索引个数一致,默认索引从0开始
print(t)
print(type(t))
#通过字典定义series  字典的键就是索引 字典的值就是series的值
t1={"name": "xiaoyu","age":30,"tel":10086}
t2=pd.Series(t1)
print(t2)
#修改dtype同numpy
print(t.dtype)
t3=t.astype(float)
print(t3.dtype)
#series的切片和索引
#通过索引来取
print(t2["name"])
print(t["f"])
#通过位置来取 下标 从0开始
print(t2[2])
#取连续的 不连续的多行
print("连续",t2[:2])
print("不连续",t2[[0,2]])
print(t2[["name","tel"]])
print(t[t>10])
#可遍历、可迭代
for i in t2.index:print(i)
print(type(t2.index))
print(len(t2.index))
print(list(t2.index))
print(list(t2.index)[:2])
print(t2.values)
print(type(t2.values))
a     1
b    13
c    21
d    12
e     3
f     4
dtype: int64
<class 'pandas.core.series.Series'>
name    xiaoyu
age         30
tel      10086
dtype: object
int64
float64
xiaoyu
4
10086
连续 name    xiaoyu
age         30
dtype: object
不连续 name    xiaoyu
tel      10086
dtype: object
name    xiaoyu
tel      10086
dtype: object
b    13
c    21
d    12
dtype: int64
name
age
tel
<class 'pandas.core.indexes.base.Index'>
3
['name', 'age', 'tel']
['name', 'age']
['xiaoyu' 30 10086]
<class 'numpy.ndarray'>
import pandas as pd
import numpy as np#创建DataFrame
t=pd.DataFrame(np.arange(12).reshape(3,4))
print(t)
#index 行索引 竖着的 0轴 ;columns 列索引 横着的 1轴 默认从0开始
t1=pd.DataFrame(np.arange(12).reshape(3,4),index=list("abc"),columns=list("wxyz"))
print(t1)
t2={"name":["xiao","yu"],"age":[12,22],"tel":[100,101]}#字典
print(t2,type(t2))
t3=pd.DataFrame(t2)
print(t3,type(t3))
t4=[{"name":"xiao","age":21,"tel":10086},{"name":"xzz","age":24,"tel":10066}]#列表
print(t4,type(t4))
t5=pd.DataFrame(t4)
print(t5,type(t5))
#缺失的数据为NaN
t6=[{"name":"xiao","age":21,"tel":10086},{"name":"xzz","age":24},{"age":24,"tel":10086}]
t7=pd.DataFrame(t6)
print(t7)
   0  1   2   3
0  0  1   2   3
1  4  5   6   7
2  8  9  10  11w  x   y   z
a  0  1   2   3
b  4  5   6   7
c  8  9  10  11
{'name': ['xiao', 'yu'], 'age': [12, 22], 'tel': [100, 101]} <class 'dict'>name  age  tel
0  xiao   12  100
1    yu   22  101 <class 'pandas.core.frame.DataFrame'>
[{'name': 'xiao', 'age': 21, 'tel': 10086}, {'name': 'xzz', 'age': 24, 'tel': 10066}] <class 'list'>name  age    tel
0  xiao   21  10086
1   xzz   24  10066 <class 'pandas.core.frame.DataFrame'>name  age      tel
0  xiao   21  10086.0
1   xzz   24      NaN
2   NaN   24  10086.0



按by字段排序 降序 True是升序

import pandas as pd
import numpy as npt3=pd.DataFrame(np.arange(12).reshape(3,4),index=list("abc"),columns=list("WXYZ"))
print(t3)
print(t3.loc["a","Z"])
print(type(t3.loc["a","Z"]))
print(t3.loc["a"])#取整行
print(type(t3.loc["a"]))
print(t3.loc[:,"Y"])#取整列
#取多行或者多列
print(t3.loc[["a","c"]])
print(t3.loc[["a","c"],:])
print(t3.loc[:,["W","Y"]])
print(t3.loc[["a","b"],["W","Y"]])
print(type(t3.loc[["a","b"],["W","Y"]]))
print(t3.loc["a":"c",["W","Y"]])#使用标签也可以用冒号 选择从哪到哪  但这里冒号在Loc中是闭合的包含c这行,即会选择到冒号后的数据#iloc通过位置获取数据
print(t3.iloc[1])#行
print(t3.iloc[:,1])#列
print(t3.iloc[:,[0,2]])#取不连续的多列
print(t3.iloc[1:,:2])
t3.iloc[1:,:2]=30
print(t3)
t3.iloc[1:,:2]=np.nan#这里不会报错  pandaas自动帮我们转化为float numpy里会报错print(t3)
print(t3)
   W  X   Y   Z
a  0  1   2   3
b  4  5   6   7
c  8  9  10  11
3
<class 'numpy.int32'>
W    0
X    1
Y    2
Z    3
Name: a, dtype: int32
<class 'pandas.core.series.Series'>
a     2
b     6
c    10
Name: Y, dtype: int32W  X   Y   Z
a  0  1   2   3
c  8  9  10  11W  X   Y   Z
a  0  1   2   3
c  8  9  10  11W   Y
a  0   2
b  4   6
c  8  10W  Y
a  0  2
b  4  6
<class 'pandas.core.frame.DataFrame'>W   Y
a  0   2
b  4   6
c  8  10
W    4
X    5
Y    6
Z    7
Name: b, dtype: int32
a    1
b    5
c    9
Name: X, dtype: int32W   Y
a  0   2
b  4   6
c  8  10W  X
b  4  5
c  8  9W   X   Y   Z
a   0   1   2   3
b  30  30   6   7
c  30  30  10  11W    X   Y   Z
a  0.0  1.0   2   3
b  NaN  NaN   6   7
c  NaN  NaN  10  11

2、pandas读取外部数据

读csv文件:pd.read_csv("./dogName.csv")
读数据库文件:pd.read_sql(sql_sentence,connertion)
从MongoDB中取数据:

3、pandas索引和缺失值的处理

(1)pandas的布尔索引
df[df[“count_aname”]>800]# count_aname大于800的会被取出
df[1000>df[“count_aname”]>800]这种不行会报错,应该改成df[(df[“count_aname”]>800)&(df[“count_aname”]<1000 )]
如果是这种多个条件就要每一个条件都用括号括起来 在使用符号连接& 且 | 或。
(2)pandas字符串方法

import pandas as pd
import numpy as npt2={"name":["xiao","yu"],"age":[12,22],"tel":[100,101],"info":["王伟/潘粤明/肖战","肖战/王一博/李现"]}
df=pd.DataFrame(t2)
print(df.head(1))
print(df["info"].str.split("/"))#这是个列表
print(df["info"].str.split("/").tolist())#tolist可以把series转化为列表
'''
[['王伟', '潘粤明', '肖战'], ['肖战', '王一博', '李现']]
一个大列表 里边每一个部分都是一个列表
'''
   name  age  tel       info
0  xiao   12  100  王伟/潘粤明/肖战
0    [王伟, 潘粤明, 肖战]
1    [肖战, 王一博, 李现]
Name: info, dtype: object
[['王伟', '潘粤明', '肖战'], ['肖战', '王一博', '李现']]

(3)pandas缺失值处理

import pandas as pd
import numpy as np#缺失值的处理
t3=pd.DataFrame(np.arange(12).reshape(3,4),index=list("abc"),columns=list("WXYZ"))
t3.iloc[1:,:2]=np.nan
print(t3)
print(pd.isnull(t3))#哪些是NaN
print(pd.notnull(t3))#哪些不是NaN  这两个刚好相反
#布尔索引
print(t3[pd.notnull(t3["W"])])
print(pd.notnull(t3["W"]))
#all 这行全部都是NaN才删掉
print(t3.dropna(axis=0,how="all"))
#删除 存在NaN的行 默认how是 any 这行只要有一个NaN就把这行删掉
print(t3.dropna(axis=0))
print(t3)
#inplace就地修改  上边那些删除方式如果不重新给t3赋值 原来的t3是不变的
#t3=t3.dropna(axis=0) 此时输出的t3的效果等同于 原地修改的t3.dropna(axis=0,how="any",inplace=True)
# t3.dropna(axis=0,how="any",inplace=True)
# print(t3)#用均值填充
print(t3["Y"].mean())
t4=t3.fillna(t3["Y"].mean())
print(t4)
t2={"name":["xiao","yu"],"age":[12,22],"tel":[100,101],"info":["王伟/潘粤明/肖战","肖战/王一博/李现"]}
t2=pd.DataFrame(t2)
t2["age"][1]=np.nan
print(t2)
print(t2["age"].mean())#均值
print(t2["age"].median())#中位数
     W    X   Y   Z
a  0.0  1.0   2   3
b  NaN  NaN   6   7
c  NaN  NaN  10  11W      X      Y      Z
a  False  False  False  False
b   True   True  False  False
c   True   True  False  FalseW      X     Y     Z
a   True   True  True  True
b  False  False  True  True
c  False  False  True  TrueW    X  Y  Z
a  0.0  1.0  2  3
a     True
b    False
c    False
Name: W, dtype: boolW    X   Y   Z
a  0.0  1.0   2   3
b  NaN  NaN   6   7
c  NaN  NaN  10  11W    X  Y  Z
a  0.0  1.0  2  3W    X   Y   Z
a  0.0  1.0   2   3
b  NaN  NaN   6   7
c  NaN  NaN  10  11
6.0W    X   Y   Z
a  0.0  1.0   2   3
b  6.0  6.0   6   7
c  6.0  6.0  10  11name   age  tel       info
0  xiao  12.0  100  王伟/潘粤明/肖战
1    yu   NaN  101  肖战/王一博/李现
12.0
12.0

4、pandas常用统计方法

数据来源:https://www.kaggle.com/damianpanek/sunday-eda/data?select=IMDB-Movie-Data.csv

import pandas as pd
import numpy as np
#2006--2016年1000部最流行的电影数据 评分的平均分导演人数等
#读取
df=pd.read_csv("datasets_1474_2639_IMDB-Movie-Data.csv")
# print(df)
# print(df.info())
print(df.head(1))
#获取电影的平均评分
print(df["Rating"].mean())
#导演人数
# print(df["Director"].tolist())
print(len(set(df["Director"].tolist())))
# print(set(df["Director"].tolist()))
#
print(len(df["Director"].unique()))
# print(df["Director"].unique())#获取演员人数
print("原始数据")
print(df["Actors"].tolist())
print(",分割后数据")
temp_actors_list=df["Actors"].str.split(",").tolist()
print(temp_actors_list)
#展开
actors_list=[i for j in temp_actors_list for i in j]
# np.array(temp_actors_list).flattern()#这里没有展开flattern无效
# print(actors_list)
actors_num=len(set(actors_list))
print(actors_num)
print(df["Runtime (Minutes)"].max())
print(df["Runtime (Minutes)"].min())
print(df["Runtime (Minutes)"].argmax())
print(df["Runtime (Minutes)"].argmin())

import pandas as pd
import numpy as np
import matplotlib
from matplotlib import pyplot as plt
#统计其分类情况  一个电影可能是有多个分类 那怎么统计电影的分类情况
#思路 重新构造一个全为0的数组 列名为分类名 如果某一条数据中分类出现过,就让0变为1file_path="./datasets_1474_2639_IMDB-Movie-Data.csv";
df=pd.read_csv(file_path)
print(df.head(1))
print(df["Genre"])
#有1000行 类别个数个列
temp_list=df["Genre"].str.split(",").tolist()#是列表嵌套列表的形式 [[],[],[]]
print(temp_list)
genre_list=list(set([i for j in temp_list for i in j]))#集合去重set
print(genre_list)#构造一个全为零的数组,然后统计每个类别个数 就每一列单独求和即可
zeros_df=pd.DataFrame(np.zeros((df.shape[0],len(genre_list))),columns=genre_list)
print(zeros_df)#给每个电影出现分类的位置赋值
for i in range(df.shape[0]):zeros_df.loc[i,temp_list[i]]=1  #一行多列赋值
print(zeros_df.head(5))#统计每个电影分类的和
genre_count=zeros_df.sum(axis=0)
print(genre_count)#排序
genre_count=genre_count.sort_values()
#绘制图形
plt.figure(figsize=(20,8),dpi=80)_x=genre_count.index
_y=genre_count.values
plt.bar(range(len(_x)),_y,width=0.4,color="orange")#条形图
plt.xticks(range(len(_x)),_x)
plt.show()
   Rank                    Title  ... Revenue (Millions) Metascore
0     1  Guardians of the Galaxy  ...             333.13      76.0[1 rows x 12 columns]
0       Action,Adventure,Sci-Fi
1      Adventure,Mystery,Sci-Fi
2               Horror,Thriller
3       Animation,Comedy,Family
4      Action,Adventure,Fantasy...
995         Crime,Drama,Mystery
996                      Horror
997         Drama,Music,Romance
998            Adventure,Comedy
999       Comedy,Family,Fantasy
Name: Genre, Length: 1000, dtype: object
[['Action', 'Adventure', 'Sci-Fi'], ['Adventure', 'Mystery', 'Sci-Fi'], ['Horror', 'Thriller'], ['Animation', 'Comedy', 'Family'], ['Action', 'Adventure', 'Fantasy'], ['Action', 'Adventure', 'Fantasy'], ['Comedy', 'Drama', 'Music'], ['Comedy'], ['Action', 'Adventure', 'Biography'], ['Adventure', 'Drama', 'Romance'], ['Adventure', 'Family', 'Fantasy'], ['Biography', 'Drama', 'History'], ['Action', 'Adventure', 'Sci-Fi'], ['Animation', 'Adventure', 'Comedy'], ['Action', 'Comedy', 'Drama'], ['Animation', 'Adventure', 'Comedy'], ['Biography', 'Drama', 'History'], ['Action', 'Thriller'], ['Biography', 'Drama'], ['Drama', 'Mystery', 'Sci-Fi'], ['Adventure', 'Drama', 'Thriller'], ['Drama'], ['Crime', 'Drama', 'Horror'], ['Animation', 'Adventure', 'Comedy'], ['Action', 'Adventure', 'Sci-Fi'], ['Comedy'], ['Action', 'Adventure', 'Drama'], ['Horror', 'Thriller'], ['Comedy'], ['Action', 'Adventure', 'Drama'], ['Comedy'], ['Drama', 'Thriller'], ['Action', 'Adventure', 'Sci-Fi'], ['Action', 'Adventure', 'Comedy'], ['Action', 'Horror', 'Sci-Fi'], ['Action', 'Adventure', 'Sci-Fi'], ['Adventure', 'Drama', 'Sci-Fi'], ['Action', 'Adventure', 'Fantasy'], ['Action', 'Adventure', 'Western'], ['Comedy', 'Drama'], ['Animation', 'Adventure', 'Comedy'], ['Drama'], ['Horror'], ['Biography', 'Drama', 'History'], ['Drama'], ['Action', 'Adventure', 'Fantasy'], ['Drama', 'Thriller'], ['Adventure', 'Drama', 'Fantasy'], ['Action', 'Adventure', 'Sci-Fi'], ['Drama'], ['Action', 'Adventure', 'Fantasy'], ['Action', 'Adventure', 'Fantasy'], ['Comedy', 'Drama'], ['Action', 'Crime', 'Thriller'], ['Action', 'Crime', 'Drama'], ['Adventure', 'Drama', 'History'], ['Crime', 'Horror', 'Thriller'], ['Drama', 'Romance'], ['Comedy', 'Drama', 'Romance'], ['Biography', 'Drama'], ['Action', 'Adventure', 'Sci-Fi'], ['Horror', 'Mystery', 'Thriller'], ['Crime', 'Drama', 'Mystery'], ['Drama', 'Romance', 'Thriller'], ['Drama', 'Mystery', 'Sci-Fi'], ['Action', 'Adventure', 'Comedy'], ['Drama', 'History', 'Thriller'], ['Action', 'Adventure', 'Sci-Fi'], ['Drama'], ['Action', 'Drama', 'Thriller'], ['Drama', 'History'], ['Action', 'Drama', 'Romance'], ['Drama', 'Fantasy'], ['Drama', 'Romance'], ['Animation', 'Adventure', 'Comedy'], ['Action', 'Adventure', 'Fantasy'], ['Action', 'Sci-Fi'], ['Adventure', 'Drama', 'War'], ['Action', 'Adventure', 'Fantasy'], ['Action', 'Comedy', 'Fantasy'], ['Action', 'Adventure', 'Sci-Fi'], ['Comedy', 'Drama'], ['Biography', 'Comedy', 'Crime'], ['Crime', 'Drama', 'Mystery'], ['Action', 'Crime', 'Thriller'], ['Action', 'Adventure', 'Sci-Fi'], ['Crime', 'Drama'], ['Action', 'Adventure', 'Fantasy'], ['Crime', 'Drama', 'Mystery'], ['Action', 'Crime', 'Drama'], ['Crime', 'Drama', 'Mystery'], ['Action', 'Adventure', 'Fantasy'], ['Drama'], ['Comedy', 'Crime', 'Drama'], ['Action', 'Adventure', 'Sci-Fi'], ['Action', 'Comedy', 'Crime'], ['Animation', 'Drama', 'Fantasy'], ['Horror', 'Mystery', 'Sci-Fi'], ['Drama', 'Mystery', 'Thriller'], ['Crime', 'Drama', 'Thriller'], ['Biography', 'Crime', 'Drama'], ['Action', 'Adventure', 'Fantasy'], ['Adventure', 'Drama', 'Sci-Fi'], ['Crime', 'Mystery', 'Thriller'], ['Action', 'Adventure', 'Comedy'], ['Crime', 'Drama', 'Thriller'], ['Comedy'], ['Action', 'Adventure', 'Drama'], ['Drama'], ['Drama', 'Mystery', 'Sci-Fi'], ['Action', 'Horror', 'Thriller'], ['Biography', 'Drama', 'History'], ['Romance', 'Sci-Fi'], ['Action', 'Fantasy', 'War'], ['Adventure', 'Drama', 'Fantasy'], ['Comedy'], ['Horror', 'Thriller'], ['Action', 'Biography', 'Drama'], ['Drama', 'Horror', 'Mystery'], ['Animation', 'Adventure', 'Comedy'], ['Adventure', 'Drama', 'Family'], ['Adventure', 'Mystery', 'Sci-Fi'], ['Adventure', 'Comedy', 'Romance'], ['Action'], ['Action', 'Thriller'], ['Adventure', 'Drama', 'Family'], ['Action', 'Adventure', 'Sci-Fi'], ['Adventure', 'Crime', 'Mystery'], ['Comedy', 'Family', 'Musical'], ['Adventure', 'Drama', 'Thriller'], ['Drama'], ['Adventure', 'Comedy', 'Drama'], ['Drama', 'Horror', 'Thriller'], ['Drama', 'Music'], ['Action', 'Crime', 'Thriller'], ['Crime', 'Drama', 'Thriller'], ['Crime', 'Drama', 'Thriller'], ['Drama', 'Romance'], ['Mystery', 'Thriller'], ['Mystery', 'Thriller', 'Western'], ['Action', 'Adventure', 'Sci-Fi'], ['Comedy', 'Family'], ['Biography', 'Comedy', 'Drama'], ['Drama'], ['Drama', 'Western'], ['Drama', 'Mystery', 'Romance'], ['Comedy', 'Drama'], ['Action', 'Drama', 'Mystery'], ['Comedy'], ['Action', 'Adventure', 'Crime'], ['Adventure', 'Family', 'Fantasy'], ['Adventure', 'Sci-Fi', 'Thriller'], ['Drama'], ['Action', 'Crime', 'Drama'], ['Drama', 'Horror', 'Mystery'], ['Action', 'Horror', 'Sci-Fi'], ['Action', 'Adventure', 'Sci-Fi'], ['Comedy', 'Drama', 'Romance'], ['Action', 'Comedy', 'Fantasy'], ['Action', 'Comedy', 'Mystery'], ['Thriller', 'War'], ['Action', 'Comedy', 'Crime'], ['Action', 'Adventure', 'Sci-Fi'], ['Action', 'Adventure', 'Crime'], ['Action', 'Adventure', 'Thriller'], ['Drama', 'Fantasy', 'Romance'], ['Action', 'Adventure', 'Comedy'], ['Biography', 'Drama', 'History'], ['Action', 'Drama', 'History'], ['Action', 'Adventure', 'Thriller'], ['Crime', 'Drama', 'Thriller'], ['Animation', 'Adventure', 'Family'], ['Adventure', 'Horror'], ['Drama', 'Romance', 'Sci-Fi'], ['Animation', 'Adventure', 'Comedy'], ['Action', 'Adventure', 'Family'], ['Action', 'Adventure', 'Drama'], ['Action', 'Comedy'], ['Horror', 'Mystery', 'Thriller'], ['Action', 'Adventure', 'Comedy'], ['Comedy', 'Romance'], ['Horror', 'Mystery'], ['Drama', 'Family', 'Fantasy'], ['Sci-Fi'], ['Drama', 'Thriller'], ['Drama', 'Romance'], ['Drama', 'War'], ['Drama', 'Fantasy', 'Horror'], ['Crime', 'Drama'], ['Comedy', 'Drama', 'Romance'], ['Drama', 'Romance'], ['Drama'], ['Crime', 'Drama', 'History'], ['Horror', 'Sci-Fi', 'Thriller'], ['Action', 'Drama', 'Sport'], ['Action', 'Adventure', 'Sci-Fi'], ['Crime', 'Drama', 'Thriller'], ['Adventure', 'Biography', 'Drama'], ['Biography', 'Drama', 'Thriller'], ['Action', 'Comedy', 'Crime'], ['Action', 'Adventure', 'Sci-Fi'], ['Drama', 'Fantasy', 'Horror'], ['Biography', 'Drama', 'Thriller'], ['Action', 'Adventure', 'Sci-Fi'], ['Action', 'Adventure', 'Mystery'], ['Action', 'Adventure', 'Sci-Fi'], ['Drama', 'Horror'], ['Comedy', 'Drama', 'Romance'], ['Comedy', 'Romance'], ['Drama', 'Horror', 'Thriller'], ['Action', 'Adventure', 'Drama'], ['Drama'], ['Action', 'Adventure', 'Sci-Fi'], ['Action', 'Drama', 'Mystery'], ['Action', 'Adventure', 'Fantasy'], ['Action', 'Adventure', 'Fantasy'], ['Action', 'Adventure', 'Sci-Fi'], ['Action', 'Adventure', 'Comedy'], ['Drama', 'Horror'], ['Action', 'Comedy'], ['Action', 'Adventure', 'Sci-Fi'], ['Animation', 'Adventure', 'Comedy'], ['Horror', 'Mystery'], ['Crime', 'Drama', 'Mystery'], ['Comedy', 'Crime'], ['Drama'], ['Comedy', 'Drama', 'Romance'], ['Action', 'Adventure', 'Sci-Fi'], ['Action', 'Adventure', 'Family'], ['Horror', 'Sci-Fi', 'Thriller'], ['Drama', 'Fantasy', 'War'], ['Crime', 'Drama', 'Thriller'], ['Action', 'Adventure', 'Drama'], ['Action', 'Adventure', 'Thriller'], ['Action', 'Adventure', 'Drama'], ['Drama', 'Romance'], ['Biography', 'Drama', 'History'], ['Drama', 'Horror', 'Thriller'], ['Adventure', 'Comedy', 'Drama'], ['Action', 'Adventure', 'Romance'], ['Action', 'Drama', 'War'], ['Animation', 'Adventure', 'Comedy'], ['Animation', 'Adventure', 'Comedy'], ['Action', 'Adventure', 'Sci-Fi'], ['Adventure', 'Family', 'Fantasy'], ['Drama', 'Musical', 'Romance'], ['Drama', 'Sci-Fi', 'Thriller'], ['Comedy', 'Drama'], ['Action', 'Comedy', 'Crime'], ['Biography', 'Comedy', 'Drama'], ['Comedy', 'Drama', 'Romance'], ['Drama', 'Thriller'], ['Biography', 'Drama', 'History'], ['Action', 'Adventure', 'Sci-Fi'], ['Horror', 'Mystery', 'Thriller'], ['Comedy'], ['Action', 'Adventure', 'Sci-Fi'], ['Action', 'Drama', 'Sci-Fi'], ['Horror'], ['Drama', 'Thriller'], ['Comedy', 'Drama', 'Romance'], ['Drama', 'Thriller'], ['Comedy', 'Drama'], ['Drama'], ['Action', 'Adventure', 'Comedy'], ['Drama', 'Horror', 'Thriller'], ['Comedy'], ['Drama', 'Sci-Fi'], ['Action', 'Adventure', 'Sci-Fi'], ['Horror'], ['Action', 'Adventure', 'Thriller'], ['Adventure', 'Fantasy'], ['Action', 'Comedy', 'Crime'], ['Comedy', 'Drama', 'Music'], ['Animation', 'Adventure', 'Comedy'], ['Action', 'Adventure', 'Mystery'], ['Action', 'Comedy', 'Crime'], ['Crime', 'Drama', 'History'], ['Comedy'], ['Action', 'Adventure', 'Sci-Fi'], ['Crime', 'Mystery', 'Thriller'], ['Action', 'Adventure', 'Crime'], ['Thriller'], ['Biography', 'Drama', 'Romance'], ['Action', 'Adventure'], ['Action', 'Fantasy'], ['Action', 'Comedy'], ['Action', 'Adventure', 'Sci-Fi'], ['Action', 'Comedy', 'Crime'], ['Thriller'], ['Action', 'Drama', 'Horror'], ['Comedy', 'Music', 'Romance'], ['Comedy'], ['Drama'], ['Action', 'Adventure', 'Fantasy'], ['Drama', 'Romance'], ['Animation', 'Adventure', 'Comedy'], ['Comedy', 'Drama'], ['Biography', 'Crime', 'Drama'], ['Drama', 'History'], ['Action', 'Crime', 'Thriller'], ['Action', 'Biography', 'Drama'], ['Horror'], ['Comedy', 'Romance'], ['Comedy', 'Romance'], ['Comedy', 'Crime', 'Drama'], ['Adventure', 'Family', 'Fantasy'], ['Crime', 'Drama', 'Thriller'], ['Action', 'Crime', 'Thriller'], ['Comedy', 'Romance'], ['Biography', 'Drama', 'Sport'], ['Drama', 'Romance'], ['Drama', 'Horror'], ['Adventure', 'Fantasy'], ['Adventure', 'Family', 'Fantasy'], ['Action', 'Drama', 'Sci-Fi'], ['Action', 'Adventure', 'Sci-Fi'], ['Action', 'Horror'], ['Comedy', 'Horror', 'Thriller'], ['Action', 'Crime', 'Thriller'], ['Crime', 'Drama', 'Music'], ['Drama'], ['Action', 'Crime', 'Thriller'], ['Action', 'Sci-Fi', 'Thriller'], ['Biography', 'Drama'], ['Action', 'Adventure', 'Fantasy'], ['Drama', 'Horror', 'Sci-Fi'], ['Biography', 'Comedy', 'Drama'], ['Crime', 'Horror', 'Thriller'], ['Crime', 'Drama', 'Mystery'], ['Animation', 'Adventure', 'Comedy'], ['Action', 'Biography', 'Drama'], ['Biography', 'Drama'], ['Biography', 'Drama', 'History'], ['Action', 'Biography', 'Drama'], ['Drama', 'Fantasy', 'Horror'], ['Comedy', 'Drama', 'Romance'], ['Drama', 'Sport'], ['Drama', 'Romance'], ['Comedy', 'Romance'], ['Action', 'Crime', 'Thriller'], ['Action', 'Crime', 'Drama'], ['Action', 'Drama', 'Thriller'], ['Adventure', 'Family', 'Fantasy'], ['Action', 'Adventure'], ['Action', 'Adventure', 'Romance'], ['Adventure', 'Family', 'Fantasy'], ['Crime', 'Drama'], ['Comedy', 'Horror'], ['Comedy', 'Fantasy', 'Romance'], ['Drama'], ['Drama'], ['Comedy', 'Drama'], ['Comedy', 'Drama', 'Romance'], ['Adventure', 'Sci-Fi', 'Thriller'], ['Action', 'Adventure', 'Fantasy'], ['Comedy', 'Drama'], ['Biography', 'Drama', 'Romance'], ['Comedy', 'Fantasy'], ['Comedy', 'Drama', 'Fantasy'], ['Comedy'], ['Horror', 'Thriller'], ['Action', 'Adventure', 'Sci-Fi'], ['Adventure', 'Comedy', 'Horror'], ['Comedy', 'Mystery'], ['Drama'], ['Adventure', 'Drama', 'Fantasy'], ['Drama', 'Sport'], ['Action', 'Adventure'], ['Action', 'Adventure', 'Drama'], ['Action', 'Drama', 'Sci-Fi'], ['Action', 'Mystery', 'Sci-Fi'], ['Action', 'Crime', 'Drama'], ['Action', 'Crime', 'Fantasy'], ['Biography', 'Comedy', 'Drama'], ['Action', 'Crime', 'Thriller'], ['Biography', 'Crime', 'Drama'], ['Drama', 'Sport'], ['Adventure', 'Comedy', 'Drama'], ['Action', 'Adventure', 'Thriller'], ['Comedy', 'Fantasy', 'Horror'], ['Drama', 'Sport'], ['Horror', 'Thriller'], ['Drama', 'History', 'Thriller'], ['Animation', 'Action', 'Adventure'], ['Action', 'Adventure', 'Drama'], ['Action', 'Comedy', 'Family'], ['Action', 'Adventure', 'Drama'], ['Action', 'Adventure', 'Sci-Fi'], ['Action', 'Adventure', 'Sci-Fi'], ['Action', 'Comedy'], ['Action', 'Crime', 'Drama'], ['Biography', 'Drama'], ['Comedy', 'Romance'], ['Comedy'], ['Drama', 'Fantasy', 'Romance'], ['Action', 'Adventure', 'Sci-Fi'], ['Comedy'], ['Comedy', 'Sci-Fi'], ['Comedy', 'Drama'], ['Animation', 'Action', 'Adventure'], ['Horror'], ['Action', 'Biography', 'Crime'], ['Animation', 'Adventure', 'Comedy'], ['Drama', 'Romance'], ['Drama', 'Mystery', 'Thriller'], ['Drama', 'History', 'Thriller'], ['Animation', 'Adventure', 'Comedy'], ['Action', 'Adventure', 'Sci-Fi'], ['Adventure', 'Comedy'], ['Action', 'Thriller'], ['Comedy', 'Music'], ['Animation', 'Adventure', 'Comedy'], ['Crime', 'Drama', 'Thriller'], ['Action', 'Adventure', 'Crime'], ['Comedy', 'Drama', 'Horror'], ['Drama'], ['Drama', 'Mystery', 'Romance'], ['Adventure', 'Family', 'Fantasy'], ['Drama'], ['Action', 'Drama', 'Thriller'], ['Drama'], ['Action', 'Horror', 'Romance'], ['Action', 'Drama', 'Fantasy'], ['Action', 'Crime', 'Drama'], ['Drama', 'Fantasy', 'Romance'], ['Action', 'Crime', 'Thriller'], ['Action', 'Mystery', 'Thriller'], ['Horror', 'Mystery', 'Thriller'], ['Action', 'Horror', 'Sci-Fi'], ['Comedy', 'Drama'], ['Comedy'], ['Action', 'Adventure', 'Horror'], ['Action', 'Adventure', 'Thriller'], ['Action', 'Crime', 'Drama'], ['Comedy', 'Crime', 'Drama'], ['Drama', 'Romance'], ['Drama', 'Thriller'], ['Action', 'Comedy', 'Crime'], ['Comedy'], ['Adventure', 'Family', 'Fantasy'], ['Drama', 'Romance'], ['Animation', 'Family', 'Fantasy'], ['Drama', 'Romance'], ['Thriller'], ['Adventure', 'Horror', 'Mystery'], ['Action', 'Sci-Fi'], ['Adventure', 'Comedy', 'Drama'], ['Animation', 'Action', 'Adventure'], ['Drama', 'Horror'], ['Action', 'Adventure', 'Sci-Fi'], ['Comedy', 'Drama'], ['Action', 'Horror', 'Mystery'], ['Action', 'Thriller'], ['Action', 'Adventure', 'Sci-Fi'], ['Drama'], ['Comedy', 'Drama', 'Romance'], ['Comedy', 'Crime'], ['Comedy', 'Romance'], ['Drama', 'Romance'], ['Crime', 'Drama', 'Thriller'], ['Horror', 'Mystery', 'Thriller'], ['Biography', 'Drama'], ['Drama', 'Mystery', 'Sci-Fi'], ['Adventure', 'Comedy', 'Family'], ['Action', 'Adventure', 'Crime'], ['Action', 'Crime', 'Mystery'], ['Mystery', 'Thriller'], ['Action', 'Sci-Fi', 'Thriller'], ['Action', 'Comedy', 'Crime'], ['Biography', 'Crime', 'Drama'], ['Biography', 'Drama', 'History'], ['Action', 'Adventure', 'Sci-Fi'], ['Adventure', 'Family', 'Fantasy'], ['Biography', 'Drama', 'History'], ['Biography', 'Comedy', 'Drama'], ['Drama', 'Thriller'], ['Horror', 'Thriller'], ['Drama'], ['Drama', 'War'], ['Comedy', 'Drama', 'Romance'], ['Drama', 'Romance', 'Sci-Fi'], ['Action', 'Crime', 'Drama'], ['Comedy', 'Drama'], ['Animation', 'Action', 'Adventure'], ['Adventure', 'Comedy', 'Drama'], ['Comedy', 'Drama', 'Family'], ['Drama', 'Romance', 'Thriller'], ['Comedy', 'Crime', 'Drama'], ['Animation', 'Comedy', 'Family'], ['Drama', 'Horror', 'Sci-Fi'], ['Action', 'Adventure', 'Drama'], ['Action', 'Horror', 'Sci-Fi'], ['Action', 'Crime', 'Sport'], ['Drama', 'Horror', 'Sci-Fi'], ['Drama', 'Horror', 'Sci-Fi'], ['Action', 'Adventure', 'Comedy'], ['Mystery', 'Sci-Fi', 'Thriller'], ['Crime', 'Drama', 'Thriller'], ['Animation', 'Adventure', 'Comedy'], ['Action', 'Sci-Fi', 'Thriller'], ['Drama', 'Romance'], ['Crime', 'Drama', 'Thriller'], ['Comedy', 'Drama', 'Music'], ['Drama', 'Fantasy', 'Romance'], ['Crime', 'Drama', 'Thriller'], ['Crime', 'Drama', 'Thriller'], ['Comedy', 'Drama', 'Romance'], ['Comedy', 'Romance'], ['Drama', 'Sci-Fi', 'Thriller'], ['Drama', 'War'], ['Action', 'Crime', 'Drama'], ['Sci-Fi', 'Thriller'], ['Adventure', 'Drama', 'Horror'], ['Comedy', 'Drama', 'Music'], ['Comedy', 'Drama', 'Romance'], ['Action', 'Adventure', 'Drama'], ['Action', 'Crime', 'Drama'], ['Adventure', 'Fantasy'], ['Drama', 'Romance'], ['Biography', 'History', 'Thriller'], ['Crime', 'Drama', 'Thriller'], ['Action', 'Drama', 'History'], ['Biography', 'Comedy', 'Drama'], ['Crime', 'Drama', 'Thriller'], ['Action', 'Biography', 'Drama'], ['Action', 'Drama', 'Sci-Fi'], ['Adventure', 'Horror'], ['Action', 'Adventure', 'Sci-Fi'], ['Action', 'Adventure', 'Mystery'], ['Comedy', 'Drama', 'Romance'], ['Horror', 'Thriller'], ['Action', 'Sci-Fi', 'Thriller'], ['Action', 'Sci-Fi', 'Thriller'], ['Biography', 'Drama'], ['Action', 'Crime', 'Drama'], ['Action', 'Crime', 'Mystery'], ['Action', 'Adventure', 'Comedy'], ['Crime', 'Drama', 'Thriller'], ['Crime', 'Drama'], ['Mystery', 'Thriller'], ['Mystery', 'Sci-Fi', 'Thriller'], ['Action', 'Mystery', 'Sci-Fi'], ['Drama', 'Romance'], ['Drama', 'Thriller'], ['Drama', 'Mystery', 'Sci-Fi'], ['Comedy', 'Drama'], ['Adventure', 'Family', 'Fantasy'], ['Biography', 'Drama', 'Sport'], ['Drama'], ['Comedy', 'Drama', 'Romance'], ['Biography', 'Drama', 'Romance'], ['Action', 'Adventure', 'Sci-Fi'], ['Drama', 'Sci-Fi', 'Thriller'], ['Drama', 'Romance', 'Thriller'], ['Mystery', 'Thriller'], ['Mystery', 'Thriller'], ['Action', 'Drama', 'Fantasy'], ['Action', 'Adventure', 'Biography'], ['Adventure', 'Comedy', 'Sci-Fi'], ['Action', 'Adventure', 'Thriller'], ['Fantasy', 'Horror'], ['Horror', 'Mystery'], ['Animation', 'Adventure', 'Comedy'], ['Action', 'Adventure', 'Drama'], ['Adventure', 'Family', 'Fantasy'], ['Action', 'Adventure', 'Sci-Fi'], ['Comedy', 'Drama'], ['Comedy', 'Drama'], ['Crime', 'Drama', 'Thriller'], ['Comedy', 'Romance'], ['Animation', 'Comedy', 'Family'], ['Comedy', 'Drama'], ['Comedy', 'Drama'], ['Biography', 'Drama', 'Sport'], ['Action', 'Adventure', 'Fantasy'], ['Action', 'Drama', 'History'], ['Action', 'Adventure', 'Sci-Fi'], ['Action', 'Adventure', 'Mystery'], ['Crime', 'Drama', 'Mystery'], ['Action'], ['Action', 'Adventure', 'Family'], ['Comedy', 'Romance'], ['Comedy', 'Drama', 'Romance'], ['Biography', 'Drama', 'Sport'], ['Action', 'Fantasy', 'Thriller'], ['Biography', 'Drama', 'Sport'], ['Action', 'Drama', 'Fantasy'], ['Adventure', 'Sci-Fi', 'Thriller'], ['Animation', 'Adventure', 'Comedy'], ['Drama', 'Mystery', 'Thriller'], ['Drama', 'Romance'], ['Crime', 'Drama', 'Mystery'], ['Comedy', 'Romance', 'Sport'], ['Comedy', 'Family'], ['Drama', 'Horror', 'Mystery'], ['Action', 'Drama', 'Sport'], ['Action', 'Adventure', 'Comedy'], ['Drama', 'Mystery', 'Sci-Fi'], ['Animation', 'Action', 'Comedy'], ['Action', 'Crime', 'Drama'], ['Action', 'Crime', 'Drama'], ['Comedy', 'Drama', 'Romance'], ['Animation', 'Action', 'Adventure'], ['Crime', 'Drama'], ['Drama'], ['Drama'], ['Comedy', 'Crime'], ['Drama'], ['Action', 'Adventure', 'Fantasy'], ['Drama', 'Fantasy', 'Romance'], ['Comedy', 'Drama'], ['Drama', 'Fantasy', 'Thriller'], ['Biography', 'Crime', 'Drama'], ['Comedy', 'Drama', 'Romance'], ['Action', 'Crime', 'Drama'], ['Sci-Fi'], ['Action', 'Biography', 'Drama'], ['Action', 'Comedy', 'Romance'], ['Adventure', 'Comedy', 'Drama'], ['Comedy', 'Crime', 'Drama'], ['Action', 'Fantasy', 'Horror'], ['Drama', 'Horror'], ['Horror'], ['Action', 'Thriller'], ['Action', 'Adventure', 'Mystery'], ['Action', 'Adventure', 'Fantasy'], ['Comedy', 'Drama', 'Romance'], ['Crime', 'Drama', 'Mystery'], ['Adventure', 'Comedy', 'Family'], ['Comedy', 'Drama', 'Romance'], ['Comedy'], ['Comedy', 'Drama', 'Horror'], ['Drama', 'Horror', 'Thriller'], ['Animation', 'Adventure', 'Family'], ['Comedy', 'Romance'], ['Mystery', 'Romance', 'Sci-Fi'], ['Crime', 'Drama'], ['Drama', 'Horror', 'Mystery'], ['Comedy'], ['Biography', 'Drama'], ['Comedy', 'Drama', 'Thriller'], ['Comedy', 'Western'], ['Drama', 'History', 'War'], ['Drama', 'Horror', 'Sci-Fi'], ['Drama'], ['Comedy', 'Drama'], ['Fantasy', 'Horror', 'Thriller'], ['Drama', 'Romance'], ['Action', 'Comedy', 'Fantasy'], ['Drama', 'Horror', 'Musical'], ['Crime', 'Drama', 'Mystery'], ['Horror', 'Mystery', 'Thriller'], ['Comedy', 'Music'], ['Drama'], ['Biography', 'Crime', 'Drama'], ['Drama'], ['Action', 'Adventure', 'Comedy'], ['Crime', 'Drama', 'Mystery'], ['Drama'], ['Action', 'Comedy', 'Crime'], ['Comedy', 'Drama', 'Romance'], ['Crime', 'Drama', 'Mystery'], ['Action', 'Comedy', 'Crime'], ['Drama'], ['Drama', 'Romance'], ['Crime', 'Drama', 'Mystery'], ['Adventure', 'Comedy', 'Romance'], ['Comedy', 'Crime', 'Drama'], ['Adventure', 'Drama', 'Thriller'], ['Biography', 'Crime', 'Drama'], ['Crime', 'Drama', 'Thriller'], ['Drama', 'History', 'Thriller'], ['Action', 'Adventure', 'Sci-Fi'], ['Action', 'Comedy'], ['Horror'], ['Action', 'Crime', 'Mystery'], ['Comedy', 'Romance'], ['Comedy'], ['Action', 'Drama', 'Thriller'], ['Action', 'Adventure', 'Sci-Fi'], ['Drama', 'Mystery', 'Thriller'], ['Comedy', 'Drama', 'Romance'], ['Action', 'Fantasy', 'Horror'], ['Drama', 'Romance'], ['Biography', 'Drama'], ['Biography', 'Drama'], ['Action', 'Adventure', 'Sci-Fi'], ['Animation', 'Adventure', 'Comedy'], ['Drama', 'Mystery', 'Thriller'], ['Action', 'Horror', 'Sci-Fi'], ['Drama', 'Romance'], ['Biography', 'Drama'], ['Action', 'Adventure', 'Drama'], ['Adventure', 'Drama', 'Fantasy'], ['Drama', 'Family'], ['Comedy', 'Drama', 'Romance'], ['Drama', 'Romance', 'Sci-Fi'], ['Action', 'Adventure', 'Thriller'], ['Comedy', 'Romance'], ['Crime', 'Drama', 'Horror'], ['Comedy', 'Fantasy'], ['Action', 'Comedy', 'Crime'], ['Adventure', 'Drama', 'Romance'], ['Action', 'Crime', 'Drama'], ['Crime', 'Horror', 'Thriller'], ['Romance', 'Sci-Fi', 'Thriller'], ['Comedy', 'Drama', 'Romance'], ['Crime', 'Drama'], ['Crime', 'Drama', 'Mystery'], ['Action', 'Adventure', 'Sci-Fi'], ['Animation', 'Fantasy'], ['Animation', 'Adventure', 'Comedy'], ['Drama', 'Mystery', 'War'], ['Comedy', 'Romance'], ['Animation', 'Comedy', 'Family'], ['Comedy'], ['Horror', 'Mystery', 'Thriller'], ['Action', 'Adventure', 'Drama'], ['Comedy'], ['Drama'], ['Adventure', 'Biography', 'Drama'], ['Comedy'], ['Horror', 'Thriller'], ['Action', 'Drama', 'Family'], ['Comedy', 'Fantasy', 'Horror'], ['Comedy', 'Romance'], ['Drama', 'Mystery', 'Romance'], ['Action', 'Adventure', 'Comedy'], ['Thriller'], ['Comedy'], ['Adventure', 'Comedy', 'Sci-Fi'], ['Comedy', 'Drama', 'Fantasy'], ['Mystery', 'Thriller'], ['Comedy', 'Drama'], ['Adventure', 'Drama', 'Family'], ['Horror', 'Thriller'], ['Action', 'Drama', 'Romance'], ['Drama', 'Romance'], ['Action', 'Adventure', 'Fantasy'], ['Comedy'], ['Action', 'Biography', 'Drama'], ['Drama', 'Mystery', 'Romance'], ['Adventure', 'Drama', 'Western'], ['Drama', 'Music', 'Romance'], ['Comedy', 'Romance', 'Western'], ['Thriller'], ['Comedy', 'Drama', 'Romance'], ['Horror', 'Thriller'], ['Adventure', 'Family', 'Fantasy'], ['Crime', 'Drama', 'Mystery'], ['Horror', 'Mystery'], ['Comedy', 'Crime', 'Drama'], ['Action', 'Comedy', 'Romance'], ['Biography', 'Drama', 'History'], ['Adventure', 'Drama'], ['Drama', 'Thriller'], ['Drama'], ['Action', 'Adventure', 'Fantasy'], ['Action', 'Biography', 'Drama'], ['Drama', 'Music'], ['Comedy', 'Drama'], ['Drama', 'Thriller', 'War'], ['Action', 'Mystery', 'Thriller'], ['Horror', 'Sci-Fi', 'Thriller'], ['Comedy', 'Drama', 'Romance'], ['Action', 'Sci-Fi'], ['Action', 'Adventure', 'Fantasy'], ['Drama', 'Mystery', 'Romance'], ['Drama'], ['Action', 'Adventure', 'Thriller'], ['Action', 'Crime', 'Thriller'], ['Animation', 'Action', 'Adventure'], ['Drama', 'Fantasy', 'Mystery'], ['Drama', 'Sci-Fi'], ['Animation', 'Adventure', 'Comedy'], ['Horror', 'Thriller'], ['Action', 'Thriller'], ['Comedy'], ['Biography', 'Drama'], ['Action', 'Mystery', 'Thriller'], ['Action', 'Mystery', 'Sci-Fi'], ['Crime', 'Drama', 'Thriller'], ['Comedy', 'Romance'], ['Comedy', 'Drama', 'Romance'], ['Biography', 'Drama', 'Thriller'], ['Drama'], ['Action', 'Adventure', 'Family'], ['Animation', 'Comedy', 'Family'], ['Action', 'Crime', 'Drama'], ['Comedy'], ['Comedy', 'Crime', 'Thriller'], ['Comedy', 'Romance'], ['Animation', 'Comedy', 'Drama'], ['Action', 'Crime', 'Thriller'], ['Comedy', 'Romance'], ['Adventure', 'Biography', 'Drama'], ['Animation', 'Adventure', 'Comedy'], ['Crime', 'Drama', 'Mystery'], ['Action', 'Comedy', 'Sci-Fi'], ['Comedy', 'Fantasy', 'Horror'], ['Comedy', 'Crime'], ['Animation', 'Action', 'Adventure'], ['Action', 'Drama', 'Thriller'], ['Fantasy', 'Horror'], ['Crime', 'Drama', 'Thriller'], ['Action', 'Adventure', 'Fantasy'], ['Comedy', 'Drama', 'Romance'], ['Biography', 'Drama', 'Romance'], ['Action', 'Drama', 'History'], ['Action', 'Adventure', 'Comedy'], ['Horror', 'Thriller'], ['Horror', 'Mystery', 'Thriller'], ['Comedy', 'Romance'], ['Animation', 'Adventure', 'Comedy'], ['Crime', 'Drama', 'Mystery'], ['Crime', 'Drama', 'Mystery'], ['Adventure', 'Biography', 'Drama'], ['Horror', 'Mystery', 'Thriller'], ['Horror', 'Thriller'], ['Drama', 'Romance', 'War'], ['Adventure', 'Fantasy', 'Mystery'], ['Action', 'Adventure', 'Sci-Fi'], ['Biography', 'Drama'], ['Drama', 'Thriller'], ['Horror', 'Thriller'], ['Drama', 'Horror', 'Thriller'], ['Action', 'Adventure', 'Fantasy'], ['Action', 'Horror', 'Thriller'], ['Comedy'], ['Drama', 'Sport'], ['Comedy', 'Family'], ['Drama', 'Romance'], ['Action', 'Adventure', 'Comedy'], ['Comedy'], ['Mystery', 'Romance', 'Thriller'], ['Crime', 'Drama'], ['Action', 'Comedy'], ['Crime', 'Drama', 'Mystery'], ['Biography', 'Drama', 'Romance'], ['Comedy', 'Crime'], ['Drama', 'Thriller'], ['Drama'], ['Animation', 'Adventure', 'Comedy'], ['Action', 'Thriller'], ['Drama', 'Thriller'], ['Animation', 'Adventure', 'Comedy'], ['Crime', 'Drama', 'Mystery'], ['Thriller'], ['Biography', 'Drama', 'Sport'], ['Crime', 'Drama', 'Thriller'], ['Drama', 'Music'], ['Crime', 'Drama', 'Thriller'], ['Drama', 'Romance'], ['Animation', 'Action', 'Adventure'], ['Comedy', 'Drama'], ['Action', 'Adventure', 'Drama'], ['Biography', 'Crime', 'Drama'], ['Horror'], ['Biography', 'Drama', 'Mystery'], ['Drama', 'Romance'], ['Animation', 'Drama', 'Romance'], ['Comedy', 'Family'], ['Drama'], ['Mystery', 'Thriller'], ['Drama', 'Fantasy', 'Horror'], ['Drama', 'Romance'], ['Biography', 'Drama', 'History'], ['Comedy', 'Family'], ['Action', 'Adventure', 'Thriller'], ['Comedy', 'Drama'], ['Action', 'Adventure', 'Fantasy'], ['Action', 'Thriller'], ['Drama', 'Romance'], ['Comedy', 'Drama', 'Romance'], ['Drama', 'Horror', 'Sci-Fi'], ['Comedy', 'Horror', 'Romance'], ['Drama'], ['Action', 'Adventure', 'Sci-Fi'], ['Action', 'Adventure', 'Fantasy'], ['Action', 'Adventure', 'Drama'], ['Biography', 'Comedy', 'Drama'], ['Drama', 'Mystery', 'Romance'], ['Animation', 'Adventure', 'Comedy'], ['Drama', 'Romance', 'Sci-Fi'], ['Drama'], ['Drama', 'Fantasy'], ['Drama', 'Romance'], ['Comedy', 'Horror', 'Thriller'], ['Comedy', 'Drama', 'Romance'], ['Crime', 'Drama'], ['Comedy', 'Romance'], ['Action', 'Drama', 'Family'], ['Comedy', 'Drama', 'Romance'], ['Action', 'Thriller', 'War'], ['Action', 'Comedy', 'Horror'], ['Biography', 'Drama', 'Sport'], ['Adventure', 'Comedy', 'Drama'], ['Comedy', 'Romance'], ['Comedy', 'Romance'], ['Comedy', 'Drama', 'Romance'], ['Action', 'Adventure', 'Crime'], ['Comedy', 'Romance'], ['Animation', 'Action', 'Adventure'], ['Action', 'Crime', 'Sci-Fi'], ['Drama'], ['Comedy', 'Drama', 'Romance'], ['Crime', 'Thriller'], ['Comedy', 'Horror', 'Sci-Fi'], ['Drama', 'Thriller'], ['Drama', 'Fantasy', 'Horror'], ['Thriller'], ['Adventure', 'Drama', 'Family'], ['Mystery', 'Sci-Fi', 'Thriller'], ['Biography', 'Crime', 'Drama'], ['Drama', 'Fantasy', 'Horror'], ['Action', 'Adventure', 'Thriller'], ['Crime', 'Drama', 'Horror'], ['Crime', 'Drama', 'Fantasy'], ['Adventure', 'Family', 'Fantasy'], ['Action', 'Adventure', 'Drama'], ['Action', 'Comedy', 'Horror'], ['Comedy', 'Drama', 'Family'], ['Action', 'Thriller'], ['Action', 'Adventure', 'Sci-Fi'], ['Adventure', 'Drama', 'Fantasy'], ['Drama'], ['Drama'], ['Comedy'], ['Drama'], ['Comedy', 'Drama', 'Music'], ['Drama', 'Fantasy', 'Music'], ['Drama'], ['Thriller'], ['Comedy', 'Horror'], ['Action', 'Comedy', 'Sport'], ['Horror'], ['Comedy', 'Drama'], ['Action', 'Drama', 'Thriller'], ['Drama', 'Romance'], ['Horror', 'Mystery'], ['Adventure', 'Drama', 'Fantasy'], ['Thriller'], ['Comedy', 'Romance'], ['Action', 'Sci-Fi', 'Thriller'], ['Fantasy', 'Mystery', 'Thriller'], ['Biography', 'Drama'], ['Crime', 'Drama'], ['Action', 'Adventure', 'Sci-Fi'], ['Adventure'], ['Comedy', 'Drama'], ['Comedy', 'Drama'], ['Comedy', 'Drama', 'Romance'], ['Adventure', 'Comedy', 'Drama'], ['Action', 'Sci-Fi', 'Thriller'], ['Comedy', 'Romance'], ['Action', 'Fantasy', 'Horror'], ['Crime', 'Drama', 'Thriller'], ['Action', 'Drama', 'Thriller'], ['Crime', 'Drama', 'Mystery'], ['Crime', 'Drama', 'Mystery'], ['Drama', 'Sci-Fi', 'Thriller'], ['Biography', 'Drama', 'History'], ['Crime', 'Horror', 'Thriller'], ['Drama'], ['Drama', 'Mystery', 'Thriller'], ['Adventure', 'Biography'], ['Adventure', 'Biography', 'Crime'], ['Action', 'Horror', 'Thriller'], ['Action', 'Adventure', 'Western'], ['Horror', 'Thriller'], ['Drama', 'Mystery', 'Thriller'], ['Comedy', 'Drama', 'Musical'], ['Horror', 'Mystery'], ['Biography', 'Drama', 'Sport'], ['Comedy', 'Family', 'Romance'], ['Drama', 'Mystery', 'Thriller'], ['Comedy'], ['Drama'], ['Drama', 'Thriller'], ['Biography', 'Drama', 'Family'], ['Comedy', 'Drama', 'Family'], ['Drama', 'Fantasy', 'Musical'], ['Comedy'], ['Adventure', 'Family'], ['Adventure', 'Comedy', 'Fantasy'], ['Horror', 'Thriller'], ['Drama', 'Romance'], ['Horror'], ['Biography', 'Drama', 'History'], ['Action', 'Adventure', 'Fantasy'], ['Drama', 'Family', 'Music'], ['Comedy', 'Drama', 'Romance'], ['Action', 'Adventure', 'Horror'], ['Comedy'], ['Crime', 'Drama', 'Mystery'], ['Horror'], ['Drama', 'Music', 'Romance'], ['Adventure', 'Comedy'], ['Comedy', 'Family', 'Fantasy']]
['Mystery', 'Drama', 'Western', 'Horror', 'Romance', 'Animation', 'Crime', 'Fantasy', 'Music', 'Sport', 'Thriller', 'Family', 'War', 'Biography', 'Comedy', 'Adventure', 'Sci-Fi', 'History', 'Musical', 'Action']Mystery  Drama  Western  Horror  ...  Sci-Fi  History  Musical  Action
0        0.0    0.0      0.0     0.0  ...     0.0      0.0      0.0     0.0
1        0.0    0.0      0.0     0.0  ...     0.0      0.0      0.0     0.0
2        0.0    0.0      0.0     0.0  ...     0.0      0.0      0.0     0.0
3        0.0    0.0      0.0     0.0  ...     0.0      0.0      0.0     0.0
4        0.0    0.0      0.0     0.0  ...     0.0      0.0      0.0     0.0
..       ...    ...      ...     ...  ...     ...      ...      ...     ...
995      0.0    0.0      0.0     0.0  ...     0.0      0.0      0.0     0.0
996      0.0    0.0      0.0     0.0  ...     0.0      0.0      0.0     0.0
997      0.0    0.0      0.0     0.0  ...     0.0      0.0      0.0     0.0
998      0.0    0.0      0.0     0.0  ...     0.0      0.0      0.0     0.0
999      0.0    0.0      0.0     0.0  ...     0.0      0.0      0.0     0.0[1000 rows x 20 columns]Mystery  Drama  Western  Horror  ...  Sci-Fi  History  Musical  Action
0      0.0    0.0      0.0     0.0  ...     1.0      0.0      0.0     1.0
1      1.0    0.0      0.0     0.0  ...     1.0      0.0      0.0     0.0
2      0.0    0.0      0.0     1.0  ...     0.0      0.0      0.0     0.0
3      0.0    0.0      0.0     0.0  ...     0.0      0.0      0.0     0.0
4      0.0    0.0      0.0     0.0  ...     0.0      0.0      0.0     1.0[5 rows x 20 columns]
Mystery      106.0
Drama        513.0
Western        7.0
Horror       119.0
Romance      141.0
Animation     49.0
Crime        150.0
Fantasy      101.0
Music         16.0
Sport         18.0
Thriller     195.0
Family        51.0
War           13.0
Biography     81.0
Comedy       279.0
Adventure    259.0
Sci-Fi       120.0
History       29.0
Musical        5.0
Action       303.0
dtype: float64

5、pandas中数据合并

join方法和merge方法

修正:图片中外连接是 outer

#数据合并 join方法 默认情况下把行索引(index)相同的数据合并到一起
df1=pd.DataFrame(np.ones((2,4)),index=["A","B"],columns=list("abcd"))
df2=pd.DataFrame(np.zeros((3,3)),index=["A","B","C"],columns=list("xyz"))
print(df1)
print(df2)dfhb1=df1.join(df2)#行数以df1为准
dfhb2=df2.join(df1)#行数以df2为准
print(dfhb1)
print("..........")
print(dfhb2)#按列索引进行合并 merge方法  columns
df3=pd.DataFrame(np.zeros((3,3)),columns=list("fax"))
print(df3)
dfhb3=df1.merge(df3,on="a")#在a列索引进行合并 默认情况下 取交集 内连接
print(dfhb3)
df3=pd.DataFrame(np.arange(9).reshape((3,3)),columns=list("fax"))
print(df3)
dfhb4=df1.merge(df3,on="a")
print(dfhb4)
df1.loc["A","a"]=100
dfhb5=df1.merge(df3,on="a")
print(dfhb5)
dfhb6=df1.merge(df3,on="a",how="inner")#默认的 内连接 交集
print(dfhb6)
dfhb7=df1.merge(df3,on="a",how="outer")#外连接 取并集
print(dfhb7)
dfhb8=df1.merge(df3,on="a",how="left")#左连接 以df1为准
print(dfhb8)
dfhb9=df1.merge(df3,on="a",how="right")#右连接 以df3为准
print(dfhb9)
     a    b    c    d
A  1.0  1.0  1.0  1.0
B  1.0  1.0  1.0  1.0x    y    z
A  0.0  0.0  0.0
B  0.0  0.0  0.0
C  0.0  0.0  0.0a    b    c    d    x    y    z
A  1.0  1.0  1.0  1.0  0.0  0.0  0.0
B  1.0  1.0  1.0  1.0  0.0  0.0  0.0
..........x    y    z    a    b    c    d
A  0.0  0.0  0.0  1.0  1.0  1.0  1.0
B  0.0  0.0  0.0  1.0  1.0  1.0  1.0
C  0.0  0.0  0.0  NaN  NaN  NaN  NaNf    a    x
0  0.0  0.0  0.0
1  0.0  0.0  0.0
2  0.0  0.0  0.0
Empty DataFrame
Columns: [a, b, c, d, f, x]
Index: []f  a  x
0  0  1  2
1  3  4  5
2  6  7  8a    b    c    d  f  x
0  1.0  1.0  1.0  1.0  0  2
1  1.0  1.0  1.0  1.0  0  2a    b    c    d  f  x
0  1.0  1.0  1.0  1.0  0  2a    b    c    d  f  x
0  1.0  1.0  1.0  1.0  0  2a    b    c    d    f    x
0  100.0  1.0  1.0  1.0  NaN  NaN
1    1.0  1.0  1.0  1.0  0.0  2.0
2    4.0  NaN  NaN  NaN  3.0  5.0
3    7.0  NaN  NaN  NaN  6.0  8.0a    b    c    d    f    x
0  100.0  1.0  1.0  1.0  NaN  NaN
1    1.0  1.0  1.0  1.0  0.0  2.0a    b    c    d  f  x
0  1.0  1.0  1.0  1.0  0  2
1  4.0  NaN  NaN  NaN  3  5
2  7.0  NaN  NaN  NaN  6  8

如果t1和t2没有列名相同,则用t1.merge(t2,left_on=“a”,right_on=“b”,how=“right”)
这里,left_on左侧按什么合并,right_on右侧按什么合并。

6、pandas中数据分组聚合

全球星巴克店铺的统计数据。数据来源:https://www.kaggle.com/starbucks/store-locations/data




两个索引,也就是复合索引,见下一节。

import pandas as pd
import numpy as np
import pymysql
pd.set_option('display.max_columns', None)
file_path="./directory.csv"
df=pd.read_csv(file_path)
print(df.head(1))
print(df.info())
#分组
grouped=df.groupby(by="Country")
print(grouped)
print("-"*100)
#DataFrameGroupBy
# 可以进行遍历  (元组)
# for i,j in grouped:
#     print(i)
#     print("*"*10)
#     print(j,type(j))
#     print("-"*100)
#打印国家是US的数据
print(df[df["Country"]=="US"])
print("-"*100)
#可以调用聚合方法  统计m个数
country_count=grouped["Brand"].count()
print(country_count)
print("-"*100)
#分别统计美国和中国的星巴克的店铺数量
print(country_count["US"])
print(country_count["CN"])
print("-"*100)
#统计中国每个省份的星巴克的店铺数量
china_data=df[df["Country"]=="CN"]#选中中国的数据
groupbyProvince=china_data.groupby(by="State/Province").count()["Brand"]#取Brand列
print(groupbyProvince)
print("-"*100)
#数据按照多个条件进行分组 返回series 两个索引
grouped1=df["Brand"].groupby(by=[df["Country"],df["State/Province"]]).count()
print(grouped1)
print(type(grouped1))
print("-"*100)#数据按照多个条件进行分组  返回DataFrame
grouped2=df[["Brand"]].groupby(by=[df["Country"],df["State/Province"]]).count()
grouped3=df.groupby(by=[df["Country"],df["State/Province"]])[["Brand"]].count()
grouped4=df.groupby(by=[df["Country"],df["State/Province"]]).count()[["Brand"]]
#以上三个效果相同
print(grouped2,type(grouped2))
print("-"*100)
print(grouped3,type(grouped3))
print("-"*100)
print(grouped4,type(grouped4))
       Brand  Store Number     Store Name Ownership Type     Street Address  \
0  Starbucks  47370-257954  Meritxell, 96       Licensed  Av. Meritxell, 96   City State/Province Country Postcode Phone Number  \
0  Andorra la Vella              7      AD    AD500    376818720   Timezone  Longitude  Latitude
0  GMT+1:00 Europe/Andorra       1.53     42.51
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 25600 entries, 0 to 25599
Data columns (total 13 columns):#   Column          Non-Null Count  Dtype
---  ------          --------------  -----  0   Brand           25600 non-null  object 1   Store Number    25600 non-null  object 2   Store Name      25600 non-null  object 3   Ownership Type  25600 non-null  object 4   Street Address  25598 non-null  object 5   City            25585 non-null  object 6   State/Province  25600 non-null  object 7   Country         25600 non-null  object 8   Postcode        24078 non-null  object 9   Phone Number    18739 non-null  object 10  Timezone        25600 non-null  object 11  Longitude       25599 non-null  float6412  Latitude        25599 non-null  float64
dtypes: float64(2), object(11)
memory usage: 2.5+ MB
None
<pandas.core.groupby.generic.DataFrameGroupBy object at 0x0000022CED62E240>
----------------------------------------------------------------------------------------------------Brand  Store Number                        Store Name  \
11964  Starbucks   3513-125945           Safeway-Anchorage #1809
11965  Starbucks   74352-84449           Safeway-Anchorage #2628
11966  Starbucks  12449-152385         Safeway - Anchorage #1813
11967  Starbucks  24936-233524          100th & C St - Anchorage
11968  Starbucks    8973-85630              Old Seward & Diamond
...          ...           ...                               ...
25567  Starbucks   74385-87621             Safeway-Laramie #2466
25568  Starbucks   73320-24375          Ridley's - Laramie #1131
25569  Starbucks  22425-219024            Laramie - Grand & 30th
25570  Starbucks  10849-103163      I-80 & Dewar Dr-Rock Springs
25571  Starbucks  10769-102454  Coffeen & Brundage Lane-Sheridan   Ownership Type                                     Street Address  \
11964       Licensed                               5600 Debarr Rd Ste 9
11965       Licensed                                     1725 Abbott Rd
11966       Licensed                                    1501 Huffman Rd
11967  Company Owned  320 W. 100th Ave, 100, Southgate Shopping Ctr ...
11968  Company Owned                                 1005 E Dimond Blvd
...              ...                                                ...
25567       Licensed                                       554 N 3rd St
25568       Licensed                                      3112 E. Grand
25569  Company Owned                                     3021 Grand Ave
25570  Company Owned                                   118 Westland Way
25571  Company Owned                                   2208 Coffeen Ave   City State/Province Country   Postcode    Phone Number  \
11964     Anchorage             AK      US  995042300    907-339-0900
11965     Anchorage             AK      US  995073444    907-339-2800
11966     Anchorage             AK      US  995153596    907-339-1300
11967     Anchorage             AK      US      99515  (907) 227-9631
11968     Anchorage             AK      US  995152050    907-344-4160
...             ...            ...     ...        ...             ...
25567       Laramie             WY      US  820723012    307-721-5107
25568       Laramie             WY      US  820705141    307-742-8146
25569       Laramie             WY      US      82070    307-742-3262
25570  Rock Springs             WY      US  829015751    307-362-7145
25571     Sheridian             WY      US  828016213    307-672-5129   Timezone  Longitude  Latitude
11964  GMT-09:00 America/Anchorage    -149.78     61.21
11965  GMT-09:00 America/Anchorage    -149.84     61.14
11966  GMT-09:00 America/Anchorage    -149.85     61.11
11967  GMT-09:00 America/Anchorage    -149.89     61.13
11968  GMT-09:00 America/Anchorage    -149.86     61.14
...                            ...        ...       ...
25567     GMT-07:00 America/Denver    -105.59     41.32
25568     GMT-07:00 America/Denver    -105.56     41.31
25569     GMT-07:00 America/Denver    -105.56     41.31
25570     GMT-07:00 America/Denver    -109.25     41.58
25571     GMT-07:00 America/Denver    -106.94     44.77  [13608 rows x 13 columns]
----------------------------------------------------------------------------------------------------
Country
AD        1
AE      144
AR      108
AT       18
AU       22...
TT        3
TW      394
US    13608
VN       25
ZA        3
Name: Brand, Length: 73, dtype: int64
----------------------------------------------------------------------------------------------------
13608
2734
----------------------------------------------------------------------------------------------------
State/Province
11    236
12     58
13     24
14      8
15      8
21     57
22     13
23     16
31    551
32    354
33    315
34     26
35     75
36     13
37     75
41     21
42     76
43     35
44    333
45     21
46     16
50     41
51    104
52      9
53     24
61     42
62      3
63      3
64      2
91    162
92     13
Name: Brand, dtype: int64
----------------------------------------------------------------------------------------------------
Country  State/Province
AD       7                  1
AE       AJ                 2AZ                48DU                82FU                 2..
US       WV                25WY                23
VN       HN                 6SG                19
ZA       GT                 3
Name: Brand, Length: 545, dtype: int64
<class 'pandas.core.series.Series'>
----------------------------------------------------------------------------------------------------Brand
Country State/Province
AD      7                   1
AE      AJ                  2AZ                 48DU                 82FU                  2
...                       ...
US      WV                 25WY                 23
VN      HN                  6SG                 19
ZA      GT                  3[545 rows x 1 columns] <class 'pandas.core.frame.DataFrame'>
----------------------------------------------------------------------------------------------------Brand
Country State/Province
AD      7                   1
AE      AJ                  2AZ                 48DU                 82FU                  2
...                       ...
US      WV                 25WY                 23
VN      HN                  6SG                 19
ZA      GT                  3[545 rows x 1 columns] <class 'pandas.core.frame.DataFrame'>
----------------------------------------------------------------------------------------------------Brand
Country State/Province
AD      7                   1
AE      AJ                  2AZ                 48DU                 82FU                  2
...                       ...
US      WV                 25WY                 23
VN      HN                  6SG                 19
ZA      GT                  3[545 rows x 1 columns] <class 'pandas.core.frame.DataFrame'>

7、pandas中索引和复合索引


① d.swaplever()交换内外层索引,可以从内层开始取索引;






修正:从里层索引来选时——交换里外层索引的位置

import pandas as pd
import numpy as np
pd.set_option('display.max_columns', None)
file_path="./directory.csv"
df=pd.read_csv(file_path)
# print(df.head(1))
# print(df.info())
#索引的方法和属性
grouped2=df[["Brand"]].groupby(by=[df["Country"],df["State/Province"]]).count()
print(grouped2.index)#复合索引
df1=pd.DataFrame(np.ones((2,4)),index=["A","B"],columns=list("abcd"))
df2=pd.DataFrame(np.zeros((3,3)),index=["A","B","C"],columns=list("xyz"))
print(df1)
print(df2)
# df1.index=["a","b"]#设置索引的值
# print(df1)
# reindex 对df1取行 重新赋值
print(df1.reindex(["a","f"]))
print(df1)
print(df1.set_index("a").index)
df3=df1.set_index("a",drop=True)#默认 指定列名为a的这一列作为索引  drop=True a这列会删掉
print(df3)
df3=df1.set_index("a",drop=False)# 指定列名为a的这一列作为索引  drop=False a这列不会删掉
print(df3)
print(df1["d"].unique())#取该列值 不重复
df4=df1.set_index("b",drop=True)#索引可以重复
print("-"*100)
print(df4)
print(df1.set_index("b").index.unique())#找不重复的索引
print(len(df1.set_index("b").index))#求索引长度个数
print(list(df1.set_index("b").index))#把索引强制类型转化为列表  索引是个可迭代对象
print("-"*100)df11=pd.DataFrame(np.ones((2,4)),index=["A","B"],columns=list("abcd"))
print(df11)
df11.loc["A","a"]=100
print(df11)
df5=df11.set_index(["a","b"])#设置两个索引 但df1本身不变
print(df5)
print(df11.set_index(["a","b"]).index)
df6=df11.set_index(["a","b","c"],drop=False)#设置三个索引  不删除 但df1本身不变
print(df6)
MultiIndex([('AD',  '7'),('AE', 'AJ'),('AE', 'AZ'),('AE', 'DU'),('AE', 'FU'),('AE', 'RK'),('AE', 'SH'),('AE', 'UQ'),('AR',  'B'),('AR',  'C'),...('US', 'UT'),('US', 'VA'),('US', 'VT'),('US', 'WA'),('US', 'WI'),('US', 'WV'),('US', 'WY'),('VN', 'HN'),('VN', 'SG'),('ZA', 'GT')],names=['Country', 'State/Province'], length=545)a    b    c    d
A  1.0  1.0  1.0  1.0
B  1.0  1.0  1.0  1.0x    y    z
A  0.0  0.0  0.0
B  0.0  0.0  0.0
C  0.0  0.0  0.0a   b   c   d
a NaN NaN NaN NaN
f NaN NaN NaN NaNa    b    c    d
A  1.0  1.0  1.0  1.0
B  1.0  1.0  1.0  1.0
Float64Index([1.0, 1.0], dtype='float64', name='a')b    c    d
a
1.0  1.0  1.0  1.0
1.0  1.0  1.0  1.0a    b    c    d
a
1.0  1.0  1.0  1.0  1.0
1.0  1.0  1.0  1.0  1.0
[1.]
----------------------------------------------------------------------------------------------------a    c    d
b
1.0  1.0  1.0  1.0
1.0  1.0  1.0  1.0
Float64Index([1.0], dtype='float64', name='b')
2
[1.0, 1.0]
----------------------------------------------------------------------------------------------------a    b    c    d
A  1.0  1.0  1.0  1.0
B  1.0  1.0  1.0  1.0a    b    c    d
A  100.0  1.0  1.0  1.0
B    1.0  1.0  1.0  1.0c    d
a     b
100.0 1.0  1.0  1.0
1.0   1.0  1.0  1.0
MultiIndex([(100.0, 1.0),(  1.0, 1.0)],names=['a', 'b'])a    b    c    d
a     b   c
100.0 1.0 1.0  100.0  1.0  1.0  1.0
1.0   1.0 1.0    1.0  1.0  1.0  1.0

8、pandas中的时间序列


import pandas as pd
from matplotlib import pyplot as plt
import numpy as np#生成时间序列 start表示开始时间 end表示结束时间 freq表示频率 D表示 天
pd_time=pd.date_range(start="20171230",end="20180131",freq="D")
print(pd_time)
pd_time=pd.date_range(start="20171230",end="20180131",freq="10D")#每隔十天
print(pd_time)
#periods 表示个数
pd_time=pd.date_range(start="20171230",periods=10,freq="10D")
print(pd_time)
# M 表示月份
pd_time=pd.date_range(start="20171230",periods=10,freq="M")
print(pd_time)
# H小时
pd_time=pd.date_range(start="20171230",periods=10,freq="H")
print(pd_time)
pd_time=pd.date_range(start="2017/12/30",periods=10,freq="H")
print(pd_time)
pd_time=pd.date_range(start="2017-12-30 10:30:45",periods=10,freq="H")
print(pd_time)
DatetimeIndex(['2017-12-30', '2017-12-31', '2018-01-01', '2018-01-02','2018-01-03', '2018-01-04', '2018-01-05', '2018-01-06','2018-01-07', '2018-01-08', '2018-01-09', '2018-01-10','2018-01-11', '2018-01-12', '2018-01-13', '2018-01-14','2018-01-15', '2018-01-16', '2018-01-17', '2018-01-18','2018-01-19', '2018-01-20', '2018-01-21', '2018-01-22','2018-01-23', '2018-01-24', '2018-01-25', '2018-01-26','2018-01-27', '2018-01-28', '2018-01-29', '2018-01-30','2018-01-31'],dtype='datetime64[ns]', freq='D')
DatetimeIndex(['2017-12-30', '2018-01-09', '2018-01-19', '2018-01-29'], dtype='datetime64[ns]', freq='10D')
DatetimeIndex(['2017-12-30', '2018-01-09', '2018-01-19', '2018-01-29','2018-02-08', '2018-02-18', '2018-02-28', '2018-03-10','2018-03-20', '2018-03-30'],dtype='datetime64[ns]', freq='10D')
DatetimeIndex(['2017-12-31', '2018-01-31', '2018-02-28', '2018-03-31','2018-04-30', '2018-05-31', '2018-06-30', '2018-07-31','2018-08-31', '2018-09-30'],dtype='datetime64[ns]', freq='M')
DatetimeIndex(['2017-12-30 00:00:00', '2017-12-30 01:00:00','2017-12-30 02:00:00', '2017-12-30 03:00:00','2017-12-30 04:00:00', '2017-12-30 05:00:00','2017-12-30 06:00:00', '2017-12-30 07:00:00','2017-12-30 08:00:00', '2017-12-30 09:00:00'],dtype='datetime64[ns]', freq='H')
DatetimeIndex(['2017-12-30 00:00:00', '2017-12-30 01:00:00','2017-12-30 02:00:00', '2017-12-30 03:00:00','2017-12-30 04:00:00', '2017-12-30 05:00:00','2017-12-30 06:00:00', '2017-12-30 07:00:00','2017-12-30 08:00:00', '2017-12-30 09:00:00'],dtype='datetime64[ns]', freq='H')
DatetimeIndex(['2017-12-30 10:30:45', '2017-12-30 11:30:45','2017-12-30 12:30:45', '2017-12-30 13:30:45','2017-12-30 14:30:45', '2017-12-30 15:30:45','2017-12-30 16:30:45', '2017-12-30 17:30:45','2017-12-30 18:30:45', '2017-12-30 19:30:45'],dtype='datetime64[ns]', freq='H')

9、pandas重采样

重采样:是指将时间序列从一个频率转化为另一个频率进行处理的过程,将高频率数据转化为低频率数据为降采样,低频率数据转化为高频率数据为升采样。pandas里提供了一个resample方法可以实现频率转化。

#重采样
t=pd.DataFrame(np.random.uniform(10,50,(100,1)),index=pd.date_range(start="20170101",periods=100))
print(t)
print(t.resample("M").mean())
print(t.resample("10D").count())
                    0
2017-01-01  31.238575
2017-01-02  22.248967
2017-01-03  17.524832
2017-01-04  44.035044
2017-01-05  29.195196
...               ...
2017-04-06  30.878312
2017-04-07  20.670926
2017-04-08  43.626255
2017-04-09  49.956429
2017-04-10  48.605538[100 rows x 1 columns]0
2017-01-31  28.165566
2017-02-28  27.370483
2017-03-31  28.530910
2017-04-30  35.5594940
2017-01-01  10
2017-01-11  10
2017-01-21  10
2017-01-31  10
2017-02-10  10
2017-02-20  10
2017-03-02  10
2017-03-12  10
2017-03-22  10
2017-04-01  10

10、一些练手小例子

(1)例1:统计中国各省份星巴克店铺排名前25并绘制图形。
数据来源:https://www.kaggle.com/starbucks/store-locations/data

import pandas as pd
import numpy as np
import matplotlib
from matplotlib import pyplot as plt#使用matplotlib绘制中国各省市星巴克店铺排名
file_path="./directory.csv"
df=pd.read_csv(file_path)
df=df[df["Country"]=="CN"]
#取前25名
data=df.groupby(by="City").count()["Brand"].sort_values(ascending=False)[:25]
_x=data.index
_y=data.values#画图
font = {'family' : 'MicroSoft YaHei','weight' : 'bold',}
matplotlib.rc("font",**font)
plt.figure(figsize=(20,8),dpi=80)
plt.barh(range(len(_x)),_y,height=0.3,color="orange")
plt.yticks(range(len(_x)),_x)
plt.show()


(2)2015年至2017年911的紧急电话数据,统计不同类型的紧急情况的次数,统计不同月份不同类型的紧急电话次数的变化情况。
数据来源:https://www.kaggle.com/mchirico/montcoalert/data
①统计不同类型的紧急情况的次数

import pandas as pd
from matplotlib import pyplot as plt
import numpy as nppd.set_option('display.max_columns', None)
df=pd.read_csv("./911.csv")
# print(df.head(10))
# print(df.info())
#获取分类情况
print(df["title"].str.split(":"))#series类型
time_list=df["title"].str.split(":").tolist()#转化为列表 series  tolist() DataFrame里to_list()
# print(time_list)
cate_list=list(set([i[0] for i in time_list]))#去重 取分类名称
print(cate_list)#一共就三个分类
#构造全为0的数组
zero_df=pd.DataFrame(np.zeros((df.shape[0],len(cate_list))),columns=cate_list)
#赋值
#方法1
for cate in cate_list:#df["title"].str.contains(cate)#遍历分类的名称列表  然后找df的title列包含这个分类名称zero_df[cate][df["title"].str.contains(cate)]=1#给当前cate这个一列赋值 选中包含的true就赋值为1(这里是布尔索引)
print(zero_df)
#方法2  这个有些慢
# for i in range(df.shape[0]):
#     zero_df.loc[i,time_list[i][0]]=1
# print(zero_df)
#统计不同类型紧急情况的次数
sum_ret=zero_df.sum(axis=0)
print(sum_ret)
0              [EMS,  BACK PAINS/INJURY]
1             [EMS,  DIABETIC EMERGENCY]
2                 [Fire,  GAS-ODOR/LEAK]
3              [EMS,  CARDIAC EMERGENCY]
4                      [EMS,  DIZZINESS]...
663517    [Traffic,  VEHICLE ACCIDENT -]
663518          [EMS,  GENERAL WEAKNESS]
663519          [EMS,  VEHICLE ACCIDENT]
663520            [Fire,  BUILDING FIRE]
663521    [Traffic,  VEHICLE ACCIDENT -]
Name: title, Length: 663522, dtype: object
['Fire', 'Traffic', 'EMS']Fire  Traffic  EMS
0        0.0      0.0  1.0
1        0.0      0.0  1.0
2        1.0      0.0  0.0
3        0.0      0.0  1.0
4        0.0      0.0  1.0
...      ...      ...  ...
663517   0.0      1.0  0.0
663518   0.0      0.0  1.0
663519   0.0      0.0  1.0
663520   1.0      0.0  0.0
663521   0.0      1.0  0.0[663522 rows x 3 columns]
Fire       100622.0
Traffic    230208.0
EMS        332700.0
dtype: float64
import pandas as pd
from matplotlib import pyplot as plt
import numpy as nppd.set_option('display.max_columns', None)
df=pd.read_csv("./911.csv")
# print(df.head(10))
# print(df.info())
#获取分类情况
print(df["title"].str.split(":"))#series类型
time_list=df["title"].str.split(":").tolist()#转化为列表 series  tolist() DataFrame里to_list()
# print(time_list)
cate_list=[i[0] for i in time_list]
cate_df=pd.DataFrame(np.array(cate_list).reshape((df.shape[0],1)))
df["cate"]=cate_df
print(cate_df)
print(df)
print(df.groupby(by="cate").count())
print(df.groupby(by="cate").count()["title"])
0              [EMS,  BACK PAINS/INJURY]
1             [EMS,  DIABETIC EMERGENCY]
2                 [Fire,  GAS-ODOR/LEAK]
3              [EMS,  CARDIAC EMERGENCY]
4                      [EMS,  DIZZINESS]...
663517    [Traffic,  VEHICLE ACCIDENT -]
663518          [EMS,  GENERAL WEAKNESS]
663519          [EMS,  VEHICLE ACCIDENT]
663520            [Fire,  BUILDING FIRE]
663521    [Traffic,  VEHICLE ACCIDENT -]
Name: title, Length: 663522, dtype: object0
0           EMS
1           EMS
2          Fire
3           EMS
4           EMS
...         ...
663517  Traffic
663518      EMS
663519      EMS
663520     Fire
663521  Traffic[663522 rows x 1 columns]lat        lng  \
0       40.297876 -75.581294
1       40.258061 -75.264680
2       40.121182 -75.351975
3       40.116153 -75.343513
4       40.251492 -75.603350
...           ...        ...
663517  40.157956 -75.348060
663518  40.136306 -75.428697
663519  40.013779 -75.300835
663520  40.121603 -75.351437
663521  40.015046 -75.299674   desc      zip  \
0       REINDEER CT & DEAD END;  NEW HANOVER; Station ...  19525.0
1       BRIAR PATH & WHITEMARSH LN;  HATFIELD TOWNSHIP...  19446.0
2       HAWS AVE; NORRISTOWN; 2015-12-10 @ 14:39:21-St...  19401.0
3       AIRY ST & SWEDE ST;  NORRISTOWN; Station 308A;...  19401.0
4       CHERRYWOOD CT & DEAD END;  LOWER POTTSGROVE; S...      NaN
...                                                   ...      ...
663517  SUNSET AVE & WOODLAND AVE; EAST NORRITON; 2020...  19403.0
663518  EAGLEVILLE RD & BUNTING CIR;  LOWER PROVIDENCE...  19403.0
663519  HAVERFORD STATION RD;  LOWER MERION; Station 3...  19041.0
663520  MARSHALL ST & HAWS AVE; NORRISTOWN; 2020-07-29...  19401.0
663521  HAVERFORD STATION RD & W MONTGOMERY AVE; LOWER...  19041.0   title            timeStamp                twp  \
0            EMS: BACK PAINS/INJURY  2015-12-10 17:10:52        NEW HANOVER
1           EMS: DIABETIC EMERGENCY  2015-12-10 17:29:21  HATFIELD TOWNSHIP
2               Fire: GAS-ODOR/LEAK  2015-12-10 14:39:21         NORRISTOWN
3            EMS: CARDIAC EMERGENCY  2015-12-10 16:47:36         NORRISTOWN
4                    EMS: DIZZINESS  2015-12-10 16:56:52   LOWER POTTSGROVE
...                             ...                  ...                ...
663517  Traffic: VEHICLE ACCIDENT -  2020-07-29 15:46:51      EAST NORRITON
663518        EMS: GENERAL WEAKNESS  2020-07-29 15:52:19   LOWER PROVIDENCE
663519        EMS: VEHICLE ACCIDENT  2020-07-29 15:52:52       LOWER MERION
663520          Fire: BUILDING FIRE  2020-07-29 15:54:08         NORRISTOWN
663521  Traffic: VEHICLE ACCIDENT -  2020-07-29 15:52:46       LOWER MERION   addr  e     cate
0                        REINDEER CT & DEAD END  1      EMS
1                    BRIAR PATH & WHITEMARSH LN  1      EMS
2                                      HAWS AVE  1     Fire
3                            AIRY ST & SWEDE ST  1      EMS
4                      CHERRYWOOD CT & DEAD END  1      EMS
...                                         ... ..      ...
663517                SUNSET AVE & WOODLAND AVE  1  Traffic
663518              EAGLEVILLE RD & BUNTING CIR  1      EMS
663519                     HAVERFORD STATION RD  1      EMS
663520                   MARSHALL ST & HAWS AVE  1     Fire
663521  HAVERFORD STATION RD & W MONTGOMERY AVE  1  Traffic  [663522 rows x 10 columns]lat     lng    desc     zip   title  timeStamp     twp    addr  \
cate
EMS      332692  332692  332692  304855  332692     332692  332480  332692
Fire     100622  100622  100622   88867  100622     100622  100545  100622
Traffic  230208  230208  230208  189601  230208     230208  230204  230208   e
cate
EMS      332692
Fire     100622
Traffic  230208
cate
EMS        332692
Fire       100622
Traffic    230208
Name: title, dtype: int64

②统计911数据中不同月份电话次数的变化情况

import pandas as pd
from matplotlib import pyplot as plt
import numpy as np#统计911数据中不同月份电话次数的变化情况
#统计911数据中不同月份不同类型的电话的次数的变化情况
pd.set_option('display.max_columns', None)
df=pd.read_csv("./911.csv")
# print(df.head(10))
# print(df.info())
df["timeStamp"]=pd.to_datetime(df["timeStamp"])#将时间戳这一列的数据转换为时间序列DatetimeIndex类型
df.set_index("timeStamp",inplace=True)
print(df.head(10))
#统计911数据中不同月份电话次数的变化情况
print("统计911数据中不同月份电话次数的变化情况")
countByMonth=df.resample("M").count()["title"]
print(countByMonth.head())
#绘图
_x=countByMonth.index
_y=countByMonth.values
_x=[i.strftime("%Y-%m-%d") for i in _x]#格式化日期
plt.figure(figsize=(20,8),dpi=80)
plt.plot(range(len(_x)),_y)
plt.xticks(range(len(_x)),_x,rotation=45)
plt.show()
                           lat        lng  \
timeStamp
2015-12-10 17:10:52  40.297876 -75.581294
2015-12-10 17:29:21  40.258061 -75.264680
2015-12-10 14:39:21  40.121182 -75.351975
2015-12-10 16:47:36  40.116153 -75.343513
2015-12-10 16:56:52  40.251492 -75.603350
2015-12-10 15:39:04  40.253473 -75.283245
2015-12-10 16:46:48  40.182111 -75.127795
2015-12-10 16:17:05  40.217286 -75.405182
2015-12-10 16:51:42  40.289027 -75.399590
2015-12-10 17:35:41  40.102398 -75.291458   desc  \
timeStamp
2015-12-10 17:10:52  REINDEER CT & DEAD END;  NEW HANOVER; Station ...
2015-12-10 17:29:21  BRIAR PATH & WHITEMARSH LN;  HATFIELD TOWNSHIP...
2015-12-10 14:39:21  HAWS AVE; NORRISTOWN; 2015-12-10 @ 14:39:21-St...
2015-12-10 16:47:36  AIRY ST & SWEDE ST;  NORRISTOWN; Station 308A;...
2015-12-10 16:56:52  CHERRYWOOD CT & DEAD END;  LOWER POTTSGROVE; S...
2015-12-10 15:39:04  CANNON AVE & W 9TH ST;  LANSDALE; Station 345;...
2015-12-10 16:46:48  LAUREL AVE & OAKDALE AVE;  HORSHAM; Station 35...
2015-12-10 16:17:05  COLLEGEVILLE RD & LYWISKI RD;  SKIPPACK; Stati...
2015-12-10 16:51:42  MAIN ST & OLD SUMNEYTOWN PIKE;  LOWER SALFORD;...
2015-12-10 17:35:41  BLUEROUTE  & RAMP I476 NB TO CHEMICAL RD; PLYM...   zip                        title                twp  \
timeStamp
2015-12-10 17:10:52  19525.0       EMS: BACK PAINS/INJURY        NEW HANOVER
2015-12-10 17:29:21  19446.0      EMS: DIABETIC EMERGENCY  HATFIELD TOWNSHIP
2015-12-10 14:39:21  19401.0          Fire: GAS-ODOR/LEAK         NORRISTOWN
2015-12-10 16:47:36  19401.0       EMS: CARDIAC EMERGENCY         NORRISTOWN
2015-12-10 16:56:52      NaN               EMS: DIZZINESS   LOWER POTTSGROVE
2015-12-10 15:39:04  19446.0             EMS: HEAD INJURY           LANSDALE
2015-12-10 16:46:48  19044.0         EMS: NAUSEA/VOMITING            HORSHAM
2015-12-10 16:17:05  19426.0   EMS: RESPIRATORY EMERGENCY           SKIPPACK
2015-12-10 16:51:42  19438.0        EMS: SYNCOPAL EPISODE      LOWER SALFORD
2015-12-10 17:35:41  19462.0  Traffic: VEHICLE ACCIDENT -           PLYMOUTH   addr  e
timeStamp
2015-12-10 17:10:52                    REINDEER CT & DEAD END  1
2015-12-10 17:29:21                BRIAR PATH & WHITEMARSH LN  1
2015-12-10 14:39:21                                  HAWS AVE  1
2015-12-10 16:47:36                        AIRY ST & SWEDE ST  1
2015-12-10 16:56:52                  CHERRYWOOD CT & DEAD END  1
2015-12-10 15:39:04                     CANNON AVE & W 9TH ST  1
2015-12-10 16:46:48                  LAUREL AVE & OAKDALE AVE  1
2015-12-10 16:17:05              COLLEGEVILLE RD & LYWISKI RD  1
2015-12-10 16:51:42             MAIN ST & OLD SUMNEYTOWN PIKE  1
2015-12-10 17:35:41  BLUEROUTE  & RAMP I476 NB TO CHEMICAL RD  1
统计911数据中不同月份电话次数的变化情况
timeStamp
2015-12-31     7916
2016-01-31    13096
2016-02-29    11396
2016-03-31    11059
2016-04-30    11287
Freq: M, Name: title, dtype: int64


③统计911数据中不同月份不同类型的电话的次数的变化情况

#定义一个绘图的方法
def plot_img(df,label):countByMonth=df.resample("M").count()["title"]# print(countByMonth.head())#绘图_x=countByMonth.index_y=countByMonth.values_x=[i.strftime("%Y-%m-%d") for i in _x]#格式化日期plt.plot(range(len(_x)),_y,label=label)plt.xticks(range(len(_x)),_x,rotation=45)#统计911数据中不同月份不同类型的电话的次数的变化情况
print("统计911数据中不同月份不同类型的电话的次数的变化情况")
pd.set_option('display.max_columns', None)
df=pd.read_csv("./911.csv")#把时间字符串转化为时间类型设置为索引
df["timeStamp"]=pd.to_datetime(df["timeStamp"])#新增一列表示事件分类
time_list=df["title"].str.split(":").tolist()
cate_list=[i[0] for i in time_list]
#这里要注意索引得一致 否则赋值不过去
cate_df=pd.DataFrame(np.array(cate_list).reshape((df.shape[0],1)))#这块的索引是从0到6万多的数字
df["cate"]=cate_dfdf.set_index("timeStamp",inplace=True)#这里重新设置索引为 时间了plt.figure(figsize=(20, 8), dpi=80)#分组 可迭代对象 元组
print(df.groupby(by="cate"))
for group_name,group_data in df.groupby(by="cate"):#对不同的分类都进行绘图plot_img(group_data,group_name)plt.legend(loc="best")
plt.show()
统计911数据中不同月份不同类型的电话的次数的变化情况
<pandas.core.groupby.generic.DataFrameGroupBy object at 0x00000265C8559358>


(3)北京、广州、上海、沈阳、成都5个城市空气质量数据,绘制这5个城市的PM2.5随时间的变化情况。数据来源:https://www.kaggle.com/uciml/pm25-data-for-five-chinese-cities
DatetimeIndex可以理解为时间戳,而PeriodIndex可以理解为时间段。以北京为例:

import pandas as pd
from matplotlib import pyplot as plt
import numpy as np
pd.set_option('display.max_columns', None)
file_path="./BeijingPM20100101_20151231.csv"
file_path1="./ChengduPM20100101_20151231.csv"
file_path2="./GuangzhouPM20100101_20151231.csv"
file_path3="./ShanghaiPM20100101_20151231.csv"
file_path4="./ShenyangPM20100101_20151231.csv"
df=pd.read_csv(file_path)
print(df.head())
print(df.info())
#把分开的时间字符串通过PeriodIndex方法转化为pandas的时间类型
period=pd.PeriodIndex(year=df["year"],month=df["month"],day=df["day"],hour=df["hour"],freq="H")
print(period)
print(type(period))
#把新的时间序列数据添加到原始数据里
df["datetime"]=period
print(df.head(10))
#把datatime设置为索引
df.set_index("datetime",inplace=True)
#进行降采样
df=df.resample("7D").mean()#处理缺失数据 删除
print(df["PM_US Post"])
data=df["PM_US Post"].dropna()
data_china=df["PM_Dongsi"]#绘制
_x=data.index
_x=[i.strftime("%Y%m%d") for i in _x]#格式化日期
_x_china=[i.strftime("%Y%m%d") for i in data_china.index]
_y=data.values
_y_china=data_china.valuesplt.figure(figsize=(20,8),dpi=80)
plt.plot(range(len(_x)),_y,label="PM_US Post")
plt.plot(range(len(_x_china)),_y_china,label="PM_CN Post")
plt.xticks(range(0,len(_x),10),list(_x)[::10],rotation=45)#列表取步长
plt.legend(loc="best")
plt.show()
  No  year  month  day  hour  season  PM_Dongsi  PM_Dongsihuan  \
0   1  2010      1    1     0       4        NaN            NaN
1   2  2010      1    1     1       4        NaN            NaN
2   3  2010      1    1     2       4        NaN            NaN
3   4  2010      1    1     3       4        NaN            NaN
4   5  2010      1    1     4       4        NaN            NaN   PM_Nongzhanguan  PM_US Post  DEWP  HUMI    PRES  TEMP cbwd    Iws  \
0              NaN         NaN -21.0  43.0  1021.0 -11.0   NW   1.79
1              NaN         NaN -21.0  47.0  1020.0 -12.0   NW   4.92
2              NaN         NaN -21.0  43.0  1019.0 -11.0   NW   6.71
3              NaN         NaN -21.0  55.0  1019.0 -14.0   NW   9.84
4              NaN         NaN -20.0  51.0  1018.0 -12.0   NW  12.97   precipitation  Iprec
0            0.0    0.0
1            0.0    0.0
2            0.0    0.0
3            0.0    0.0
4            0.0    0.0
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 52584 entries, 0 to 52583
Data columns (total 18 columns):#   Column           Non-Null Count  Dtype
---  ------           --------------  -----  0   No               52584 non-null  int64  1   year             52584 non-null  int64  2   month            52584 non-null  int64  3   day              52584 non-null  int64  4   hour             52584 non-null  int64  5   season           52584 non-null  int64  6   PM_Dongsi        25052 non-null  float647   PM_Dongsihuan    20508 non-null  float648   PM_Nongzhanguan  24931 non-null  float649   PM_US Post       50387 non-null  float6410  DEWP             52579 non-null  float6411  HUMI             52245 non-null  float6412  PRES             52245 non-null  float6413  TEMP             52579 non-null  float6414  cbwd             52579 non-null  object 15  Iws              52579 non-null  float6416  precipitation    52100 non-null  float6417  Iprec            52100 non-null  float64
dtypes: float64(11), int64(6), object(1)
memory usage: 7.2+ MB
None
PeriodIndex(['2010-01-01 00:00', '2010-01-01 01:00', '2010-01-01 02:00','2010-01-01 03:00', '2010-01-01 04:00', '2010-01-01 05:00','2010-01-01 06:00', '2010-01-01 07:00', '2010-01-01 08:00','2010-01-01 09:00',...'2015-12-31 14:00', '2015-12-31 15:00', '2015-12-31 16:00','2015-12-31 17:00', '2015-12-31 18:00', '2015-12-31 19:00','2015-12-31 20:00', '2015-12-31 21:00', '2015-12-31 22:00','2015-12-31 23:00'],dtype='period[H]', length=52584, freq='H')
<class 'pandas.core.indexes.period.PeriodIndex'>No  year  month  day  hour  season  PM_Dongsi  PM_Dongsihuan  \
0   1  2010      1    1     0       4        NaN            NaN
1   2  2010      1    1     1       4        NaN            NaN
2   3  2010      1    1     2       4        NaN            NaN
3   4  2010      1    1     3       4        NaN            NaN
4   5  2010      1    1     4       4        NaN            NaN
5   6  2010      1    1     5       4        NaN            NaN
6   7  2010      1    1     6       4        NaN            NaN
7   8  2010      1    1     7       4        NaN            NaN
8   9  2010      1    1     8       4        NaN            NaN
9  10  2010      1    1     9       4        NaN            NaN   PM_Nongzhanguan  PM_US Post  DEWP  HUMI    PRES  TEMP cbwd    Iws  \
0              NaN         NaN -21.0  43.0  1021.0 -11.0   NW   1.79
1              NaN         NaN -21.0  47.0  1020.0 -12.0   NW   4.92
2              NaN         NaN -21.0  43.0  1019.0 -11.0   NW   6.71
3              NaN         NaN -21.0  55.0  1019.0 -14.0   NW   9.84
4              NaN         NaN -20.0  51.0  1018.0 -12.0   NW  12.97
5              NaN         NaN -19.0  47.0  1017.0 -10.0   NW  16.10
6              NaN         NaN -19.0  44.0  1017.0  -9.0   NW  19.23
7              NaN         NaN -19.0  44.0  1017.0  -9.0   NW  21.02
8              NaN         NaN -19.0  44.0  1017.0  -9.0   NW  24.15
9              NaN         NaN -20.0  37.0  1017.0  -8.0   NW  27.28   precipitation  Iprec          datetime
0            0.0    0.0  2010-01-01 00:00
1            0.0    0.0  2010-01-01 01:00
2            0.0    0.0  2010-01-01 02:00
3            0.0    0.0  2010-01-01 03:00
4            0.0    0.0  2010-01-01 04:00
5            0.0    0.0  2010-01-01 05:00
6            0.0    0.0  2010-01-01 06:00
7            0.0    0.0  2010-01-01 07:00
8            0.0    0.0  2010-01-01 08:00
9            0.0    0.0  2010-01-01 09:00
datetime
2010-01-01     71.627586
2010-01-08     69.910714
2010-01-15    163.654762
2010-01-22     68.069307
2010-01-29     53.583333...
2015-11-27    242.642857
2015-12-04    145.437500
2015-12-11     88.750000
2015-12-18    204.139241
2015-12-25    209.244048
Freq: 7D, Name: PM_US Post, Length: 313, dtype: float64

数据分析--pandas相关推荐

  1. pandas dataframe column_Python数据分析——Pandas 教程(下)

    Python数据分析--Pandas 教程(上) 上节,我们讲了 Pandas 基本的数据加载与检索,这节我们讲讲如何进行数据比较. Pandas系列对象 在 Pandas 中我们获取指定列的数据有多 ...

  2. pandas pivot 计算占比_数据分析Pandas 基础(二)

    推荐阅读:数据分析--Pandas 基础(一) 上一节课介绍了 Pandas 的基本用法,这一章节我们通过对 "泰坦尼克号" 幸存者进行数据分析,来进一步的学习 pandas. t ...

  3. Python数据分析pandas之分组统计透视表

    Python数据分析pandas之分组统计透视表 数据聚合统计 Padans里的聚合统计即是应用分组的方法对数据框进行聚合统计,常见的有min(最小).max(最大).avg(平均值).sum(求和) ...

  4. Python数据分析pandas之数据拼接与连接

    Python数据分析pandas之数据拼接与连接 数据拼接处理 数据拼接处理指的是numpy.pandas里对数据的拼接.连接.合并等多种方法的概称.有时我们处理的数据会分很多步骤,而中间或者最终的结 ...

  5. Python数据分析pandas之dataframe初识

    Python数据分析pandas之dataframe初识 声明与简介 pandas是一个基于python的.快速的.高效.灵活.易用的开源的数据处理.分析包(工具)..pandas构建在numpy之上 ...

  6. Python数据分析pandas之series初识

    Python数据分析pandas之series初识 声明与简介 pandas是一个基于python的.快速的.高效.灵活.易用的开源的数据处理.分析包(工具)..pandas构建在numpy之上,它通 ...

  7. python数据分析df_Python数据分析pandas入门!(附数据分析资料)

    Python数据分析pandas入门!(附数据分析资料) 1.pandas数据结构之DataFrame+ 这是小编准备的python数据分析资料!进群:700341555即可获取! Python数据分 ...

  8. python pandas 读取数据库_数据分析-pandas从数据库读取数据

    数据分析-pandas从数据库读取数据 使用pandas读取数据到DataFrame,对于只是数据分析来说,重点是读取数据,读取数据过程越简单越好,并不需要写得很复杂显得自己很厉害的样子.最好就是代码 ...

  9. 【Python】数据分析.pandas.透视表与交叉表

    文章目录 数据分析-pandas.透视表与交叉表 一.透视表 二.交叉表 三.任务实现 数据分析-pandas.透视表与交叉表 一.透视表 数据透视表是数据分析中常见的工具之一,根据一个或多个键值对数 ...

  10. 数据分析---pandas(一)

    活动地址:CSDN21天学习挑战赛 数据分析---pandas Pandas基本介绍 Pandas安装和引用 Pandas 基本数据结构 Pandas库的series类型 索引(数据的行标签) 值 切 ...

最新文章

  1. 13岁小孩都跟我抢Python了,完了!
  2. Win8 x64 + Office Word 2013 x64 无法自动加载 Endnote X6 的解决方案
  3. python最新版下载教程-各种版本的Python下载安装教程
  4. oracle判断数据表的字段内容是否为空
  5. tcpdump命令---Linux学习笔记
  6. u大师u盘装系统win7_优盘如何装系统 u盘装系统的步骤
  7. 博客园的BLOG也申请了
  8. fiddler怎么修改服务器返回数据,基于Fiddler实现修改接口返回数据进行测试
  9. 为什么打印出来的文件右边有阴影_打印机扫描怎么用,教您怎么用打印机扫描...
  10. 《大型网站架构技术》系列分享专栏
  11. python统计数组元素个数_统计二维数组里元素的个数
  12. 14.Java实现UDP通信
  13. gitbub(cp:http://www.linuxidc.com/Linux/2014-03/97821.htm)
  14. 柔性电流传感器(柔性电流探头)的工作原理和特点是什么?
  15. 2017届腾讯校园招聘笔试——最后一道编程题
  16. 数据科学家也良莠不齐 蹩脚数据科学家的10个迹象
  17. Kubernetes 亲和性与反亲和性
  18. 数据库原理 头歌实训 数据库常用对象
  19. GoldenSource和McObject推出最快的RegTechEDM解决方案
  20. 深度学习的常见模型CNN

热门文章

  1. centos7 安装 nodejs最新版 (官网安装包安装)
  2. 沙盘Sandboxie电脑安全上网必备工具
  3. QQ聊天记录保存在哪个文件里?
  4. 仙人掌之歌——系统设计(3)
  5. 手机三要素验证的核验流程
  6. docker覆盖镜像默认命令之docker entrypoint
  7. 11-Android之Preference的移除
  8. android Deskclock 设置不对齐,Preference去除左边空白的图标占位
  9. Qt开发的小游戏-抗日
  10. 暗黑起源服务器维护,[端游]传奇暗黑起源三职业恐怖的黎明VA1.0版本_雕文系统_暗黑重铸...