数据分析工具pandas快速入门教程2-pandas数据结构

创建数据

Series和python的列表类似。DataFrame则类似值为Series的字典。

create.py


#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# create.py

import pandas as pd

print("\n\n创建序列Series")
s = pd.Series(['banana', 42])
print(s)

print("\n\n指定索引index创建序列Series")
s = pd.Series(['Wes McKinney', 'Creator of Pandas'], index=['Person', 'Who'])
print(s)

# 注意：列名未必为执行的顺序，通常为按字母排序
print("\n\n创建数据帧DataFrame")
scientists = pd.DataFrame({
    ' Name': ['Rosaline Franklin', 'William Gosset'],
    ' Occupation': ['Chemist', 'Statistician'],
    ' Born': ['1920-07-25', '1876-06-13'],
    ' Died': ['1958-04-16', '1937-10-16'],
    ' Age': [37, 61]})
print(scientists)

print("\n\n指定顺序(index和columns)创建数据帧DataFrame")
scientists = pd.DataFrame(
    data={'Occupation': ['Chemist', 'Statistician'],
    'Born': ['1920-07-25', '1876-06-13'],
    'Died': ['1958-04-16', '1937-10-16'],
    'Age': [37, 61]},
    index=['Rosaline Franklin', 'William Gosset'],
    columns=['Occupation', 'Born', 'Died', 'Age'])
print(scientists)

执行结果：


$ ./create.py 


创建序列Series
0    banana
1        42
dtype: object


指定索引index创建序列Series
Person         Wes McKinney
Who       Creator of Pandas
dtype: object


创建数据帧DataFrame
                Name    Occupation        Born        Died   Age
0  Rosaline Franklin       Chemist  1920-07-25  1958-04-16    37
1     William Gosset  Statistician  1876-06-13  1937-10-16    61


指定顺序(index和columns)创建数据帧DataFrame
                     Occupation        Born        Died  Age
Rosaline Franklin       Chemist  1920-07-25  1958-04-16   37
William Gosset     Statistician  1876-06-13  1937-10-16   61

Series

官方文档：http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.html

Series的属性

属性	描述
loc	使用索引值获取子集
iloc	使用索引位置获取子集
dtype或dtypes	类型
T	转置
shape	数据的尺寸
size	元素的数量
values	ndarray或类似ndarray的Series

Series的方法

方法	描述
append	连接2个或更多系列
corr	计算与其他Series的关联
cov	与其他Series计算协方差
describe	计算汇总统计
drop duplicates	返回一个没有重复项的Series
equals	Series是否具有相同的元素
get values	获取Series的值，与values属性相同
hist	绘制直方图
min	返回最小值
max	返回最大值
mean	返回算术平均值
median	返回中位数
mode(s)	返回mode(s)
replace	用指定值替换系列中的值
sample	返回Series中值的随机样本
sort values	排序
to frame	转换为数据帧
transpose	返回转置
unique	返回numpy.ndarray唯一值

series.py


#!/usr/bin/python3
# -*- coding: utf-8 -*-
# CreateDate: 2018-3-14
# series.py

import pandas as pd
import numpy as np

scientists = pd.DataFrame(
    data={'Occupation': ['Chemist', 'Statistician'],
    'Born': ['1920-07-25', '1876-06-13'],
    'Died': ['1958-04-16', '1937-10-16'],
    'Age': [37, 61]},
    index=['Rosaline Franklin', 'William Gosset'],
    columns=['Occupation', 'Born', 'Died', 'Age'])
print(scientists)

# 从数据帧(DataFrame)获取的行或者列为Series
first_row = scientists.loc['William Gosset']
print(type(first_row))
print(first_row)

# index和keys是一样的
print(first_row.index)
print(first_row.keys())
print(first_row.values)

print(first_row.index[0])
print(first_row.keys()[0])

# Pandas.Series和numpy.ndarray很类似
ages = scientists['Age']
print(ages)

# 统计，更多参考http://pandas.pydata.org/pandas-docs/stable/basics.html#descriptive-statistics
print(ages.mean())
print(ages.min())
print(ages.max())
print(ages.std())

scientists = pd.read_csv('../data/scientists.csv')
ages = scientists['Age']
print(ages)
print(ages.mean())
print(ages.describe())
print(ages[ages > ages.mean()])
print(ages > ages.mean())
manual_bool_values = [True, True, False, False, True, True, False, False]
print(ages[manual_bool_values])

print(ages + ages)
print(ages * ages)
print(ages + 100)
print(ages * 2)
print(ages + pd.Series([1, 100]))
# print(ages + np.array([1, 100])) 会报错，不同类型相加，大小一定要一样
print(ages + np.array([1, 100, 1, 100, 1, 100, 1, 100]))

# 排序： 默认有自动排序
print(ages)
rev_ages = ages.sort_index(ascending=False)
print(rev_ages)
print(ages * 2)
print(ages + rev_ages)

执行结果


$ python3 series.py 
                     Occupation        Born        Died  Age
Rosaline Franklin       Chemist  1920-07-25  1958-04-16   37
William Gosset     Statistician  1876-06-13  1937-10-16   61
<class 'pandas.core.series.Series'>
Occupation    Statistician
Born            1876-06-13
Died            1937-10-16
Age                     61
Name: William Gosset, dtype: object
Index(['Occupation', 'Born', 'Died', 'Age'], dtype='object')
Index(['Occupation', 'Born', 'Died', 'Age'], dtype='object')
['Statistician' '1876-06-13' '1937-10-16' 61]
Occupation
Occupation
Rosaline Franklin    37
William Gosset       61
Name: Age, dtype: int64
49.0
37
61
16.97056274847714
0    37
1    61
2    90
3    66
4    56
5    45
6    41
7    77
Name: Age, dtype: int64
59.125
count     8.000000
mean     59.125000
std      18.325918
min      37.000000
25%      44.000000
50%      58.500000
75%      68.750000
max      90.000000
Name: Age, dtype: float64
1    61
2    90
3    66
7    77
Name: Age, dtype: int64
0    False
1     True
2     True
3     True
4    False
5    False
6    False
7     True
Name: Age, dtype: bool
0    37
1    61
4    56
5    45
Name: Age, dtype: int64
0     74
1    122
2    180
3    132
4    112
5     90
6     82
7    154
Name: Age, dtype: int64
0    1369
1    3721
2    8100
3    4356
4    3136
5    2025
6    1681
7    5929
Name: Age, dtype: int64
0    137
1    161
2    190
3    166
4    156
5    145
6    141
7    177
Name: Age, dtype: int64
0     74
1    122
2    180
3    132
4    112
5     90
6     82
7    154
Name: Age, dtype: int64
0     38.0
1    161.0
2      NaN
3      NaN
4      NaN
5      NaN
6      NaN
7      NaN
dtype: float64
0     38
1    161
2     91
3    166
4     57
5    145
6     42
7    177
Name: Age, dtype: int64
0    37
1    61
2    90
3    66
4    56
5    45
6    41
7    77
Name: Age, dtype: int64
7    77
6    41
5    45
4    56
3    66
2    90
1    61
0    37
Name: Age, dtype: int64
0     74
1    122
2    180
3    132
4    112
5     90
6     82
7    154
Name: Age, dtype: int64
0     74
1    122
2    180
3    132
4    112
5     90
6     82
7    154
Name: Age, dtype: int64

数据帧(DataFrame)

DataFrame是最常见的Pandas对象,可认为是Python存储类似电子表格的数据的方式。Series多常见功能都包含在DataFrame中。

子集的方法

注意ix现在已经不推荐使用。

DataFrame常用的索引操作有：

方式	描述
df[val]	选择单个列
df [[ column1, column2, ... ]]	选择多个列
df.loc[val]	选择行
df. loc [[ label1 , label2 ,...]]	选择多行
df.loc[:, val]	基于行index选择列
df.loc[val1, val2]	选择行列
df.iloc[row number]	基于行数选择行
df. iloc [[ row1, row2, ...]] Multiple rows by row number	基于行数选择多行
df.iloc[:, where]	选择列
df.iloc[where_i, where_j]	选择行列
df.at[label_i, label_j]	选择值
df.iat[i, j]	选择值
reindex method	通过label选择多行或列
get_value, set_value	通过label选择耽搁行或列
df[bool]	选择行
df [[ bool1, bool2, ...]]	选择行
df[ start :stop: step ]	基于行数选择行

#!/usr/bin/python3
# -*- coding: utf-8 -*-
# CreateDate: 2018-3-31
# df.py

import pandas as pd
import numpy as np

scientists = pd.read_csv('../data/scientists.csv')
print(scientists[scientists['Age'] > scientists['Age'].mean()])
first_half = scientists[: 4]
second_half = scientists[ 4 :]
print(first_half)
print(second_half)
print(first_half + second_half)
print(scientists * 2)

执行结果


#!/usr/bin/python3
# -*- coding: utf-8 -*-
# df.py

import pandas as pd
import numpy as np

scientists = pd.read_csv('../data/scientists.csv')
print(scientists[scientists['Age'] > scientists['Age'].mean()])
first_half = scientists[: 4]
second_half = scientists[ 4 :]
print(first_half)
print(second_half)
print(first_half + second_half)
print(scientists * 2)

执行结果


$ python3 df.py 
                   Name        Born        Died  Age     Occupation
1        William Gosset  1876-06-13  1937-10-16   61   Statistician
2  Florence Nightingale  1820-05-12  1910-08-13   90          Nurse
3           Marie Curie  1867-11-07  1934-07-04   66        Chemist
7          Johann Gauss  1777-04-30  1855-02-23   77  Mathematician
                   Name        Born        Died  Age    Occupation
0     Rosaline Franklin  1920-07-25  1958-04-16   37       Chemist
1        William Gosset  1876-06-13  1937-10-16   61  Statistician
2  Florence Nightingale  1820-05-12  1910-08-13   90         Nurse
3           Marie Curie  1867-11-07  1934-07-04   66       Chemist
            Name        Born        Died  Age          Occupation
4  Rachel Carson  1907-05-27  1964-04-14   56           Biologist
5      John Snow  1813-03-15  1858-06-16   45           Physician
6    Alan Turing  1912-06-23  1954-06-07   41  Computer Scientist
7   Johann Gauss  1777-04-30  1855-02-23   77       Mathematician
  Name Born Died  Age Occupation
0  NaN  NaN  NaN  NaN        NaN
1  NaN  NaN  NaN  NaN        NaN
2  NaN  NaN  NaN  NaN        NaN
3  NaN  NaN  NaN  NaN        NaN
4  NaN  NaN  NaN  NaN        NaN
5  NaN  NaN  NaN  NaN        NaN
6  NaN  NaN  NaN  NaN        NaN
7  NaN  NaN  NaN  NaN        NaN
                                       Name                  Born  \
0        Rosaline FranklinRosaline Franklin  1920-07-251920-07-25   
1              William GossetWilliam Gosset  1876-06-131876-06-13   
2  Florence NightingaleFlorence Nightingale  1820-05-121820-05-12   
3                    Marie CurieMarie Curie  1867-11-071867-11-07   
4                Rachel CarsonRachel Carson  1907-05-271907-05-27   
5                        John SnowJohn Snow  1813-03-151813-03-15   
6                    Alan TuringAlan Turing  1912-06-231912-06-23   
7                  Johann GaussJohann Gauss  1777-04-301777-04-30   

                   Died  Age                            Occupation  
0  1958-04-161958-04-16   74                        ChemistChemist  
1  1937-10-161937-10-16  122              StatisticianStatistician  
2  1910-08-131910-08-13  180                            NurseNurse  
3  1934-07-041934-07-04  132                        ChemistChemist  
4  1964-04-141964-04-14  112                    BiologistBiologist  
5  1858-06-161858-06-16   90                    PhysicianPhysician  
6  1954-06-071954-06-07   82  Computer ScientistComputer Scientist  
7  1855-02-231855-02-23  154            MathematicianMathematician

修改列

#!/usr/bin/python3
# -*- coding: utf-8 -*-
# Author:    xurongzhong#126.com wechat:pythontesting qq:37391319
# qq群：144081101 591302926  567351477
# CreateDate: 2018-06-07
# change.py

import pandas as pd
import numpy as np
import random

scientists = pd.read_csv('../data/scientists.csv')
print(scientists['Born'].dtype)
print(scientists['Died'].dtype)
print(scientists.head())

# 转为日期 参考：https://docs.python.org/3.5/library/datetime.html
born_datetime = pd.to_datetime(scientists['Born'], format='%Y-%m-%d')
died_datetime = pd.to_datetime(scientists['Died'], format='%Y-%m-%d')
# 增加列
scientists['born_dt'], scientists['died_dt'] = (born_datetime, died_datetime)
print(scientists.shape)
print(scientists.head())

random.seed(42)
random.shuffle(scientists['Age']) # 此修改会作用于scientists
print(scientists.head())

scientists['age_days_dt'] = (scientists['died_dt'] - scientists['born_dt'])
print(scientists.head())

执行结果：

$ python3 change.py 
object
object
                   Name        Born        Died  Age    Occupation
0     Rosaline Franklin  1920-07-25  1958-04-16   37       Chemist
1        William Gosset  1876-06-13  1937-10-16   61  Statistician
2  Florence Nightingale  1820-05-12  1910-08-13   90         Nurse
3           Marie Curie  1867-11-07  1934-07-04   66       Chemist
4         Rachel Carson  1907-05-27  1964-04-14   56     Biologist
(8, 7)
                   Name        Born        Died  Age    Occupation    born_dt  \
0     Rosaline Franklin  1920-07-25  1958-04-16   37       Chemist 1920-07-25   
1        William Gosset  1876-06-13  1937-10-16   61  Statistician 1876-06-13   
2  Florence Nightingale  1820-05-12  1910-08-13   90         Nurse 1820-05-12   
3           Marie Curie  1867-11-07  1934-07-04   66       Chemist 1867-11-07   
4         Rachel Carson  1907-05-27  1964-04-14   56     Biologist 1907-05-27   

     died_dt  
0 1958-04-16  
1 1937-10-16  
2 1910-08-13  
3 1934-07-04  
4 1964-04-14  
/usr/lib/python3.5/random.py:272: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  x[i], x[j] = x[j], x[i]
                   Name        Born        Died  Age    Occupation    born_dt  \
0     Rosaline Franklin  1920-07-25  1958-04-16   66       Chemist 1920-07-25   
1        William Gosset  1876-06-13  1937-10-16   56  Statistician 1876-06-13   
2  Florence Nightingale  1820-05-12  1910-08-13   41         Nurse 1820-05-12   
3           Marie Curie  1867-11-07  1934-07-04   77       Chemist 1867-11-07   
4         Rachel Carson  1907-05-27  1964-04-14   90     Biologist 1907-05-27   

     died_dt  
0 1958-04-16  
1 1937-10-16  
2 1910-08-13  
3 1934-07-04  
4 1964-04-14  
                   Name        Born        Died  Age    Occupation    born_dt  \
0     Rosaline Franklin  1920-07-25  1958-04-16   66       Chemist 1920-07-25   
1        William Gosset  1876-06-13  1937-10-16   56  Statistician 1876-06-13   
2  Florence Nightingale  1820-05-12  1910-08-13   41         Nurse 1820-05-12   
3           Marie Curie  1867-11-07  1934-07-04   77       Chemist 1867-11-07   
4         Rachel Carson  1907-05-27  1964-04-14   90     Biologist 1907-05-27   

     died_dt age_days_dt  
0 1958-04-16  13779 days  
1 1937-10-16  22404 days  
2 1910-08-13  32964 days  
3 1934-07-04  24345 days  
4 1964-04-14  20777 days

数据导入导出

out.py

#!/usr/bin/python3
# -*- coding: utf-8 -*-
# Author:    china-testing#126.com wechat:pythontesting qq群:630011153
# CreateDate: 2018-3-31
# out.py

import pandas as pd
import numpy as np
import random
scientists = pd.read_csv('../data/scientists.csv')

names = scientists['Name']
print(names)
names.to_pickle('../output/scientists_names_series.pickle')
scientists.to_pickle('../output/scientists_df.pickle')

# .p, .pkl,  .pickle 是常用的pickle文件扩展名
scientist_names_from_pickle = pd.read_pickle('../output/scientists_df.pickle')
print(scientist_names_from_pickle)

names.to_csv('../output/scientist_names_series.csv')
scientists.to_csv('../output/scientists_df.tsv', sep='\t')
# 不输出行号
scientists.to_csv('../output/scientists_df_no_index.csv', index=None)

# Series可以转为df再输出成excel文件
names_df = names.to_frame()
names_df.to_excel('../output/scientists_names_series_df.xls')
names_df.to_excel('../output/scientists_names_series_df.xlsx')

scientists.to_excel('../output/scientists_df.xlsx', sheet_name='scientists',
                    index=False)

执行结果：

$ python3 out.py 
0       Rosaline Franklin
1          William Gosset
2    Florence Nightingale
3             Marie Curie
4           Rachel Carson
5               John Snow
6             Alan Turing
7            Johann Gauss
Name: Name, dtype: object
                   Name        Born        Died  Age          Occupation
0     Rosaline Franklin  1920-07-25  1958-04-16   37             Chemist
1        William Gosset  1876-06-13  1937-10-16   61        Statistician
2  Florence Nightingale  1820-05-12  1910-08-13   90               Nurse
3           Marie Curie  1867-11-07  1934-07-04   66             Chemist
4         Rachel Carson  1907-05-27  1964-04-14   56           Biologist
5             John Snow  1813-03-15  1858-06-16   45           Physician
6           Alan Turing  1912-06-23  1954-06-07   41  Computer Scientist
7          Johann Gauss  1777-04-30  1855-02-23   77       Mathematician

注意：序列一般是直接输出成excel文件

更多的输入输出方法：

方式	描述
to_clipboard	将数据保存到系统剪贴板进行粘贴
to_dense	将数据转换为常规“密集”DataFrame
to_dict	将数据转换为Python字典
to_gbq	将数据转换为Google BigQuery表格
toJidf	将数据保存为分层数据格式（HDF）
to_msgpack	将数据保存到可移植的类似JSON的二进制文件中
toJitml	将数据转换为HTML表格
tojson	将数据转换为JSON字符串
toJatex	将数据转换为LTEXtabular环境
to_records	将数据转换为记录数组
to_string	将DataFrame显示为stdout的字符串
to_sparse	将数据转换为SparceDataFrame
to_sql	将数据保存到SQL数据库中
to_stata	将数据转换为Stata dta文件

读CSV文件

read_csv.py


#!/usr/bin/python3
# -*- coding: utf-8 -*-
# Author:    china-testing#126.com wechat:pythontesting QQ群:630011153
# CreateDate: 2018-3-9
# read_csv.py

import pandas as pd

df = pd.read_csv("1.csv", header=None) # 不读取列名
print("df:")
print(df)

print("df.head():")
print(df.head()) # head(self, n=5)，默认为5行，类似的有tail
print("df.tail():")
print(df.tail())

df = pd.read_csv("1.csv") # 默认读取列名
print("df:")
print(df)

df = pd.read_csv("1.csv", names=['号码','群号']) # 自定义列名
print("df:")
print(df)

# 自定义列名，去掉第一行
df = pd.read_csv("1.csv", skiprows=[0], names=['号码','群号'])
print("df:")
print(df)

执行结果：


df:
           0          1
0         qq    qqgroup
1   37391319  144081101
2   37391320  144081102
3   37391321  144081103
4   37391322  144081104
5   37391323  144081105
6   37391324  144081106
7   37391325  144081107
8   37391326  144081108
9   37391327  144081109
10  37391328  144081110
11  37391329  144081111
12  37391330  144081112
13  37391331  144081113
14  37391332  144081114
15  37391333  144081115
df.head():
          0          1
0        qq    qqgroup
1  37391319  144081101
2  37391320  144081102
3  37391321  144081103
4  37391322  144081104
df.tail():
           0          1
11  37391329  144081111
12  37391330  144081112
13  37391331  144081113
14  37391332  144081114
15  37391333  144081115
df:
          qq    qqgroup
0   37391319  144081101
1   37391320  144081102
2   37391321  144081103
3   37391322  144081104
4   37391323  144081105
5   37391324  144081106
6   37391325  144081107
7   37391326  144081108
8   37391327  144081109
9   37391328  144081110
10  37391329  144081111
11  37391330  144081112
12  37391331  144081113
13  37391332  144081114
14  37391333  144081115
df:
          号码         群号
0         qq    qqgroup
1   37391319  144081101
2   37391320  144081102
3   37391321  144081103
4   37391322  144081104
5   37391323  144081105
6   37391324  144081106
7   37391325  144081107
8   37391326  144081108
9   37391327  144081109
10  37391328  144081110
11  37391329  144081111
12  37391330  144081112
13  37391331  144081113
14  37391332  144081114
15  37391333  144081115
df:
          号码         群号
0   37391319  144081101
1   37391320  144081102
2   37391321  144081103
3   37391322  144081104
4   37391323  144081105
5   37391324  144081106
6   37391325  144081107
7   37391326  144081108
8   37391327  144081109
9   37391328  144081110
10  37391329  144081111
11  37391330  144081112
12  37391331  144081113
13  37391332  144081114
14  37391333  144081115

写CSV文件

#!/usr/bin/python3
# -*- coding: utf-8 -*-
# write_csv.py

import pandas as pd

data ={'qq': [37391319,37391320], 'group':[1,2]}

df = pd.DataFrame(data=data, columns=['qq','group'])
df.to_csv('2.csv',index=False)

读写excel和csv类似，不过要改用read_excel来读，excel_summary_demo, 提供了多个excel求和的功能，可以做为excel读写的实例，这里不再赘述。

使用pandas处理excel有更多的pandas处理excel的资料，深入学习可以参考。

讨论qq群144081101 591302926 567351477 钉钉免费群21745728

最后编辑于：2018.08.26 10:31:45

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 202,056评论 5赞 474
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 84,842评论 2赞 378
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 148,938评论 0赞 335
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 54,296评论 1赞 272
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 63,292评论 5赞 363
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 48,413评论 1赞 281
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 37,824评论 3赞 393
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 36,493评论 0赞 256
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 40,686评论 1赞 295
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 35,502评论 2赞 318
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 37,553评论 1赞 329
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 33,281评论 4赞 318
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 38,820评论 3赞 305
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 29,873评论 0赞 19
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 31,109评论 1赞 258
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 42,699评论 2赞 348
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 42,257评论 2赞 341

数据分析工具pandas快速入门教程2-pandas数据结构

创建数据

Series

数据帧(DataFrame)

修改列

数据导入导出

推荐阅读更多精彩内容