1、官方文档
ndarray.size
Number of elements in the array.矩阵中元素的个数。
s = pd.Series({'a': 1, 'b': 2, 'c': 3})
>>> s.size
3
df = pd.DataFrame({'col1': [1, 2], 'col2': [3, 4]})
>>> df.size
4
2、size包括NaN值,count不包括:
In [46]:
df = pd.DataFrame({'a':[0,0,1,2,2,2], 'b':[1,2,3,4,np.NaN,4], 'c':np.random.randn(6)})
df
Out[46]:
a b c
0 0 1 1.067627
1 0 2 0.554691
2 1 3 0.458084
3 2 4 0.426635
4 2 NaN -2.238091
5 2 4 1.256943
In [48]:
print(df.groupby(['a'])['b'].count())
print(df.groupby(['a'])['b'].size())
a
0 2
1 1
2 2
Name: b, dtype: int64
a
0 2
1 1
2 3
dtype: int64
3、即使数据没有NA值,count()的结果也更加冗长
In [114]:
grouped = fec_mrbo.groupby(['cand_nm',labels])
grouped.size().unstack(0)
Out[114]:
cand_nm Obama, Barack Romney, Mitt
contb_receipt_amt
(0, 1] 493.0 77.0
(1, 10] 40070.0 3681.0
(10, 100] 372280.0 31853.0
(100, 1000] 153991.0 43357.0
(1000, 10000] 22284.0 26186.0
(10000, 100000] 2.0 1.0
(100000, 1000000] 3.0 NaN
(1000000, 10000000] 4.0 NaN
In [115]:
grouped = fec_mrbo.groupby(['cand_nm',labels])
grouped.count().unstack(0)
Out[115]:
cmte_id cand_id contbr_nm contbr_city contbr_st ... memo_cd memo_text form_tp file_num parties
cand_nm Obama, Barack Romney, Mitt Obama, Barack Romney, Mitt Obama, Barack Romney, Mitt Obama, Barack Romney, Mitt Obama, Barack Romney, Mitt ... Obama, Barack Romney, Mitt Obama, Barack Romney, Mitt Obama, Barack Romney, Mitt Obama, Barack Romney, Mitt Obama, Barack Romney, Mitt
contb_receipt_amt
(0, 1] 493.0 77.0 493.0 77.0 493.0 77.0 493.0 77.0 493.0 77.0 ... 31.0 1.0 138.0 10.0 493.0 77.0 493.0 77.0 493.0 77.0
(1, 10] 40070.0 3681.0 40070.0 3681.0 40070.0 3681.0 40070.0 3681.0 40070.0 3681.0 ... 4645.0 14.0 4781.0 53.0 40070.0 3681.0 40070.0 3681.0 40070.0 3681.0
(10, 100] 372280.0 31853.0 372280.0 31853.0 372280.0 31853.0 372276.0 31853.0 372280.0 31853.0 ... 33331.0 74.0 33789.0 236.0 372280.0 31853.0 372280.0 31853.0 372280.0 31853.0
(100, 1000] 153991.0 43357.0 153991.0 43357.0 153991.0 43357.0 153991.0 43355.0 153987.0 43357.0 ... 31674.0 347.0 31897.0 849.0 153991.0 43357.0 153991.0 43357.0 153991.0 43357.0
(1000, 10000] 22284.0 26186.0 22284.0 26186.0 22284.0 26186.0 22284.0 26185.0 22284.0 26186.0 ... 16622.0 640.0 16693.0 2217.0 22284.0 26186.0 22284.0 26186.0 22284.0 26186.0
(10000, 100000] 2.0 1.0 2.0 1.0 2.0 1.0 2.0 1.0 2.0 1.0 ... 0.0 1.0 1.0 1.0 2.0 1.0 2.0 1.0 2.0 1.0
(100000, 1000000] 3.0 NaN 3.0 NaN 3.0 NaN 3.0 NaN 3.0 NaN ... 3.0 NaN 3.0 NaN 3.0 NaN 3.0 NaN 3.0 NaN
(1000000, 10000000] 4.0 NaN 4.0 NaN 4.0 NaN 4.0 NaN 4.0 NaN ... 4.0 NaN 4.0 NaN 4.0 NaN 4.0 NaN 4.0 NaN