在线观看不卡亚洲电影_亚洲妓女99综合网_91青青青亚洲娱乐在线观看_日韩无码高清综合久久

鍍金池/ 教程/ 數(shù)據(jù)分析&挖掘/ Pandas索引和選擇數(shù)據(jù)
Pandas教程
Pandas注意事項(xiàng)&竅門
Pandas IO工具
Pandas重建索引
Pandas稀疏數(shù)據(jù)
Pandas時(shí)間差(Timedelta)
Pandas聚合
Pandas字符串和文本數(shù)據(jù)
Pandas分類數(shù)據(jù)
Pandas索引和選擇數(shù)據(jù)
Pandas基本功能
Pandas系列
Pandas數(shù)據(jù)幀(DataFrame)
Pandas日期功能
Pandas缺失數(shù)據(jù)
Pandas與SQL比較
Pandas迭代
Pandas合并/連接
Pandas選項(xiàng)和自定義
Pandas級(jí)聯(lián)
Pandas可視化
Pandas數(shù)據(jù)結(jié)構(gòu)
Pandas環(huán)境安裝配置
Pandas統(tǒng)計(jì)函數(shù)
Pandas窗口函數(shù)
Pandas面板(Panel)
Pandas排序
Pandas函數(shù)應(yīng)用
Pandas快速入門
Pandas描述性統(tǒng)計(jì)
Pandas分組(GroupBy)

Pandas索引和選擇數(shù)據(jù)

在本章中,我們將討論如何切割和丟棄日期,并獲取Pandas中大對(duì)象的子集。

Python和NumPy索引運(yùn)算符"[]"和屬性運(yùn)算符"."。 可以在廣泛的用例中快速輕松地訪問(wèn)Pandas數(shù)據(jù)結(jié)構(gòu)。然而,由于要訪問(wèn)的數(shù)據(jù)類型不是預(yù)先知道的,所以直接使用標(biāo)準(zhǔn)運(yùn)算符具有一些優(yōu)化限制。對(duì)于生產(chǎn)環(huán)境的代碼,我們建議利用本章介紹的優(yōu)化Pandas數(shù)據(jù)訪問(wèn)方法。

Pandas現(xiàn)在支持三種類型的多軸索引; 這三種類型在下表中提到 -

編號(hào) 索引 描述
1 .loc() 基于標(biāo)簽
2 .iloc() 基于整數(shù)
3 .ix() 基于標(biāo)簽和整數(shù)

.loc()

Pandas提供了各種方法來(lái)完成基于標(biāo)簽的索引。 切片時(shí),也包括起始邊界。整數(shù)是有效的標(biāo)簽,但它們是指標(biāo)簽而不是位置。

.loc()具有多種訪問(wèn)方式,如 -

  • 單個(gè)標(biāo)量標(biāo)簽
  • 標(biāo)簽列表
  • 切片對(duì)象
  • 一個(gè)布爾數(shù)組

loc需要兩個(gè)單/列表/范圍運(yùn)算符,用","分隔。第一個(gè)表示行,第二個(gè)表示列。

示例1

#import the pandas library and aliasing as pd
import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(8, 4),
index = ['a','b','c','d','e','f','g','h'], columns = ['A', 'B', 'C', 'D'])

#select all rows for a specific column
print (df.loc[:,'A'])

執(zhí)行上面示例代碼,得到以下結(jié)果 -

a    0.015860
b   -0.014135
c    0.446061
d    1.801269
e   -1.404779
f   -0.044016
g    0.996651
h    0.764672
Name: A, dtype: float64

示例2

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(8, 4),
index = ['a','b','c','d','e','f','g','h'], columns = ['A', 'B', 'C', 'D'])

# Select all rows for multiple columns, say list[]
print (df.loc[:,['A','C']])

執(zhí)行上面示例代碼,得到以下結(jié)果 -

          A         C
a -0.529735 -1.067299
b -2.230089 -1.798575
c  0.685852  0.333387
d  1.061853  0.131853
e  0.990459  0.189966
f  0.057314 -0.370055
g  0.453960 -0.624419
h  0.666668 -0.433971

示例3

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(8, 4),
index = ['a','b','c','d','e','f','g','h'], columns = ['A', 'B', 'C', 'D'])

# Select few rows for multiple columns, say list[]
print (df.loc[['a','b','f','h'],['A','C']])
# Select all rows for multiple columns, say list[]
print (df.loc[:,['A','C']])

執(zhí)行上面示例代碼,得到以下結(jié)果 -

          A         C
a -1.959731  0.720956
b  1.318976  0.199987
f -1.117735 -0.181116
h -0.147029  0.027369
          A         C
a -1.959731  0.720956
b  1.318976  0.199987
c  0.839221 -1.611226
d  0.722810  1.649130
e -0.524845 -0.037824
f -1.117735 -0.181116
g -0.642907  0.443261
h -0.147029  0.027369

示例4

# import the pandas library and aliasing as pd
import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(8, 4),
index = ['a','b','c','d','e','f','g','h'], columns = ['A', 'B', 'C', 'D'])

# Select range of rows for all columns
print (df.loc['a':'h'])

執(zhí)行上面示例代碼,得到以下結(jié)果 -

          A         B         C         D
a  1.556186  1.765712  1.060657  0.810279
b  1.377965 -0.183283 -0.224379  0.963105
c -0.530016  0.167183 -0.066459  0.074198
d -1.515189 -1.453529 -1.559400  1.072148
e -0.487399  0.436143 -1.045622 -0.029507
f  0.552548  0.410745  0.570222 -0.628133
g  0.865293 -0.638388  0.388827 -0.469282
h -0.690596  1.765139 -0.492070 -0.176074

示例5

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(8, 4),
index = ['a','b','c','d','e','f','g','h'], columns = ['A', 'B', 'C', 'D'])

# for getting values with a boolean array
print (df.loc['a']>0)

執(zhí)行上面示例代碼,得到以下結(jié)果 -

A    False
B     True
C    False
D     True
Name: a, dtype: bool

.iloc()

Pandas提供了各種方法,以獲得純整數(shù)索引。像python和numpy一樣,第一個(gè)位置是基于0的索引。

各種訪問(wèn)方式如下 -

  • 整數(shù)
  • 整數(shù)列表
  • 系列值

示例1

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(8, 4), columns = ['A', 'B', 'C', 'D'])

# select all rows for a specific column
print (df.iloc[:4])

執(zhí)行上面示例代碼,得到以下結(jié)果 -

          A         B         C         D
0  0.277146  0.274234  0.860555 -1.312323
1 -1.064776  2.082030  0.695930  2.409340
2  0.033953 -1.155217  0.113045 -0.028330
3  0.241075 -2.156415  0.939586 -1.670171

示例2

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(8, 4), columns = ['A', 'B', 'C', 'D'])

# Integer slicing
print (df.iloc[:4])
print (df.iloc[1:5, 2:4])

執(zhí)行上面示例代碼,得到以下結(jié)果 -

          A         B         C         D
0  1.346210  0.251839  0.975964  0.319049
1  0.459074  0.038155  0.893615  0.659946
2 -1.097043  0.017080  0.869331 -1.443731
3  1.008033 -0.189436 -0.483688 -1.167312
          C         D
1  0.893615  0.659946
2  0.869331 -1.443731
3 -0.483688 -1.167312
4  1.566395 -1.292206

示例3

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(8, 4), columns = ['A', 'B', 'C', 'D'])

# Slicing through list of values
print (df.iloc[[1, 3, 5], [1, 3]])
print (df.iloc[1:3, :])
print (df.iloc[:,1:3])

執(zhí)行上面示例代碼,得到以下結(jié)果 -

          B         D
1  0.081257  0.009109
3  1.037680 -1.467327
5  1.106721  0.320468
          A         B         C         D
1 -0.133711  0.081257 -0.031869  0.009109
2  0.895576 -0.513450 -0.048573  0.698965
          B         C
0  0.442735 -0.949859
1  0.081257 -0.031869
2 -0.513450 -0.048573
3  1.037680 -0.801157
4 -0.547456 -0.255016
5  1.106721  0.688142
6 -0.466452  0.219914
7  1.583112  0.982030

.ix()

除了基于純標(biāo)簽和整數(shù)之外,Pandas還提供了一種使用.ix()運(yùn)算符進(jìn)行選擇和子集化對(duì)象的混合方法。

示例1


import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(8, 4), columns = ['A', 'B', 'C', 'D'])

# Integer slicing
print (df.ix[:4])

執(zhí)行上面示例代碼,得到以下結(jié)果 -

          A         B         C         D
0 -1.449975 -0.002573  1.349962  0.539765
1 -1.249462 -0.800467  0.483950  0.187853
2  1.361273 -1.893519  0.307613 -0.119003
3 -0.103433 -1.058175 -0.587307 -0.114262
4 -0.612298  0.873136 -0.607457  1.047772

示例2

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(8, 4), columns = ['A', 'B', 'C', 'D'])
# Index slicing
print (df.ix[:,'A'])

執(zhí)行上面示例代碼,得到以下結(jié)果 -

0    1.539915
1    1.359477
2    0.239694
3    0.563254
4    2.123950
5    0.341554
6   -0.075717
7   -0.606742
Name: A, dtype: float64

使用符號(hào)

使用多軸索引從Pandas對(duì)象獲取值可使用以下符號(hào) -

對(duì)象 索引 描述
Series s.loc[indexer] 標(biāo)量值
DataFrame df.loc[row_index,col_index] 標(biāo)量對(duì)象
Panel p.loc[item_index,major_index, minor_index] p.loc[item_index,major_index, minor_index]

注意 - .iloc().ix()應(yīng)用相同的索引選項(xiàng)和返回值。

現(xiàn)在來(lái)看看如何在DataFrame對(duì)象上執(zhí)行每個(gè)操作。這里使用基本索引運(yùn)算符[] -

示例1

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(8, 4), columns = ['A', 'B', 'C', 'D'])
print (df['A'])

執(zhí)行上面示例代碼,得到以下結(jié)果 -

0    0.028277
1   -1.037595
2   -0.563495
3   -1.196961
4   -0.805250
5   -0.911648
6   -0.355171
7   -0.232612
Name: A, dtype: float64

示例2

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(8, 4), columns = ['A', 'B', 'C', 'D'])

print (df[['A','B']])

執(zhí)行上面示例代碼,得到以下結(jié)果 -

          A         B
0 -0.767339 -0.729411
1 -0.563540 -0.639142
2  0.873589 -2.166382
3  0.900330  0.253875
4 -0.520105  0.064438
5 -1.452176 -0.440864
6 -0.291556 -0.861924
7 -1.464235  0.313168

示例3

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(8, 4), columns = ['A', 'B', 'C', 'D'])
print (df[2:2])

執(zhí)行上面示例代碼,得到以下結(jié)果 -

Empty DataFrame
Columns: [A, B, C, D]
Index: []

屬性訪問(wèn)

可以使用屬性運(yùn)算符.來(lái)選擇列。

示例

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(8, 4), columns = ['A', 'B', 'C', 'D'])

print (df.A)

執(zhí)行上面示例代碼,得到以下結(jié)果 -

0    0.104820
1   -1.206600
2    0.469083
3   -0.821226
4   -1.238865
5    1.083185
6   -0.827833
7   -0.199558
Name: A, dtype: float64