In data manipulation using Pandas, iloc and loc are two commonly used slicing methods that often cause confusion. Understanding their fundamental difference is essential for efficient data management.
Label vs. Location
The key distinction between loc and iloc lies in how they select data:
Examples:
Consider a DataFrame df with a non-monotonic index containing letters:
import pandas as pd df = pd.DataFrame({'col1': ['a', 'b', 'c', 'd', 'e', 'f']}, index=[49, 48, 47, 0, 1, 2])
loc (Label-Based Slicing):
iloc (Location-Based Slicing):
Key Differences:
The following table highlights the differences between loc and iloc in various scenarios:
Object | Description | loc | iloc |
---|---|---|---|
0 | Single item | Value at index label 0 ('d') | Value at index location 0 ('a') |
0:1 | Slice | Two rows (labels 0 and 1) | One row (first row at location 0) |
1:47 | Slice with out-of-bounds end | Zero rows (empty Series) | Five rows (location 1 onwards) |
1:47:-1 | Slice with negative step | Three rows (labels 1 back to 47) | Zero rows (empty Series) |
[2, 0] | Integer list | Two rows with given labels | Two rows with given locations |
s > 'e' | Boolean series (indicating true values) | One row (containing 'f') | NotImplementedError |
(s>'e').values | Boolean array | One row (containing 'f') | Same as loc |
999 | Integer object not in index | KeyError | IndexError (out of bounds) |
-1 | Integer object not in index | KeyError | Returns last value in s |
lambda x: x.index[3] | Callable applied to series (here returning 3rd item in index) | s.loc[s.index[3]] | s.iloc[s.index[3]] |
The above is the detailed content of What's the Difference Between Pandas `loc` and `iloc` for Data Selection?. For more information, please follow other related articles on the PHP Chinese website!