In Python's pandas library, the loc and iloc functions are used for slicing DataFrames. While they share some similarities, they differ significantly in their primary purpose and underlying mechanism.
loc operates based on labels, which are the index values associated with rows or columns. It retrieves rows (or columns) by matching their labels to the specified selection criteria. For instance, df.loc[:5] will return the first five rows of the DataFrame, where the labels are in ascending order.
iloc, on the other hand, operates based on integer locations. It selects rows (or columns) based on their position in the DataFrame. For example, df.iloc[:5] will also return the first five rows, but its selection is based on ordinal position (0-based index).
Consider the following DataFrame with a non-monotonic index:
s = pd.Series(list("abcdef"), index=[49, 48, 47, 0, 1, 2])
Using loc and iloc to retrieve the first five elements:
s.loc[:5] # row by row label (inclusive) s.iloc[:5] # row by row location (exclusive)
The results are different:
0 d 1 e 2 f
49 a 48 b 47 c 0 d 1 e
To summarize the general differences between loc and iloc:
It's important to note that iloc can also operate on the columns of a DataFrame, but its syntax remains the same. loc, however, can use axis labels when selecting columns, providing more flexibility.
For further information, refer to the pandas documentation on [indexing and slicing](https://pandas.pydata.org/docs/user_guide/indexing.html).
The above is the detailed content of What's the Difference Between pandas' `loc` and `iloc` for DataFrame Selection?. For more information, please follow other related articles on the PHP Chinese website!