What's the Difference Between pandas' `loc` and `iloc` for DataFrame Selection?-Python Tutorial-php.cn

What's the Difference Between pandas' `loc` and `iloc` for DataFrame Selection?

DDD

Release： 2024-12-22 00:27:40

Original

897 people have browsed it

What's the Difference Between pandas' `loc` and `iloc` for DataFrame Selection?

How are iloc and loc different?

In Python's pandas library, the loc and iloc functions are used for slicing DataFrames. While they share some similarities, they differ significantly in their primary purpose and underlying mechanism.

loc vs. iloc: Label-Based vs. Location-Based Selection

loc operates based on labels, which are the index values associated with rows or columns. It retrieves rows (or columns) by matching their labels to the specified selection criteria. For instance, df.loc[:5] will return the first five rows of the DataFrame, where the labels are in ascending order.

iloc, on the other hand, operates based on integer locations. It selects rows (or columns) based on their position in the DataFrame. For example, df.iloc[:5] will also return the first five rows, but its selection is based on ordinal position (0-based index).

Examples to Illustrate the Distinction

Consider the following DataFrame with a non-monotonic index:

s = pd.Series(list("abcdef"), index=[49, 48, 47, 0, 1, 2])

Copy after login

Using loc and iloc to retrieve the first five elements:

s.loc[:5]   # row by row label (inclusive)
s.iloc[:5]  # row by row location (exclusive)

Copy after login

The results are different:

s.loc[:5] returns rows with index labels 0 to 5 (inclusive), resulting in:

0    d
1    e
2    f

Copy after login

s.iloc[:5] returns rows at locations 0 to 4 (exclusive), resulting in:

Copy after login

General Differences

To summarize the general differences between loc and iloc:

loc: Index label-based, precise selection by tags.
iloc: Integer location-based, selection by position.
loc can handle non-monotonic indexes and out-of-bounds labels, whereas iloc raises errors in such cases.
iloc performs faster than loc in certain scenarios, especially when the index is numeric and in order.

Additional Considerations

It's important to note that iloc can also operate on the columns of a DataFrame, but its syntax remains the same. loc, however, can use axis labels when selecting columns, providing more flexibility.

For further information, refer to the pandas documentation on [indexing and slicing](https://pandas.pydata.org/docs/user_guide/indexing.html).

The above is the detailed content of What's the Difference Between pandas' `loc` and `iloc` for DataFrame Selection?. For more information, please follow other related articles on the PHP Chinese website!