Home > Backend Development > Python Tutorial > How Can I Efficiently Filter Pandas DataFrames Using 'IN' and 'NOT IN' Operators?

How Can I Efficiently Filter Pandas DataFrames Using 'IN' and 'NOT IN' Operators?

Barbara Streisand
Release: 2024-12-29 16:22:19
Original
845 people have browsed it

How Can I Efficiently Filter Pandas DataFrames Using

Filtering Pandas Dataframes with "In" and "Not In": A Simpler Solution

When working with Pandas dataframes, it is often necessary to filter data based on specific criteria. One common requirement is to find rows where a particular column matches or does not match a set of predefined values, similar to the SQL "IN" and "NOT IN" operators.

Alternative to the Merge-Based Approach

Traditionally, some users have employed a merge-based approach to achieve this filtering. While functional, this method is considered inefficient and needlessly complex.

Using pd.Series.isin

The ideal solution lies in utilizing the pd.Series.isin function. It provides straightforward functionality for both "IN" and "NOT IN" filtering.

"IN" Filtering

To filter rows where a specific column matches any value in a provided list, use:

something.isin(somewhere)
Copy after login

"NOT IN" Filtering

Alternatively, to filter rows where a column value does not match any value in a given list, use:

~something.isin(somewhere)
Copy after login

Example Usage

Consider the following example:

df = pd.DataFrame({'country': ['US', 'UK', 'Germany', 'China']})

countries_to_keep = ['UK', 'China']

df_in = df[df.country.isin(countries_to_keep)]
df_not_in = df[~df.country.isin(countries_to_keep)]

print(df_in)
print(df_not_in)
Copy after login

Output:

    country
1        UK
3     China
    country
0        US
2   Germany
Copy after login

As demonstrated, pd.Series.isin provides a concise and efficient method for filtering Pandas dataframes. It eliminates the need for convoluted merge-based approaches, making the filtering process both simpler and more performant.

The above is the detailed content of How Can I Efficiently Filter Pandas DataFrames Using 'IN' and 'NOT IN' Operators?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template