Home > Backend Development > Python Tutorial > How Can I Efficiently Perform SQL's IN and NOT IN Filtering in Pandas?

How Can I Efficiently Perform SQL's IN and NOT IN Filtering in Pandas?

Susan Sarandon
Release: 2024-12-20 20:05:17
Original
311 people have browsed it

How Can I Efficiently Perform SQL's IN and NOT IN Filtering in Pandas?

Object-Oriented SQL-Like Filtering in Pandas

Filtering Pandas dataframes is a crucial task in data analysis, and understanding how to effectively utilize SQL-like operators can greatly simplify this process. Specifically, the IN and NOT IN operators are widely used for selecting rows based on membership criteria.

Pandas Equivalent of SQL's IN and NOT IN

To achieve the equivalent of SQL's IN operator in Pandas, you can leverage the isin() method on a Series. This method checks whether each element in the Series is contained within a specified list or set. For example, to filter a dataframe based on a list of countries to include:

df = pd.DataFrame({'country': ['US', 'UK', 'Germany', 'China']})
countries_to_keep = ['UK', 'China']
df[df['country'].isin(countries_to_keep)]
Copy after login

This will return a dataframe containing only the rows where the country column values are present in the countries_to_keep list.

To perform the equivalent of SQL's NOT IN, simply negate the isin() method using the ~ operator:

df[~df['country'].isin(countries_to_keep)]
Copy after login

This will produce a dataframe excluding the rows where the country column values match the specified list.

Example

Consider the following dataframe:

df = pd.DataFrame({'country': ['US', 'UK', 'Germany', 'China']})
countries_to_keep = ['UK', 'China']
Copy after login

To select only rows with countries in countries_to_keep:

df_filtered_in = df[df['country'].isin(countries_to_keep)]
Copy after login

This will result in:

    country
1        UK
3     China
Copy after login

To select rows with countries not in countries_to_keep:

df_filtered_not_in = df[~df['country'].isin(countries_to_keep)]
Copy after login

This will produce:

    country
0        US
2   Germany
Copy after login

Conclusion

The isin() method in Pandas provides a convenient and efficient way to implement IN and NOT IN-style filtering, offering a powerful tool for manipulating dataframes based on membership criteria.

The above is the detailed content of How Can I Efficiently Perform SQL's IN and NOT IN Filtering in Pandas?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template