When working with Pandas DataFrames, there often arises a need to filter rows based on specific values in a particular column. This mimics SQL queries where rows are retrieved using filters like WHERE column_name = some_value.
To select rows where a column value matches a scalar value, some_value, use the equality operator ==:
df.loc[df['column_name'] == some_value]
To select rows where a column value is in an array, some_values, use the isin method:
df.loc[df['column_name'].isin(some_values)]
Multiple conditions can be combined using the logical & operator:
df.loc[(df['column_name'] >= A) & (df['column_name'] <= B)]
Note: Use parentheses to ensure operator precedence is correct.
To select rows where a column value does not equal some_value, use the inequality operator !=:
df.loc[df['column_name'] != some_value]
For isin, negate the result using ~:
df = df.loc[~df['column_name'].isin(some_values)]
Consider the following DataFrame:
import pandas as pd df = pd.DataFrame({'A': ['foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'foo'], 'B': ['one', 'one', 'two', 'three', 'two', 'two', 'one', 'three'], 'C': np.arange(8), 'D': np.arange(8) * 2}) print(df)
Select rows where A is foo:
print(df.loc[df['A'] == 'foo'])
Select rows where B is one or three:
print(df.loc[df['B'].isin(['one', 'three'])])
Create an index and select rows using it:
df = df.set_index(['B']) print(df.loc['one'])
Select rows with multiple indexed values:
print(df.loc[df.index.isin(['one', 'two'])])
The above is the detailed content of How to Select Specific Rows in Pandas DataFrames Based on Column Values?. For more information, please follow other related articles on the PHP Chinese website!