Pandas Multiple Conditions Indexing: Unexpected Behavior
With pandas, applying filters to a DataFrame is a common operation. However, when using multiple conditions, especially with logical operators like AND and OR, unexpected results can occur.
Problem:
When filtering rows based on values in two columns, the AND operator appears to behave like OR, and vice versa. For example, the code below should:
<code class="python">df = pd.DataFrame({'a': range(5), 'b': range(5) }) df['a'][1] = -1 df['b'][1] = -1 df['a'][3] = -1 df['b'][4] = -1 df1 = df[(df.a != -1) & (df.b != -1)] df2 = df[(df.a != -1) | (df.b != -1)] print(pd.concat([df, df1, df2], axis=1, keys=['original df', 'using AND (&)', 'using OR (|)',]))</code>
Explanation:
The unexpected behavior stems from how the logical operators are interpreted in the context of pandas indexing.
AND Operator:
OR Operator:
Therefore, the AND operator behaves like OR because it excludes rows based on the absence of -1 in either column. Conversely, the OR operator behaves like AND because it includes rows only when both columns do not contain -1.
Additional Note:
The above is the detailed content of Why Does Pandas Indexing with Multiple Conditions Exhibit Unexpected Behavior?. For more information, please follow other related articles on the PHP Chinese website!