Creating a Column Based on Conditional Logic in Python
When working with Pandas DataFrames, we often encounter scenarios where we need to create a new column based on a conditional check between existing columns. This can be achieved using the np.where function with nested conditions.
To illustrate, consider the following DataFrame:
<code class="python">import pandas as pd df = pd.DataFrame({ "A": [2, 3, 1], "B": [2, 1, 3] })</code>
We want to create a new column C based on the following criteria:
Using a Custom Function
One approach is to create a custom function that implements the conditional logic and apply it to the DataFrame:
<code class="python">def f(row): if row['A'] == row['B']: return 0 elif row['A'] > row['B']: return 1 else: return -1 df['C'] = df.apply(f, axis=1)</code>
Using np.where
Alternatively, we can use the np.where function to directly assign values to the new column:
<code class="python">df['C'] = np.where(df['A'] == df['B'], 0, np.where(df['A'] > df['B'], 1, -1))</code>
This approach is vectorized and more efficient for large datasets.
Result:
Both approaches produce the following result:
<code class="python">print(df) A B C 0 2 2 0 1 3 1 1 2 1 3 -1</code>
The above is the detailed content of How to Perform Conditional Column Creation in Python\'s Pandas DataFrames?. For more information, please follow other related articles on the PHP Chinese website!