Numpy "where" with Multiple Conditions: Addressing Three Conditions
Problem Description:
Adding a new column to a dataframe based on multiple conditions becomes challenging when facing more than two conditions. The given scenario demands the creation of an "energy_class" column with "high", "medium", or "low" values based on the "consumption_energy" column's values.
Solution:
Although numpy.where can only handle two conditions, a clever workaround using numpy.select resolves the issue.
Python Code:
<code class="python"># Define column and conditions col = 'consumption_energy' conditions = [df2[col] >= 400, (df2[col] < 400) & (df2[col] > 200), df2[col] <= 200] # Define choices for conditions choices = ["high", 'medium', 'low'] # Add "energy_class" column with np.select df2["energy_class"] = np.select(conditions, choices, default=np.nan)</code>
Example Output:
consumption_energy energy_class 0 459 high 1 416 high 2 186 low 3 250 medium 4 411 high 5 210 medium 6 343 medium 7 328 medium 8 208 medium 9 223 medium
Additional Note:
default=np.nan assigns NaN values to rows that don't meet any conditions. You can customize this to fit your needs.
The above is the detailed content of How to Add a Column to a DataFrame Using Numpy \'where\' with More Than Two Conditions?. For more information, please follow other related articles on the PHP Chinese website!