How to Divide a Pandas DataFrame by a Column Value
Splitting a Pandas DataFrame based on a column value can be useful for creating separate subsets of data. Suppose you have a DataFrame with a column named 'Sales' and you want to divide it into two DataFrames: one containing rows where 'Sales' is less than a specified value, and another containing rows where 'Sales' is greater than or equal to that value.
To achieve this, you can use boolean indexing with the following steps:
Split the DataFrame: Apply the boolean masks to the original DataFrame to create two new DataFrames:
Alternatively, you can invert the first mask using the ~ operator:
mask = df['Sales'] >= s df1 = df[mask] df2 = df[~mask]<p>Here's an example to illustrate the process:</p> <pre class="brush:php;toolbar:false"><code class="python">df = pd.DataFrame({'Sales': [10, 20, 30, 40, 50], 'A': [3, 4, 7, 6, 1]}) print(df) s = 30 df1 = df[df['Sales'] >= s] print(df1) df2 = df[df['Sales'] < s] print(df2)</code>
The output will be:
A Sales 0 3 10 1 4 20 2 7 30 3 6 40 4 1 50 A Sales 2 7 30 3 6 40 4 1 50 A Sales 0 3 10 1 4 20
This demonstrates how to split a Pandas DataFrame into two based on a specified column value using boolean indexing.
The above is the detailed content of How to Divide a Pandas DataFrame by a Column Value?. For more information, please follow other related articles on the PHP Chinese website!