Home > Backend Development > Python Tutorial > How to Create a New Race Label Column in Pandas Based on Multiple Ethnicity Columns?

How to Create a New Race Label Column in Pandas Based on Multiple Ethnicity Columns?

DDD
Release: 2024-12-10 11:33:14
Original
751 people have browsed it

How to Create a New Race Label Column in Pandas Based on Multiple Ethnicity Columns?

Creating New Column Based on Values from Multiple Columns Using a Function in Pandas

When working with dataframes in Pandas, it may be necessary to create a new column based on values from multiple existing columns. A common scenario arises when a custom function needs to be applied to a set of columns row-wise to determine the new column's values.

Example Scenario

Consider the following dataframe with six ethnicity-related indicator columns:

df = pd.DataFrame({
    'ERI_Hispanic': [0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
    'ERI_AmerInd_AKNatv': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
    'ERI_Asian': [0, 0, 0, 0, 0, 0, 1, 0, 0, 0],
    'ERI_Black_Afr.Amer': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
    'ERI_HI_PacIsl': [0, 0, 0, 0, 0, 0, 0, 1, 0, 0],
    'ERI_White': [1, 0, 1, 1, 0, 1, 1, 1, 1, 1]
})
Copy after login

The goal is to create a new column named 'race_label' that classifies each row based on the following criteria:

  1. If ERI_Hispanic equals 1, return "Hispanic".
  2. If the sum of all non-Hispanic ERI columns (ERI_AmerInd_AKNatv, ERI_Asian, ERI_Black_Afr.Amer, ERI_HI_PacIsl, and ERI_White) is greater than 1, return "Two or More".
  3. For any other non-zero value in the ERI columns, return the corresponding race label (e.g., "A/I AK Native", "Asian", "Black/AA", "Haw/Pac Isl.", or "White").

Solution

The solution involves two steps: creating a custom function to perform the classification and applying the function to the dataframe row-wise.

1. Defining the Custom Function

def label_race(row):
    if row['ERI_Hispanic'] == 1:
        return 'Hispanic'
    elif row['ERI_AmerInd_AKNatv'] + row['ERI_Asian'] + row['ERI_Black_Afr.Amer'] + row['ERI_HI_PacIsl'] + row['ERI_White'] > 1:
        return 'Two or More'
    elif row['ERI_AmerInd_AKNatv'] == 1:
        return 'A/I AK Native'
    elif row['ERI_Asian'] == 1:
        return 'Asian'
    elif row['ERI_Black_Afr.Amer'] == 1:
        return 'Black/AA'
    elif row['ERI_HI_PacIsl'] == 1:
        return 'Haw/Pac Isl.'
    elif row['ERI_White'] == 1:
        return 'White'
    else:
        return 'Other'
Copy after login

This function takes a row of the dataframe as input and returns the appropriate race label based on the provided criteria.

2. Applying the Function to the Dataframe

To create the new 'race_label' column, use the apply() function along with the axis=1 parameter to apply the label_race function to each row of the dataframe.

df['race_label'] = df.apply(label_race, axis=1)
Copy after login

The resulting dataframe with the new column is displayed below:

    ERI_Hispanic  ERI_AmerInd_AKNatv  ERI_Asian  ERI_Black_Afr.Amer  ERI_HI_PacIsl  ERI_White  \
0             0                  0         0                     0             0          1   
1             1                  0         0                     0             0          0   
2             0                  0         0                     0             0          1   
3             0                  0         0                     0             0          1   
4             0                  0         0                     0             0          0   
5             0                  0         0                     0             0          1   
6             0                  0         1                     0             0          1   
7             0                  0         0                     0             1          1   
8             0                  0         0                     1             0          0   
9             0                  0         0                     0             0          1   

     race_label  
0         White  
1      Hispanic  
2         White  
3         White  
4         Other  
5         White  
6   Two or More  
7         White  
8  Haw/Pac Isl.  
9         White  
Copy after login

The above is the detailed content of How to Create a New Race Label Column in Pandas Based on Multiple Ethnicity Columns?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template