How to Merge DataFrames Based on a Column, Preserving the First DataFrame\'s Information?

Mary-Kate Olsen
Release: 2024-10-31 05:17:02
Original
1012 people have browsed it

How to Merge DataFrames Based on a Column, Preserving the First DataFrame's Information?

How to Merge DataFrames Based on a Column, Preserving the First DataFrame's Information

Given two dataframes with overlapping but not identical rows, you can merge them based on a specific column to add information from one to the other. In this case, you want to merge two dataframes, df1 and df2, where df1 contains information about age and df2 contains information about sex. Here's how to do it while keeping the information from the first dataframe.

Solution

To merge the dataframes and retain the first dataframe's information, use the map function along with a Series created by set_index:

<code class="python">df1['Sex'] = df1['Name'].map(df2.set_index('Name')['Sex'])</code>
Copy after login

For example:

<code class="python">df1 = pd.DataFrame({'Name': ['Tom', 'Sara', 'Eva', 'Jack', 'Laura'], 
                     'Age': [34, 18, 44, 27, 30], 
                     'Sex': None})

df2 = pd.DataFrame({'Name': ['Tom', 'Paul', 'Eva', 'Jack', 'Michelle'], 
                     'Sex': ['M', 'M', 'F', 'M', 'F']})

df1['Sex'] = df1['Name'].map(df2.set_index('Name')['Sex'])

print(df1)</code>
Copy after login

Output:

   Name  Age Sex
0   Tom  34   M
1  Sara  18  NaN
2   Eva  44   F
3  Jack  27   M
4 Laura  30  NaN
Copy after login
Copy after login

Alternative Solution

Another way to achieve the same result is by merging the dataframes using a left join:

<code class="python">df = df1.merge(df2[['Name', 'Sex']], on='Name', how='left')</code>
Copy after login

Example:

<code class="python">df = df1.merge(df2[['Name', 'Sex']], on='Name', how='left')

print(df)</code>
Copy after login

Output:

   Name  Age Sex
0   Tom  34   M
1  Sara  18  NaN
2   Eva  44   F
3  Jack  27   M
4 Laura  30  NaN
Copy after login
Copy after login

Handling Multiple Join Columns

If you need to merge on multiple columns, use the same left join approach with the additional join columns specified:

<code class="python">df = df1.merge(df2[['Name', 'Sex', 'Year', 'Code']], on=['Year', 'Code'], how='left')</code>
Copy after login

Error Handling: Duplicate Join Columns

If you encounter an error due to duplicate join columns, handle it by removing duplicates or using a dictionary for mapping.

The above is the detailed content of How to Merge DataFrames Based on a Column, Preserving the First DataFrame\'s Information?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template