Given two dataframes with overlapping but not identical rows, you can merge them based on a specific column to add information from one to the other. In this case, you want to merge two dataframes, df1 and df2, where df1 contains information about age and df2 contains information about sex. Here's how to do it while keeping the information from the first dataframe.
To merge the dataframes and retain the first dataframe's information, use the map function along with a Series created by set_index:
<code class="python">df1['Sex'] = df1['Name'].map(df2.set_index('Name')['Sex'])</code>
For example:
<code class="python">df1 = pd.DataFrame({'Name': ['Tom', 'Sara', 'Eva', 'Jack', 'Laura'], 'Age': [34, 18, 44, 27, 30], 'Sex': None}) df2 = pd.DataFrame({'Name': ['Tom', 'Paul', 'Eva', 'Jack', 'Michelle'], 'Sex': ['M', 'M', 'F', 'M', 'F']}) df1['Sex'] = df1['Name'].map(df2.set_index('Name')['Sex']) print(df1)</code>
Output:
Name Age Sex 0 Tom 34 M 1 Sara 18 NaN 2 Eva 44 F 3 Jack 27 M 4 Laura 30 NaN
Another way to achieve the same result is by merging the dataframes using a left join:
<code class="python">df = df1.merge(df2[['Name', 'Sex']], on='Name', how='left')</code>
Example:
<code class="python">df = df1.merge(df2[['Name', 'Sex']], on='Name', how='left') print(df)</code>
Output:
Name Age Sex 0 Tom 34 M 1 Sara 18 NaN 2 Eva 44 F 3 Jack 27 M 4 Laura 30 NaN
If you need to merge on multiple columns, use the same left join approach with the additional join columns specified:
<code class="python">df = df1.merge(df2[['Name', 'Sex', 'Year', 'Code']], on=['Year', 'Code'], how='left')</code>
If you encounter an error due to duplicate join columns, handle it by removing duplicates or using a dictionary for mapping.
The above is the detailed content of How to Merge DataFrames Based on a Column, Preserving the First DataFrame\'s Information?. For more information, please follow other related articles on the PHP Chinese website!