Home > Backend Development > Python Tutorial > How to Find Rows in One Pandas DataFrame That Are Not in Another?

How to Find Rows in One Pandas DataFrame That Are Not in Another?

Barbara Streisand
Release: 2024-12-09 07:59:11
Original
904 people have browsed it

How to Find Rows in One Pandas DataFrame That Are Not in Another?

Obtaining DataFrame Rows Not Present in Another DataFrame

To obtain rows from a DataFrame (df1) that are not present in another DataFrame (df2), the following steps can be executed:

import pandas as pd

# Create the two DataFrames.
df1 = pd.DataFrame(data={'col1': [1, 2, 3, 4, 5, 3], 'col2': [10, 11, 12, 13, 14, 10]})
df2 = pd.DataFrame(data={'col1': [1, 2, 3], 'col2': [10, 11, 12]})

# Perform a left join, ensuring each row in df1 joins with a single row in df2.
df_all = df1.merge(df2.drop_duplicates(), on=['col1', 'col2'], how='left', indicator=True)

# Create a boolean condition to identify rows in df1 that are not in df2.
condition = df_all['_merge'] == 'left_only'

# Filter df1 based on the condition.
result = df1[condition]
Copy after login

This approach ensures that only rows in df1 that do not exist in df2 are extracted, taking into account both column values in each row. Alternate solutions that check for individual column values independently may lead to incorrect results.

The above is the detailed content of How to Find Rows in One Pandas DataFrame That Are Not in Another?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template