How to Efficiently Drop Consecutive Duplicates in Pandas?

Mary-Kate Olsen
Release: 2024-11-13 17:29:02
Original
569 people have browsed it

How to Efficiently Drop Consecutive Duplicates in Pandas?

Efficient Dropping of Consecutive Duplicates in Pandas

When working with pandas DataFrames, it's often necessary to remove duplicate values. The built-in drop_duplicates() method, however, removes all instances of duplicate values, including consecutive duplicates. For cases where only consecutive duplicates need to be dropped, there are more efficient methods available.

One approach involves using the shift() function. By comparing the DataFrame against its shifted version (a.shift(-1)), a boolean mask can be created that identifies where consecutive duplicates occur. This mask can then be used to select only the unique values, as seen in the following example:

a.loc[a.shift(-1) != a]
Copy after login

Another method utilizes the diff() function. It calculates the difference between rows and can be used to identify consecutive duplicates. However, it's slower than the shift() method for large datasets.

Using:

a.loc[a.diff() != 0]
Copy after login

The original answer suggested using shift() with a period of -1, but the correct usage is shift(1) (or simply shift()) since the default shift period is 1. This modification ensures that only the first consecutive value is returned:

a.loc[a.shift(1) != a]
Copy after login

Both the shift() and diff() methods provide efficient ways to drop consecutive duplicates in Pandas and should be considered based on the specific context and performance requirements.

The above is the detailed content of How to Efficiently Drop Consecutive Duplicates in Pandas?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template