pandas Practical Guide: Tips for quickly deleting rows of data
Overview:
Pandas is a commonly used data analysis library in Python, with powerful data processing and operating functions. During data processing, it is often necessary to delete unnecessary row data. This article will introduce some techniques for deleting row data using pandas and provide specific code examples.
1. Delete row data with specific conditions
import pandas as pd data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'], 'Age': [25, 30, 35, 40], 'Gender': ['Female', 'Male', 'Male', 'Male']} df = pd.DataFrame(data)
Now we want to delete the rows whose Gender is Male, you can use the following code:
df = df.drop(df[df['Gender'] == 'Male'].index)
After running, Gender will be deleted from the df is Male's row data.
Code analysis:
df['Gender'] == 'Male'
is a conditional judgment statement that returns a Boolean Series object, representing the Gender column The row whose value is Male; df[df['Gender'] == 'Male'].index
Returns the index, that is, the index position of the row whose Gender is 'Male'; df.drop()
method can delete rows based on index. import pandas as pd import numpy as np data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'], 'Age': [25, 30, np.nan, 40], 'Gender': ['Female', 'Male', 'Male', 'Male']} df = pd.DataFrame(data)
We can use dropna ()
Method to delete rows containing null values:
df = df.dropna()
After running, df will delete row data containing null values.
drop_duplicates()
method to delete duplicate row data: import pandas as pd data = {'Name': ['Alice', 'Bob', 'Bob', 'David'], 'Age': [25, 30, 30, 40], 'Gender': ['Female', 'Male', 'Male', 'Male']} df = pd.DataFrame(data)
Now we can use the following code to delete duplicate rows:
df = df.drop_duplicates()
2. Delete rows based on row index
Sometimes we need to delete based on row index, you can use the drop()
method Delete row data based on index.
import pandas as pd data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'], 'Age': [25, 30, 35, 40], 'Gender': ['Female', 'Male', 'Male', 'Male']} df = pd.DataFrame(data)
Suppose we want to delete the row with index 2, we can use the following code:
df = df.drop(2)
After running, the row with index 2 is deleted.
3. Delete multiple rows
Sometimes you need to delete multiple rows, which can be achieved by passing in an index list or using slicing.
import pandas as pd data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'], 'Age': [25, 30, 35, 40], 'Gender': ['Female', 'Male', 'Male', 'Male']} df = pd.DataFrame(data)
Example 1: Delete rows with indexes 1 and 2
df = df.drop([1, 2])
Example 2: Delete rows with indexes 1 to 3
df = df.drop(df.index[1:4])
Both of the above methods are fast Delete multiple rows.
Conclusion:
This article introduces the techniques of using pandas to delete row data and provides specific code examples. During data processing, using these techniques can help us delete unnecessary rows of data quickly and efficiently. It is hoped that readers can use it flexibly in practical applications to speed up the speed and accuracy of data processing.
The above is the detailed content of Pandas practical guide: Tips for quickly deleting row data. For more information, please follow other related articles on the PHP Chinese website!