What are the operations methods of Python drop() to delete rows and columns?

WBOY
Release: 2023-04-19 15:03:06
forward
3183 people have browsed it

The drop() function can come in handy when performing feature engineering and dividing data sets. It can easily eliminate data, operation columns, operation rows, etc.

The detailed syntax of drop() is as follows:

Deleting rows is index, deleting columns is columns:

DataFrame.drop(labels=None, axis=0, index=None, columns=None, inplace=False)
Copy after login

Parameters:

labels: to be deleted Label for a row or column, either a single label or a list of labels.

axis: The axis of the row or column to be deleted, 0 means row, 1 means column.

index: The index of the row to be deleted, which can be a single index or a list of indexes.

columns: The column name of the column to be deleted, which can be a single column name or a list of column names.

inplace: Whether to operate on the original DataFrame. The default is False, which means the operation will not be performed on the original DataFrame.

Delete columns

Usage scenario 1: Delete unnecessary features.

For example: if some features have little impact on the results, you can delete the independent variables that are not related to the dependent variable; in order to avoid multicollinearity, you should delete the independent variables that have a strong correlation.

df = data.drop(data[['RowNumber','CustomerId','Surname']],axis=1)
df
Copy after login

Code explanation:

data is the data set, the two square brackets represent the DataFrame format, which filters out 3 fields to be deleted;

axis=1 represents the operation Column;

Running results:

What are the operations methods of Python drop() to delete rows and columns?

Usage scenario 2: Delete the dependent variable

# 自变量、因变量
x_data = df.drop(['Exited'],axis=1)
y_data = df['Exited']
x_data
Copy after login

Code explanation:

## Fill in the field to be deleted in the #drop() function, which means to delete the column named "Exited" from df;

['Exited'] This field is the dependent variable we want to remove, a single field can This means;

Running results:

What are the operations methods of Python drop() to delete rows and columns?

Delete rows

Usage scenario 3: When dividing the data set, a training set is generated , remove the samples assigned to the training set, and the rest is the test set.

#划分训练集
train_data = data.sample(frac = 0.8, random_state = 0)
#测试集
test_data = data.drop(train_data.index)
Copy after login
Code explanation:

Fill in the row index in the drop() function to delete the row;

train_data is the training set we have divided, train_data.index represents the row index ;

axis=0, which means deleting rows, or not writing it, is the default value;

The above is the detailed content of What are the operations methods of Python drop() to delete rows and columns?. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:yisu.com
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template