Easy way to read Excel files using Pandas
In data analysis and processing, it is often necessary to read data from Excel files and perform various operations. Pandas is a powerful Python data analysis library that provides a simple and convenient way to read Excel files. This article will introduce how to use Pandas to read Excel files and provide specific code examples.
Before you begin, make sure you have the Pandas library installed. Pandas can be installed using the following code:
pip install pandas
Next, we assume that there is an Excel file named "example.xlsx", which contains a worksheet named "Sheet1". In this worksheet, there is some data including name, age and gender. We will read data from this Excel file.
First, let us import the Pandas library and read the Excel file:
import pandas as pd df = pd.read_excel('example.xlsx', sheet_name='Sheet1')
In the above code, we use the read_excel
function to read the Excel file. Among them, example.xlsx
is the file name of the Excel file to be read, and sheet_name='Sheet1'
is the name of the worksheet to be read. If the sheet_name
parameter is not specified, the first worksheet will be read by default.
After reading the Excel file, Pandas stores the data in the form of DataFrame in the variable df
. A DataFrame is a two-dimensional array of labels, similar to a table in Excel. The name of each column is called a column label, and the index of each row is called a row label.
Now, we can perform various operations on the read data, such as viewing the data of the first few rows, obtaining the data of a certain column, filtering the data, etc.
View the data of the first few rows:
print(df.head())
Get the data of a column:
name_column = df['姓名'] print(name_column)
Filter data:
filtered_data = df[df['年龄'] > 30] print(filtered_data)
In the above code, df.head()
will display the first few rows of DataFrame data, and the first 5 rows will be displayed by default. df['Name']
will get the data of the column named "Name", and df[df['Age'] > 30]
will be based on the "Age" column Conditions filter out data that meets the conditions.
In addition to reading Excel files, Pandas also provides some other methods to process Excel files, such as writing data to Excel files, adding new worksheets, etc. The following are some commonly used methods:
Write DataFrame to Excel file:
df.to_excel('output.xlsx', sheet_name='Sheet2', index=False)
The above code writes DataFrame to the file named "output.xlsx" "Sheet2" and set index=False
to not include the row index.
Add a new worksheet to an existing Excel file:
with pd.ExcelWriter('example.xlsx', mode='a') as writer: df.to_excel(writer, sheet_name='Sheet2', index=False)
The above code uses pd.ExcelWriter
to write the DataFrame to the current Excel file Some Excel files, and set mode='a'
to append writing. df.to_excel()
The method writes the DataFrame to the "Sheet2" worksheet.
By using Pandas, we can easily read and process Excel files, and be able to perform various operations to make data analysis and processing more efficient and convenient. The above is the introduction and sample code of a simple method to read Excel files using Pandas. Hope this helps!
The above is the detailed content of Pandas method to quickly read Excel files. For more information, please follow other related articles on the PHP Chinese website!