Concatenating Multiple CSV Files into a Single DataFrame
Importing multiple CSV files into pandas and concatenating them into one large DataFrame can be achieved using the following steps:
-
Read the CSV Files: Use glob.glob() to obtain a list of all CSV files in the designated directory. Then, read each CSV file using pd.read_csv(), and store the resulting DataFrames in a list.
import glob
import pandas as pd
# Get data file names
path = r'C:\DRO\DCL_rawdata_files'
filenames = glob.glob(path + "/*.csv")
dfs = []
for filename in filenames:
dfs.append(pd.read_csv(filename))
Copy after login
-
Concatenate the DataFrames: Use pd.concat() to concatenate all the DataFrames in the list into a single DataFrame. Set ignore_index=True to avoid index conflicts while concatenating.
# Concatenate all data into one DataFrame
big_frame = pd.concat(dfs, ignore_index=True)
Copy after login
Additional Considerations:
- Ensure that all CSV files have the same columns for successful concatenation.
- If the CSV files have different column names or formats, consider using additional preprocessing steps to align them before concatenating.
- To identify each data sample, add a new column to the DataFrame with information such as the file name or a unique identifier.
The above is the detailed content of How to Combine Multiple CSV Files into a Single Pandas DataFrame?. For more information, please follow other related articles on the PHP Chinese website!