The goal of this task is to import multiple CSV files from a directory into a single pandas DataFrame. Here's how to accomplish this:
First, import the necessary libraries for file handling and data manipulation:
import pandas as pd import glob import os
To read and concatenate the CSV files, follow these steps:
Here's an example code that combines these steps:
# Get file names path = r"C:\DRO\DCL_rawdata_files" filenames = glob.glob(os.path.join(path, "*.csv")) dfs = [] for filename in filenames: dfs.append(pd.read_csv(filename, header=0)) # Concatenate data into one DataFrame big_frame = pd.concat(dfs, ignore_index=True)
To differentiate between data from different CSV files, you can add a new column to identify each file. Here are a few options for doing so:
Option 1: Add File Name as a Column
for df in dfs: df["file_name"] = df.file_name.str.split("\").str[-1].str.split(".")[0]
Option 2: Add File Source as a Column
df["Source"] = np.repeat([f"File{i}" for i in range(len(dfs))], [len(df) for df in dfs])
By following these steps, you can efficiently import multiple CSV files into a single cohesive DataFrame in Python, making it easy to analyze and process data from various sources.
The above is the detailed content of How Can I Combine Multiple CSV Files into a Single Pandas DataFrame in Python?. For more information, please follow other related articles on the PHP Chinese website!