Home > Backend Development > Python Tutorial > How Can I Combine Multiple CSV Files into a Single Pandas DataFrame in Python?

How Can I Combine Multiple CSV Files into a Single Pandas DataFrame in Python?

Mary-Kate Olsen
Release: 2024-12-28 14:02:10
Original
884 people have browsed it

How Can I Combine Multiple CSV Files into a Single Pandas DataFrame in Python?

Merging Multiple CSV Files into a Single DataFrame in Python

The goal of this task is to import multiple CSV files from a directory into a single pandas DataFrame. Here's how to accomplish this:

Importing Necessary Libraries

First, import the necessary libraries for file handling and data manipulation:

import pandas as pd
import glob
import os
Copy after login

Reading and Concatenating CSV Files

To read and concatenate the CSV files, follow these steps:

  1. Get File Names: Use glob.glob() to get a list of CSV file names in the specified directory.
  2. Read Files and Create DataFrames: Read each CSV file into a pandas DataFrame.
  3. Concatenate DataFrames: Use pd.concat() method to concatenate all the DataFrames into a single DataFrame.

Code for Concatenation

Here's an example code that combines these steps:

# Get file names
path = r"C:\DRO\DCL_rawdata_files"
filenames = glob.glob(os.path.join(path, "*.csv"))

dfs = []
for filename in filenames:
    dfs.append(pd.read_csv(filename, header=0))

# Concatenate data into one DataFrame
big_frame = pd.concat(dfs, ignore_index=True)
Copy after login

Adding Identity to DataFrames

To differentiate between data from different CSV files, you can add a new column to identify each file. Here are a few options for doing so:

Option 1: Add File Name as a Column

for df in dfs:
    df["file_name"] = df.file_name.str.split("\").str[-1].str.split(".")[0]
Copy after login

Option 2: Add File Source as a Column

df["Source"] = np.repeat([f"File{i}" for i in range(len(dfs))], [len(df) for df in dfs])
Copy after login

Conclusion

By following these steps, you can efficiently import multiple CSV files into a single cohesive DataFrame in Python, making it easy to analyze and process data from various sources.

The above is the detailed content of How Can I Combine Multiple CSV Files into a Single Pandas DataFrame in Python?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template