Home > Backend Development > Python Tutorial > How can you split text in a Pandas column into multiple rows using delimiters?

How can you split text in a Pandas column into multiple rows using delimiters?

Mary-Kate Olsen
Release: 2024-11-16 10:39:03
Original
784 people have browsed it

How can you split text in a Pandas column into multiple rows using delimiters?

Splitting Text in a Column into Multiple Rows with Pandas

When working with tabular data containing strings that need to be split into multiple lines, leveraging pandas and Python can greatly assist in this task. Consider the scenario where a CSV file contains a column with text that requires splitting by specific delimiters.

Problem Statement

Suppose you have a CSV file with a column named "Seatblocks" that contains strings representing multiple sets of seats, each separated by a space followed by a colon. Your goal is to split these seat sets into separate rows. For instance, the following Seatblocks column:

2:218:10:4,6 1:13:36:1,12 1:13:37:1,13
Copy after login

should result in three separate rows:

2:218:10:4,6
1:13:36:1,12
1:13:37:1,13
Copy after login

Solution Using Pandas

To efficiently split the Seatblocks column and create multiple rows, you can utilize the following steps:

  1. Split by Space: Use the str.split() method to split the text by spaces within each cell of the "Seatblocks" column:

    s = df['Seatblocks'].str.split(' ')
    Copy after login
  2. Apply the Series Function: To convert the resulting lists of space-separated strings into a dataframe, apply the Series function to each list:

    s = s.apply(Series, 1)
    Copy after login
  3. Flatten DataFrame: Stack the new dataframe to flatten it into a one-column dataframe:

    s = s.stack()
    Copy after login
  4. Reset Index and Rename Column: Reset the index to align with the original dataframe's index and rename the column to 'Seatblocks':

    s.index = s.index.droplevel(-1)
    s.name = 'Seatblocks'
    Copy after login
  5. Delete the Original Column: Remove the original "Seatblocks" column from the dataframe:

    del df['Seatblocks']
    Copy after login
  6. Join Split DataFrame: Finally, join the split dataframe with the original dataframe:

    df = df.join(s)
    Copy after login

Alternative for Splitting by Colon

If the Seatblocks column needs to be split by colons, you can modify the solution as follows:

s = df['Seatblocks'].str.split(' ')
s = s.apply(lambda x: Series(x.split(':')))
Copy after login

This will create a dataframe with each colon-separated string in its own column.

The above is the detailed content of How can you split text in a Pandas column into multiple rows using delimiters?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template