Home > Backend Development > Python Tutorial > How to Split a Pandas DataFrame String Column into Two Columns?

How to Split a Pandas DataFrame String Column into Two Columns?

Mary-Kate Olsen
Release: 2024-12-24 04:41:18
Original
521 people have browsed it

How to Split a Pandas DataFrame String Column into Two Columns?

How to split a dataframe string column into two columns?

When working with tabular data, it's often necessary to manipulate the data to extract specific pieces of information. One common task is splitting a single column of string values into multiple columns, each containing a portion of the original string.

Problem and Requirement

Suppose we have a DataFrame named df with one column called row that contains string values in the following format:

          row
0    00000 UNITED STATES
1    01000 ALABAMA
2    01001 Autauga County, AL
3    01003 Baldwin County, AL
4    01005 Barbour County, AL
Copy after login

Our goal is to split the row column into two new columns: fips and row, where fips contains the first five characters of each string and row contains the remaining characters.

Solution using str.split()

One way to split the row column is to use the str.split() method. This method takes a regular expression as an argument, and it splits the string based on the pattern specified by the regular expression. In our case, we can use the following regular expression:

r'(\d{5}) +'
Copy after login

This regular expression will match a sequence of five digits followed by one or more spaces. We can then use the str.split() method to split the row column using this regular expression, and assign the resulting lists to the fips and row columns as follows:

import pandas as pd

# Split the 'row' column into 'fips' and 'row' columns
df[['fips', 'row']] = df['row'].str.split(r'(\d{5}) +', n=1, expand=True)
Copy after login

The expand=True parameter is used to specify that the str.split() method should return a DataFrame with multiple columns, rather than a Series of lists.

Result

After executing the above code, our DataFrame df will look like this:

         fips       row
0    00000 UNITED STATES
1    01000 ALABAMA
2    01001 Autauga County, AL
3    01003 Baldwin County, AL
4    01005 Barbour County, AL
Copy after login
Copy after login

Alternative Solution using str.extract()

Another way to split the row column is to use the str.extract() method. This method takes a regular expression as an argument, and it returns a DataFrame containing the matches for the regular expression. In our case, we can use the following regular expression:

r'(\d{5}) +\D+'
Copy after login

This regular expression will match a sequence of five digits followed by one or more non-digits. We can then use the str.extract() method to extract the matches for this regular expression, and assign the resulting DataFrame to the fips and row columns as follows:

import pandas as pd

# Split the 'row' column into 'fips' and 'row' columns
df[['fips', 'row']] = df['row'].str.extract(r'(\d{5}) +\D+')
Copy after login

Result

After executing the above code, our DataFrame df will look like this:

         fips       row
0    00000 UNITED STATES
1    01000 ALABAMA
2    01001 Autauga County, AL
3    01003 Baldwin County, AL
4    01005 Barbour County, AL
Copy after login
Copy after login

Both of the above solutions will achieve the desired result, splitting the row column into fips and row columns. The str.split() solution is more flexible and can be used to split the column based on any regular expression, while the str.extract() solution is more straightforward and easier to understand.

The above is the detailed content of How to Split a Pandas DataFrame String Column into Two Columns?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template