Home > Backend Development > Python Tutorial > How to Convert a PySpark String Column to a Date Column?

How to Convert a PySpark String Column to a Date Column?

Barbara Streisand
Release: 2024-12-01 11:26:10
Original
952 people have browsed it

How to Convert a PySpark String Column to a Date Column?

Converting PySpark String to Date Format

You have a PySpark DataFrame with a string column in the MM-dd-yyyy format, and you need to convert it to a date column.

Solution:

To convert a PySpark string column to a date column, you can use the to_date function. However, if you're using an older version of Spark (< 2.2), you can follow the alternative approach below:

Alternative Approach for Spark < 2.2:

Use a combination of unix_timestamp and from_unixtime functions:

from pyspark.sql.functions import unix_timestamp, from_unixtime

# Example DataFrame with string dates
df = spark.createDataFrame(
    [("11/25/1991",), ("11/24/1991",), ("11/30/1991",)],
    ["date_str"]
)

# Convert to timestamps
df2 = df.select(
    "date_str",
    from_unixtime(unix_timestamp("date_str", "MM/dd/yyy")).alias("date")
)
Copy after login

This will create a new column named date with date objects converted from the string column.

The above is the detailed content of How to Convert a PySpark String Column to a Date Column?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template