How to Obtain a Cartesian Product in Pandas
In Pandas, a DataFrame is a tabular data structure. Performing operations on multiple DataFrames is often necessary for data analysis. One such operation is the Cartesian product, which combines all rows from two DataFrames into a new DataFrame.
Merging for Cartesian Product (Pandas >= 1.2)
The merge function in Pandas provides an efficient method for obtaining a Cartesian product. For versions 1.2 and above, use the following:
df1 = DataFrame({'col1': [1, 2], 'col2': [3, 4]}) df2 = DataFrame({'col3': [5, 6]}) df1.merge(df2, how='cross')
This returns a new DataFrame with all combinations of rows from df1 and df2.
Merging for Cartesian Product (Pandas < 1.2)
For earlier versions of Pandas, merge can still be used if there is a repeated key for each row. This key allows rows to be aligned for the Cartesian product:
df1 = DataFrame({'key': [1, 1], 'col1': [1, 2], 'col2': [3, 4]}) df2 = DataFrame({'key': [1, 1], 'col3': [5, 6]}) merge(df1, df2, on='key')[['col1', 'col2', 'col3']]
The above is the detailed content of How to Calculate the Cartesian Product of DataFrames in Pandas?. For more information, please follow other related articles on the PHP Chinese website!