Home > Backend Development > Python Tutorial > Conditional merge with pandas

Conditional merge with pandas

WBOY
Release: 2024-02-22 13:07:09
forward
1228 people have browsed it

Conditional merge with pandas

Question content

I have a pandas dataframe as shown below, which details additional calls to a region:

commsdate area day0 incremental day1 incremental day2 incremental
01/01/24 sales 43 36 29
01/01/24 service 85 74 66
02/01/24 sales 56 42 31
02/01/24 service 73 62 49
03/01/24 sales 48 32 twenty four
03/01/24 service 67 58 46

I am trying to calculate the number of calls received by date, so a sales call received on January 1st will be day0_incremental (43) of that date and January 2nd will be day0 of January 2 plus 1 Day1 on January 1 (36) 56) and January 3 will be day0 on January 3 plus day1 on January 2 plus day2 on January 1 (48 42 29), resulting in the following data frame:

CallDate Sales Service
01/01/24 43 85
02/01/24 92 147
03/01/24 119 195
04/01/24 63 107
05/01/24 twenty four 46

I have successfully created a shell of the data frame for the second table with no values ​​under the range column but don't know what to do next:

df['commsdate'] = pd.to_datetime(df['commsdate'], format='%d/%m/%y')
areaunique = df['area'].unique().tolist()
from datetime import timedelta
calldate = pd.date_range(start=min(df['commsdate']), end=max(df['commsdate'])+timedelta(days=6), freq='d')

data = {area: [] for area in areaunique}

dfnew = pd.dataframe(data)

dfnew['calldate'] = calldate

dfnew = dfnew.melt(id_vars=['calldate'], var_name='area')

dfnew = dfnew.pivot(index='calldate', columns='area', values='value')

dfnew = dfnew.reset_index()

dfnew = dfnew[['calldate'] + areaunique]
Copy after login

I've started writing a for loop, but I've only gotten this far:

for i in range(1,len(areaunique)+1):
    dfnew.columns(i) =
Copy after login


Correct answer


You can dialpivot,shiftandadd

df['commsdate'] = pd.to_datetime(df['commsdate'], dayfirst=true)
tmp = df.pivot(index='commsdate', columns='area')

out = (tmp['day0 incremental']
       .add(tmp['day1 incremental'].shift(freq='1d'), fill_value=0)
       .add(tmp['day2 incremental'].shift(freq='2d'), fill_value=0)
       .reset_index().rename_axis(columns=none)
      )
Copy after login

Alternatively, programmatically use functools.reduce using numbers extracted from the dayx … string:

from functools import reduce
import re

reg = re.compile(r'day(\d+)')

df['commsdate'] = pd.to_datetime(df['commsdate'], dayfirst=true)
tmp = df.pivot(index='commsdate', columns='area')

out = reduce(lambda a,b: a.add(b, fill_value=0),
             (tmp[d].shift(freq=f'{reg.search(d).group(1)}d') for d in
              tmp.columns.get_level_values(0).unique())
            ).reset_index().rename_axis(columns=none)
Copy after login

Output:

CommsDate  Sales  Service
0 2024-01-01   43.0     85.0
1 2024-01-02   92.0    147.0
2 2024-01-03  119.0    195.0
3 2024-01-04   63.0    107.0
4 2024-01-05   24.0     46.0
Copy after login

The above is the detailed content of Conditional merge with pandas. For more information, please follow other related articles on the PHP Chinese website!

source:stackoverflow.com
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template