Python extracts the specified location record method after groupby grouping

不言
Release: 2018-04-20 13:45:30
Original
4862 people have browsed it

The following is a Python method for extracting specified location records after groupby grouping. It has a good reference value and I hope it will be helpful to everyone. Let’s take a look together

When conducting data analysis and data modeling, the first thing we have to do is to process the data and extract the information we need. The following introduces some usage of groupby to make data processing more convenient.

When we use groupby to extract information, we often find some statistics (max, min, var, etc.) of the grouped samples. If now we want to take the second record and the third to last record of the grouped sample, how should we do this? We can extract the first and last samples after grouping through first and last. But if we want to take samples at specified locations, there is no ready-made function. We need to write it ourselves. Below I will introduce to you how to implement the above functions.

1) Data introduction

The action table has 3 columns: userid, actionType and actionTime, which respectively represent user id, user behavior type and behavior Time of occurrence. The specific format is as shown below:

2) Grouping operation

a = action.groupby('userid') 
b = action.groupby('userid')['actionTime'] 
type(a) 
type(b)
Copy after login

After grouping, we can see that the data types of a and b are DataFrameGroupBy and SeriesGroupBy

3) Number retrieval operation

①The second/penultimate operation time of different users

action.groupby('userid')['actionTime'].apply(lambda i:i.iloc[1] if len(i)>1 else np.nan) 
action.groupby('userid')['actionTime'].apply(lambda i:i.iloc[-2] if len(i)>1 else np.nan)
Copy after login

②A certain behavior of different users Second/penultimate operation time

action[action['actionType']==2].groupby('userid')['actionTime'].apply(lambda i:i.iloc[1] if len(i)>1 else np.nan) 
action[action['actionType']==2].groupby('userid')['actionTime'].apply(lambda i:i.iloc[-2] if len(i)>1 else np.nan)
Copy after login

PS: Because some users may only have one record, direct fetching may cause errors. So I use if to make the judgment first.

In this way we can extract samples at any position of the grouped data.

Related recommendations:

pandas method of getting the row with the maximum value in the groupby group


The above is the detailed content of Python extracts the specified location record method after groupby grouping. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template