pandas groupby sort within groups

What you want to do is actually again a groupby (on the result of the first groupby): sort and take the first three elements per group.

Starting from the result of the first groupby:

In [60]: df_agg = df.groupby(['job','source']).agg({'count':sum})

We group by the first level of the index:

In [63]: g = df_agg['count'].groupby('job', group_keys=False)

Then we want to sort (‘order’) each group and take the first three elements:

In [64]: res = g.apply(lambda x: x.sort_values(ascending=False).head(3))

However, for this, there is a shortcut function to do this, nlargest:

In [65]: g.nlargest(3)
Out[65]:
job     source
market  A         5
        D         4
        B         3
sales   E         7
        C         6
        B         4
dtype: int64

So in one go, this looks like:

df_agg['count'].groupby('job', group_keys=False).nlargest(3)

how to sort pandas dataframe from one column
How to groupby based on two columns in pandas?
Pandas group-by and sum
Count unique values per groups with Pandas
Pandas group-by and sum
Pandas ‘count(distinct)’ equivalent
Count unique values using pandas groupby
Count unique values per groups with Pandas [duplicate]
Find the unique values in a column and then sort them
How do I sort a dictionary by value?
How do I sort a dictionary by value?
Renaming column names in Pandas
How to reset index in a pandas dataframe? [duplicate]
Delete a column from a Pandas DataFrame
How do I sort a dictionary by value?
How to deal with SettingWithCopyWarning in Pandas
How to deal with SettingWithCopyWarning in Pandas
Constructing pandas DataFrame from values in variables gives “ValueError: If using all scalar values, you must pass an index”
How to iterate over rows in a DataFrame in Pandas
pandas read_json: “If using all scalar values, you must pass an index”
How to iterate over rows in a DataFrame in Pandas
Writing a pandas DataFrame to CSV file
Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
Get statistics for each group (such as count, mean, etc) using pandas GroupBy?
Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
Writing a pandas DataFrame to CSV file
Adding new column to existing DataFrame in Python pandas
Modifing data while using iterrows() does not work
ImportError: No module named pandas
How to change the order of DataFrame columns?
How to change the order of DataFrame columns?
ImportError: No module named pandas. Pandas installed pip
What does `ValueError: cannot reindex from a duplicate axis` mean?
Pandas DataFrame Groupby two columns and get counts
How can I use the apply() function for a single column?
How to show all columns’ names on a large pandas dataframe?
Convenient way to deal with ValueError: cannot reindex from a duplicate axis
ValueError: Unknown label type: ‘continuous’
How to fix IndexError: invalid index to scalar variable
“Series objects are mutable and cannot be hashed” error
How to deal with SettingWithCopyWarning in Pandas
Merging dataframes on index with pandas
ImportError: No module named pandas
TypeError: ‘Series’ objects are mutable, thus they cannot be hashed problemwith column
Create a Pandas Dataframe by appending one row at a time
How to replace NaN values by Zeroes in a column of a Pandas Dataframe?
ValueError: Length of values does not match length of index | Pandas DataFrame.unique()
Convert Python dict into a dataframe
re.sub erroring with “Expected string or bytes-like object”
Creating an empty Pandas DataFrame, then filling it?
How do I select rows from a DataFrame based on column values?
How do I select rows from a DataFrame based on column values?
DataFrame constructor not properly called! error
How do I get the row count of a Pandas DataFrame?
Python pandas groupby aggregate on multiple columns, then pivot
Python Pandas Counting the Occurrences of a Specific value
Convert pandas dataframe to NumPy array
DataFrame constructor not properly called
Convert pandas Series to DataFrame
Error:cannot convert float NaN to integer in pandas
ImportError: Missing required dependencies [‘numpy’]
Replacing column values in a pandas DataFrame
Error”Can only compare identically-labeled Series objects” and sort_index
How to iterate over rows in a DataFrame in Pandas
Finding median of list in Python
How do I get the row count of a Pandas DataFrame?
Python Pandas – Missing required dependencies [‘numpy’] 1
Pandas “Can only compare identically-labeled DataFrame objects” error
Pandas: ValueError: cannot convert float NaN to integer
Get list from pandas dataframe column or row?
pandas DataFrame “no numeric data to plot” error
ValueError: Length of values does not match length of index | Pandas DataFrame.unique()
‘DataFrame’ object has no attribute ‘sort’
‘DataFrame’ object has no attribute ‘sort’
pandas: merge (join) two data frames on multiple columns
TypeError: cannot unpack non-iterable int objec
Pandas DataFrame column to list
Rename Pandas DataFrame Index
Python group by
Why do I get: “Length of values does not match length of index” error?
pandas: filter rows of DataFrame with operator chaining
why should I make a copy of a data frame in pandas
‘DataFrame’ object has no attribute ‘sort’
Pandas, merging two dataframes on multiple columns, and multiplying result
Convert DataFrame column type from string to datetime, dd/mm/yyyy format
Pandas how to use pd.cut()
How to customize a scatter matrix to see all titles?
How to Read .txt in Pandas
Syntax behind sorted(key=lambda: …)
Get a list from Pandas DataFrame column headers
What is dtype(‘O’), in pandas?
How to read a .xlsx file using the pandas Library in iPython?
Get total of Pandas column
Plot pie chart and table of pandas dataframe
Type error: cannot convert the series to
What is dtype(‘O’), in pandas?
TypeError: ‘DataFrame’ object is not callable
How to check whether a pandas DataFrame is empty?
Convert columns to string in Pandas
Change column type in pandas

Related Posts:

Leave a Comment Cancel reply