Count unique values per groups with Pandas [duplicate]

You need nunique:

df = df.groupby('domain')['ID'].nunique()

print (df)
domain
'facebook.com'    1
'google.com'      1
'twitter.com'     2
'vk.com'          3
Name: ID, dtype: int64

If you need to strip ' characters:

df = df.ID.groupby([df.domain.str.strip("'")]).nunique()
print (df)
domain
facebook.com    1
google.com      1
twitter.com     2
vk.com          3
Name: ID, dtype: int64

Or as Jon Clements commented:

df.groupby(df.domain.str.strip("'"))['ID'].nunique()

You can retain the column name like this:

df = df.groupby(by='domain', as_index=False).agg({'ID': pd.Series.nunique})
print(df)
    domain  ID
0       fb   1
1      ggl   1
2  twitter   2
3       vk   3

The difference is that nunique() returns a Series and agg() returns a DataFrame.

Count unique values per groups with Pandas
How to groupby based on two columns in pandas?
Get statistics for each group (such as count, mean, etc) using pandas GroupBy?
Pandas group-by and sum
Pandas group-by and sum
Pandas ‘count(distinct)’ equivalent
Count unique values using pandas groupby
Groupby value counts on the dataframe pandas
Find the unique values in a column and then sort them
pandas groupby sort within groups
how to sort pandas dataframe from one column
How to reset index in a pandas dataframe? [duplicate]
Constructing pandas DataFrame from values in variables gives “ValueError: If using all scalar values, you must pass an index”
How to iterate over rows in a DataFrame in Pandas
pandas read_json: “If using all scalar values, you must pass an index”
Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
Writing a pandas DataFrame to CSV file
Pandas DataFrame Groupby two columns and get counts
How to show all columns’ names on a large pandas dataframe?
ValueError: Unknown label type: ‘continuous’
How to deal with SettingWithCopyWarning in Pandas
ImportError: No module named pandas
TypeError: ‘Series’ objects are mutable, thus they cannot be hashed problemwith column
ValueError: Length of values does not match length of index | Pandas DataFrame.unique()
Creating an empty Pandas DataFrame, then filling it?
Python Pandas Counting the Occurrences of a Specific value
DataFrame constructor not properly called
Convert pandas Series to DataFrame
ImportError: Missing required dependencies [‘numpy’]
Error”Can only compare identically-labeled Series objects” and sort_index
How to iterate over rows in a DataFrame in Pandas
Pandas: ValueError: cannot convert float NaN to integer
ValueError: Length of values does not match length of index | Pandas DataFrame.unique()
‘DataFrame’ object has no attribute ‘sort’
pandas: merge (join) two data frames on multiple columns
Why do I get: “Length of values does not match length of index” error?
why should I make a copy of a data frame in pandas
‘DataFrame’ object has no attribute ‘sort’
Pandas, merging two dataframes on multiple columns, and multiplying result
How to customize a scatter matrix to see all titles?
What is dtype(‘O’), in pandas?
How to read a .xlsx file using the pandas Library in iPython?
Plot pie chart and table of pandas dataframe
What is dtype(‘O’), in pandas?
TypeError: ‘DataFrame’ object is not callable
ValueError: ‘object too deep for desired array’
What does axis in pandas mean?
How to take column-slices of dataframe in pandas
AttributeError: Can only use .dt accessor with datetimelike values
Combine two columns of text in pandas dataframe
Group by index + column in pandas
Pandas: change data type of Series to String
Normalize data in pandas
Convert a Pandas DataFrame to a dictionary
Selecting with complex criteria from pandas.DataFrame
Create a Pandas Dataframe by appending one row at a time
module ‘pandas’ has no attribute ‘rolling_mean’
How do I combine two dataframes?
Normalize columns of pandas data frame
Normalize columns of pandas data frame
Ignoring NaNs with str.containsv
How to apply a function to two columns of Pandas dataframe
Convert categorical data in pandas dataframe
pandas create new column based on values from other columns / apply a function of multiple columns, row-wise
No numeric types to aggregate – change in groupby() behaviour?
Loading a file with more than one line of JSON into Pandas
Solution for SpecificationError: nested renamer is not supported while agg() along with groupby()
ValueError: Expected object or value when reading json as pandas dataframe
How to check if a column exists in Pandas
pandas DataFrame: replace nan values with average of columns
Pandas error “Can only use .str accessor with string values”
Pandas split DataFrame by column value
How to get row number in dataframe in Pandas?
AttributeError: ‘Series’ object has no attribute ‘reshape’
How to convert column with dtype as object to string in Pandas Dataframe
datetime to string with series in pandas
Coalesce values from 2 columns into a single column in a pandas dataframe
Unknown format code ‘f’ for object of type ‘str’- Folium
Pandas: sum up multiple columns into one column without last column
Convert Pandas Column to DateTime
if else function in pandas dataframe
Python: Pandas pd.read_excel giving ImportError: Install xlrd >= 0.9.0 for Excel support
Read data (.dat file) with Pandas
Move column by name to front of table in pandas
Boolean Series key will be reindexed to match DataFrame index
Converting Pandas dataframe into Spark dataframe error
Must have equal len keys and value when setting with an iterable
Difference between data type ‘datetime64[ns]’ and ‘

Merging two DataFrames
Pandas dataframe groupby plot
pandas: multiple conditions while indexing data frame – unexpected behavior
How to print a specific row of a pandas DataFrame?
Logical operators for Boolean indexing in Pandas
How to update Pandas from Anaconda and is it possible to use eclipse with this last
vectorize conditional assignment in pandas dataframe
Convert list of dictionaries to a pandas DataFrame
Python Pandas : pivot table with aggfunc = count unique distinct
Compare two columns using pandas
AttributeError: ‘Series’ object has no attribute ‘split’ error in sending emails
alueError: ordinal must be >= 1

Related Posts:

Leave a Comment Cancel reply