You need nunique
:
df = df.groupby('domain')['ID'].nunique() print (df) domain 'facebook.com' 1 'google.com' 1 'twitter.com' 2 'vk.com' 3 Name: ID, dtype: int64
If you need to strip
'
characters:
df = df.ID.groupby([df.domain.str.strip("'")]).nunique() print (df) domain facebook.com 1 google.com 1 twitter.com 2 vk.com 3 Name: ID, dtype: int64
Or as Jon Clements commented:
df.groupby(df.domain.str.strip("'"))['ID'].nunique()
You can retain the column name like this:
df = df.groupby(by='domain', as_index=False).agg({'ID': pd.Series.nunique}) print(df) domain ID 0 fb 1 1 ggl 1 2 twitter 2 3 vk 3
The difference is that nunique()
returns a Series and agg()
returns a DataFrame.
Related Posts:
- Count unique values per groups with Pandas
- How to groupby based on two columns in pandas?
- Get statistics for each group (such as count, mean, etc) using pandas GroupBy?
- Pandas group-by and sum
- Pandas group-by and sum
- Pandas ‘count(distinct)’ equivalent
- Count unique values using pandas groupby
- Groupby value counts on the dataframe pandas
- Find the unique values in a column and then sort them
- pandas groupby sort within groups
- how to sort pandas dataframe from one column
- How to reset index in a pandas dataframe? [duplicate]
- Constructing pandas DataFrame from values in variables gives “ValueError: If using all scalar values, you must pass an index”
- How to iterate over rows in a DataFrame in Pandas
- pandas read_json: “If using all scalar values, you must pass an index”
- Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
- Writing a pandas DataFrame to CSV file
- Pandas DataFrame Groupby two columns and get counts
- How to show all columns’ names on a large pandas dataframe?
- ValueError: Unknown label type: ‘continuous’
- How to deal with SettingWithCopyWarning in Pandas
- ImportError: No module named pandas
- TypeError: ‘Series’ objects are mutable, thus they cannot be hashed problemwith column
- ValueError: Length of values does not match length of index | Pandas DataFrame.unique()
- Creating an empty Pandas DataFrame, then filling it?
- Python Pandas Counting the Occurrences of a Specific value
- DataFrame constructor not properly called
- Convert pandas Series to DataFrame
- ImportError: Missing required dependencies [‘numpy’]
- Error”Can only compare identically-labeled Series objects” and sort_index
- How to iterate over rows in a DataFrame in Pandas
- Pandas: ValueError: cannot convert float NaN to integer
- ValueError: Length of values does not match length of index | Pandas DataFrame.unique()
- ‘DataFrame’ object has no attribute ‘sort’
- pandas: merge (join) two data frames on multiple columns
- Why do I get: “Length of values does not match length of index” error?
- why should I make a copy of a data frame in pandas
- ‘DataFrame’ object has no attribute ‘sort’
- Pandas, merging two dataframes on multiple columns, and multiplying result
- How to customize a scatter matrix to see all titles?
- What is dtype(‘O’), in pandas?
- How to read a .xlsx file using the pandas Library in iPython?
- Plot pie chart and table of pandas dataframe
- What is dtype(‘O’), in pandas?
- TypeError: ‘DataFrame’ object is not callable
- ValueError: ‘object too deep for desired array’
- What does axis in pandas mean?
- How to take column-slices of dataframe in pandas
- AttributeError: Can only use .dt accessor with datetimelike values
- Combine two columns of text in pandas dataframe
- Group by index + column in pandas
- Pandas: change data type of Series to String
- Normalize data in pandas
- Convert a Pandas DataFrame to a dictionary
- Selecting with complex criteria from pandas.DataFrame
- Create a Pandas Dataframe by appending one row at a time
- module ‘pandas’ has no attribute ‘rolling_mean’
- How do I combine two dataframes?
- Normalize columns of pandas data frame
- Normalize columns of pandas data frame
- Ignoring NaNs with str.containsv
- How to apply a function to two columns of Pandas dataframe
- Convert categorical data in pandas dataframe
- pandas create new column based on values from other columns / apply a function of multiple columns, row-wise
- No numeric types to aggregate – change in groupby() behaviour?
- Loading a file with more than one line of JSON into Pandas
- Solution for SpecificationError: nested renamer is not supported while agg() along with groupby()
- ValueError: Expected object or value when reading json as pandas dataframe
- How to check if a column exists in Pandas
- pandas DataFrame: replace nan values with average of columns
- Pandas error “Can only use .str accessor with string values”
- Pandas split DataFrame by column value
- How to get row number in dataframe in Pandas?
- AttributeError: ‘Series’ object has no attribute ‘reshape’
- How to convert column with dtype as object to string in Pandas Dataframe
- datetime to string with series in pandas
- Coalesce values from 2 columns into a single column in a pandas dataframe
- Unknown format code ‘f’ for object of type ‘str’- Folium
- Pandas: sum up multiple columns into one column without last column
- Convert Pandas Column to DateTime
- if else function in pandas dataframe
- Python: Pandas pd.read_excel giving ImportError: Install xlrd >= 0.9.0 for Excel support
- Read data (.dat file) with Pandas
- Move column by name to front of table in pandas
- Boolean Series key will be reindexed to match DataFrame index
- Converting Pandas dataframe into Spark dataframe error
- Must have equal len keys and value when setting with an iterable
- Difference between data type ‘datetime64[ns]’ and ‘
- Merging two DataFrames
- Pandas dataframe groupby plot
- pandas: multiple conditions while indexing data frame – unexpected behavior
- How to print a specific row of a pandas DataFrame?
- Logical operators for Boolean indexing in Pandas
- How to update Pandas from Anaconda and is it possible to use eclipse with this last
- vectorize conditional assignment in pandas dataframe
- Convert list of dictionaries to a pandas DataFrame
- Python Pandas : pivot table with aggfunc = count unique distinct
- Compare two columns using pandas
- AttributeError: ‘Series’ object has no attribute ‘split’ error in sending emails
- alueError: ordinal must be >= 1