python pandas remove duplicate columns

Here’s a one line solution to remove columns based on duplicate column names:

df = df.loc[:,~df.columns.duplicated()]

How it works:

Suppose the columns of the data frame are ['alpha','beta','alpha']

df.columns.duplicated() returns a boolean array: a True or False for each column. If it is False then the column name is unique up to that point, if it is True then the column name is duplicated earlier. For example, using the given example, the returned value would be [False,False,True].

Pandas allows one to index using boolean values whereby it selects only the True values. Since we want to keep the unduplicated columns, we need the above boolean array to be flipped (ie [True, True, False] = ~[False,False,True])

Finally, df.loc[:,[True,True,False]] selects only the non-duplicated columns using the aforementioned indexing capability.

Note: the above only checks columns names, not column values.

how to sort pandas dataframe from one column
Renaming column names in Pandas
How to reset index in a pandas dataframe? [duplicate]
Delete a column from a Pandas DataFrame
How to deal with SettingWithCopyWarning in Pandas
How to deal with SettingWithCopyWarning in Pandas
Constructing pandas DataFrame from values in variables gives “ValueError: If using all scalar values, you must pass an index”
How to iterate over rows in a DataFrame in Pandas
pandas read_json: “If using all scalar values, you must pass an index”
How to iterate over rows in a DataFrame in Pandas
Writing a pandas DataFrame to CSV file
Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
Writing a pandas DataFrame to CSV file
Adding new column to existing DataFrame in Python pandas
Modifing data while using iterrows() does not work
ImportError: No module named pandas
How to change the order of DataFrame columns?
How to change the order of DataFrame columns?
ImportError: No module named pandas. Pandas installed pip
What does `ValueError: cannot reindex from a duplicate axis` mean?
Pandas DataFrame Groupby two columns and get counts
How can I use the apply() function for a single column?
How to show all columns’ names on a large pandas dataframe?
Convenient way to deal with ValueError: cannot reindex from a duplicate axis
ValueError: Unknown label type: ‘continuous’
How to groupby based on two columns in pandas?
How to fix IndexError: invalid index to scalar variable
“Series objects are mutable and cannot be hashed” error
How to deal with SettingWithCopyWarning in Pandas
Merging dataframes on index with pandas
ImportError: No module named pandas
TypeError: ‘Series’ objects are mutable, thus they cannot be hashed problemwith column
Create a Pandas Dataframe by appending one row at a time
How to replace NaN values by Zeroes in a column of a Pandas Dataframe?
ValueError: Length of values does not match length of index | Pandas DataFrame.unique()
Convert Python dict into a dataframe
re.sub erroring with “Expected string or bytes-like object”
Creating an empty Pandas DataFrame, then filling it?
How do I select rows from a DataFrame based on column values?
How do I select rows from a DataFrame based on column values?
DataFrame constructor not properly called! error
Pandas group-by and sum
How do I get the row count of a Pandas DataFrame?
Python pandas groupby aggregate on multiple columns, then pivot
Python Pandas Counting the Occurrences of a Specific value
Convert pandas dataframe to NumPy array
Count unique values per groups with Pandas
DataFrame constructor not properly called
Convert pandas Series to DataFrame
Error:cannot convert float NaN to integer in pandas
ImportError: Missing required dependencies [‘numpy’]
Replacing column values in a pandas DataFrame
Error”Can only compare identically-labeled Series objects” and sort_index
How to iterate over rows in a DataFrame in Pandas
Pandas group-by and sum
How do I get the row count of a Pandas DataFrame?
Python Pandas – Missing required dependencies [‘numpy’] 1
Pandas “Can only compare identically-labeled DataFrame objects” error
Pandas: ValueError: cannot convert float NaN to integer
Get list from pandas dataframe column or row?
pandas DataFrame “no numeric data to plot” error
ValueError: Length of values does not match length of index | Pandas DataFrame.unique()
‘DataFrame’ object has no attribute ‘sort’
‘DataFrame’ object has no attribute ‘sort’
pandas: merge (join) two data frames on multiple columns
TypeError: cannot unpack non-iterable int objec
Pandas DataFrame column to list
Rename Pandas DataFrame Index
Why do I get: “Length of values does not match length of index” error?
pandas: filter rows of DataFrame with operator chaining
why should I make a copy of a data frame in pandas
‘DataFrame’ object has no attribute ‘sort’
Pandas, merging two dataframes on multiple columns, and multiplying result
Convert DataFrame column type from string to datetime, dd/mm/yyyy format
Pandas how to use pd.cut()
How to customize a scatter matrix to see all titles?
How to Read .txt in Pandas
Get a list from Pandas DataFrame column headers
What is dtype(‘O’), in pandas?
How to read a .xlsx file using the pandas Library in iPython?
Get total of Pandas column
Plot pie chart and table of pandas dataframe
Type error: cannot convert the series to
What is dtype(‘O’), in pandas?
TypeError: ‘DataFrame’ object is not callable
Pandas ‘count(distinct)’ equivalent
How to check whether a pandas DataFrame is empty?
Convert columns to string in Pandas
Change column type in pandas
Python TypeError: cannot convert the series to when trying to do math on dataframe
ValueError: ‘object too deep for desired array’
Shuffle DataFrame rows
What does axis in pandas mean?
How to take column-slices of dataframe in pandas
AttributeError: Can only use .dt accessor with datetimelike values
Combine two columns of text in pandas dataframe
Python: pandas merge multiple dataframes
How to load a tsv file into a Pandas DataFrame?
How to count the NaN values in a column in pandas DataFrame

Related Posts:

Leave a Comment Cancel reply