First, to convert a Categorical column to its numerical codes, you can do this easier with: dataframe['c'].cat.codes
.
Further, it is possible to select automatically all columns with a certain dtype in a dataframe using select_dtypes
. This way, you can apply above operation on multiple and automatically selected columns.
First making an example dataframe:
In [75]: df = pd.DataFrame({'col1':[1,2,3,4,5], 'col2':list('abcab'), 'col3':list('ababb')}) In [76]: df['col2'] = df['col2'].astype('category') In [77]: df['col3'] = df['col3'].astype('category') In [78]: df.dtypes Out[78]: col1 int64 col2 category col3 category dtype: object
Then by using select_dtypes
to select the columns, and then applying .cat.codes
on each of these columns, you can get the following result:
In [80]: cat_columns = df.select_dtypes(['category']).columns In [81]: cat_columns Out[81]: Index([u'col2', u'col3'], dtype='object') In [83]: df[cat_columns] = df[cat_columns].apply(lambda x: x.cat.codes) In [84]: df Out[84]: col1 col2 col3 0 1 0 0 1 2 1 1 2 3 2 0 3 4 0 1 4 5 1 1