Boolean Series key will be reindexed to match DataFrame index

Your approach will work despite the warning, but it’s best not to rely on implicit, unclear behavior.

Solution 1, make the selection of indices in a_list a boolean mask:

df[df.index.isin(a_list) & df.a_col.isnull()]

Solution 2, do it in two steps:

df2 = df.loc[a_list]
df2[df2.a_col.isnull()]

Solution 3, if you want a one-liner, use a trick found here:

df.loc[a_list].query('a_col != a_col')

The warning comes from the fact that the boolean vector df.a_col.isnull() is the length of df, while df.loc[a_list] is of the length of a_list, i.e. shorter. Therefore, some indices in df.a_col.isnull() are not in df.loc[a_list].

What pandas does is reindex the boolean series on the index of the calling dataframe. In effect, it gets from df.a_col.isnull() the values corresponding to the indices in a_list. This works, but the behavior is implicit, and could easily change in the future, so that’s what the warning is about.

Leave a Comment