Your approach will work despite the warning, but it’s best not to rely on implicit, unclear behavior.
Solution 1, make the selection of indices in a_list
a boolean mask:
df[df.index.isin(a_list) & df.a_col.isnull()]
Solution 2, do it in two steps:
df2 = df.loc[a_list] df2[df2.a_col.isnull()]
Solution 3, if you want a one-liner, use a trick found here:
df.loc[a_list].query('a_col != a_col')
The warning comes from the fact that the boolean vector df.a_col.isnull()
is the length of df
, while df.loc[a_list]
is of the length of a_list
, i.e. shorter. Therefore, some indices in df.a_col.isnull()
are not in df.loc[a_list]
.
What pandas does is reindex the boolean series on the index of the calling dataframe. In effect, it gets from df.a_col.isnull()
the values corresponding to the indices in a_list
. This works, but the behavior is implicit, and could easily change in the future, so that’s what the warning is about.