You can use numpy.where
:
def my_fun (var1,var2,var3): df[var3]= np.where((df[var1]-df[var2])>0, df[var1]-df[var2], 0) return df df1 = my_fun('age1','age2','diff') print (df1) age1 age2 diff 0 23 10 13 1 45 20 25 2 21 50 0
Error is better explain here.
Slowier solution with apply
, where need axis=1
for data processing by rows:
def my_fun(x, var1, var2, var3): print (x) if (x[var1]-x[var2])>0 : x[var3]=x[var1]-x[var2] else: x[var3]=0 return x print (df.apply(lambda x: my_fun(x, 'age1', 'age2','diff'), axis=1)) age1 age2 diff 0 23 10 13 1 45 20 25 2 21 50 0
Also is possible use loc
, but sometimes data can be overwritten:
def my_fun(x, var1, var2, var3): print (x) mask = (x[var1]-x[var2])>0 x.loc[mask, var3] = x[var1]-x[var2] x.loc[~mask, var3] = 0 return x print (my_fun(df, 'age1', 'age2','diff')) age1 age2 diff 0 23 10 13.0 1 45 20 25.0 2 21 50 0.0