The reason why you are getting this warning is in the description of fun.aggregate
(see ?dcast
):
aggregation function needed if variables do not identify a single observation for each output cell. Defaults to length (with a message) if needed but not specified
So, an aggregation function is needed when there is more than one value for one spot in the wide dataframe.
An explanation based on your data:
When you use dcast(df, Id + Task ~ Type, value.var="Freq")
you get:
Id Task A B 1 3 1 2 3 2 3 2 3 0 3 4 1 3 3 4 4 2 1 3
Which is logical because for each combination of Id
, Task
and Type
there is only value in Freq
. But when you use dcast(df, Id ~ Type, value.var="Freq")
you get this (including a warning message):
Aggregation function missing: defaulting to length Id A B 1 3 2 2 2 4 2 2
Now, looking back at the top part of your data:
Id Task Type Freq 3 1 A 2 3 1 B 3 3 2 A 3 3 2 B 0
You see why this is the case. For each combination of Id
and Type
there are two values in Freq
(for Id 3: 2
and 3
for A
& 3
and 0
for Type B
) while you can only put one value in this spot in the wide dataframe for each values of type
. Therefore dcast
wants to aggregate these values into one value. The default aggregation function is length
, but you can use other aggregation functions like sum
, mean
, sd
or a custom function by specifying them with fun.aggregate
.
For example, with fun.aggregate = sum
you get:
Id A B 1 3 5 3 2 4 4 6
Now there is no warning because dcast
is being told what to do when there is more than one value: return the sum of the values.