SummarySE (Rmisc package) to produce a barplot with error bars (ggplot2)

Both the code variants that you posted won’t work because they are using the function summarySE() wrongly:

  • Version 1: You use Family as the measurement variable, which means that you ask the function to give you mean, standard deviation, etc. of Family.
  • Version2: You correctly group by Family, but now you supply many measurement variables. This does not work because summarySE() expects a single measurement variable. Try to imagine how the output table should look with several measurement variables and you will notice that this won’t be possible. You would have 13 columns for sd, 13 columns for ci, etc.

The problem with your data is that Swimming”, “Not.Swimming”, “Running”, etc. are actually values not variables. (Explaining this in detail is too much for this answer; see here if you need more information.) So, you need to convert your data into so-called long format:

library(tidyr)
long_behaviours <- gather(behaviours, variable, value, -Family)
long_behaviours[c(1, 120, 313, 730), ]
##     Family     variable       value
## 1       v4     Swimming -0.48055680
## 120     G8 Not.Swimming -0.05086028
## 313     G8  Not.Running -0.07139534
## 730     v4  Not.Hunting -0.22489721

As you can see from the few lines that I “randomly” picked from the resulting data frame, there is now a column that gives you the predictor and a single column with the numeric value. Now, you can use value as the measurement variable in summarySE and group by the other two:

library(Rmisc)
sum_behaviours <- summarySE(long_behaviours, measurevar =  "value",
                            groupvar = c("Family", "variable"), na.rm = TRUE)
head(sum_behaviours)
##   Family     variable  N        value         sd          se         ci
## 1     G8     Fighting 50  0.157977831 0.58253445 0.082382813 0.16555446
## 2     G8     Grooming 50  0.003784713 0.06611479 0.009350043 0.01878961
## 3     G8      Hunting 50  0.157977831 0.58253445 0.082382813 0.16555446
## 4     G8 Not.Fighting 50 -0.007098363 0.33806726 0.047809930 0.09607765
## 5     G8 Not.Grooming 50  0.202045803 1.30151612 0.184062175 0.36988679
## 6     G8  Not.Hunting 50 -0.007098363 0.33806726 0.047809930 0.09607765

You have now a table with mean, standard deviation, etc. for each Family and variable. This is the data you need to produce the plot according to the example from the R-Cookbook:

library(ggplot2)
ggplot(sum_behaviours, aes(x = variable, y = value, fill = Family)) + 
  geom_bar(position=position_dodge(), stat="identity") +
  geom_errorbar(aes(ymin = value - ci, ymax = value + ci),
                width=.2, position=position_dodge(.9)) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Leave a Comment