R – longer object length is not a multiple of shorter object length

Have a look at these lines here:

#predict with test data
RFestimated <- predict(model1, dataTest)

[1] 118.7794
> length(RFestimated)
[1] 8352
> length(model1$y)
[1] 33405

What you see is that their lengths differ. How is this supposed to work? Think about what you are trying to do:

a <- c(1,2,3)
b <- c(4,5)
a-b
[1] -3 -3 -1
Warning message:
In a - b : longer object length is not a multiple of shorter object length

You either need to evaluate the RMSE on the train data, or on the test data, but you are mixing them. That is, either this

RFestimated <- predict(model1, dataTrain)
qqnorm((RFestimated - model1$y)/sd(RFestimated-model1$y))

would work, or this:

RFestimated <- predict(model1, dataTest)
qqnorm((RFestimated - dataTest$y)/sd(RFestimated-dataTest$y))

The first option tells you how good you are fitting the data in the sample used for fitting, and the second gives you the performance on the test data.

Leave a Comment