Principal Components Analysis:Error in colMeans(x, na.rm = TRUE) : ‘x’ must be numeric

You can convert a character vector to numeric values by going via factor. Then each unique value gets a unique integer code. In this example, there’s four values so the numbers are 1 to 4, in alphabetical order, I think:

> d = data.frame(country=c("foo","bar","baz","qux"),x=runif(4),y=runif(4))
> d
  country          x         y
1     foo 0.84435112 0.7022875
2     bar 0.01343424 0.5019794
3     baz 0.09815888 0.5832612
4     qux 0.18397525 0.8049514
> d$country = as.numeric(as.factor(d$country))
> d
  country          x         y
1       3 0.84435112 0.7022875
2       1 0.01343424 0.5019794
3       2 0.09815888 0.5832612
4       4 0.18397525 0.8049514

You can then run prcomp:

> prcomp(d)
Standard deviations:
[1] 1.308665216 0.339983614 0.009141194

Rotation:
               PC1          PC2          PC3
country -0.9858920  0.132948161 -0.101694168
x       -0.1331795 -0.991081523 -0.004541179
y       -0.1013910  0.009066471  0.994805345

Whether this makes sense for your application is up to you. Maybe you just want to drop the first column: prcomp(d[,-1]) and work with the numeric data, which seems to be what the other “answers” are trying to achieve.

Leave a Comment