How to count the number of observations in R like Stata command count

The with function will let you use shorthand column references and sum will count TRUE results from the expression(s). As @mnel pointed out, you can also do: The benefit of that is that you can do: And, the behaviour matches Stata’s count almost exactly (syntax notwithstanding).

Categories R

Adding a regression line on a ggplot

In general, to provide your own formula you should use arguments x and y that will correspond to values you provided in ggplot() – in this case x will be interpreted as x.plot and y as y.plot. You can find more information about smoothing methods and formula via the help page of function stat_smooth() as it is the default stat used by geom_smooth(). If you are using the same x and … Read more

Categories R

duplicate ‘row.names’ are not allowed error

Then tell read.table not to use row.names: and now your rows will simply be numbered. Also look at read.csv which is a wrapper for read.table which already sets the sep=’,’ and header=TRUE arguments so that your call simplifies to

Categories R

Remove rows with all or some NAs (missing values) in data.frame

Also check complete.cases : na.omit is nicer for just removing all NA‘s. complete.cases allows partial selection by including only certain columns of the dataframe: Your solution can’t work. If you insist on using is.na, then you have to do something like: but using complete.cases is quite a lot more clear, and faster.

Categories R

Convert a list to a data frame

The default for the parameter stringsAsFactors is now default.stringsAsFactors() which in turn yields FALSE as its default. Assuming your list of lists is called l: The above will convert all character columns to factors, to avoid this you can add a parameter to the data.frame() call:

Categories R

Correlation between multiple variables of a data frame

I have a data.frame of 10 Variables in R. Lets call them var1 var2…var10 I want to find correlation of one of var1 with respect to var2, var3 … var10 How can we do that? cor function can find correlation between 2 variables at a time. By using that I had to write cor function for each Analysis

Categories R

Remove NA values from a vector

Trying ?max, you’ll see that it actually has a na.rm = argument, set by default to FALSE. (That’s the common default for many other R functions, including sum(), mean(), etc.) Setting na.rm=TRUE does just what you’re asking for: If you do want to remove all of the NAs, use this idiom instead: A final note: Other functions (e.g. table(), lm(), and sort()) have NA-related arguments that use different … Read more

Categories R