Unsure how to pass column variable to mean() in r

Am a newb. Trying to code for a program. I have a multi-column data frame. I want to calc the mean of a column. I want to pass to the mean() function the name of the column that i want to use for mean calc. I have tried to pass it a character string that uses the $ symbol. It seems R doesnt allow the string passed to be a character and wants a logical or numeric when $ is used to define the column name. Net, am stuck. Is there another way to do this? Am suggestions would be appreciated. Code and results are below

> ## df.final is the name of the dataframe

> car.type        <- "ford"
> col.name        <- paste("df.final","$", car.type, sep = "")

> print(col.name)
[1] "df.final$ford"

> mean(col.name, na.rm = TRUE)
[1] NA
Warning message:
In mean.default(col.name, na.rm = TRUE) :
argument is not numeric or logical: returning NA

> mean(df.final$ford, na.rm = TRUE)
[1] 3.14

Answers


(df.final <- data.frame(ford = sample(0:100, 5), toyota = sample(0:50, 5)))
#   ford toyota
# 1   42      5
# 2   30     46
# 3   45     29
# 4   69     48
# 5   18     14
col.name
# [1] "df.final$ford"
typeof(col.name)
# [1] "character"

Currently, col.name is a character vector, so taking its mean makes no sense. Let's parse it into an expression:

temp <- parse(text = col.name)
temp
# expression(df.final$ford)
typeof(temp)
# [1] "expression"
mean(temp)
# [1] NA
# Warning message:
# In mean.default(temp) : argument is not numeric or logical: returning NA

Hmm. R still isn't happy, because taking the mean of an expression doesn't make sense either. Let's evaluate our expression.

temp <- eval(parse(text = col.name))
temp
# [1] 42 30 45 69 18
typeof(temp)
# [1] "integer"
mean(temp)
# [1] 40.8

Much better. So mean(eval(parse(text = col.name)), na.rm = T) does the trick for your example. You might also check out the useful function ?do.call as well:

do.call(mean, args = list(x = temp, na.rm = T))
# [1] 40.8

You can use [ or [[ to access columns by name:

df.final <- data.frame(ford=c(1, 2, NA), toyota=c(3, 2, 1))
car.type <- "ford"
mean(df.final[,car.type], na.rm=TRUE)
# [1] 1.5
mean(df.final[[car.type]], na.rm=TRUE)
# [1] 1.5

Just to mention, you can use eval(·) and parse(·)

> mean(eval(parse(text=col.name)), na.rm=TRUE)
[1] 1.5

Need Your Help

What is the best way to write a loop with no body

c++

I have a function func which returns true or false. Until func returns false, I want to keep calling it. What is the least awkward way to do this?

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.