Lag with missing data

I have a dataset on state level approval ratings. I need to lag one of the variables by two years.

The data is annual and spans 1970 to 2008. Obviously, if I lag the data I will lose some observations (ie: 1970 won't be able to find the 1968 data) I'm fine with losing those observations, but the diff command returns an error when I try to lag.

However, when I run the lag I get the following error that the replacement does not match the data:

> df$lagvar <- diff(df$var, lag=2)
Error in `$<-.data.frame`(`*tmp*`, "lagvar", value = c(-0.4262501,  : 
replacement has 230 rows, data has 232

I've searched around, but cannot find a solution. Any ideas on how to get around this?

Answers


diff does not pad with leading NA by default. You have to add those yourself.

df$lagvar <- c(NA, NA, diff(df$var, lag=2))

You could write a simple wrapper function to do it for you. Something like this, perhaps:

mydiff <- function(x, ...) {
  d <- diff(x, ...)
  c(rep(NA, NROW(x)-NROW(d)), d)
}

Need Your Help

WP7 - Progress Bar

c# windows-phone-7 progress-bar stackpanel

What is the best way to structure a ProgressBar using Blend?

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.