Lag with missing data

I have a dataset on state level approval ratings. I need to lag one of the variables by two years.

The data is annual and spans 1970 to 2008. Obviously, if I lag the data I will lose some observations (ie: 1970 won't be able to find the 1968 data) I'm fine with losing those observations, but the diff command returns an error when I try to lag.

However, when I run the lag I get the following error that the replacement does not match the data:

> df$lagvar <- diff(df$var, lag=2)
Error in `$<`(`*tmp*`, "lagvar", value = c(-0.4262501,  : 
replacement has 230 rows, data has 232

I've searched around, but cannot find a solution. Any ideas on how to get around this?


diff does not pad with leading NA by default. You have to add those yourself.

df$lagvar <- c(NA, NA, diff(df$var, lag=2))

You could write a simple wrapper function to do it for you. Something like this, perhaps:

mydiff <- function(x, ...) {
  d <- diff(x, ...)
  c(rep(NA, NROW(x)-NROW(d)), d)

