using stringr to split vectors, unexpected result length
Something simple I'm messing up in using stringr to manipulate character vectors. I have a data frame of the following sort
library(stringr) d1 <- data.frame(x = str_c(rpois(10, lambda=5), rpois(10, lambda=10), sep = "_"))
and I want everything after the underscore as a separate variable. This use of str_sub results in a vector of length 20, and I'm at a loss to explain why.
d1$y <- str_sub(d1$x, str_locate(d1$x, fixed("_"))+1)
Error in $<-.data.frame(*tmp*, "y", value = c("_12", "_7", "_15", : replacement has 20 rows, data has 10
Could someone direct me how to write the str_sub call in the right way?
This is what you want to be doing (check out output of str_locate to see why it wasn't working for you, also note that str_sub recycles the arguments):
d1$y = str_sub(d1$x, str_locate(d1$x, fixed("_"))[,1] + 1, -1)
Or in base R:
d1$y = sub("^[^_]*_", "", d1$x)