Select Data According to a Partial Match
Let's say I have the following data frames and want to merge them.
df1 = data.frame(zipcoide=c(90001,90002,90003,66062,90005)) df1 df2 = data.frame(sfc_code=c(900,660,800,400,500,100,300,350,310,450)) df2
SCF Codes are apparently zipcode prefixes and I want to match the sfc_code with the zipcode.
Basically, if I'm given a list of scf codes, I want to select all those zipcodes which have that scf code.
So in this example, I want to end up with
90001 90002 90003 90005
I figure I could use the sqldf package to write a query to select based on " LIKE' %900% " but was looking for something a little more "elegant."
You want to return the all the zipcodes whose first 3 digits appear in your sfc_codes list:
df1[ as.numeric(substr( df1$zipcoide , 1 , 3 ) ) %in% df2$sfc_code , ] #  90001 90002 90003 66062 90005
Probably not the best example because all zip codes are in that sfc_code list!
But if we remove 660 then we get:
df2 = data.frame(sfc_code=c(900,800,400,500,100,300,350,310,450)) df1[ as.numeric(substr( df1$zipcoide , 1 , 3 ) ) %in% df2$sfc_code , ] #  90001 90002 90003 90005