How do I do a negative / nomatch / inverse search in data.table?
What happens if I want to select all the rows in a data.table that do not contain a particular value in the key variable using binary search? By the way, what is the correct jargon for what I want to do? Is it "nojoin"? Is it "negative selection"?
DT = data.table(x=rep(c("a","b","c"),each=3), y=c(1,3,6), v=1:9) setkey(DT,x)
Lets do a positive selection for all rows where x=="a" but using binary search
That's beautiful but I want the opposite of that. I want all the rows that are not "a" in other words where x!="a"
That is a vector scanning. The above line works but is uses vector scanning. I want to use binary. I was expecting the following to work, but alas...
The above two do not work and trying to play with nomatch got me nowhere.
The idiom is this:
DT[-DT["a", which=TRUE]] x y v 1: b 1 4 2: b 3 5 3: b 6 6 4: c 1 7 5: c 3 8 6: c 6 9
- The mailing list posting Return Select/Join that does NOT match?
- The previous question non-joins with data.tables
- Matthew Dowle's answer to Porting set operations from R's data frames to data tables: How to identify duplicated rows?
Update. New in v1.8.3 is not-join syntax. Farrel's first expectation (! rather than -) has been implemented :
DT[-DT["a",which=TRUE,nomatch=0],...] # old idiom DT[!"a",...] # same result, now preferred.
See the NEWS item for more detailed info and example.