Regular expression in R
I am having some troubles with the regular expression in R. I use str_extract from the library stringr and my problem is :
library(stringr) test="word1 something word2 something word3 something word3" temp = str_extract(test,'word2.+word3') print(temp) ##  "word2 something word3 something word3"
The problem is that I want it to stop at the first word3, I don't want the last part of the string. Any idea please ? thank you very much
and if I have
test="word1 something word2 something1 word3 something2 word3 something3 word2 something4 word3"
and that I want to keep a 2 size vector like this "word2 something1 word3", "word2 something4 word3" thanks again
Change your regex line to:
temp = str_extract(test,'word2.+?word3') ^
Notice that I added ? which makes the .+ non greedy (i.e. it captures as little as possible as opposed to capture everything before the next term in the regex).
To extract all the occurrences, use:
temp = str_extract_all(test,'word2.+?word3')