Regex - Match an end html tag if start tag is not present
i want to get an ending html tag like </EM> only if somewhere before it i.e. before any previous tags or text there is no starting <EM> tag my sample string is
ddd d<STRONG>dfdsdsd dsdsddd<EM>ss</EM>r and</EM>and strong</STRONG>
in this string the output should be </EM> and this also the second </EM> because it lacks the starting <EM>. i have tried
but it doesnt seem to work please help thnks
I am not sure regex is best suited for this kind of task, since tags can always be nested.
Anyhow, a C# regex like:
would only bring the second </EM> tag
- ?! is a negative look*ahead* which explains why both </EM> are found. So... (?!=<EM>.*)xxx actually means capture xxx if it is not followed by =<EM>.*. I am not sure you wanted to include an = in there
- ?<! is a negative look*behind*, more suited to what you wanted to do, but which would not work with java regex engine, since this look-behind regex does not have an obvious maximum length.
However, with a .Net regex engine, as tested on RETester, it does work.