Regex operation cost

So bottom line, speed REALLY matters to me. Every millisecond, so I want to see which method is the fastest.

I'm my program, I have various cases with different flags (flag[1] - flag[7]). To see how to handle the output, I must match the input with one of various patterns (pattern[1] - pattern[7]). So here is the question, is it better match the string with pattern[1], if it matches, handle it, if not try to match it to pattern[2] (pretty much doing the matching 7 times) OR to put ALL patterns into one regex with a split like:

"^[patterns[1]|pattern[2]|...]$

see if matches, and if it does, do a split on the string to get the flag value (it is always in the end) and handle it accordingly?

So bottom line: 7 different matches against 1 match and a split.

Note: based on the provided data, I will attempt to sort the 7 matches, so the one that is most likely to occur will be matched first.

I want to keep this question time-oriented, but for suggestions and decision making, the probability of the string being accepted after first match is roughly 91.3%

Answers


I'm not totally clear exactly what your search criteria are. You hint that the match string is always at the end. So here are some simple time tests to give a general idea. The tests search for two strings, the first of which is not present in the target, the second of which is present.

string.IndexOf 240 nanoseconds (to find string anywere in string, not just at end)
string.EndsWith 210 nanoseconds
Regex.Match 1,285 nanoseconds
precompiled Regex 648 nanoseconds

The test code is below. It uses a little benchmarking utility I wrote that removes the timing test overhead (the bracketing loops, etc) from the results. I'm not a regex expert, so hopefully my search pattern is comparable to the string tests.

string s = "zzzzzzzzzzzzzzzzzzzzzzzsomething";
string search1 = "thinker";
string search2 = "something";
int pos = 0;
new Bench().Time("string.IndexOf", (c) => {
    for (int i = 0; i < c; i++) {
        if ((pos = s.IndexOf(search1)) < 0) {
            pos = s.IndexOf(search2);
        }
    }
});
bool found = false;
new Bench().Time("string.EndsWith", (c) => {
    for (int i = 0; i < c; i++) {
        if (!(found = s.EndsWith(search1))) {
            found = s.EndsWith(search2);
        }
    }
});
string pattern = "(" + search1 + "|" + search2 + ")$";
Match match = null;
new Bench().Time("Regex.Match", (c) => { for (int i = 0; i < c; i++) match = Regex.Match(s, pattern); });
Regex regex = new Regex(pattern, RegexOptions.Compiled);
new Bench().Time("precompiled", (c) => { for (int i = 0; i < c; i++) match = regex.Match(s); });

Need Your Help


About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.