RegularExpressionValidator slow on multiline textbox (textarea)

I have a multiline textbox (textarea) that I want to verify has a particular string in it. I was trying:

<asp:RegularExpressionValidator runat="server" ControlToValidate="txtTemplate" ValidationExpression="^(.\s*)*Content(.\s*)*$" Text="content" ErrorMessage="Must contain: Content" />

Using ^(.\s*)*$ seems to pass for a textarea. So I tried to sandwich my criteria between two of these. But it seems to lock up both IE and Chrome.

This should be simple, I think I'm making it tougher than it needs to be.

Answers


If the validation is always being done on the server (that's what runat="server" means, isn't it?), the simplest solution is probably to use this regex:

(?s)^.*Content.*$

(?s) turns on Singleline mode, which allows the . metacharacter to match all characters including linefeeds. If you want it to run on the client as well, use this:

^[\s\S]*Content[\s\S]*$

That's because JavaScript has no equivalent for Singleline mode (also known as DOT_ALL, DOTALL, dot-matches-all, single-line, or /s mode). It doesn't recognize inline modifiers like (?s) and (?i), either.

Watch out for constructs like (.\s*)*, where an expression with quantifiers (*, +, etc.) is enclosed in a group which is itself controlled by a quantifier. If the regex fails to achieve a match right away, it goes back and tries to match by different paths (i.e., by using different parts of the regex to match different parts of the string), which can get very expensive, performance-wise. This regex is especially bad because . and \s can match many of the same characters, which dramatically increases the number of paths it has to explore before giving up.

The phenomenon is commonly known as catastrophic backtracking, and it usually manifests in cases where there's no possibility of a match. I would expect your validator to work fine when the sequence Content is present.

By the way, if you want to match only on the complete word Content, you should add word boundaries, like so:

(?s)^.*\bContent\b.*$

That will prevent false positives on words like MalContent and Contentious. \b works differently in different regex flavors. In .NET it's Unicode-aware unless you specify ECMAScript mode. In JavaScript it's supposed to recognize only the ASCII letters and digits as word characters; in most browsers it does, but don't take it for granted.


Need Your Help

Thinking Sphinx search returns (Object doesn't support #inspect)

ruby-on-rails-3 indexing full-text-search sphinx thinking-sphinx

I have sphinx and thinking sphinx (3.0.2) installed on mountain lion for my rails 3 app and everything seems to be running fine during the installation. My eventual plan is to work with the geodist

How do I upgrade Flash Builder 4.6 to AIR 3.4?

air adobe flash-builder

AIR 3.4 was just released, with features like workers and native APN support.

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.