Regular Expression: Why am I getting matches here when I expect none?

I've a regular expression that's looking for 2-3 upper case letters, together, ending in T and beginning with P, M, C or E. The regular expression, executed in PHP, looks like this:

<?php

# The string to match against
$DT = 'Sat, 26 Nov 2011 21:04:19 GMT';

# Returns "MT" as a match
preg_match('/[PMCE][A-Z]?T/', $DT, $matches);

# I've also tried this -- returns "M" as a match
preg_match('/P|M|C|E[A-Z]?T/', $DT, $matches);

The second character is marked as optional with the ? but shouldn't it only be capable of returning PT, MT, CT, ET, or P*T, M*T, C*T, E*T?

This regular expression should not be matching the above string, I thought? I've actually already worked around with non-regular expression methods, but I'd like to know what the heck I'm doing wrong. How is it possible that "MT" is a match to either of those expressions?

In English I thought the both read "The character P,M,C,or E possibly followed by any A-Z character, followed by a T.

Answers


preg_match('/[PMCE][A-Z]?T/', $DT, $matches);


preg_match('/P|M|C|E[A-Z]?T/', $DT, $matches);

Both of these are matching against the GMT. If you want it to be its own word make it match a space, like this:

preg_match('/ [PMCE][A-Z]?T/', $DT, $matches);

The P|M|C|E[A-Z]?T expression translates to something like P or M or C or E[A-Z]?T, which is why it's quite happy to match the single "M".

If you want your second regex to behave more like the first then you'll need to group the or-ed characters: (P|M|C|E)[A-Z]?T should do it, but I prefer your original version anyway.


The second character is marked as optional with the ? but shouldn't it only be capable of returning PT, MT, CT, ET, or P*T, M*T, C*T, E*T?

Sure, but it's returning MT, which, like you say, it's a possible match. I think your problem is that you don't expect preg_match to start a match attempt from the middle of the timezone identifier. But in that case, you have to specify so:

preg_match('/\b[PMCE][A-Z]?T/', $DT, $matches);

\b matches a word boundary.


Need Your Help

Starteam will not check-in a new file added to a project

.net version-control starteam

I am using Visual Studio 2005 and StarTeam 2008 (cross-platform client and VS integration). At some point, I added an 'App.config' to a project. I notice now that this file will not check-in.

How do you create a hash table in C++?

c++ boost stl map hashtable

I am creating a simple hash table in VS 2008 C++.

How to pass arguments to an included file?

php include

I'm trying to make the whole &lt;head&gt; section its own include file. One drawback is the title and description and keyword will be the same; I can't figure out how to pass arguments to the include

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.