Regular Expression: Why am I getting matches here when I expect none?

I've a regular expression that's looking for 2-3 upper case letters, together, ending in T and beginning with P, M, C or E. The regular expression, executed in PHP, looks like this:

<?php

# The string to match against
$DT = 'Sat, 26 Nov 2011 21:04:19 GMT';

# Returns "MT" as a match
preg_match('/[PMCE][A-Z]?T/', $DT, $matches);

# I've also tried this -- returns "M" as a match
preg_match('/P|M|C|E[A-Z]?T/', $DT, $matches);

The second character is marked as optional with the ? but shouldn't it only be capable of returning PT, MT, CT, ET, or P*T, M*T, C*T, E*T?

This regular expression should not be matching the above string, I thought? I've actually already worked around with non-regular expression methods, but I'd like to know what the heck I'm doing wrong. How is it possible that "MT" is a match to either of those expressions?

In English I thought the both read "The character P,M,C,or E possibly followed by any A-Z character, followed by a T.

Answers


preg_match('/[PMCE][A-Z]?T/', $DT, $matches);


preg_match('/P|M|C|E[A-Z]?T/', $DT, $matches);

Both of these are matching against the GMT. If you want it to be its own word make it match a space, like this:

preg_match('/ [PMCE][A-Z]?T/', $DT, $matches);

The P|M|C|E[A-Z]?T expression translates to something like P or M or C or E[A-Z]?T, which is why it's quite happy to match the single "M".

If you want your second regex to behave more like the first then you'll need to group the or-ed characters: (P|M|C|E)[A-Z]?T should do it, but I prefer your original version anyway.


The second character is marked as optional with the ? but shouldn't it only be capable of returning PT, MT, CT, ET, or P*T, M*T, C*T, E*T?

Sure, but it's returning MT, which, like you say, it's a possible match. I think your problem is that you don't expect preg_match to start a match attempt from the middle of the timezone identifier. But in that case, you have to specify so:

preg_match('/\b[PMCE][A-Z]?T/', $DT, $matches);

\b matches a word boundary.


Need Your Help

Starteam will not check-in a new file added to a project

.net version-control starteam

I am using Visual Studio 2005 and StarTeam 2008 (cross-platform client and VS integration). At some point, I added an 'App.config' to a project. I notice now that this file will not check-in.

How do you create a hash table in C++?

c++ boost stl map hashtable

I am creating a simple hash table in VS 2008 C++.

How to pass arguments to an included file?

php include

I'm trying to make the whole &lt;head&gt; section its own include file. One drawback is the title and description and keyword will be the same; I can't figure out how to pass arguments to the include