RegEx to exclude number using PHP

RegEx to exclude academic title

I want split paragraph string into array of sentences using regular expression with character dot (.). And the next problem is about number.

Here is an example :

In this year 2013. Hello Mr. Andre, your money is Rp 40.000.

Of course the correct output :

Array ( [0] => In this year 2013 [1] => Hello Mr. Andre, your money is Rp 40.000 )

The title problem (Mr.) is already solved from my question before. I've tried with adding regex of number but still don't work.

My not worked code :

$titles_number=array('(^[0-9]*)','(?<!Mr)', '(?<!Mrs)', '(?<!Ms)');

Can I do this with one blow (one regex to get rid two problem)? Tell me if I can't do it. Thanks in advance


This will be easier to accomplish with preg_match_all():

    $subject, $result, PREG_PATTERN_ORDER);


  • [^\s.] matches the next non-whitespace character (i.e., skip over any whitespace between sentences)
  • [^.]* gobbles up any non-dot characters
  • \. matches a dot IF...
  • (?<=Prof\.|Dr\.|Mr\.|Mrs\.|Ms\.)'s part of an honorific...
  • (?=\d) ...or part of a number


  1. (?<=Prof\.|Dr\.|Mr\.|Mrs\.|Ms\.) is legal because the alternation is at the top level. That is, it acts like several discrete lookbehinds, each with a fixed length. That's why I had to repeat the \. in every branch instead of using (?<=(?:Prof|Dr|Mr|Mrs|Ms)\.).

  2. \.(?=\d) seems to be sufficient for identifying a dot that's part of a number. If you really have to check for digits before and after the dot, you can use (?=(?<=\d\.)\d) instead.

  3. If this is for anything more serious than a homework problem, you should discard regexes and look for a natural-language processing library. Crude as all this is, it's very close to the limit of what you can do with regexes.

