Why am I getting “sre_constants.error: bad character in group name” with ASCII group names?

This is my regex:

(?P=<streetname>[a-zæøå ]+)(?:[ ]+)(?P=<housenumber>\d+)(?:[ ]+),(?:[ ]+)(?P=<postalcode>\d{1,4})(?:[ ]+)(?P=<city>[a-zæøå ]+)

All the group names contain only ASCII characters, so why the error?

Traceback (most recent call last):
  File "addrtools.py", line 46, in 
  File "addrtools.py", line 43, in main
    extract_address('Testaddress 15B, 1234 Oslo')
  File "addrtools.py", line 35, in extract_address
    match = re.match(pat_full, string)
  File "/Users/tomas/.pythonbrew/pythons/Python-2.7.3/lib/python2.7/re.py", line 137, in match
    return _compile(pattern, flags).match(string)
  File "/Users/tomas/.pythonbrew/pythons/Python-2.7.3/lib/python2.7/re.py", line 242, in _compile
    raise error, v # invalid expression
sre_constants.error: bad character in group name

I have confirmed that pat_full does indeed contain the above regular expression. Also, my document is encoded in UTF-8 and is set to UTF-8 mode (# --*-- Encoding: UTF-8 --*--).


You are using (?P=<name>...) patterns, which mean "Matches whatever text was matched by the earlier group named name". But you don't have such a groups like streetname defined before.

Remove the = to make them actual named groups:

>>> re.compile('(?P<streetname>[a-zæøå ]+)(?:[ ]+)(?P<housenumber>\d+)(?:[ ]+),(?:[ ]+)(?P<postalcode>\d{1,4})(?:[ ]+)(?P<city>[a-zæøå ]+)')
<_sre.SRE_Pattern object at 0x102e6a620>

This is probably what you meant to do in the first place. :-)

Need Your Help

How to get 'type' field descriptor from ctypes Structure or Union field

python types ctypes descriptor

I have a structure with different datatype fields. I would like to iterate through the structure fields, check the datatype, and set the field with an appropriate value.

Cannot install android sdk packages

java android ubuntu-12.04

I'm trying to update my packages in the android sdk. When I run:

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.