Apostrophe within Python lookbehind assertion

I'm trying to use a Python regular expression to get the first token of a character-separated string. I don't want to treat backslashed separators as real separators, so I'm using a negative lookbehind assertion. When the separator is a comma, it works without problem.

>>> import re
>>> re.match("(.*?)(?<!\\\\),.*", "Hello\, world!,This is a comma separated string,Third value").groups(1)[0]
'Hello\\, world!'

Whereas the exact same code by replacing the comma with an apostrophe does not work at all.

>>> import re
>>> re.match("(.*?)(?<!\\\\)'.*", "Hello\' world!'This is an apostrophe separated string'Third value").groups(1)[0]
'Hello'
>>>

I'm using python 2.7.2, but I have the same behavior with Python 3 (tested on Ideone). The Python re documentation does not indicate that ' is a special character, so I'm really wondering, why is my ' treated differently?

(Please, no comments: Who would want to have an apostrophe-separated file. Well... I do...)

Answers


print(repr("\'"),repr("\,"))

Results in:

"'" '\\,'

As you can see "\'" doesn't actually have a \\ in it. Hence when you change it to "\\'" the pattern matches producing:

Hello\' world!

"\'" is actually an escape sequence:

\' Single quote (')

Clearly, the reason

>>> ord("\'") == ord("'")
True

Is because "\'" is equivalent to "'". It makes sense \' is an escape sequence:

>>> 'i\'ll'
"i'll"

Need Your Help

Android, MySQL, JSON: MySQL script only returning values row date is 0000-00-00

php android mysql json

I am working on a system where the android devices are connected to a WAMP server.

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.