Need regex to extract fields from string

I need to extract the title, location, and price from a string like this:

10' Starcraft pop up camper (Newport) $5500

It should be obvious which are which.

However, there are also cases like this:

10' (approx.) Starcraft pop up camper (Drigg's Town, PA) $5500

_

When I use a simple regex, I can match the first string correctly, but not the second:

^(?<title>.+?) \((?<area>.+?)\) \$(?<price>[\d]+)$

_

I'm pretty sure lookaheads/backreferences can handle this, but I don't know how. Can someone help me out with some explanation? (And maybe references to an easy to read article on the subject.)

Answers


With only 2 examples, the best I can suggest is to change the lazy quantifier to greedy quantifier for title capturing group:

^(?<title>.+) \((?<area>.+?)\) \$(?<price>[\d]+)$
           ^^
          Here

Effectively, the pattern in area capturing group will now capture the text inside the last brackets () (providing that it is followed by text that can be matched by price capturing group).

The greedy quantifier in title consumes as much text as possible, and force the area capturing group to take the furthest possible match.


Another way is to make sure the sub-pattern in area capturing group does not contain ():

^(?<title>.+) \((?<area>[^()]+)\) \$(?<price>[\d]+)$
           ^^           ^^^^^^
          Here           Here

I also remove the lazy quantifier, since it is redundant. There is only one way to match bracket () characters, which is before and after the text captured by area capturing group.


The 2 solutions above assume that area will never contain bracket () characters. The pattern is going to be slightly more complicated if you want to allow that.


Need Your Help

cURL response to NSString?

osx cocoa curl nsstring libcurl

- (IBAction)newSearchToolBarButtonAction:(id)sender

Method Overloading Of >> Binary Operator To Print A String

c++ operator-overloading

I want the StringClass class with overloading the >> binary operator like as follows:

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.