Scrape a price off a website

I'm trying to scrape a price from a web page using PHP and Regexes. The price will be in the format £123.12 or $123.12 (i.e., pounds or dollars).

I'm loading up the contents using libcurl. The output of which is then going into preg_match_all. So it looks a bit like this:

$contents = curl_exec($curl);

preg_match_all('/(?:\$|£)[0-9]+(?:\.[0-9]{2})?/', $contents, $matches);

So far so simple. The problem is, PHP isn't matching anything at all - even when there are prices on the page. I've narrowed it down to there being a problem with the '£' character - PHP doesn't seem to like it.

I think this might be a charset issue. But whatever I do, I can't seem to get PHP to match it! Anyone have any ideas?

(Edit: I should note if I try using the Regex Test Tool using the same regex and page content, it works fine)

Answers


Have you try to use \ in front of £

preg_match_all('/(\$|\£)[0-9]+(\.[0-9]{2})/', $contents, $matches);

I have try this expression with .Net with \£ and it works. I just edited it and removed some ":".

Read my comment about the possibility of Curl giving you bad encoding (comment of this post).


maybe pound has it's html entity replacement? i think you should try your regexp with some sort of couching program (i.e. match it against fixed text locally).

i'd change my regexp like this: '/(?:\$|£)\d+(?:\.\d{2})?/'


Need Your Help

size transition on resizable textarea

css resize textarea less css-transitions

I've implemented my textarea, which is shown/hidden with transition animation on hover on its master element:

Copying Mat to raw array in OpenCV with Java? (Getting “multiple of channels count” error)

java scala opencv image-processing

I'm trying to load an image in Scala using OpenCV with the Java bindings. After loading the image, I'd like to convert it to a traditional Scala Array[Float].

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.