Unexpected end of regular expression using python
I am trying to scrape stock prices from Yahoo! Finance into a local database as per a tutorial by Chris Reeves, and I keep getting the above error when trying to execute this code. Can anyone tell me what is wrong here? Thanks.
from threading import Thread import urllib import re import MySQLdb gmap = {} def th(ur): base = "http://finance.yahoo.com/q?s="+ur regex = '<span id="yfs_l84_'+ur.lower()+'">(.+?)</span>' pattern = re.compile(regex) htmltext = urllib.urlopen(base).read() results = re.findall(pattern, htmltext) try: gmap[ur] = results[0] except: print "Got an error" symbolslist = open("multithread/stocks.txt").read() symbolslist = symbolslist.replace(" ","").split(",") print symbolslist threadlist = [] for u in symbolslist: t = Thread(target=th,args=(u,)) t.start() threadlist.append(t) for b in threadlist: b.join()
This is the exact error that I'm getting:
Exception in thread Thread-1: Traceback (most recent call last): File "C:\Python27\lib\threading.py", line 810, in __bootstrap_inner self.run() File "C:\Python27\lib\threading.py", line 763, in run self.__target(*self.__args, **self.__kwargs) File "multithread/threads.py", line 11, in th pattern = re.compile(regex) File "C:\Python27\lib\re.py", line 190, in compile return _compile(pattern, flags) File "C:\Python27\lib\re.py", line 242, in _compile raise error, v # invalid expression error: unexpected end of regular expression
Answers
Alas, you didn't show us the important part. That is, print symbolslist. Something in that list is creating an invalid regular expression when you paste it into the <span ... boilerplate.
You can probably fix it by changing that line like so:
regex = '<span id="yfs_l84_' + re.escape(ur.lower()) + '">(.+?)</span>' ^^^^^^^^^^ ^
However, if that works, it would probably only be hiding the real problem. The real problem is probably that you have some kind of nonsense in symbolslist.