python gdata, extracting numbers from output
I'm still new to Python and I've been working with a script to get system info from my Raspberry Pi, like cpu temp and such and import it to a google doc spreadsheet. My goal is to extract the numbers from the output, which is in the form temp=54.1'C. I need the numbers alone to be able to graph the data over time...
I'm using:
import gdata.spreadsheet.service import os import subprocess import re email = 'myemail@gmail.com' password = 'mypassword' spreadsheet_key = 'sjdaf;ljaslfjasljdgasjdflasdjfgkjvja' worksheet_id = '1' def temp(): command = "/opt/vc/bin/vcgencmd measure_temp" proc = subprocess.Popen(command, stdout=subprocess.PIPE, shell=True) output = proc.stdout.read() return output def main(): spr_client = gdata.spreadsheet.service.SpreadsheetsService() spr_client.email = email spr_client.password = password spr_client.ProgrammaticLogin() dict = {} dict['temp'] = temp() entry = spr_client.InsertRow(dict, spreadsheet_key, worksheet_id) if __name__ == '__main__': try: main() except: print "Insert Row Failed!"
This above gives the standard result. I've tried tinkering with re.findall(), but can't get either the right placement or right combination of conditions (r,'/d+', s and other things) to get it to return only the number 54.1... I basically end up with "Insert Row Failed"
Any guidance would be appreciated. Thanks!
Answers
You were on the right track using re; your best bet (assuming the decimal can be arbitrary, etc.) is something like this:
import re def temp(): command = "/opt/vc/bin/vcgencmd measure_temp" proc = subprocess.Popen(command, stdout=subprocess.PIPE, shell=True) output = proc.stdout.read() # Build the regex. Use () to capture the group; we want any number of # digits \d or decimal points \. that is preceded by temp= and # followed by 'C temp_regex = re.compile(r'temp=([\d\.]*)\'C') matches = re.findall(temp_regex, output) # now matches = ['54.1'] temp = float(matches[0]) return temp
The regex captures any combination of numbers and decimal places (e.g. 12.34.56 would get matched); you could restrict it if necessary to only allow a single decimal place, but that's more work than it appears to be worth, if you can trust that the data you're getting back is well-formed. If you do want the number to be more precise, you could compile the regex like this (for at least one numeral preceding the decimal place and exactly one following it):
temp_regex = re.compile(r'temp=(\d+.\d)\'C')
Again, we capture the expression using the parentheses (captured groups are returned by findall), but this time, increase the specificity of what we're looking for. This will capture any number like 123.4 but not .4 and not 123. If you find that you need to broaden it out a bit but still want only one decimal place:
temp_regex = re.compile(r'temp=(\d+.\d+)\'C')
That will capture any number with at least one numeral proceeding and following the decimal, so 1234.5678 would match but 1234. would not and .1234 would not.
As an alternative to using re.findall(), you might use re.match(), which returns match objects. Then your usage would look something like this (using the direct method, rather than pre-compiling the string:
match = re.match(r'temp=(\d+.\d+)\'C', output) if match: temp = float(match.group(1)) # get the first matching group captured by () else: pass # You should add some error handling here
One of the things this makes clearer than the way I had re.findall() above is that if nothing is captured, you have an issue, and you need to figure out how to handle it.
You can look at other ways to vary that up at Regular-Expressions.info, easily the best site I've found on the web for a quick resource on the topic.
Well, I've already spent too much time messing with this. I couldn't seem to get the output = proc.stdout.read() to give me anything. I tried dozens of combinations of re with no luck.
Then I started looking at the replace() method. And it might not be the slickest way to go, but I know the output will always be in the form of "temp=XX.X'C" (with X being numbers), so I just ended up doing this:
def temp(): command = "/opt/vc/bin/vcgencmd measure_temp" proc = subprocess.Popen(command, stdout=subprocess.PIPE, shell=True) output = proc.stdout.read() output1 = output.replace("temp=","") output2 = output1.replace("'C","") return output2
and it worked! It shows up in the Google Spreadsheet as a number just as I needed.
Thanks for the help on this anyway, I'll keep trying to implement re in other applications and maybe I'll find out why I couldn't get it to work with this.