python gdata, extracting numbers from output

I'm still new to Python and I've been working with a script to get system info from my Raspberry Pi, like cpu temp and such and import it to a google doc spreadsheet. My goal is to extract the numbers from the output, which is in the form temp=54.1'C. I need the numbers alone to be able to graph the data over time...

I'm using:

import gdata.spreadsheet.service
import os
import subprocess
import re

email = 'myemail@gmail.com'
password = 'mypassword'

spreadsheet_key = 'sjdaf;ljaslfjasljdgasjdflasdjfgkjvja'
worksheet_id = '1'

def temp():
   command = "/opt/vc/bin/vcgencmd measure_temp"
   proc = subprocess.Popen(command, stdout=subprocess.PIPE, shell=True)
   output = proc.stdout.read()
   return output

def main():
   spr_client = gdata.spreadsheet.service.SpreadsheetsService()
   spr_client.email = email
   spr_client.password = password
   spr_client.ProgrammaticLogin()

   dict = {}
   dict['temp'] = temp()

   entry = spr_client.InsertRow(dict, spreadsheet_key, worksheet_id)

if __name__ == '__main__':
      try:
         main()
      except:
         print "Insert Row Failed!"

This above gives the standard result. I've tried tinkering with re.findall(), but can't get either the right placement or right combination of conditions (r,'/d+', s and other things) to get it to return only the number 54.1... I basically end up with "Insert Row Failed"

Any guidance would be appreciated. Thanks!

Answers


You were on the right track using re; your best bet (assuming the decimal can be arbitrary, etc.) is something like this:

import re

def temp():
    command = "/opt/vc/bin/vcgencmd measure_temp"
    proc = subprocess.Popen(command, stdout=subprocess.PIPE, shell=True)
    output = proc.stdout.read()

    # Build the regex. Use () to capture the group; we want any number of
    # digits \d or decimal points \. that is preceded by temp= and
    # followed by 'C
    temp_regex = re.compile(r'temp=([\d\.]*)\'C')
    matches = re.findall(temp_regex, output)   # now matches = ['54.1']

    temp = float(matches[0])
    return temp

The regex captures any combination of numbers and decimal places (e.g. 12.34.56 would get matched); you could restrict it if necessary to only allow a single decimal place, but that's more work than it appears to be worth, if you can trust that the data you're getting back is well-formed. If you do want the number to be more precise, you could compile the regex like this (for at least one numeral preceding the decimal place and exactly one following it):

temp_regex = re.compile(r'temp=(\d+.\d)\'C')

Again, we capture the expression using the parentheses (captured groups are returned by findall), but this time, increase the specificity of what we're looking for. This will capture any number like 123.4 but not .4 and not 123. If you find that you need to broaden it out a bit but still want only one decimal place:

temp_regex = re.compile(r'temp=(\d+.\d+)\'C')

That will capture any number with at least one numeral proceeding and following the decimal, so 1234.5678 would match but 1234. would not and .1234 would not.

As an alternative to using re.findall(), you might use re.match(), which returns match objects. Then your usage would look something like this (using the direct method, rather than pre-compiling the string:

match = re.match(r'temp=(\d+.\d+)\'C', output)
if match:
    temp = float(match.group(1))   # get the first matching group captured by ()
else:
    pass   # You should add some error handling here

One of the things this makes clearer than the way I had re.findall() above is that if nothing is captured, you have an issue, and you need to figure out how to handle it.


You can look at other ways to vary that up at Regular-Expressions.info, easily the best site I've found on the web for a quick resource on the topic.


Well, I've already spent too much time messing with this. I couldn't seem to get the output = proc.stdout.read() to give me anything. I tried dozens of combinations of re with no luck.

Then I started looking at the replace() method. And it might not be the slickest way to go, but I know the output will always be in the form of "temp=XX.X'C" (with X being numbers), so I just ended up doing this:

def temp():
   command = "/opt/vc/bin/vcgencmd measure_temp"
   proc = subprocess.Popen(command, stdout=subprocess.PIPE, shell=True)
   output = proc.stdout.read()
   output1 = output.replace("temp=","")
   output2 = output1.replace("'C","")
   return output2

and it worked! It shows up in the Google Spreadsheet as a number just as I needed.

Thanks for the help on this anyway, I'll keep trying to implement re in other applications and maybe I'll find out why I couldn't get it to work with this.


Need Your Help

How to find out on flickr the farm-id and other info of a photo

flickr flickr-api

Im trying to display in my website photos froma photoset I have in Flickr. In the documentation says:

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.