unescaping xml text from ElementTree in python

I'm trying to pull out an escape noded from an XML document. The raw text for the node looks like this:

<Notes>{&quot;Phase&quot;: 0, &quot;Flipper&quot;: 0, &quot;Guide&quot;: 0,     
&quot;Sample&quot;: 0, &quot;Triangle8&quot;: 0, &quot;Triangle5&quot;: 0,     
&quot;Triangle4&quot;: 0, &quot;Triangle7&quot;: 0, &quot;Triangle6&quot;: 0,     
&quot;Triangle1&quot;: 0, &quot;Triangle3&quot;: 0, &quot;Triangle2&quot;: 0}</Notes> 

I'm pulling the text out as follows:

infile = ET.parse("C:/userfiles/EXP011/SESAME_60/SESAME_60_runinfo.xml")
r = infile.getroot()
XMLNS = "{http://example.com/foo/bar/runinfo_v4_3}"
x=r.find(".//"+XMLNS+"Notes")
print(x.text)

I expected to get:

{"Phase": 0, "Flipper": 0, "Guide&quot": 0,     
"Sample": 0, "Triangle8": 0, "Triangle5": 0,     
"Triangle4": 0, "Triangle7": 0, "Triangle6": 0,     
"Triangle1": 0, "Triangle3": 0, "Triangle2": 0}

but, instead, I got:

 {&quot;Phase&quot;: 0, &quot;Flipper&quot;: 0, &quot;Guide&quot;: 0,      
 &quot;Sample&quot;: 0, &quot;Triangle8&quot;: 0, &quot;Triangle5&quot;: 0,   
 &quot;Triangle4&quot;: 0, &quot;Triangle7&quot;: 0, &quot;Triangle6&quot;: 0, 
 &quot;Triangle1&quot;: 0, &quot;Triangle3&quot;: 0, &quot;Triangle2&quot;: 0}

How do I get the unescaped string?

Answers


Use HTMLParser.HTMLParser():

In [8]: import HTMLParser    

In [11]: HTMLParser.HTMLParser().unescape('&quot;')
Out[11]: u'"'

saxutils handles &lt;, &gt; and &amp;, but it does not handle &quot;.

In [9]: import xml.sax.saxutils as saxutils

In [10]: saxutils.unescape('&quot;')
Out[10]: '&quot;'    

Need Your Help

A file executed from crontab returns different results than from command line

linux bash cron crontab

I need to run a script from crontab and this has a simple server load check that looks like this: