How do I handle unicode user input in Scala safely (esp XML entities)

On my website I have a form that takes in some textual user input. All works fine for "normal" characters. However when unicode characters are input... well, the plot thickens.

User inputs something like

やっぱ死にかけてる

This comes in to the server as text containing XML entity refs

やっぱ死にかけてる?

Now, when I want to serve this back to the client in HTML, how do I do it?

If I simply output the string as it is, there could be a chance for a script attack. If I try to encode it with scala.xml.Text it gets converted to:

やっぱ死にかけてる?

Is there a better ready-made solution in Scala which can detect entity refs and not escape them, yet escape XML tags?

Answers


Parse the string containing entity references as a fragment of XML. To safely output the Unicode characters in XML, you can be paranoid and use XML entity references for them, as per the function escape

scala>import xml.parsing.ConstructingParser                                                             
import xml.parsing.ConstructingParser

scala>import io.Source                                                                                  
import io.Source

scala> val d = ConstructingParser.fromSource(Source.fromString("<dummy>&#12420;</dummy>"), true).documnent
d: scala.xml.Document = <dummy>や</dummy>

scala>val t = d(0).text                                                                                         
res0: String = や

scala> import xml._
import xml._

scala> def escape(xmlText: String): NodeSeq = {
     |   def escapeChar(c: Char): xml.Node =
     |     if (c > 0x7F || Character.isISOControl(c))
     |       xml.EntityRef("#" + Integer.toString(c, 10))
     |     else
     |       xml.Text(c.toString)
     | 
     |   new xml.Group(xmlText.map(escapeChar(_)))
     | }
escape: (xmlText: String)scala.xml.NodeSeq

scala> <foo>{escape(t)}</foo>                            
res3: scala.xml.Elem = <foo>&#12420;</foo>

Need Your Help

Wordpress show page and child navigation

php wordpress

Sorry this is a really, really, really silly question but how would I go about listing my navigation trail on WordPress.

How to map JSON to Java Object

java json

In my web-service, I'm sending a json inside the post request to server. In server I need to put this information into Argument of my called function.

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.