Navigation HTML with Scala
I want to open a html Source with Scala and then navigate. I new to Scala, so my question is what is the best class to use to navigate. Where I would have methods like getFirstChild?
//get html val html = Source.fromURL("https://www.google.com") // now what?
HTML document is an xml document, so you can use scala capabilities to working with xml. Here is an article which gives basic overview of XML processing capabilities of scala. Of course, There are plenty of java/scala libraries which simplify standart scala mechanism.
AFAIK, Scala doesn't have direct support for HTML ( although it has support for XML in built ). For example:
scala> import scala.io.Source scala> import scala.xml.XML scala> val html = Source.fromURL("https://www.google.com") scala> XML.loadString(html.toString) org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog. at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:198)
This is because all HTML documents are not well formed XML documents. To overcome this, you can use any HTML processing library in Java. Check for an example here: