HTML/XML Parser for Java

What HTML parsers have the following features:

  • Fast
  • Thread-safe
  • Reliable and bug-free
  • Parses HTML and XML
  • Handles erroneous HTML
  • Has a DOM implementation
  • Supports HTML4, JavaScript, and CSS tags
  • Relatively simple, object-oriented API

What parser you think is better?

Thank you.

Answers


Check out Web Harvest. It's both a library you can use and a data extraction tool, which sounds to me that's exactly what you want to do. You create XML script files to instruct the scraper how to extract the information you need and from where. The provided GUI is very useful to quickly test the scripts.

Check out the project's samples page to see if it's a good fit for what you are trying to do.


Need Your Help

How to check if a key has no value

ios objective-c uitableview key feed

I have two feeds and i want with the content of them to populate a table view, but when i want to display the images the key are diffrents so in cellForRowAtIndexPath i have:

Fonts not working in Chrome (Windows & OSX)

css google-chrome fonts

I have an issue with my website where any fonts I import do not appear in Chrome.

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.