This is the JTidy package, a Java port of the HTML Tidy C++ library.
This is the NekoHTML library, a Java implementation that works well for HTML fragment parsing.
This is a wrapper around the SWT implementation of a Mozilla XPCOM wrapper. It actually renders the HTML as in a browser and generates an enhanced DOM document.