You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by Ken Krugler <kk...@transpac.com> on 2009/09/25 02:17:52 UTC

Html parser questions

Hi all,

[Resending with an image instead of the HTML example - previous  
attempt was rejected by Apache.org as being spam...weird]

I'm doing a comparison of the Tika HtmlParser with the original Nutch  
HTML parsing code.

I've run into some issues, and wanted input before filing any Jira  
requests/bugs.

As an example of a test document: