You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by Ken Krugler <kk...@transpac.com> on 2009/09/25 02:17:52 UTC
Html parser questions
Hi all,
[Resending with an image instead of the HTML example - previous
attempt was rejected by Apache.org as being spam...weird]
I'm doing a comparison of the Tika HtmlParser with the original Nutch
HTML parsing code.
I've run into some issues, and wanted input before filing any Jira
requests/bugs.
As an example of a test document: