You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Ken Krugler (JIRA)" <ji...@apache.org> on 2010/08/16 23:20:21 UTC

[jira] Updated: (TIKA-480) BoilerpipeContentHandler needs to emit full set of standard elements

     [ https://issues.apache.org/jira/browse/TIKA-480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ken Krugler updated TIKA-480:
-----------------------------

    Attachment: TIKA-480.patch

> BoilerpipeContentHandler needs to emit full set of standard elements
> --------------------------------------------------------------------
>
>                 Key: TIKA-480
>                 URL: https://issues.apache.org/jira/browse/TIKA-480
>             Project: Tika
>          Issue Type: Bug
>    Affects Versions: 0.7
>            Reporter: Ken Krugler
>            Assignee: Ken Krugler
>             Fix For: 0.8
>
>         Attachments: TIKA-480.patch
>
>
> Currently BoilerpipeContentHandler will call the provided delegate ContentHandler with:
> <p>xxx</p>
> for each block of text. But without the wrappers around these elements, things like BodyContentHandler can't be used.
> In addition, current BoilerpipeContentHandler emits a <p> element with a null attributes value, which will cause a NPE for BodyContentHandler.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.