You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Ken Krugler (JIRA)" <ji...@apache.org> on 2010/08/16 22:55:19 UTC

[jira] Created: (TIKA-480) BoilerpipeContentHandler needs to emit full set of standard elements

BoilerpipeContentHandler needs to emit full set of standard elements
--------------------------------------------------------------------

                 Key: TIKA-480
                 URL: https://issues.apache.org/jira/browse/TIKA-480
             Project: Tika
          Issue Type: Bug
    Affects Versions: 0.7
            Reporter: Ken Krugler
            Assignee: Ken Krugler
             Fix For: 0.8


Currently BoilerpipeContentHandler will call the provided delegate ContentHandler with:

<p>xxx</p>

for each block of text. But without the wrappers around these elements, things like BodyContentHandler can't be used.

In addition, current BoilerpipeContentHandler emits a <p> element with a null attributes value, which will cause a NPE for BodyContentHandler.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (TIKA-480) BoilerpipeContentHandler needs to emit full set of standard elements

Posted by "Ken Krugler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TIKA-480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ken Krugler resolved TIKA-480.
------------------------------

    Resolution: Fixed

SVN 986131

> BoilerpipeContentHandler needs to emit full set of standard elements
> --------------------------------------------------------------------
>
>                 Key: TIKA-480
>                 URL: https://issues.apache.org/jira/browse/TIKA-480
>             Project: Tika
>          Issue Type: Bug
>    Affects Versions: 0.7
>            Reporter: Ken Krugler
>            Assignee: Ken Krugler
>             Fix For: 0.8
>
>         Attachments: TIKA-480.patch
>
>
> Currently BoilerpipeContentHandler will call the provided delegate ContentHandler with:
> <p>xxx</p>
> for each block of text. But without the wrappers around these elements, things like BodyContentHandler can't be used.
> In addition, current BoilerpipeContentHandler emits a <p> element with a null attributes value, which will cause a NPE for BodyContentHandler.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (TIKA-480) BoilerpipeContentHandler needs to emit full set of standard elements

Posted by "Ken Krugler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TIKA-480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ken Krugler updated TIKA-480:
-----------------------------

    Attachment: TIKA-480.patch

> BoilerpipeContentHandler needs to emit full set of standard elements
> --------------------------------------------------------------------
>
>                 Key: TIKA-480
>                 URL: https://issues.apache.org/jira/browse/TIKA-480
>             Project: Tika
>          Issue Type: Bug
>    Affects Versions: 0.7
>            Reporter: Ken Krugler
>            Assignee: Ken Krugler
>             Fix For: 0.8
>
>         Attachments: TIKA-480.patch
>
>
> Currently BoilerpipeContentHandler will call the provided delegate ContentHandler with:
> <p>xxx</p>
> for each block of text. But without the wrappers around these elements, things like BodyContentHandler can't be used.
> In addition, current BoilerpipeContentHandler emits a <p> element with a null attributes value, which will cause a NPE for BodyContentHandler.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.