You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Andrzej Bialecki (Updated) (JIRA)" <ji...@apache.org> on 2011/10/07 19:50:30 UTC

[jira] [Updated] (TIKA-748) RTF parser fails to extract the body

     [ https://issues.apache.org/jira/browse/TIKA-748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrzej Bialecki  updated TIKA-748:
-----------------------------------

    Attachment: test.rtf
    
> RTF parser fails to extract the body
> ------------------------------------
>
>                 Key: TIKA-748
>                 URL: https://issues.apache.org/jira/browse/TIKA-748
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 0.10
>            Reporter: Andrzej Bialecki 
>         Attachments: test.rtf
>
>
> Using tika-app I'm getting the following result of parsing the attached document:
> {noformat}
> <?xml version="1.0" encoding="UTF-8"?><html xmlns="http://www.w3.org/1999/xhtml">
> <head>
> <meta name="subject" content="tests"/>
> <meta name="Content-Length" content="2235"/>
> <meta name="comment" content="StarWriter"/>
> <meta name="X-Parsed-By" content="org.apache.tika.parser.DefaultParser"/>
> <meta name="X-Parsed-By" content="org.apache.tika.parser.rtf.RTFParser"/>
> <meta name="Content-Type" content="application/rtf"/>
> <meta name="resourceName" content="test.rtf"/>
> <title>test rft document</title>
> </head>
> <body/></html>
> {noformat}
> The expected result would be a non-empty body containing the text "The quick brown fox jumps over the lazy dog
> ".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira