You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by bu...@apache.org on 2013/10/10 10:00:24 UTC

[Bug 55645] New: ChunkNotFoundException when trying to getRtfBody

https://issues.apache.org/bugzilla/show_bug.cgi?id=55645

            Bug ID: 55645
           Summary: ChunkNotFoundException when trying to getRtfBody
           Product: POI
           Version: 3.9
          Hardware: Other
                OS: other
            Status: NEW
          Severity: normal
          Priority: P2
         Component: HSMF
          Assignee: dev@poi.apache.org
          Reporter: paolo.asioli@gmail.com

Created attachment 30916
  --> https://issues.apache.org/bugzilla/attachment.cgi?id=30916&action=edit
Outlook MSG that gives the above error

Hello

I've a message (attached to this bug), saved from Outlook 2010, that gives me
ChunkNotFoundException when I try to call 
getRtfBody

Could you please check if there's a bug ? I'm using the latest stable release
3.9 on Android 4.0.1

Thanks a lot !

Paolo

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 55645] ChunkNotFoundException when trying to getRtfBody

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55645

Nick Burch <ap...@gagravarr.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |NEEDINFO

--- Comment #8 from Nick Burch <ap...@gagravarr.org> ---
Are you able to use one of the tools like POIFSViewer or POIFSDump to identify
which chunk (POIFS Entry) actually contains your text? That will help us narrow
down what's wrong

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 55645] ChunkNotFoundException when trying to getRtfBody

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55645

Paolo <pa...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |paolo.asioli@gmail.com

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 55645] ChunkNotFoundException when trying to getRtfBody

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55645

Paolo <pa...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEEDINFO                    |NEW

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 55645] ChunkNotFoundException when trying to getRtfBody

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55645

Nick Burch <ap...@gagravarr.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |NEEDINFO

--- Comment #1 from Nick Burch <ap...@gagravarr.org> ---
Are you sure your outlook file actually has a RTF part? (Not all of them do)

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 55645] ChunkNotFoundException when trying to getRtfBody

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55645

--- Comment #5 from Nick Burch <ap...@gagravarr.org> ---
Outlook files tend to have one or two out of plain, rtf and html. It's very
rare to have all 3. If your file only has rtf, and you really wanted something
like html, you'd be best off using Apache Tika as that can convert for you

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 55645] ChunkNotFoundException when trying to getRtfBody

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55645

--- Comment #2 from Paolo <pa...@gmail.com> ---
(In reply to Nick Burch from comment #1)
> Are you sure your outlook file actually has a RTF part? (Not all of them do)

Don't know for sure (don't know how to read a MSG by hand like an EML) but in
Outlook it shows some richly formatted text.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 55645] ChunkNotFoundException when trying to getRtfBody

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55645

--- Comment #7 from Paolo <pa...@gmail.com> ---
What kind of additional information do you need ? There's a MSG attached that
to my tests shows this anomaly (no plain, no HTML and no RTF, yet when opened
on Outlook presents formatted text).
Don't know what's the problem, but I think I gave ample information to
investigate...

Please let me know.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 55645] ChunkNotFoundException when trying to getRtfBody

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55645

--- Comment #9 from Paolo <pa...@gmail.com> ---
(In reply to Nick Burch from comment #8)
> Are you able to use one of the tools like POIFSViewer or POIFSDump to
> identify which chunk (POIFS Entry) actually contains your text? That will
> help us narrow down what's wrong

Thanks for the tip. I'll try that and get back with relevant information.
Cheers

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 55645] ChunkNotFoundException when trying to getRtfBody

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55645

--- Comment #3 from Nick Burch <ap...@gagravarr.org> ---
Not all richly formatted text in Outlook is done using RTF! Does your message
have a HTML chunk instead?

Take a look at the Tika Outlook parser if you want a detailed example of using
HSMF to process msg files:
https://svn.apache.org/repos/asf/tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/microsoft/OutlookExtractor.java

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 55645] ChunkNotFoundException when trying to getRtfBody

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55645

--- Comment #6 from Paolo <pa...@gmail.com> ---
(In reply to Nick Burch from comment #5)
> Outlook files tend to have one or two out of plain, rtf and html. It's very
> rare to have all 3. If your file only has rtf, and you really wanted
> something like html, you'd be best off using Apache Tika as that can convert
> for you

Maybe I didn't explain myself correctly. The attached example apparently has 
NO plain text
NO HTML
NO RTF
according to Apache POI.

But since I see text when opening in Outlook, I think there may be a problem.

Did you test the MSG attachment to confirm my report ?

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 55645] ChunkNotFoundException when trying to getRtfBody

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55645

--- Comment #4 from Paolo <pa...@gmail.com> ---
(In reply to Nick Burch from comment #3)
> Not all richly formatted text in Outlook is done using RTF! Does your
> message have a HTML chunk instead?
> 
> Take a look at the Tika Outlook parser if you want a detailed example of
> using HSMF to process msg files:
> https://svn.apache.org/repos/asf/tika/trunk/tika-parsers/src/main/java/org/
> apache/tika/parser/microsoft/OutlookExtractor.java

You're right, but I already get text, html and RTF and looks like there was
none.

So I thought there may be some kind of bug, since Outlook showed some formatted
text.

Here is and extract of my code:

try {
    this.messaggioHTML = msg.getHtmlBody();
    if (MainActivity.DEBUG) {
        android.util.Log.d(MainActivity.TAG, "HTML Body: "
                + this.messaggioHTML);
    }
} catch (ChunkNotFoundException e) {
    android.util.Log.e(MainActivity.TAG,
            "HTML Body: not found");
    this.messaggioHTML = "";
}

try {
    this.messaggioTesto = msg.getTextBody();
    if (MainActivity.DEBUG) {
        android.util.Log.d(MainActivity.TAG, "TXT Body: "
                + this.messaggioTesto);
    }
} catch (ChunkNotFoundException e) {
    android.util.Log.e(MainActivity.TAG,
            "TXT  Body: not found");
    this.messaggioTesto = "";
}

try {
    String messaggioRtf = msg.getRtfBody();

    if (MainActivity.DEBUG) {
        android.util.Log.d(MainActivity.TAG, "RTF Body: "
                + messaggioRtf);
    }

} catch (ChunkNotFoundException e) {
    android.util.Log.e(MainActivity.TAG,
            "RTF  Body: not found");
} catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
}

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org