You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by bu...@apache.org on 2013/10/10 10:00:24 UTC
[Bug 55645] New: ChunkNotFoundException when trying to getRtfBody
https://issues.apache.org/bugzilla/show_bug.cgi?id=55645
Bug ID: 55645
Summary: ChunkNotFoundException when trying to getRtfBody
Product: POI
Version: 3.9
Hardware: Other
OS: other
Status: NEW
Severity: normal
Priority: P2
Component: HSMF
Assignee: dev@poi.apache.org
Reporter: paolo.asioli@gmail.com
Created attachment 30916
--> https://issues.apache.org/bugzilla/attachment.cgi?id=30916&action=edit
Outlook MSG that gives the above error
Hello
I've a message (attached to this bug), saved from Outlook 2010, that gives me
ChunkNotFoundException when I try to call
getRtfBody
Could you please check if there's a bug ? I'm using the latest stable release
3.9 on Android 4.0.1
Thanks a lot !
Paolo
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 55645] ChunkNotFoundException when trying to getRtfBody
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55645
Nick Burch <ap...@gagravarr.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |NEEDINFO
--- Comment #8 from Nick Burch <ap...@gagravarr.org> ---
Are you able to use one of the tools like POIFSViewer or POIFSDump to identify
which chunk (POIFS Entry) actually contains your text? That will help us narrow
down what's wrong
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 55645] ChunkNotFoundException when trying to getRtfBody
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55645
Paolo <pa...@gmail.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |paolo.asioli@gmail.com
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 55645] ChunkNotFoundException when trying to getRtfBody
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55645
Paolo <pa...@gmail.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEEDINFO |NEW
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 55645] ChunkNotFoundException when trying to getRtfBody
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55645
Nick Burch <ap...@gagravarr.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |NEEDINFO
--- Comment #1 from Nick Burch <ap...@gagravarr.org> ---
Are you sure your outlook file actually has a RTF part? (Not all of them do)
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 55645] ChunkNotFoundException when trying to getRtfBody
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55645
--- Comment #5 from Nick Burch <ap...@gagravarr.org> ---
Outlook files tend to have one or two out of plain, rtf and html. It's very
rare to have all 3. If your file only has rtf, and you really wanted something
like html, you'd be best off using Apache Tika as that can convert for you
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 55645] ChunkNotFoundException when trying to getRtfBody
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55645
--- Comment #2 from Paolo <pa...@gmail.com> ---
(In reply to Nick Burch from comment #1)
> Are you sure your outlook file actually has a RTF part? (Not all of them do)
Don't know for sure (don't know how to read a MSG by hand like an EML) but in
Outlook it shows some richly formatted text.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 55645] ChunkNotFoundException when trying to getRtfBody
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55645
--- Comment #7 from Paolo <pa...@gmail.com> ---
What kind of additional information do you need ? There's a MSG attached that
to my tests shows this anomaly (no plain, no HTML and no RTF, yet when opened
on Outlook presents formatted text).
Don't know what's the problem, but I think I gave ample information to
investigate...
Please let me know.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 55645] ChunkNotFoundException when trying to getRtfBody
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55645
--- Comment #9 from Paolo <pa...@gmail.com> ---
(In reply to Nick Burch from comment #8)
> Are you able to use one of the tools like POIFSViewer or POIFSDump to
> identify which chunk (POIFS Entry) actually contains your text? That will
> help us narrow down what's wrong
Thanks for the tip. I'll try that and get back with relevant information.
Cheers
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 55645] ChunkNotFoundException when trying to getRtfBody
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55645
--- Comment #3 from Nick Burch <ap...@gagravarr.org> ---
Not all richly formatted text in Outlook is done using RTF! Does your message
have a HTML chunk instead?
Take a look at the Tika Outlook parser if you want a detailed example of using
HSMF to process msg files:
https://svn.apache.org/repos/asf/tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/microsoft/OutlookExtractor.java
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 55645] ChunkNotFoundException when trying to getRtfBody
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55645
--- Comment #6 from Paolo <pa...@gmail.com> ---
(In reply to Nick Burch from comment #5)
> Outlook files tend to have one or two out of plain, rtf and html. It's very
> rare to have all 3. If your file only has rtf, and you really wanted
> something like html, you'd be best off using Apache Tika as that can convert
> for you
Maybe I didn't explain myself correctly. The attached example apparently has
NO plain text
NO HTML
NO RTF
according to Apache POI.
But since I see text when opening in Outlook, I think there may be a problem.
Did you test the MSG attachment to confirm my report ?
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 55645] ChunkNotFoundException when trying to getRtfBody
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55645
--- Comment #4 from Paolo <pa...@gmail.com> ---
(In reply to Nick Burch from comment #3)
> Not all richly formatted text in Outlook is done using RTF! Does your
> message have a HTML chunk instead?
>
> Take a look at the Tika Outlook parser if you want a detailed example of
> using HSMF to process msg files:
> https://svn.apache.org/repos/asf/tika/trunk/tika-parsers/src/main/java/org/
> apache/tika/parser/microsoft/OutlookExtractor.java
You're right, but I already get text, html and RTF and looks like there was
none.
So I thought there may be some kind of bug, since Outlook showed some formatted
text.
Here is and extract of my code:
try {
this.messaggioHTML = msg.getHtmlBody();
if (MainActivity.DEBUG) {
android.util.Log.d(MainActivity.TAG, "HTML Body: "
+ this.messaggioHTML);
}
} catch (ChunkNotFoundException e) {
android.util.Log.e(MainActivity.TAG,
"HTML Body: not found");
this.messaggioHTML = "";
}
try {
this.messaggioTesto = msg.getTextBody();
if (MainActivity.DEBUG) {
android.util.Log.d(MainActivity.TAG, "TXT Body: "
+ this.messaggioTesto);
}
} catch (ChunkNotFoundException e) {
android.util.Log.e(MainActivity.TAG,
"TXT Body: not found");
this.messaggioTesto = "";
}
try {
String messaggioRtf = msg.getRtfBody();
if (MainActivity.DEBUG) {
android.util.Log.d(MainActivity.TAG, "RTF Body: "
+ messaggioRtf);
}
} catch (ChunkNotFoundException e) {
android.util.Log.e(MainActivity.TAG,
"RTF Body: not found");
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org