You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by Joe Gallo <jo...@gmail.com> on 2011/10/06 20:45:57 UTC
HSLFExtractor Bug
I ran into a problem with tika extraction of ppt files today, and I
think it traced it back to some mistaken code in the HSLFExtractor
// Repeat the Notes header, if set
if (hf != null && hf.isHeaderVisible() && hf.getHeaderText() != null) {
xhtml.startElement("p", "class", "slide-note-header");
xhtml.characters( hf.getFooterText() ); <----------
shouldn't this be hf.getHeaderText()? the getFooterText() call here
is returning null, and causing an NPE in the XHTMLContentHandler
xhtml.endElement("p");
}
Joe
HSLFExtractor Bug
Posted by Joe Gallo <jo...@gmail.com>.
> I think this was raised in TIKA-727 and fixed in r1177313, any chance you
> could check with a svn checkout / recent nightly build and verify your
> problem is fixed?
Ah, yes, confirmed -- it is fixed in 1.0-20111006.162438-162.
Re: HSLFExtractor Bug
Posted by Nick Burch <ni...@alfresco.com>.
On Thu, 6 Oct 2011, Joe Gallo wrote:
> I ran into a problem with tika extraction of ppt files today, and I
> think it traced it back to some mistaken code in the HSLFExtractor
>
> xhtml.characters( hf.getFooterText() ); <----------
I think this was raised in TIKA-727 and fixed in r1177313, any chance you
could check with a svn checkout / recent nightly build and verify your
problem is fixed?
Nick