You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by Bruce Ritchie <br...@jivesoftware.com> on 2008/10/23 01:48:19 UTC

3.5 beta 3 - NPE received extracting text from some ppt files

All,

I've begun testing POI 3.5 beta 3 for it's ability to extract text from the new MS doc formats along with the open office formats. However, I've been seeing a dozen or so ppt files in the test sample fail with a NPE as follows:

java.lang.NullPointerException
 at org.apache.poi.hslf.model.SimpleShape.getClientRecords(SimpleShape.java:322)
 at org.apache.poi.hslf.model.SimpleShape.getClientDataRecord(SimpleShape.java:307)
 at org.apache.poi.hslf.model.TextShape.getPlaceholderAtom(TextShape.java:547)
 at org.apache.poi.hslf.model.Sheet.getPlaceholder(Sheet.java:408)
 at org.apache.poi.hslf.model.HeadersFooters.isVisible(HeadersFooters.java:244)
 at org.apache.poi.hslf.model.HeadersFooters.isHeaderVisible(HeadersFooters.java:148)
 at org.apache.poi.hslf.extractor.PowerPointExtractor.getText(PowerPointExtractor.java:173)
 at org.apache.poi.hslf.extractor.PowerPointExtractor.getText(PowerPointExtractor.java:144)
 at com.jivesoftware.community.search.extractor.POIExtractor.extractText(POIExtractor.java:120)

If required for a fix I have an example ~5Mb ppt file that exhibits the above issue I can email to whomever wants to take a look at this issue.


Regards,

Bruce Ritchie