You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by bu...@apache.org on 2009/08/12 12:14:34 UTC
DO NOT REPLY [Bug 47685] New: extracting text from xls files fails
https://issues.apache.org/bugzilla/show_bug.cgi?id=47685
Summary: extracting text from xls files fails
Product: POI
Version: 3.2-FINAL
Platform: PC
OS/Version: Windows Vista
Status: NEW
Severity: normal
Priority: P2
Component: HSSF
AssignedTo: dev@poi.apache.org
ReportedBy: christiaan.fluit@aduna-software.com
--- Comment #0 from Christiaan Fluit <ch...@aduna-software.com> 2009-08-12 03:14:31 PDT ---
I have a couple of xls files that result in exceptions when I try to extract
their text. POI 3.2-FINAL gives the following stacktrace:
org.apache.poi.hssf.record.RecordFormatException: Unable to construct record
instance
at
org.apache.poi.hssf.record.RecordFactory.createRecord(RecordFactory.java:186)
at
org.apache.poi.hssf.record.RecordFactory.createRecords(RecordFactory.java:328)
at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:271)
at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:196)
at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:178)
at [proprietary code trace]
Caused by: java.lang.ArrayIndexOutOfBoundsException
at
org.apache.poi.hssf.record.RecordInputStream.checkRecordPosition(RecordInputStream.java:142)
at
org.apache.poi.hssf.record.RecordInputStream.readByte(RecordInputStream.java:151)
at org.apache.poi.hssf.record.MMSRecord.<init>(MMSRecord.java:46)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at
org.apache.poi.hssf.record.RecordFactory.createRecord(RecordFactory.java:184)
... 25 common frames omitted
POI 3.5-beta5 gives this stacktrace:
org.apache.poi.hssf.record.RecordFormatException: Unable to construct record
instance
at
org.apache.poi.hssf.record.RecordFactory$ReflectionRecordCreator.create(RecordFactory.java:71)
at
org.apache.poi.hssf.record.RecordFactory.createSingleRecord(RecordFactory.java:269)
at
org.apache.poi.hssf.record.RecordFactory.createRecord(RecordFactory.java:248)
at
org.apache.poi.hssf.eventusermodel.HSSFRecordStream.getNextRecord(HSSFRecordStream.java:162)
at
org.apache.poi.hssf.eventusermodel.HSSFRecordStream.nextRecord(HSSFRecordStream.java:93)
at
org.apache.poi.hssf.eventusermodel.HSSFEventFactory.genericProcessEvents(HSSFEventFactory.java:141)
at
org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processEvents(HSSFEventFactory.java:98)
at [proprietary code trace]
Caused by: org.apache.poi.hssf.record.RecordFormatException: Not enough data
(0) to read requested (1) bytes
at
org.apache.poi.hssf.record.RecordInputStream.checkRecordPosition(RecordInputStream.java:185)
at
org.apache.poi.hssf.record.RecordInputStream.readByte(RecordInputStream.java:193)
at org.apache.poi.hssf.record.MMSRecord.<init>(MMSRecord.java:46)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at
org.apache.poi.hssf.record.RecordFactory$ReflectionRecordCreator.create(RecordFactory.java:63)
... 12 more
Due to the nature of these files, I cannot post them here, but I am willing to
share them with developers looking into this bug.
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
DO NOT REPLY [Bug 47685] extracting text from xls files fails
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=47685
--- Comment #2 from Andreas <an...@gmx.de> 2009-10-23 02:04:22 UTC ---
I had the same problem with a file created in MS Excel. I could solve the
problem by removing an image that was embedded over two cells.
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
DO NOT REPLY [Bug 47685] extracting text from xls files fails
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=47685
Maxim Valyanskiy <ma...@gmail.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEEDINFO |RESOLVED
Resolution| |FIXED
--- Comment #3 from Maxim Valyanskiy <ma...@gmail.com> 2010-04-27 05:30:47 EDT ---
Fixed in r938372
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
DO NOT REPLY [Bug 47685] extracting text from xls files fails
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=47685
Nick Burch <ni...@torchbox.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |NEEDINFO
--- Comment #1 from Nick Burch <ni...@torchbox.com> 2009-08-12 06:50:08 PDT ---
Without the file I can only suggest you dig into the problematic record code
(MMSRecord), compare that to the published microsoft docs and see if you can
spot the issue
Also, it's worth opening the file in a new copy of office, and doing a "save
as". If that file opens without issue, then a workaround is probably needed for
whatever software wrote your file not quite according to the spec. If that
doesn't help, then that looks more like a record bug in poi.
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org