You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by bu...@apache.org on 2011/02/15 08:24:12 UTC
DO NOT REPLY [Bug 50779] New: RecordFormatException Not enough data
(1) to read requested (2) bytes
https://issues.apache.org/bugzilla/show_bug.cgi?id=50779
Summary: RecordFormatException Not enough data (1) to read
requested (2) bytes
Product: POI
Version: 3.7
Platform: PC
OS/Version: Windows XP
Status: NEW
Severity: normal
Priority: P2
Component: HSSF
AssignedTo: dev@poi.apache.org
ReportedBy: apptaro@gmail.com
The following error occurs when reading some Excel file saved with Excel 2003:
Exception in thread "main"
org.apache.poi.hssf.record.RecordFormatException: Unable to construct record
instance
at
org.apache.poi.hssf.record.RecordFactory$ReflectionConstructorRecordCreator.create(RecordFactory.java:65)
at
org.apache.poi.hssf.record.RecordFactory.createSingleRecord(RecordFactory.java:300)
at
org.apache.poi.hssf.record.RecordFactoryInputStream.readNextRecord(RecordFactoryInputStream.java:270)
at
org.apache.poi.hssf.record.RecordFactoryInputStream.nextRecord(RecordFactoryInputStream.java:236)
at
org.apache.poi.hssf.record.RecordFactory.createRecords(RecordFactory.java:442)
at
org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:263)
at
org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:188)
at
org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:305)
at
org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:286)
at aflat4.apps.adr.POITest.main(POITest.java:18)
Caused by: org.apache.poi.hssf.record.RecordFormatException: Not enough data
(1) to read requested (2) bytes
at
org.apache.poi.hssf.record.RecordInputStream.checkRecordPosition(RecordInputStream.java:216)
at
org.apache.poi.hssf.record.RecordInputStream.readUShort(RecordInputStream.java:267)
at org.apache.poi.util.StringUtil.readUnicodeLE(StringUtil.java:277)
at
org.apache.poi.hssf.record.common.UnicodeString$ExtRst.<init>(UnicodeString.java:172)
at
org.apache.poi.hssf.record.common.UnicodeString.<init>(UnicodeString.java:438)
at
org.apache.poi.hssf.record.SSTDeserializer.manufactureStrings(SSTDeserializer.java:55)
at org.apache.poi.hssf.record.SSTRecord.<init>(SSTRecord.java:250)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:494)
at
org.apache.poi.hssf.record.RecordFactory$ReflectionConstructorRecordCreator.create(RecordFactory.java:57)
... 9 more
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
DO NOT REPLY [Bug 50779] RecordFormatException Not enough data (1)
to read requested (2) bytes
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=50779
--- Comment #4 from apptaro@gmail.com 2011-02-15 03:09:13 EST ---
Two test files are attached. Both are created in Japanese Excel 2003.
UnicodeStringFailCase1.xls produces the original error. This is the case where
a CONTINUE record appears in ExtRst and split two bytes of a unicode character.
Unicode StringFailCase2.xls produces a slightly different error below. This is
the case where a CONTINUE record appears in PhRun and split two bytes of a
unsigned short value.
Exception in thread "main" org.apache.poi.hssf.record.RecordFormatException:
Unable to construct record instance
at
org.apache.poi.hssf.record.RecordFactory$ReflectionConstructorRecordCreator.create(RecordFactory.java:65)
at
org.apache.poi.hssf.record.RecordFactory.createSingleRecord(RecordFactory.java:300)
at
org.apache.poi.hssf.record.RecordFactoryInputStream.readNextRecord(RecordFactoryInputStream.java:270)
at
org.apache.poi.hssf.record.RecordFactoryInputStream.nextRecord(RecordFactoryInputStream.java:236)
at
org.apache.poi.hssf.record.RecordFactory.createRecords(RecordFactory.java:442)
at
org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:263)
at
org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:188)
at
org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:305)
at
org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:286)
at aflat4.apps.adr.POITest.main(POITest.java:18)
Caused by: org.apache.poi.hssf.record.RecordFormatException: Not enough data
(1) to read requested (2) bytes
at
org.apache.poi.hssf.record.RecordInputStream.checkRecordPosition(RecordInputStream.java:216)
at
org.apache.poi.hssf.record.RecordInputStream.readUShort(RecordInputStream.java:267)
at
org.apache.poi.hssf.record.common.UnicodeString$PhRun.<init>(UnicodeString.java:309)
at
org.apache.poi.hssf.record.common.UnicodeString$PhRun.<init>(UnicodeString.java:297)
at
org.apache.poi.hssf.record.common.UnicodeString$ExtRst.<init>(UnicodeString.java:178)
at
org.apache.poi.hssf.record.common.UnicodeString.<init>(UnicodeString.java:438)
at
org.apache.poi.hssf.record.SSTDeserializer.manufactureStrings(SSTDeserializer.java:55)
at org.apache.poi.hssf.record.SSTRecord.<init>(SSTRecord.java:250)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:494)
at
org.apache.poi.hssf.record.RecordFactory$ReflectionConstructorRecordCreator.create(RecordFactory.java:57)
... 9 more
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
DO NOT REPLY [Bug 50779] RecordFormatException Not enough data (1)
to read requested (2) bytes
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=50779
--- Comment #8 from apptaro@gmail.com 2011-03-13 23:00:03 EDT ---
As a reporter, I built r1080496, tested and confirmed that the bug is resolved.
Thank you for fixing!
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
DO NOT REPLY [Bug 50779] RecordFormatException Not enough data (1)
to read requested (2) bytes
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=50779
--- Comment #1 from apptaro@gmail.com 2011-02-15 02:31:23 EST ---
This error occurs with some Excel files that have many unicode character
strings with phonetic data. Details are described here:
http://thread.gmane.org/gmane.comp.jakarta.poi.user/16008/focus=16077
I have a Excel file that causes the error, but I can put it here because it is
confidential. I'm trying to create a test file to duplicate the issue.
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
DO NOT REPLY [Bug 50779] RecordFormatException Not enough data (1)
to read requested (2) bytes
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=50779
--- Comment #6 from Yegor Kozlov <ye...@dinom.ru> 2011-03-07 09:20:36 EST ---
Created an attachment (id=26740)
--> (https://issues.apache.org/bugzilla/attachment.cgi?id=26740)
junit test to demonsrate the bug
to be included in the poi test collection...
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
DO NOT REPLY [Bug 50779] RecordFormatException Not enough data (1)
to read requested (2) bytes
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=50779
Yegor Kozlov <ye...@dinom.ru> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
--- Comment #7 from Yegor Kozlov <ye...@dinom.ru> 2011-03-11 05:12:54 EST ---
Fixed in r1080496, junit added
My previous comment was not quite correct, I should have read the poi-user
thread more thoroughly.
The fix only applies to the phonetic stuff, it does seem to be special and can
contain a CONTINUE break between two bytes of a unicode character or a 'short'
data.
The trick is to pass a decorated LittleEndianInput to the the ExtRst
constructor and this decorated instance properly handles CONTINUE breaks in
the middle of primitive data types.
Yegor
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
DO NOT REPLY [Bug 50779] RecordFormatException Not enough data (1)
to read requested (2) bytes
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=50779
Arimitsu <is...@infoteria.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |ishii@infoteria.com
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
DO NOT REPLY [Bug 50779] RecordFormatException Not enough data (1)
to read requested (2) bytes
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=50779
--- Comment #5 from Yegor Kozlov <ye...@dinom.ru> 2011-03-07 09:18:34 EST ---
Interesting. So far we assumed that for primitive types (short, int, long,
etc.) a continue record break always occurs at the type boundary. Your
attachments clearly demonstrate that it is not always so and a CONTINUE break
can be in the middle of a primitive type.
I know how to fix it, but I'm hesitating whether this behavior should be
default or only applied to this particular case.
Initialization of BIFF records sits on top of the RecordInputStream class which
greedily reads the primitive types. To properly handle CONTINUE it needs to
reads byte by byte and then make sense of the read data. Something like this:
// current version. Does not work if CONTINUE occurs between two bytes.
public int readUShort() {
checkRecordPosition(LittleEndian.SHORT_SIZE);
_currentDataOffset += LittleEndian.SHORT_SIZE;
return _dataInput.readUShort();
}
// Corrected. readByte() rolls over CONTINUE if necessary
public int readUShort() {
int ch1 = readByte();
int ch2 = readByte();
return (ch2 << 8) + (ch1 << 0);
}
Note that there is at least one case where readShort() must be greedy: for
double-byte characters a Continue record break MUST occur at the double-byte
character boundary.
Yegor
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
DO NOT REPLY [Bug 50779] RecordFormatException Not enough data (1)
to read requested (2) bytes
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=50779
--- Comment #3 from apptaro@gmail.com 2011-02-15 03:00:40 EST ---
Created an attachment (id=26659)
--> (https://issues.apache.org/bugzilla/attachment.cgi?id=26659)
UnicodeStringFailCase2
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
DO NOT REPLY [Bug 50779] RecordFormatException Not enough data (1)
to read requested (2) bytes
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=50779
--- Comment #2 from apptaro@gmail.com 2011-02-15 02:59:39 EST ---
Created an attachment (id=26658)
--> (https://issues.apache.org/bugzilla/attachment.cgi?id=26658)
UnicodeStringFailCase1
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org