You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by bu...@apache.org on 2016/02/03 10:35:47 UTC
[Bug 58963] New: OutOfMemoryError while reading some Excel files
https://bz.apache.org/bugzilla/show_bug.cgi?id=58963
Bug ID: 58963
Summary: OutOfMemoryError while reading some Excel files
Product: POI
Version: 3.13-FINAL
Hardware: PC
Status: NEW
Severity: critical
Priority: P2
Component: XSSF
Assignee: dev@poi.apache.org
Reporter: zmircmircea@gmail.com
Created attachment 33518
--> https://bz.apache.org/bugzilla/attachment.cgi?id=33518&action=edit
sample project + file to reproduce the error
Hi!
Today we received an Excel file which can't be added into the system, because
POI triggers OOM while trying to open it.
The xlsx file has ~300KB and the application -Xmx750m. It does the same with
-Xmx2750m so it's definitely not this.
This OOM happens with both 3.13 and 3.14-beta1.
I will attach a sample maven project + the problematic file called
"eu-triggers-oom.xlsx" available in /src/test/resources
To replicate the issue, just execute the test from class POIExcelOOMNGTest.
I will also a VisualVM memory usage screenshot while running the test.
Here is the stacktrace of the test project:
shouldNotThrowOOMWhileReadingExcel(poi.excel.oom.POIExcelOOMNGTest) Time
elapsed: 136.287 sec <<< FAILURE!
org.apache.poi.POIXMLException: java.lang.reflect.InvocationTargetException
at
org.apache.poi.xssf.usermodel.XSSFFactory.createDocumentPart(XSSFFactory.java:62)
at org.apache.poi.POIXMLDocumentPart.read(POIXMLDocumentPart.java:465)
at org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:173)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:278)
at
poi.excel.oom.POIExcelOOMNGTest.shouldNotThrowOOMWhileReadingExcel(POIExcelOOMNGTest.java:12)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at
org.apache.poi.xssf.usermodel.XSSFFactory.createDocumentPart(XSSFFactory.java:60)
... 37 more
Caused by: java.lang.OutOfMemoryError: Java heap space
at org.apache.xmlbeans.impl.store.CharUtil.allocate(CharUtil.java:397)
at org.apache.xmlbeans.impl.store.CharUtil.saveChars(CharUtil.java:506)
at org.apache.xmlbeans.impl.store.CharUtil.saveChars(CharUtil.java:419)
at org.apache.xmlbeans.impl.store.CharUtil.saveChars(CharUtil.java:489)
at org.apache.xmlbeans.impl.store.Cur$CurLoadContext.text(Cur.java:2927)
at
org.apache.xmlbeans.impl.store.Cur$CurLoadContext.stripText(Cur.java:3130)
at org.apache.xmlbeans.impl.store.Cur$CurLoadContext.text(Cur.java:3143)
at
org.apache.xmlbeans.impl.store.Locale$SaxHandler.characters(Locale.java:3291)
at
org.apache.xmlbeans.impl.piccolo.xml.Piccolo.reportCdata(Piccolo.java:992)
at
org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.parseXMLNS(PiccoloLexer.java:1290)
at
org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.parseXML(PiccoloLexer.java:1261)
at
org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.yylex(PiccoloLexer.java:4812)
at org.apache.xmlbeans.impl.piccolo.xml.Piccolo.yylex(Piccolo.java:1290)
at org.apache.xmlbeans.impl.piccolo.xml.Piccolo.yyparse(Piccolo.java:1400)
at org.apache.xmlbeans.impl.piccolo.xml.Piccolo.parse(Piccolo.java:714)
at org.apache.xmlbeans.impl.store.Locale$SaxLoader.load(Locale.java:3479)
at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1277)
at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1264)
at
org.apache.xmlbeans.impl.schema.SchemaTypeLoaderBase.parse(SchemaTypeLoaderBase.java:345)
at
org.openxmlformats.schemas.spreadsheetml.x2006.main.SstDocument$Factory.parse(Unknown
Source)
at
org.apache.poi.xssf.model.SharedStringsTable.readFrom(SharedStringsTable.java:119)
at
org.apache.poi.xssf.model.SharedStringsTable.<init>(SharedStringsTable.java:106)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at
org.apache.poi.xssf.usermodel.XSSFFactory.createDocumentPart(XSSFFactory.java:60)
at org.apache.poi.POIXMLDocumentPart.read(POIXMLDocumentPart.java:465)
at org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:173)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:278)
at
poi.excel.oom.POIExcelOOMNGTest.shouldNotThrowOOMWhileReadingExcel(POIExcelOOMNGTest.java:12)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 58963] OutOfMemoryError while reading some Excel files
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58963
Dominik Stadler <do...@gmx.at> changed:
What |Removed |Added
----------------------------------------------------------------------------
OS| |All
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 58963] OutOfMemoryError while reading some Excel files
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58963
--- Comment #5 from Mircea <zm...@gmail.com> ---
Created attachment 33690
--> https://bz.apache.org/bugzilla/attachment.cgi?id=33690&action=edit
VisualVM memory sampler screenshot
I have just reproduced the error and took a screenshot from VisualVM's memory
sampler.
It seems that char[] is taking most of the memory.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 58963] OutOfMemoryError while reading some Excel files
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58963
--- Comment #9 from Mircea <zm...@gmail.com> ---
I looked into that link but I couldn't find the working dist version.
I tried what's in build/dist, but no success. Apache POI classes were missing.
As you probably already have POI setup in your IDE, can you please send me (via
Dropbox, OneDrive etc) the working dist version?
Otherwise, what should I do to get it working?
I spent quite a lot of time taking the src zip and trying to fix all the
missing dependencies, but there are many which aren't there.
On the other side, I provided the test case XML and Java code (3 lines).
It's 1000x easier for one of POI developers to copy those 3 lines + the file in
a project compiled against the latest version of POI.
Then I wouldn't have to spend a lot of time trying to build POI myself.
Thanks a lot. :)
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 58963] OutOfMemoryError while reading some Excel files
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58963
Mircea <zm...@gmail.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEEDINFO |RESOLVED
Resolution|--- |DUPLICATE
--- Comment #11 from Mircea <zm...@gmail.com> ---
Great! Thanks for the clarification.
I built it with ant, then added the remaining jars.
It seems to work properly now. No more memory issue.
Most probably bug 57031 has taken care of it.
Thanks, Apache POI!
*** This bug has been marked as a duplicate of bug 57031 ***
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 58963] OutOfMemoryError while reading some Excel files
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58963
--- Comment #1 from Mircea <zm...@gmail.com> ---
Created attachment 33519
--> https://bz.apache.org/bugzilla/attachment.cgi?id=33519&action=edit
VisualVM screenshot after the test failed - OOM
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 58963] OutOfMemoryError while reading some Excel files
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58963
--- Comment #4 from Javen O'Neal <on...@apache.org> ---
Looking through eu-triggers-oom.xlsx, I didn't see any XML bombs likely to
bloat memory. The main memory consumer here is shared strings table containing
5700 unique values. The rest of the XML files are mostly empty.
To build up a test case to make sure there isn't anything else in the Excel
file, a unit test could be: read in a dictionary of random Norwegian words
(including æ, ø, å--if that matters), generate 5700 "sentences" composed of 10
random words each, and look at memory consumption.
The SharedStringsTable [1] uses an Array<CTRst> and a Map<String, Int> to store
the strings. The Map makes string lookup faster at the cost of increasing the
memory requirements, but both structures should be able to handle 6700 entries
without OOM'ing.
What kinds of objects did VisualVM indicate were consuming the largest amount
of memory?
[1]
https://svn.apache.org/viewvc/poi/trunk/src/ooxml/java/org/apache/poi/xssf/model/SharedStringsTable.java?view=markup
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 58963] OutOfMemoryError while reading some Excel files
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58963
--- Comment #2 from Javen O'Neal <on...@apache.org> ---
By any chance have you used the same file on an older version of POI without
getting an OOM?
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 58963] OutOfMemoryError while reading some Excel files
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58963
Kai G <no...@kaigrabfelder.de> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |nospam@kaigrabfelder.de
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 58963] OutOfMemoryError while reading some Excel files
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58963
--- Comment #10 from Nick Burch <ap...@gagravarr.org> ---
(In reply to Mircea from comment #9)
> Otherwise, what should I do to get it working?
> I spent quite a lot of time taking the src zip and trying to fix all the
> missing dependencies, but there are many which aren't there.
Just run "ant jar" from a svn checkout / git checkout / source download, and
all the dependencies you need will be fetched for you on demand
Otherwise, the nightly builds are available from Jenkins at
https://builds.apache.org/job/POI/lastSuccessfulBuild/artifact/build/dist/ -
grab the POI jars from the bin, and add in any dependencies from the previous
full POI release's bin package
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 58963] OutOfMemoryError while reading some Excel files
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58963
--- Comment #6 from Javen O'Neal <on...@apache.org> ---
> It seems that char[] is taking most of the memory.
Which would make sense because the shared strings table is disproportionately
large, and rest of the xml nodes are just arrays/pointers+strings.
Andi fixed some OOM's in the current trunk build on bug 57031. Read through
that bug to see if it's relevant to you, and if so see if you still get OOM's
on a 3.15 beta 1 nightly [1]
[1] https://builds.apache.org/job/POI/lastSuccessfulBuild/artifact/
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 58963] OutOfMemoryError while reading some Excel files
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58963
--- Comment #3 from Mircea <zm...@gmail.com> ---
We received the file just today from a complaining user, then we tested it
ourselves with 3.13 and 3.14-beta1, but nothing else.
I am sorry.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 58963] OutOfMemoryError while reading some Excel files
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58963
--- Comment #7 from Mircea <zm...@gmail.com> ---
I'll give it a try.
What should I write in my Maven pom.xml file in order to get the nightly build
instead of 3.13?
I didn't find the nightly builds Maven repository.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 58963] OutOfMemoryError while reading some Excel files
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58963
Dominik Stadler <do...@gmx.at> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |NEEDINFO
--- Comment #8 from Dominik Stadler <do...@gmx.at> ---
Nightly builds are not available via Maven but only as manual download via
https://builds.apache.org/view/POI/job/POI/lastSuccessfulBuild/artifact/build/dist
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org