You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@poi.apache.org by bu...@apache.org on 2012/11/27 11:06:03 UTC

[Bug 54211] New: OutofMemory Exception while parsing large xlsx

https://issues.apache.org/bugzilla/show_bug.cgi?id=54211

            Bug ID: 54211
           Summary: OutofMemory Exception while parsing large xlsx
           Product: POI
           Version: 3.8
          Hardware: PC
            Status: NEW
          Severity: normal
          Priority: P2
         Component: XSSF
          Assignee: dev@poi.apache.org
          Reporter: sairampareek@gmail.com
    Classification: Unclassified

I have a 6.33 MB size xlsx file which i m trying to read. One file is read
successfully but when I try to read multiple (4) files concurrently I m getting
"outofmemory: java heap space".

Is there any workaround for this.

OPCPackage pkg = OPCPackage.open(fileNameOnDisk, PackageAccess.READ);
//here the error is thrown
ReadOnlySharedStringsTable sst = new ReadOnlySharedStringsTable(pkg);
XSSFReader r = new XSSFReader(pkg);
XMLReader parser =
XMLReaderFactory.createXMLReader("org.apache.xerces.parsers.SAXParser");


java.lang.OutOfMemoryError: Java heap space
    at java.util.Arrays.copyOfRange(Arrays.java:3209)
    at java.lang.String.<init>(String.java:215)
    at java.lang.StringBuffer.toString(StringBuffer.java:585)
    at
org.apache.poi.xssf.eventusermodel.ReadOnlySharedStringsTable.endElement(ReadOnlySharedStringsTable.java:211)
    at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source)
    at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanEndElement(Unknown
Source)
    at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown
Source)
    at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
Source)
    at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
    at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
    at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
    at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
    at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
    at
org.apache.poi.xssf.eventusermodel.ReadOnlySharedStringsTable.readFrom(ReadOnlySharedStringsTable.java:143)
    at
org.apache.poi.xssf.eventusermodel.ReadOnlySharedStringsTable.<init>(ReadOnlySharedStringsTable.java:112)

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org

[Bug 54211] OutofMemory Exception while parsing large xlsx

Posted by bu...@apache.org.

https://issues.apache.org/bugzilla/show_bug.cgi?id=54211

Nick Burch <ap...@gagravarr.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |WORKSFORME
                 OS|                            |All

--- Comment #1 from Nick Burch <ap...@gagravarr.org> ---
You need to increase your heap size, the default JVM heap is very small

If you really can't do that, you'll need to process the shared strings table
differently. The code you're using buffers the shared strings table into memory
for quick access when processing the slides. It's not usually too big, but it
can be a noticable part of the file size. If you can't hold it all in ram,
you'll need to process it in a streaming manner and store the id -> string
lookup elsewhere (eg on disk, on another box in a KV store / cache) for use
when handling the sheets

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org