You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by bu...@apache.org on 2009/08/12 15:18:54 UTC

DO NOT REPLY [Bug 47687] New: Is there any limitation at size of the MS Office document to extract using POI library?

https://issues.apache.org/bugzilla/show_bug.cgi?id=47687

           Summary: Is there any limitation at size of the MS Office
                    document to extract using POI library?
           Product: POI
           Version: 3.2-FINAL
          Platform: PC
        OS/Version: Windows XP
            Status: NEW
          Severity: normal
          Priority: P2
         Component: POI Overall
        AssignedTo: dev@poi.apache.org
        ReportedBy: ursbijju@gmail.com
                CC: ursbijju@gmail.com


--- Comment #0 from Bijju <ur...@gmail.com> 2009-08-12 06:18:52 PDT ---
We have been extracting many office documents successfully using POI 3.2. But
for a specific document of huge size >19MB file was not able to extract. 

But in practical scenarios we will ave more than 500MB documents also (in fact
no restriction at that). And technically, as POI is a Java library, size should
not be a concern while getting the handle of the document. I am using event
driven logic for document extraction.

But i have noticed, when document size is reduced POI extracts, if not fails.
Any reason for this? Am i missing any basic technical point here?

Also, POI treats HTML content of word document as another document than of
simple text. Need to check more on this. If this is yes, pls. let me know what
would be the reason for this?

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 47687] Is there any limitation at size of the MS Office document to extract using POI library?

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=47687


Nick Burch <ni...@torchbox.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |INVALID


--- Comment #1 from Nick Burch <ni...@torchbox.com> 2009-08-12 06:47:49 PDT ---
Please ask questions on the mailing list. Try checking the list archives too,
your question is almost certainly about needing a bigger java heap size.

Also, http://poi.apache.org/poifs/embeded.html might be of interest to you WRT
embeded documents

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org