You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by bu...@apache.org on 2016/11/01 04:40:07 UTC

[Bug 60325] New: Poor performance in DirectoryNode.createDocument() for NPOIFSFileSystem

https://bz.apache.org/bugzilla/show_bug.cgi?id=60325

            Bug ID: 60325
           Summary: Poor performance in DirectoryNode.createDocument() for
                    NPOIFSFileSystem
           Product: POI
           Version: 3.15-FINAL
          Hardware: PC
            Status: NEW
          Severity: normal
          Priority: P2
         Component: POIFS
          Assignee: dev@poi.apache.org
          Reporter: luke.quinane@gmail.com
  Target Milestone: ---

Created attachment 34413
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=34413&action=edit
Sample project which compares NPOIFSFileSystem and OPOIFSFileSystem

When adding lots of documents entries to the file system the performance of the
NPOIFSFileSystem implementation is ~10x slower than OPOIFSFileSystem.

The attached sample program is often stuck with stacks like this:
          at java.nio.Buffer.<init>(Buffer.java:202)
          at java.nio.ByteBuffer.<init>(ByteBuffer.java:281)
          at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:70)
          at java.nio.ByteBuffer.wrap(ByteBuffer.java:373)
          at
org.apache.poi.poifs.nio.ByteArrayBackedDataSource.read(ByteArrayBackedDataSource.java:49)
          at
org.apache.poi.poifs.filesystem.NPOIFSFileSystem.getBlockAt(NPOIFSFileSystem.java:484)
          at
org.apache.poi.poifs.filesystem.NPOIFSStream$StreamBlockByteBufferIterator.next(NPOIFSStream.java:169)
          at
org.apache.poi.poifs.filesystem.NPOIFSStream$StreamBlockByteBufferIterator.next(NPOIFSStream.java:142)
          at
org.apache.poi.poifs.filesystem.NPOIFSMiniStore.getBlockAt(NPOIFSMiniStore.java:71)
          at
org.apache.poi.poifs.filesystem.NPOIFSStream$StreamBlockByteBufferIterator.next(NPOIFSStream.java:169)
          at
org.apache.poi.poifs.filesystem.NPOIFSStream$StreamBlockByteBufferIterator.next(NPOIFSStream.java:142)
          at
org.apache.poi.poifs.filesystem.NDocumentInputStream.readFully(NDocumentInputStream.java:248)
          at
org.apache.poi.poifs.filesystem.NDocumentInputStream.read(NDocumentInputStream.java:150)
          at
org.apache.poi.poifs.filesystem.DocumentInputStream.read(DocumentInputStream.java:125)
          at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
          at java.io.BufferedInputStream.skip(BufferedInputStream.java:380)
          - locked <0x345> (a java.io.BufferedInputStream)
          at
org.apache.poi.poifs.filesystem.NPOIFSDocument.store(NPOIFSDocument.java:126)
          at
org.apache.poi.poifs.filesystem.NPOIFSDocument.<init>(NPOIFSDocument.java:84)
          at
org.apache.poi.poifs.filesystem.DirectoryNode.createDocument(DirectoryNode.java:422)
          at Test.copyAllEntries(Test.java:83)
          at Test.copyAllEntries(Test.java:77)
          at Test.main(Test.java:49)

This problem crops up while creating MSG files with lots of recipients because
each recipient requires several document entries.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60325] Poor performance in DirectoryNode.createDocument() for NPOIFSFileSystem

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60325

PJ Fanning <fa...@yahoo.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEEDINFO                    |RESOLVED
         Resolution|---                         |WORKSFORME

--- Comment #9 from PJ Fanning <fa...@yahoo.com> ---
Please reopen if v5.2.0 does not help

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60325] Poor performance in DirectoryNode.createDocument() for NPOIFSFileSystem

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60325

--- Comment #7 from sits <da...@gmail.com> ---
I think it will.  When we upgrade to POI 5.0.0 we will re-test.  Thanks.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60325] Poor performance in DirectoryNode.createDocument() for NPOIFSFileSystem

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60325

--- Comment #5 from Nick Burch <ap...@gagravarr.org> ---
If we're able to change how we open NPOIFS from files (see my thread on dev@),
we might be able to mmap in all file cases, and from that we might be able to
mmap bigger blocks

That may then allow us to change NPOIFSMiniStore to avoid quite as much
wrapping/buffering of the mini blocks too,

(Note the "might" in the above - this is untested and just a guess!)

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60325] Poor performance in DirectoryNode.createDocument() for NPOIFSFileSystem

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60325

sits <da...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |david.sitsky@gmail.com
                 OS|                            |All

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60325] Poor performance in DirectoryNode.createDocument() for NPOIFSFileSystem

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60325

--- Comment #3 from Dominik Stadler <do...@gmx.at> ---
A quick analysis points more into the direction of NPOIFSMiniStore.getBlockAt()
because it iterates block-by-block via an Iterator<ByteBuffer>, for large
documents the offset can be high (i.e. in your sample between 500 and 1000
times for each call) and thus there are many loop-iterations with many
it.next() calls to StreamBlockByteBufferIterator which has to perform more work
to do these steps. 

Unfortunately this is quite core to the class, so not easily replaced with
something more performing as far as I see :(.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60325] Poor performance in DirectoryNode.createDocument() for NPOIFSFileSystem

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60325

--- Comment #4 from Dominik Stadler <do...@gmx.at> ---
Created attachment 34845
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=34845&action=edit
Screenshot from JVisualVM sampling

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60325] Poor performance in DirectoryNode.createDocument() for NPOIFSFileSystem

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60325

--- Comment #6 from PJ Fanning <fa...@yahoo.com> ---
I wonder if https://bz.apache.org/bugzilla/show_bug.cgi?id=65184 helps here

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60325] Poor performance in DirectoryNode.createDocument() for NPOIFSFileSystem

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60325

Javen O'Neal <on...@apache.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |NEEDINFO

--- Comment #1 from Javen O'Neal <on...@apache.org> ---
Do you have any suggestions on how to improve the NPOIFSFileSystem
implementation?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60325] Poor performance in DirectoryNode.createDocument() for NPOIFSFileSystem

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60325

--- Comment #2 from Dominik Stadler <do...@gmx.at> ---
Hm, the ByteBuffer.wrap() call seems unlikely to be the time-culprit as it just
populates some members, I'd try to use some profiler or APM tool here to find
what is actually using up the time.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60325] Poor performance in DirectoryNode.createDocument() for NPOIFSFileSystem

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60325

Dominik Stadler <do...@gmx.at> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |NEEDINFO
         Depends on|                            |65184

--- Comment #8 from Dominik Stadler <do...@gmx.at> ---
Please report if changes from #65184 did improve performance here.


Referenced Bugs:

https://bz.apache.org/bugzilla/show_bug.cgi?id=65184
[Bug 65184] Improve performance of POFSMiniStore getBlockAt
-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60325] Poor performance in DirectoryNode.createDocument() for NPOIFSFileSystem

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60325

Dominik Stadler <do...@gmx.at> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEEDINFO                    |NEW

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org