You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by bu...@apache.org on 2019/03/06 17:24:34 UTC

[Bug 63240] New: Use of synchronized in DocumentHelper.newDocumentBuilder

https://bz.apache.org/bugzilla/show_bug.cgi?id=63240

            Bug ID: 63240
           Summary: Use of synchronized in
                    DocumentHelper.newDocumentBuilder
           Product: POI
           Version: unspecified
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: POI Overall
          Assignee: dev@poi.apache.org
          Reporter: nikeee@outlook.com
  Target Milestone: ---

We are procesing XLSF files in parallel and faced a performance bottleneck.
Investigating the issue, it turns out that all threads wait on the synchronized 
DocumentHelper.newDocumentBuilder().

It was introduced in this commit:
https://github.com/apache/poi/commit/eabb6a924be24abb879372d0bc967e0d316b2cf8?diff=split

Related Bug:
https://bz.apache.org/bugzilla/show_bug.cgi?id=56814#c10

It seems that the synchronized was added because it does not have any impact on
single-threaded applications, as it will get optimized away. When working
multi-threaded, all threads are waiting for the mehtod to finish.

This is the related code section:
https://github.com/apache/poi/blame/trunk/src/ooxml/java/org/apache/poi/ooxml/util/DocumentHelper.java#L89-L98

The newDocumentBuilder does not seem to mutate any global state as it only
creates a local variable which is returned afterwards.

Is there any special reason for this synchronized? Can this be removed? Are
there workarounds we can use with the current version of POI?

This is the thread dump/call stack:
> "thread-2" - Thread t@845
>    java.lang.Thread.State: BLOCKED
> 	at org.apache.poi.util.DocumentHelper.newDocumentBuilder(DocumentHelper.java:87)
> 	- waiting to lock <246a71d6> (a java.lang.Class) owned by "thead-1" t@806
> 	at org.apache.poi.util.DocumentHelper.readDocument(DocumentHelper.java:140)
> 	at org.apache.poi.openxml4j.opc.PackageRelationshipCollection.parseRelationshipsPart(PackageRelationshipCollection.java:308)
> 	at org.apache.poi.openxml4j.opc.PackageRelationshipCollection.<init>(PackageRelationshipCollection.java:167)
> 	at org.apache.poi.openxml4j.opc.PackageRelationshipCollection.<init>(PackageRelationshipCollection.java:135)
> 	at org.apache.poi.openxml4j.opc.PackagePart.loadRelationships(PackagePart.java:565)
> 	at org.apache.poi.openxml4j.opc.PackagePart.<init>(PackagePart.java:111)
> 	at org.apache.poi.openxml4j.opc.PackagePart.<init>(PackagePart.java:82)
> 	at org.apache.poi.openxml4j.opc.PackagePart.<init>(PackagePart.java:127)
> 	at org.apache.poi.openxml4j.opc.ZipPackagePart.<init>(ZipPackagePart.java:79)
> 	at org.apache.poi.openxml4j.opc.ZipPackage.getPartsImpl(ZipPackage.java:335)
> 	at org.apache.poi.openxml4j.opc.OPCPackage.getParts(OPCPackage.java:756)
> 	at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:327)
> 	at org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:184)
> 	at org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:149)

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 63240] Use of synchronized in DocumentHelper.newDocumentBuilder

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=63240

--- Comment #1 from PJ Fanning <fa...@yahoo.com> ---
The DocumentBuilderFactory is package private so you only access it if you
create a class in org.apache.poi.ooxml.util package.

You can then create your own version of the newDocumentBuilder method.

I think we can change the behaviour in POI but it could be weeks before our
next release and this change would need to be load tested.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 63240] Use of synchronized in DocumentHelper.newDocumentBuilder

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=63240

PJ Fanning <fa...@yahoo.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #3 from PJ Fanning <fa...@yahoo.com> ---
updated using https://svn.apache.org/viewvc?view=revision&revision=1854935

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 63240] Use of synchronized in DocumentHelper.newDocumentBuilder

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=63240

Niklas Mollenhauer <ni...@outlook.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |uwe@thetaphi.de

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 63240] Use of synchronized in DocumentHelper.newDocumentBuilder

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=63240

--- Comment #2 from Niklas Mollenhauer <ni...@outlook.com> ---
Well, we are only using this indirectly, because it is used in a call of
WorkbookFactory.create (as seen in the provided call stack). I don't think we
can tell POI to use a different class.

The current workaround is caching the workbooks. However, this is not a
suitable long-term solution, due to other constraints. Having this issue fixed
in the next release in some weeks would suffice.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org