You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by bu...@apache.org on 2019/09/26 09:09:30 UTC

[Bug 63774] New: Custom properties "add" method is slow

https://bz.apache.org/bugzilla/show_bug.cgi?id=63774

            Bug ID: 63774
           Summary: Custom properties "add" method is slow
           Product: POI
           Version: 4.0.0-FINAL
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: XSSF
          Assignee: dev@poi.apache.org
          Reporter: serhiy.boychenko@cern.ch
  Target Milestone: ---

Dear developers,

We have recently experienced an issue moving to the latest release. In one of
our use cases we are setting a large number of custom properties to an Excel
(XLSX) file and it seems that in recent versions something was changed and this
process became much slower. Consider the following code:


public class PoiTest {
    public static void main(String[] args) throws IOException {
        try (FileInputStream fis = new FileInputStream("test.xlsx")) {
            XSSFWorkbook workbook = new XSSFWorkbook(fis);

            POIXMLProperties.CustomProperties properties =
workbook.getProperties().getCustomProperties();

            IntStream.range(0, 10000).forEach(i -> {
                long ts = System.currentTimeMillis();
                properties.addProperty("property" + i, "value" + 1);
                System.out.println(i + " time taken " +
(System.currentTimeMillis() - ts));
            });
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

You will see that time is constantly increasing. The investigation led me to
conclude that the cause is the call to 'nextPid()' method in 'CustomProperties'
class (defined in 'POIXMLProperties'). If you notice, every time the method is
called, it iterates over the whole structure to find out which is the latest
pid. In our case, since we do the changes sequentially, we have implemented a
workaround in our code base to cache pid, in more general case I am not sure
what should be a solution (if any), but performance after some point
deteriorates so much that it makes application unusable.

Best regards,
Serhiy.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 63774] Custom properties "add" method is slow

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=63774

PJ Fanning <fa...@yahoo.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED

--- Comment #1 from PJ Fanning <fa...@yahoo.com> ---
Added a solution to track last Pid using r1867597

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 63774] Custom properties "add" method is slow

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=63774

--- Comment #2 from serhiy <se...@cern.ch> ---
Hello, sorry for replying to a resolved issue, but I still have some concerns
about the proposed solution. I am not really familiar with XSLSX format and
therefore I do not know if "pid" must be unique or not. In case it must be
unique, it is maybe better to use AtomicInteger or something similar, as in
concurrent code execution there is a chance of two different properties ending
up with same pid, otherwise ignore this comment. The second concern I have is
"if(contains(name))" in "add" method, according to my experimentation, this
method is also slow and leads to a bad performance after reaching around 2000
entries in Custom Properties. Therefore I am wondering if the names should be
also cached into a temporary set to improve performance. Sorry once again for
bumping the issue. Best regards, Serhiy.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 63774] Custom properties "add" method is slow

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=63774

--- Comment #4 from PJ Fanning <fa...@yahoo.com> ---
should be in 4.1.1

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 63774] Custom properties "add" method is slow

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=63774

--- Comment #3 from dimatk <di...@gmail.com> ---
In which release are these changes?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org