You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by bu...@apache.org on 2016/01/18 15:43:47 UTC

[Bug 58885] New: Performance regression on addition of merged regions

https://bz.apache.org/bugzilla/show_bug.cgi?id=58885

            Bug ID: 58885
           Summary: Performance regression on addition of merged regions
           Product: POI
           Version: unspecified
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: SS Common
          Assignee: dev@poi.apache.org
          Reporter: dtn-asfbugs@corefiling.co.uk

We regressed on performance after fixing bug 58443: adding each merged region
is now linear in the number of merged regions which already exist, which showed
up as quite a sharp performance regression in one of our apps.

See bug 58443 comment 3.

We either need to speed this up or allow bypassing the validation for
applications which are confident they're doing it right.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 58885] Performance regression on addition of merged regions

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58885

--- Comment #1 from Javen O'Neal <on...@apache.org> ---
Would Sheet.addMergedRegion(CellRangeAddress region, boolean
checkForOverlapping) (not implemented) solve this problem?

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 58885] Performance regression on addition of merged regions

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58885

Javen O'Neal <on...@apache.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|SS Common                   |XSSF

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 58885] Performance regression on addition of merged regions

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58885

--- Comment #2 from Javen O'Neal <on...@apache.org> ---
Adding a merged region *safely* (without creating a potentially corrupted
workbook) is inherently a O(n) operation, as it need to check all merged
regions in the sheet for potential overlap.

A merged region could be added unsafely (potential workbook corruption), but
this should not be the default behavior.

If adding multiple merged regions, it still requires O(n) time for each merged
region addition. The checking could be deferred until a later time before
saving the workbook, but wouldn't save any CPU cycles unless merged regions
were added and removed between opening and saving the workbook.

> which showed up as quite a sharp performance regression in one of our apps

Could I get a rough number of merged regions in a sheet so I could profile
this? I don't think there's much we can do to make the code faster, so we'll
probably need to provide a method that bypasses validation. To make it crystal
clear that using the bypassed-validation option can produce a corrupt workbook,
I was thinking of calling it public int
Sheet.addMergedRegionUnsafe(CellRangeAddress region), as the signature from
comment 1 does not stress the consequences of checkForOverlapping=False.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 58885] Performance regression on addition of merged regions

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58885

--- Comment #3 from David North <dt...@corefiling.co.uk> ---
Thanks Javen.

I agree blindly adding merged regions without the guards is an inherently
unsafe operation, but in our case we're authoring the workbook from scratch in
one run, so I think it's reasonable to give us API to say "we're confident
we're adding non-overlapping regions".

In terms of API, we could perhaps call the alternative method
addMergedRegionUnsafe(CellRangeAddress region) or use an enum as an alternative
to the boolean.

The number of merged regions on the sheet in question is 23,002 - which I'll
admit is quite a lot!

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 58885] Performance regression on addition of merged regions

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58885

Javen O'Neal <on...@apache.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Blocks|                            |59212

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 58885] Performance regression on addition of merged regions

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58885

Javen O'Neal <on...@apache.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED

--- Comment #4 from Javen O'Neal <on...@apache.org> ---
Fixed in r1727776. Use XSSFSheet.addMergedRegionsUnsafe(CellRangeAddress) if
you want to add a merged region without incurring the cost of checking for
potential intersecting regions. XSSFSheet.validateMergedRegions can be called
after the fact if you want to defer the validation.

Updated changelog in r1727779.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org