You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by bu...@apache.org on 2009/11/26 16:10:28 UTC

DO NOT REPLY [Bug 48292] New: [PATCH] Support of array formulas

https://issues.apache.org/bugzilla/show_bug.cgi?id=48292

           Summary: [PATCH] Support of array formulas
           Product: POI
           Version: unspecified
          Platform: PC
        OS/Version: Windows XP
            Status: NEW
          Severity: normal
          Priority: P2
         Component: POI Overall
        AssignedTo: dev@poi.apache.org
        ReportedBy: Petr.Udalau@exigenservices.com


Created an attachment (id=24623)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=24623)
Set/remove array formulas patch.

There is first patch with implementation of set/remove array formulas.

In the nearest future I will prepare patches with evaluation of array formulas
and modifications for functions that supports array formulas and tests for all.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 48292] [PATCH] Support of array formulas

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=48292

--- Comment #1 from Josh Micich <jo...@gildedtree.com> 2009-11-30 16:25:38 UTC ---
This was originally attachment ID 24632 that was lost as part of the
issues.apache.org data loss on 2009-11-26/27.

Note that this attachment ID has since been re-used.

The original attachment comment was:
Some improvements to the original patch

Thanks for your work on array formulas.  This will be an important addition,
since this area is barely covered by POI at the moment (as I'm sure you're
aware).

There are a few issues with the rest of the patch, so I have re-worked what I
can and attached a new draft of the patch (based on the latest svn r885007).
Some of my fixes might be wrong, so feel free to change anything to improve the
patch.

A very small amount of your first patch was applied in svn r885006.

The biggest outstanding concern is the lack of test cases.  Please write
something that calls the new top-level APIs.  Try to make the tests execute the
newly changed/added code in all the lower classes.

In XSSFCell you wrote "if (isPartOfArrayFormulaGroup() && f == null)"
suggesting that it is possible for "isPartOfArrayFormulaGroup() to return true
with f!=null.  It seems more logical that this condition would represent a
corrupted spreadsheet.

According to OOO docs the formula parsing rules are slightly different for
array formulas.  You should use FormulaType.ARRAY, and we'll probably need to
update FormulaParser accordingly.

ConstantValueParser - it seems that there are now two alternate ways to encode
array element values of type text (with String and UnicodeString).  It would be
much better if such an ambiguity was not introduced into the code.  In
addition, UnicodeStrings can contain a list of FormattingRuns which is
something that is not applicable to array formulas (as far as I understand).
According to the current junit tests, this code is not executed anyway.

FormulaRecordAggregate - field _sharedValueManager can never be null so there
is no need to check for that.  In general don't write null checks for things
that are asserted or documented as being never null - it raises doubts as to
whether the check might really be required.

In CellRangeAddress.valueOf() you added code which presumably allows one to
write valueOf("A") instead of valueOf("A:A") but it is not clear how this is
useful or relevant.

Try to keep classes encapsulated - for example the ArrayRecords list should not
spill out of SharedValueManager, and SharedValueManager should not spill out of
FormulaRecordAggregate.

Please don't use your own internal user-id for @author tags.  Full name in
plain text is best.

I also changed some method names and javadoc on the public API for clarity.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 48292] [PATCH] Support of array formulas

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=48292

--- Comment #2 from Josh Micich <jo...@gildedtree.com> 2009-11-30 18:39:51 UTC ---
Created an attachment (id=24645)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=24645)
reworked patch

The patch I previously attached doesn't seem to be linked here anymore, so I am
re-submitting it.

This new version of the patch has been rebased to svn r885585.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 48292] [PATCH] Support of array formulas

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=48292

Josh Micich <jo...@gildedtree.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #24645|0                           |1
        is obsolete|                            |
  Attachment #24658|0                           |1
        is obsolete|                            |

--- Comment #6 from Josh Micich <jo...@gildedtree.com> 2009-12-09 00:16:06 UTC ---
Created an attachment (id=24682)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=24682)
another draft of the patch

Thanks for all your work so far but I'm afraid it's going to take a few more
iterations before we are able to commit this patch.  Please take a look at my
latest reworked version. I have made several changes which are described below.
 There are still outstanding issues that need to be addressed.  A very small
amount of work has been submitted (svn r888577, svn r888582, svn r888665). 
These changes don't add any functionality; they just resolve existing issues
and make the pending patch simpler.

Firstly, I've added Apache License notices to each new file.  It's important
that you agree to this, because we cannot take code contributions without it.
The tests you provided were in a package "com.exigen...". These have been moved
to the appropriate POI packages.
The tests needed to be converted to junit 3.8 because that's what POI uses
(changing junit versions is probably something we'd decide to do project wide).
 The test classes were re-named to follow POI conventions, though the names
could still be more descriptive.
POI expects tests to be silent (not write to std-out or any log) when
successful.
Test code should not throw exceptions to be caught by other test code during
successful tests.  The only exceptions that should be thrown are ones
deliberately caused in the production code being tested.
One test method for every example spreadsheet cell is probably too many.  Some
simplification was done, but consider using techniques like those in
TestFormulasFromSpreadsheet and TestIndexFunctionFromSpreadsheet etc.

General coding/formatting guidelines:
  - please use unix line endings (LF instead of CRLF)
  - don't mix indent style (tabs or spaces) in the same file
  - always use {} for the bodies of if/else for/while statements
  - don't assign to parameters
  - turn on IDE warnings for unnecessary code

I made significant simplifications to ArrayEval (while keeping all your junit
tests working).
I got rid of ArrayEval.illegalForAggregation because there is a much simpler
way to get #N/A out of formulas with array size errors:  The #N/A value
originates in ArrayFormulaEvaluatorHelper.getArrayValue() (new method), and
just flows up the evaluation stack through standard techniques.
The method ArrayEval.arrayAsArea() is also not needed (due to introduction of
TwoDEval).  TwoDEval simplifies a lot of the code you needed to touch in
existing function implementations.

There are also some problems in your interpretation of how array formulas work.
 I am pretty sure that at least one of the test cases is asserting the wrong
result (testOffsetArray() - ConstantArray.xls C149).  I have added a new test
class TestArrayEvaluationExtra which shows what the correct behaviour should
be.  The new failing test cases are currently disabled.  While investigating, I
have found out that array eval elements cannot be cell references or areas.  So
in this example, OFFSET cannot return an array of cell references.  I've
disabled ArrayEval from doing this, and made other fixes to keep all test cases
working.  The solution will probably involve getting the RVA types of the
tokens correct during the formula parsing phase.  Read the OOO documentation
for an introduction to this.  Unfortunately I think that information is
incomplete, but it is the best resource I know of.  We will probably have to
discover all the missing details for ourselves.

I also found that using the "Evaluate Formula" button (available in Excel 2007)
helped me understand how some of the example formulas should be evaluated.  

The short-circuit-if evaluation stuff is probably not needed.  The first
parameter to IF has type 'V' and I have a feeling that any parameter of type
'V' should always have a scalar ValueEval passed at runtime.  If you pass an
array for a parameter of type 'V' then the whole function gets invoked once for
each item in the array.  You have already written some code (prepareArgsForLoop
etc) that does this.  It's just a question of having it be invoked at the right
time.

There are a few other problems in the code:  Don't instantiate ArrayEval until
you have its contents calculated.   (i.e. don't create half-initialised
objects).  FunctionWithArraySupport.supportArray() looks like it is needed for
answering questions about function parameters which are probably resolvable by
the existing FunctionMetadata.  Similar problems with ArrayMode - it's probably
providing an incomplete solution to something that can perhaps be solved better
using proper RVA token classification.


Please review the re-worked patch and address these issues as best you can.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 48292] [PATCH] Support of array formulas

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=48292

--- Comment #19 from Dugas du Villard <cd...@calasys.fr> ---
Hi,

I'm sorry to bother you with this functionnality.
But I realy need it in my project.

What's the news?
Is there a patch or some code I can add in my soft to fix it without waiting
for a new release?

Thanks.
Christophe

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 48292] [PATCH] Support of array formulas

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=48292

--- Comment #14 from Dugas du Villard <cd...@calasys.fr> 2011-12-08 15:20:07 UTC ---
Created attachment 28053
  --> https://issues.apache.org/bugzilla/attachment.cgi?id=28053
File from Excel 2003 that can not be re-save using POI due to Array Formula In
Cell B37

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 48292] [PATCH] Support of array formulas

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=48292

--- Comment #16 from Dugas du Villard <cd...@calasys.fr> 2011-12-09 07:31:41 UTC ---
Comment on attachment 28053
  --> https://issues.apache.org/bugzilla/attachment.cgi?id=28053
File from Excel 2003 that can not be re-save using POI due to Array Formula In
Cell B37

Hi,

I have a problem with POI due probably to array formula. It's why I'm posting
the probleme in this bug.

I have attach 2 xls files. One contain array formula (in cell B37..), the other
not. 
I get an error "java.lang.RuntimeException: Failed to find a matching shared
formula record" when I save the file containing array formula using POI
(method: workbook.write) If I save the file using Excel 2010, and then pass it
to POI, it's ok. But if it is saved from Excel 2003, there is a problem.

Is there a fix for this? Is it link to the bug describe here?

Thanks.
Christophe.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 48292] [PATCH] Support of array formulas

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=48292

--- Comment #21 from coppertan.au@gmail.com ---
Hi Dugas,

  Did you try applying the patch successfully?  If so, which version of POI did
you use?

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 48292] [PATCH] Support of array formulas

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=48292

--- Comment #20 from Yegor Kozlov <ye...@dinom.ru> ---
> 
> What's the news?

there is no news which means that the patch is not applied.

The patch needs some work and no one yet volunteered to work on it. 

> Is there a patch or some code I can add in my soft to fix it without waiting
> for a new release?
> 

The patch is attached, go ahead and apply it to trunk. 

Yegor

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 48292] [PATCH] Support of array formulas

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=48292

--- Comment #9 from Petr.Udalau <Pe...@exigenservices.com> 2009-12-18 06:38:37 UTC ---
Created an attachment (id=24731)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=24731)
last reworked patch

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 48292] [PATCH] Support of array formulas

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=48292

Petr.Udalau <Pe...@exigenservices.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #24623|0                           |1
        is obsolete|                            |

--- Comment #4 from Petr.Udalau <Pe...@exigenservices.com> 2009-12-02 05:54:26 UTC ---
Created an attachment (id=24658)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=24658)
Final patch

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 48292] [PATCH] Support of array formulas

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=48292

Petr.Udalau <Pe...@exigenservices.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #24731|0                           |1
        is obsolete|                            |
  Attachment #24732|0                           |1
        is obsolete|                            |

--- Comment #11 from Petr.Udalau <Pe...@exigenservices.com> 2009-12-21 07:36:15 UTC ---
Created an attachment (id=24746)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=24746)
last reworked patch with test data

Updated INDEX function and restructured tests.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 48292] [PATCH] Support of array formulas

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=48292

--- Comment #12 from Petr.Udalau <Pe...@exigenservices.com> 2010-08-06 09:59:48 EDT ---
Hi,
Our team have made investigation and implement new functionality before. And we
need response about our patch with array formulas in POI. We are waiting since
2009 year. So what about that?

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 48292] [PATCH] Support of array formulas

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=48292

Josh Micich <jo...@gildedtree.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #24659|0                           |1
        is obsolete|                            |

--- Comment #7 from Josh Micich <jo...@gildedtree.com> 2009-12-09 00:20:25 UTC ---
Created an attachment (id=24683)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=24683)
added test files

The zip contains 6 files destined for test-data/spreadsheet/ (the standard
location for test sample spreadsheet files).

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 48292] [PATCH] Support of array formulas

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=48292

--- Comment #1 from Josh Micich <jo...@gildedtree.com> 2009-11-27 14:35:52 UTC ---
Created an attachment (id=24632)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=24632)
Some improvements to the original patch

Thanks for your work on array formulas.  This will be an important addition,
since this area is barely covered by POI at the moment (as I'm sure you're
aware).

There are a few issues with the rest of the patch, so I have re-worked what I
can and attached a new draft of the patch (based on the latest svn r885007). 
Some of my fixes might be wrong, so feel free to change anything to improve the
patch.

A very small amount of your first patch was applied in svn r885006.

The biggest outstanding concern is the lack of test cases.  Please write
something that calls the new top-level APIs.  Try to make the tests execute the
newly changed/added code in all the lower classes. 

In XSSFCell you wrote "if (isPartOfArrayFormulaGroup() && f == null)"
suggesting that it is possible for "isPartOfArrayFormulaGroup() to return true
with f!=null.  It seems more logical that this condition would represent a
corrupted spreadsheet.

According to OOO docs the formula parsing rules are slightly different for
array formulas.  You should use FormulaType.ARRAY, and we'll probably need to
update FormulaParser accordingly.

ConstantValueParser - it seems that there are now two alternate ways to encode
array element values of type text (with String and UnicodeString).  It would be
much better if such an ambiguity was not introduced into the code.  In
addition, UnicodeStrings can contain a list of FormattingRuns which is
something that is not applicable to array formulas (as far as I understand). 
According to the current junit tests, this code is not executed anyway.

FormulaRecordAggregate - field _sharedValueManager can never be null so there
is no need to check for that.  In general don't write null checks for things
that are asserted or documented as being never null - it raises doubts as to
whether the check might really be required.

In CellRangeAddress.valueOf() you added code which presumably allows one to
write valueOf("A") instead of valueOf("A:A") but it is not clear how this is
useful or relevant.

Try to keep classes encapsulated - for example the ArrayRecords list should not
spill out of SharedValueManager, and SharedValueManager should not spill out of
FormulaRecordAggregate.

Please don't use your own internal user-id for @author tags.  Full name in
plain text is best.

I also changed some method names and javadoc on the public API for clarity.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 48292] [PATCH] Support of array formulas

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=48292

--- Comment #10 from Petr.Udalau <Pe...@exigenservices.com> 2009-12-18 06:39:18 UTC ---
Created an attachment (id=24732)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=24732)
new test data

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 48292] [PATCH] Support of array formulas

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=48292

--- Comment #15 from Dugas du Villard <cd...@calasys.fr> 2011-12-08 15:21:25 UTC ---
Created attachment 28054
  --> https://issues.apache.org/bugzilla/attachment.cgi?id=28054
28053: File from Excel 2003 that can be re-save using POI No Array Formula in
this one

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 48292] [PATCH] Support of array formulas

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=48292

Alexander <ho...@uni-wuerzburg.de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hoernlein@uni-wuerzburg.de

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 48292] [PATCH] Support of array formulas

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=48292

--- Comment #13 from Yegor Kozlov <ye...@dinom.ru> 2010-08-09 06:10:05 EDT ---

Petr,

Thanks again for your contribution. The patch was partially applied in Dec 2009
but there are still issues to address. I think we will need at least one
iteration to sort them out. 

The patch consists of two big parts that were splitted between me and Josh
Micich.

 (1) usermodel API for arrays formulas (me). It is way more complex than just 
setter methods. The API should be consistent and mimic Excel - it should block
changing cells included in multi-cell arrays, prevent data corruption, etc. 
This part is mostly done, see the history in r894469, r893870 and r893625. 

Most of usermodel tests are consolidated in
org.apache.poi.ss.usermodel.BaseTestSheetUpdateArrayFormulas which is used both
by HSSF and XSSF. 

 (2) evaluation of array formulas (Josh). This is the hardest part and it is
still in progress. Let's wait feedback from Josh on how much remains to be
done. 

Regards,
Yegor

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 48292] [PATCH] Support of array formulas

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=48292

--- Comment #8 from Petr.Udalau <Pe...@exigenservices.com> 2009-12-18 06:36:34 UTC ---
Josh,
We appreciate your time and effort to integrate our stuff.

>Thanks for all your work so far but I'm afraid it's going to take a few more
>iterations before we are able to commit this patch.
We understand it and ready to continue our work to match your requirements.


>The tests needed to be converted to junit 3.8 because that's what POI uses
>(changing junit versions is probably something we'd decide to do project wide).
> The test classes were re-named to follow POI conventions, though the names
>could still be more descriptive.
>POI expects tests to be silent (not write to std-out or any log) when
>successful.
>Test code should not throw exceptions to be caught by other test code during
>successful tests.  The only exceptions that should be thrown are ones
>deliberately caused in the production code being tested.
>One test method for every example spreadsheet cell is probably too many.  Some
>simplification was done, but consider using techniques like those in
>TestFormulasFromSpreadsheet and TestIndexFunctionFromSpreadsheet etc.
We reworked our tests to fit your requirements.

>General coding/formatting guidelines:
>  - please use unix line endings (LF instead of CRLF)
>  - don't mix indent style (tabs or spaces) in the same file
>  - always use {} for the bodies of if/else for/while statements
>  - don't assign to parameters
>  - turn on IDE warnings for unnecessary code
We are in process of improving our code.

>I made significant simplifications to ArrayEval (while keeping all your junit
>tests working).
>I got rid of ArrayEval.illegalForAggregation because there is a much simpler
>way to get #N/A out of formulas with array size errors:  The #N/A value
>originates in ArrayFormulaEvaluatorHelper.getArrayValue() (new method), and
>just flows up the evaluation stack through standard techniques.
>The method ArrayEval.arrayAsArea() is also not needed (due to introduction of
>TwoDEval).  TwoDEval simplifies a lot of the code you needed to touch in
>existing function implementations.
The above is reasonable improvement. Thanks.

>There are also some problems in your interpretation of how array formulas work.
> I am pretty sure that at least one of the test cases is asserting the wrong
>result (testOffsetArray() - ConstantArray.xls C149).  I have added a new test
>class TestArrayEvaluationExtra which shows what the correct behaviour should
>be.  The new failing test cases are currently disabled.  While investigating, I
>have found out that array eval elements cannot be cell references or areas.  So
>in this example, OFFSET cannot return an array of cell references.  I've
>disabled ArrayEval from doing this, and made other fixes to keep all test cases
>working.  The solution will probably involve getting the RVA types of the
>tokens correct during the formula parsing phase.  Read the OOO documentation
>for an introduction to this.  Unfortunately I think that information is
>incomplete, but it is the best resource I know of.  We will probably have to
>discover all the missing details for ourselves.
The issue you have raised is really exists. We spent quite a time researching
it. 
1.    Behavior of function chain based on array of references (could be return
by function offset(), index() etc.) is different in Excel 2003 and Excel 2007
(I mean engines, not formats). For example
=SUM(SUM((OFFSET(A149:B150,{1,0},{1,0},{1,1},{1,1}))) returns 5 in Excel 2003
and #N/A in Excel 2007. In excel 2003
=average((OFFSET(A149:B150,{1,0},{1,0},{1,1},{1,1})) returns 4 but
=sum(average((OFFSET(A149:B150,{1,0},{1,0},{1,1},{1,1}))) returns 2.5. We think
that reason is in inconsistency of excel concepts. We decided to stick to Excel
2007 logic. First of all we assume that offset() returns array of references –
if create an array formula {=OFFSET(A149:B150,{1,0},{1,0},{1,1},{1,1})} on two
cells, it would fill it with 4 and 1 correspondingly. The behavior we should
match is the following one: 
a.    =SUM(OFFSET(A149:B150,{1,0},{1,0},{1,1},{1,1}))   ---->    4
b.    =SUM(SUM(OFFSET(A149:B150,{1,0},{1,0},{1,1},{1,1})))   ---->    #N/A!
(It is very strange result, result of functions have to depend on only args)
c.    =SQRT(OFFSET(A149:B150,{1,0},{1,0},{1,1},{1,1}))  ---->    2
d.    =SUM(SQRT(OFFSET(A149:B150,{1,0},{1,0},{1,1},{1,1})))  ------->  #Value!
e.    =SQRT(SUM(OFFSET(A149:B150,{1,0},{1,0},{1,1},{1,1}))) --------->  2
f.    =SUM(SQRT(SUM(OFFSET(A149:B150,{1,0},{1,0},{1,1},{1,1})))) -------->
#N/A!
2.    The logic above mathematically seems strange but need to emulate it. To
do this we add componentError as element on ArrayEval which contains „future”
error to be exposed by aggregation function used second time.
3.    We consider ArrayEval as „universal” container which is transfered via
evaluation stack and need to contain results of any types including references.
Thus OFFSET(A149:B150,{1,0},{1,0},{1,1},{1,1})) returns array of references
(ArrayEval contains AreaEvals) and {OFFSET(A149:B150,{1,0},{1,0},{1,1},{1,1}))}
on two cells returns 4 and 1. We have removed your validation from ArrayEval.


>I also found that using the "Evaluate Formula" button (available in Excel 2007)
>helped me understand how some of the example formulas should be evaluated.  
Thanks for the hint

>The short-circuit-if evaluation stuff is probably not needed.  The first
>parameter to IF has type 'V' and I have a feeling that any parameter of type
>'V' should always have a scalar ValueEval passed at runtime.  If you pass an
>array for a parameter of type 'V' then the whole function gets invoked once for
>each item in the array.  You have already written some code (prepareArgsForLoop
>etc) that does this.  It's just a question of having it be invoked at the right
>time.
We need this change because IF first parameter is array. In this case IF
optimization not always accepatble. We do iteration over array but in
non-optimized manner.

>There are a few other problems in the code:  Don't instantiate ArrayEval until
you have its contents calculated.   (i.e. don't create half-initialised
objects).
We’ll look at it.
>  FunctionWithArraySupport.supportArray() looks like it is needed for
>answering questions about function parameters which are probably resolvable by
>the existing FunctionMetadata.  Similar problems with ArrayMode - it's probably
>providing an incomplete solution to something that can perhaps be solved better
>using proper RVA token classification.
Our investigation showed that RVA types from FunctionMetadata is not enough for
decision if function accept array as argument or we need to iterate over it. V
and A types are OK but problem is with R type. For example for IF function is
 1      IF      2       3       R       V R R          
But function logic requires second/third parameters needed to be iterated (as
scalar values). The same is true for COLUMN, CHOOSE, VLOOKUP, HLOOKUP. Thus,
for the moment we stick to our approach.
There is some unnecessary code which is commented. This code is represents that
specificaiotns differs from our interfaces.

There is also problem with determining return type of formula for example for
formulas:
=SUM(IF(A67:B68>2;A67:B68;1))
=SUM(OFFSET(A149:B150,{1,0},{1,0},{1,1},{1,1}))

OFFSET    metadata
78    OFFSET    3    5    R    R V V V V    x

According to the specification: arguments that marked "V" in specification
represented like an array and "R"-arguments represented like an area eval.
Return type of functions is "R" in specification but OFFSET returns single
value for SUM(by preparing args for one itaration of loop) and IF returns array
for SUM.

So, we make one-week investigation and you can see result. Our patch is raw but
set/remove functionality is done, there is some problems with evaluation but we
can't find right specification how to evaluate formulas(and even Excel have
problems with evaluations of array formulas)

We will waiting for your response and your point of view on the problem of
array formulas evaluation in Excel. We have to determine way of evaluation in
POI: we have to imitate Excel calculations or find some rules of calculations
of formulas(may be using RVA specification).

Regards.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 48292] [PATCH] Support of array formulas

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=48292

--- Comment #3 from Petr.Udalau <Pe...@exigenservices.com> 2009-12-02 05:53:11 UTC ---
Thanks for reworked patch.
We have made final patch of array formula support and we will waiting for your
review.

Notes:

1) About "if (isPartOfArrayFormulaGroup() && f == null)" in XSSFCell:
I saved Excel document with array formula and when I had opened the xml of
sheet I saw that formula was set only for the first cell of array formula range
and other cells contained only calculated result.

2) About parsing of array formulas:
I changed OperandClassTransformer. But may be It is not solution for this
problem.

3) About CellRangeAddress.valueOf():
This is the case when the array formula is set into one cell. Then the Excel
save the "ref" param of formula like a "A1"(It is only for one-cell array
formula, when we set array formula into several cells "ref" param will contain
range like "A1:B1").

4) About evaluation of array formulas:
There are two types of array formulas:
  1.Some formulas (for example "SQRT({1,2;3,4})") can take array arguments and
have to be evaluated in loop with different args like a simple function.
  2.Functions that have to be evaluated in special mode when it is in array
formula.

5) About optimized CHOOSE and IF:
Optimization IF/ Choose  was made by analyze of  first argument and evaluated
only expression, represented by second argument in true case or by third
argument in false case.
In Array Formula Context first argument may be Boolean array which contains:
-only true values
-only false values
-mixed true and false values.
Method checkBooleanContent() provides analyze, which case we have.
Fist two cases can be optimized in the same manner as usual IF/Choose function.
Third case does not allow such optimization and requires evaluating both
expression and only then  choosing right value from evaluated arrays according 
Boolean array value. It means that we need to restore “non optimized” way to
evaluate such IF/Choose function. Restore of “non optimized” way lead to
ignoring of AttrPtg elements, which provide skipping no needed evaluation
during optimized way.

Due to  that IF/Choose function may be nested,   decision of optimized/non
optimized  is concern only for current IF/Choose function, need to keep in such
function stack and be restored after  finishing of  evaluating current
IF/Choose.

6) There will suitable to have common interface for Function and
FreeRefFunction.

Do you have any ideas and proposals?

P.S. "test/resources" must be a source folder to run tests. TestArrays - main
test.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 48292] [PATCH] Support of array formulas

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=48292

Yegor Kozlov <ye...@dinom.ru> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Blocks|                            |46989

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 48292] [PATCH] Support of array formulas

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=48292

--- Comment #22 from Dan Peebles <pu...@me.com> ---
Any updates? Not having this prevents me from using POI usefully, and it seems
like a pity to let all this effort languish for so long (~5 years now?)

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 48292] [PATCH] Support of array formulas

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=48292

--- Comment #17 from Dugas du Villard <cd...@calasys.fr> 2012-03-08 09:06:06 UTC ---
Hi,

What's up about this bug?

Thanks.
Christophe.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 48292] [PATCH] Support of array formulas

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=48292

--- Comment #18 from Yegor Kozlov <ye...@dinom.ru> 2012-03-11 07:26:55 UTC ---
Unfortunately this patch is still not applied. I hope to find time to check it
in in the next POI-3.9 cycle.  

Yegor

(In reply to comment #17)
> Hi,
> 
> What's up about this bug?
> 
> Thanks.
> Christophe.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 48292] [PATCH] Support of array formulas

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=48292

--- Comment #23 from Yegor Kozlov <ye...@dinom.ru> ---
(In reply to Dan Peebles from comment #22)
> Any updates? Not having this prevents me from using POI usefully, and it
> seems like a pity to let all this effort languish for so long (~5 years now?)

unfortunately it is not that easy.

The patch in its current form is not applicable. It breaks unit tests and
violates the API. I tried to adapt it but had to give up because of lack of
time. We need someone to work it, otherwise it will stay in Bugzilla for good. 
If you are willing to contribute then please do work on it!

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 48292] [PATCH] Support of array formulas

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=48292

--- Comment #5 from Petr.Udalau <Pe...@exigenservices.com> 2009-12-02 05:55:21 UTC ---
Created an attachment (id=24659)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=24659)
Tests for array formulas

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org