You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@openoffice.apache.org by Lei Wang <le...@apache.org> on 2012/07/02 10:50:44 UTC

[Calc][Discuss]Interoperability with Excel for SUMIF, COUNTIF, COUNTBLANK, DCOUNT, DSUM, ... with empty cells and "" and ="" empty string criteria.

Hi all,

There is an old bug i65221(
https://issues.apache.org/ooo/show_bug.cgi?id=65221). It is about
interoperability with Excel for SUMIF, COUNTIF, COUNTBLANK, DCOUNT, DSUM,
... with empty cells and "" and ="" empty string criteria.

There is some discussion about it, but no conclusion. Some people think
Excel does not do well, so simply mimicking Excel behavior might not be the
best idea. Some people think self-consistency in Calc is more important.
Some people think compatibility with Excel is a huge concern.

IMHO, compatible with Excel is very important. I suggest changing these
formulas behavior according to Excel. Following is my consideration
1)Most people use Excel, compatible with Excel is very important. Excel
does not show self-consistency in some scenarios, Calc doe not show
self-consistency either.
2)Some frequently used function, such as calculating empty cells, can not
be implemented by Calc, while Excel can.
3)Although Excel does not show self-consistency in some scenarios. These
scenarios are all related with pure empty cell. Excel use following rules
  When empty cell is criteria, only cells with value 0 meet it. This is
true for COUNTIF, SUMIF. But not true for DSUM/DCOUNT/IF.
  When an empty cell in a cell range, it can meet empty string criteria.
This is true for COUNTIF, SUMIF. But not true for DSUM/DCOUNT.

Re: [Calc][Discuss]Interoperability with Excel for SUMIF, COUNTIF, COUNTBLANK, DCOUNT, DSUM, ... with empty cells and "" and ="" empty string criteria.

Posted by Oliver-Rainer Wittmann <or...@googlemail.com>.
Hi,

[resend to ooo-dev]

On 02.07.2012 10:50, Lei Wang wrote:
> Hi all,
>
> There is an old bug i65221(
> https://issues.apache.org/ooo/show_bug.cgi?id=65221). It is about
> interoperability with Excel for SUMIF, COUNTIF, COUNTBLANK, DCOUNT, DSUM,
> ... with empty cells and "" and ="" empty string criteria.
>
> There is some discussion about it, but no conclusion. Some people think
> Excel does not do well, so simply mimicking Excel behavior might not be the
> best idea. Some people think self-consistency in Calc is more important.
> Some people think compatibility with Excel is a huge concern.
>
> IMHO, compatible with Excel is very important. I suggest changing these
> formulas behavior according to Excel. Following is my consideration
> 1)Most people use Excel, compatible with Excel is very important. Excel
> does not show self-consistency in some scenarios, Calc doe not show
> self-consistency either.
> 2)Some frequently used function, such as calculating empty cells, can not
> be implemented by Calc, while Excel can.
> 3)Although Excel does not show self-consistency in some scenarios. These
> scenarios are all related with pure empty cell. Excel use following rules
>    When empty cell is criteria, only cells with value 0 meet it. This is
> true for COUNTIF, SUMIF. But not true for DSUM/DCOUNT/IF.
>    When an empty cell in a cell range, it can meet empty string criteria.
> This is true for COUNTIF, SUMIF. But not true for DSUM/DCOUNT.
>

first, I am not familar with spreadsheet formulas.

I agree that interoperability with Microsoft Excel is important. But also 
important and from my point of view even more important is the conformance to 
ODF 1.2.
I do not know, if the needed changes would cause an unconformance with ODF 1.2. 
If yes, things will be complicated.

just my thoughts from an ODF perspective.


Best regards, Oliver.

Re: [Calc][Discuss]Interoperability with Excel for SUMIF, COUNTIF, COUNTBLANK, DCOUNT, DSUM, ... with empty cells and "" and ="" empty string criteria.

Posted by John Hart <jh...@testra.com>.
What would excite me would be an option to vector cell functions to user 
code and translate existing functions into macros. This would make 
compatibility a user solvable problem and open up new markets not served 
by MS. such as PCB design with a spread sheet. I have the macros.



On 7/2/2012 2:53 AM, Oliver-Rainer Wittmann wrote:
> Hi,
>
> On 02.07.2012 10:50, Lei Wang wrote:
>> Hi all,
>>
>> There is an old bug i65221(
>> https://issues.apache.org/ooo/show_bug.cgi?id=65221). It is about
>> interoperability with Excel for SUMIF, COUNTIF, COUNTBLANK, DCOUNT, 
>> DSUM,
>> ... with empty cells and "" and ="" empty string criteria.
>>
>> There is some discussion about it, but no conclusion. Some people think
>> Excel does not do well, so simply mimicking Excel behavior might not 
>> be the
>> best idea. Some people think self-consistency in Calc is more important.
>> Some people think compatibility with Excel is a huge concern.
>>
>> IMHO, compatible with Excel is very important. I suggest changing these
>> formulas behavior according to Excel. Following is my consideration
>> 1)Most people use Excel, compatible with Excel is very important. Excel
>> does not show self-consistency in some scenarios, Calc doe not show
>> self-consistency either.
>> 2)Some frequently used function, such as calculating empty cells, can 
>> not
>> be implemented by Calc, while Excel can.
>> 3)Although Excel does not show self-consistency in some scenarios. These
>> scenarios are all related with pure empty cell. Excel use following 
>> rules
>>    When empty cell is criteria, only cells with value 0 meet it. This is
>> true for COUNTIF, SUMIF. But not true for DSUM/DCOUNT/IF.
>>    When an empty cell in a cell range, it can meet empty string 
>> criteria.
>> This is true for COUNTIF, SUMIF. But not true for DSUM/DCOUNT.
>>
>
> first, I am not familar with spreadsheet formulas.
>
> I agree that interoperability with Microsoft Excel is important. But 
> also important and from my point of view even more important is the 
> conformance to ODF 1.2.
> I do not know, if the needed changes would cause an unconformance with 
> ODF 1.2. If yes, things will be complicated.
>
> just my thoughts from an ODF perspective.
>
>
> Best regards, Oliver.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: ooo-users-unsubscribe@incubator.apache.org
> For additional commands, e-mail: ooo-users-help@incubator.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: ooo-users-unsubscribe@incubator.apache.org
For additional commands, e-mail: ooo-users-help@incubator.apache.org


RE: [Calc][Discuss]Interoperability with Excel for SUMIF, COUNTIF, COUNTBLANK, DCOUNT, DSUM, ... with empty cells and "" and ="" empty string criteria.

Posted by "Dennis E. Hamilton" <de...@acm.org>.
+1 with primary concern being compatibility with OpenFormula and ODF 1.2.

The relevant documents are here <http://docs.oasis-open.org/office/v1.2/os/>.  The OpenFormula specification is in Part 2, OpenDocument-v1.2-os-part2.*, where the .odf and .pdf are the most usable.

It appears that there is ambiguity with regard to comparisons when the type of a scalar is not clear (e.g., a comparison where a cell is empty and it is not clear whether it should be understood as a number (0) or a string ("") in processing the condition.  For some comparison operations, there are implementation-dependent cases for mismatched types.

On the other hand, OpenFormula section 4.11.8 Criterion may cover all the cases in question for SUMIF and COUNTIF. I've added some cases to check in a comment on the old issue.

 - Dennis

-----Original Message-----
From: Oliver-Rainer Wittmann [mailto:orwittmann@googlemail.com] 
Sent: Monday, July 02, 2012 02:54
To: ooo-users@incubator.apache.org
Subject: Re: [Calc][Discuss]Interoperability with Excel for SUMIF, COUNTIF, COUNTBLANK, DCOUNT, DSUM, ... with empty cells and "" and ="" empty string criteria.

Hi,

On 02.07.2012 10:50, Lei Wang wrote:
> Hi all,
>
> There is an old bug i65221(
> https://issues.apache.org/ooo/show_bug.cgi?id=65221). It is about
> interoperability with Excel for SUMIF, COUNTIF, COUNTBLANK, DCOUNT, DSUM,
> ... with empty cells and "" and ="" empty string criteria.
>
> There is some discussion about it, but no conclusion. Some people think
> Excel does not do well, so simply mimicking Excel behavior might not be the
> best idea. Some people think self-consistency in Calc is more important.
> Some people think compatibility with Excel is a huge concern.
>
> IMHO, compatible with Excel is very important. I suggest changing these
> formulas behavior according to Excel. Following is my consideration
> 1)Most people use Excel, compatible with Excel is very important. Excel
> does not show self-consistency in some scenarios, Calc doe not show
> self-consistency either.
> 2)Some frequently used function, such as calculating empty cells, can not
> be implemented by Calc, while Excel can.
> 3)Although Excel does not show self-consistency in some scenarios. These
> scenarios are all related with pure empty cell. Excel use following rules
>    When empty cell is criteria, only cells with value 0 meet it. This is
> true for COUNTIF, SUMIF. But not true for DSUM/DCOUNT/IF.
>    When an empty cell in a cell range, it can meet empty string criteria.
> This is true for COUNTIF, SUMIF. But not true for DSUM/DCOUNT.
>

first, I am not familar with spreadsheet formulas.

I agree that interoperability with Microsoft Excel is important. But also 
important and from my point of view even more important is the conformance to 
ODF 1.2.
I do not know, if the needed changes would cause an unconformance with ODF 1.2. 
If yes, things will be complicated.

just my thoughts from an ODF perspective.


Best regards, Oliver.

---------------------------------------------------------------------
To unsubscribe, e-mail: ooo-users-unsubscribe@incubator.apache.org
For additional commands, e-mail: ooo-users-help@incubator.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: ooo-users-unsubscribe@incubator.apache.org
For additional commands, e-mail: ooo-users-help@incubator.apache.org


Re: [Calc][Discuss]Interoperability with Excel for SUMIF, COUNTIF, COUNTBLANK, DCOUNT, DSUM, ... with empty cells and "" and ="" empty string criteria.

Posted by Oliver-Rainer Wittmann <or...@googlemail.com>.
Hi,

On 02.07.2012 10:50, Lei Wang wrote:
> Hi all,
>
> There is an old bug i65221(
> https://issues.apache.org/ooo/show_bug.cgi?id=65221). It is about
> interoperability with Excel for SUMIF, COUNTIF, COUNTBLANK, DCOUNT, DSUM,
> ... with empty cells and "" and ="" empty string criteria.
>
> There is some discussion about it, but no conclusion. Some people think
> Excel does not do well, so simply mimicking Excel behavior might not be the
> best idea. Some people think self-consistency in Calc is more important.
> Some people think compatibility with Excel is a huge concern.
>
> IMHO, compatible with Excel is very important. I suggest changing these
> formulas behavior according to Excel. Following is my consideration
> 1)Most people use Excel, compatible with Excel is very important. Excel
> does not show self-consistency in some scenarios, Calc doe not show
> self-consistency either.
> 2)Some frequently used function, such as calculating empty cells, can not
> be implemented by Calc, while Excel can.
> 3)Although Excel does not show self-consistency in some scenarios. These
> scenarios are all related with pure empty cell. Excel use following rules
>    When empty cell is criteria, only cells with value 0 meet it. This is
> true for COUNTIF, SUMIF. But not true for DSUM/DCOUNT/IF.
>    When an empty cell in a cell range, it can meet empty string criteria.
> This is true for COUNTIF, SUMIF. But not true for DSUM/DCOUNT.
>

first, I am not familar with spreadsheet formulas.

I agree that interoperability with Microsoft Excel is important. But also 
important and from my point of view even more important is the conformance to 
ODF 1.2.
I do not know, if the needed changes would cause an unconformance with ODF 1.2. 
If yes, things will be complicated.

just my thoughts from an ODF perspective.


Best regards, Oliver.

---------------------------------------------------------------------
To unsubscribe, e-mail: ooo-users-unsubscribe@incubator.apache.org
For additional commands, e-mail: ooo-users-help@incubator.apache.org


Re: [Calc][Discuss]Interoperability with Excel for SUMIF, COUNTIF, COUNTBLANK, DCOUNT, DSUM, ... with empty cells and "" and ="" empty string criteria.

Posted by Regina Henschel <rb...@t-online.de>.
Hi Lei Wang,

I think, that ODF itself is ambiguous and not clear and have written my 
concerns to office-comment
https://lists.oasis-open.org/archives/office-comment/201207/maillist.html

I think AOO should always calculate as defined in ODF. If Excel 
calculates another way, it is possible to define a new function with an 
postfix _add and use it for exchange with Excel. That is done for a lot 
of other functions already.

Kind regards
Regina

Lei Wang schrieb:
> @Dennis,
>
> I agree that conforming to ODFF is first goal. I add comments and
> attachment in https://issues.apache.org/ooo/show_bug.cgi?id=65221
>>From my understanding of ODFF, I give ODFF expect result for some formula
> related with empty cell in the attachment(xls). If there is any mistake,
> please comments.
>
> Open this file in AOO3.4/LO3.5, 16 cases fail.
> There are:
>
> Criterion in COUNTIF/SUMIF
>                             ODFF              AOO3.4/LO3.5
>   criterion as "="     empty cell        empty string cell
>   criterion as "<>"    not empty cell        not empty cell and not empty
> string cell
>   criterion as "<>0"    any cell except        any cell except 0, not
> including empty cell
>                               value 0, including
>                                empty cell
>
> Criterion in criteria for database formula, such as DSUM, DCOUNT
>                                ODFF            AOO3.4/LO3.5
>   criterion is empty    value 0            all cells
>   cell
>   criterion is empty    empty string        all cells
>   string
>   criterion as "="     empty cell        empty cell and empty string cell
>   criterion as "<>"    not empty cell        not empty cell and not empty
> string cell
>
> Also AOO3.4/LO3.5 has self-consistent problem.
> The different result for same criterion between COUNTIF/SUMIF and database
> formulu(DCOUNT/DSUM). These criterion includes:
> criterion is empty cell, criterion is empty string, criterion is "=",
> criterion is "<>0"
>
> While, open this file in Excel 2003, 8 cases fail.
> There are:
>
> Criterion in COUNTIF/SUMIF
>                                 ODFF                  Excel 2003
>   criterion is ""          empty string cell    empty string cell and empty
> cell
>   criterion is empty    empty string cell    empty string cell and empty cell
>   string
>
> Criterion in criteria for database formula, such as DSUM, DCOUNT
>                               ODFF            AOO3.4/LO3.5
>   criterion is empty    value 0            all cells
>   cell
>
> Also Excel 2003 has self-consistent problem.
> The different result for same criterion between COUNTIF/SUMIF and database
> formulu(DCOUNT/DSUM). These criterion includes:
> criterion is empty cell, criterion is empty string
>
> It seems Excel has better conformance for these test cases.
>
> Conform to ODFF is very important. Compatible with Excel is also very
> important, especially for some common used Excel scenario, such as
> COUNTIF(<refrange>, "") to count all cells displayed as empty. Do we have
> any chance to modify ODFF to solve this Excel user problem?
>
>
> On Tue, Jul 3, 2012 at 1:14 AM, Dennis E. Hamilton
> <de...@acm.org>wrote:
>
>> @Andrew,
>>
>> See latest comments on https://issues.apache.org/ooo/show_bug.cgi?id=65221
>>
>> I agree with orw that the first goal should be to align with OpenFormula,
>> and that is specific about the use of Criteria in SUMIF, COUNTIF, etc., as
>> you saw in section 4.11.8.
>>
>> In the case that there needs to be further clarification in OpenFormula,
>> comments can be submitted to the ODF TC for Errata and/or improvements in
>> the next edition, ODF 1.3.  OpenFormula was developed with cross-format
>> interoperability in mind and Microsoft participation was helpful for that.
>>
>> User Guide documentation on these cases is probably also important.
>>
>>   - Dennis
>>
>> -----Original Message-----
>> From: Andrew Pitonyak [mailto:andrew@pitonyak.org]
>> Sent: Monday, July 02, 2012 07:39
>> To: ooo-dev@incubator.apache.org
>> Subject: Re: [Calc][Discuss]Interoperability with Excel for SUMIF,
>> COUNTIF, COUNTBLANK, DCOUNT, DSUM, ... with empty cells and "" and =""
>> empty string criteria.
>>
>> On 02.07.2012 04:50, Lei Wang wrote:
>>> Hi all,
>>>
>>> There is an old bug i65221(
>>> https://issues.apache.org/ooo/show_bug.cgi?id=65221). It is about
>>> interoperability with Excel for SUMIF, COUNTIF, COUNTBLANK, DCOUNT,
>>> DSUM,
>>> ... with empty cells and "" and ="" empty string criteria.
>>>
>> [ ... ]
>>
>> I am surprised that the ODF standard does not dictate how this should
>> be handled. I was under the impression that function behaviour in an ODF
>> document was well defined. I believe that Robert Weir is far more aware
>> of ODF type issues than I (since I have seen him mentioned by name on
>> similar topics) - not that I intend to unfairly pull him into this
>> discussion.
>>
>> In Version 1.2 of the part-2 for OpenFormula, I don't see it obviously
>> spelled out, but I don't really have time to extensively pursue it.
>>
>> [ ... ]
>>
>> 4.7 Empty Cell
>> An empty cell is neither zero nor the empty string, and an empty cell
>> can be distinguished from cells containing values (including zero and
>> the empty string). An empty cell is not the same as an Error, in
>> particular, it is distinguishable from the Error #N/A (not available).
>>
>> Moving on to section 4.8........
>>
>> The definitions of specific operations and functions that allow
>> references as operands and parameters stipulate any particular
>> limitations there are on forms of references and how empty cells, when
>> permitted, are interpreted.
>>
>> Now, on to 4.11.8, pretty clear here:
>>
>> 4.11.8 Criterion
>> A criterion is a single cell Reference, Number or Text. It is used in
>> comparisons with cell contents.
>> A reference to an empty cell is interpreted as the numeric value 0.
>> A matching expression can be:
>> A Number or Logical value. A matching cell content equals the Number or
>> Logical value.
>> A value beginning with a comparator (<, <=, =, >, >=, <>). 6.4.9
>> For =, if the value is empty it matches empty cells. 4.7
>> For <>, if the value is empty it matches non-empty cells.
>> For <>, if the value is not empty it matches any cell content except
>> the value, including empty cells.
>> Note: "=0" does not match empty cells.
>> For = and <>, if the value is not empty and can not be interpreted as a
>> Number type or one of its subtypes and the host-defined property
>> HOST-SEARCH-CRITERIA-MUST-APPLY-TO-WHOLE-CELL is true, comparison is
>> against the entire cell contents, if false, comparison is against any
>> subpart of the field that matches the criteria. For = and <>, if the
>> value is not empty and can not be interpreted as a Number type or one of
>> its subtypes 3.4 applies.
>> Other Text value. If the host-defined property
>> HOST-SEARCH-CRITERIA-MUST-APPLY-TO-WHOLE-CELL is true, the comparison is
>> against the entire cell contents, if false, comparison is against any
>> subpart of the field that matches the criteria.
>>
>> [ ... ]
>>
>>
>>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: ooo-users-unsubscribe@incubator.apache.org
For additional commands, e-mail: ooo-users-help@incubator.apache.org


Re: [Calc][Discuss]Interoperability with Excel for SUMIF, COUNTIF, COUNTBLANK, DCOUNT, DSUM, ... with empty cells and "" and ="" empty string criteria.

Posted by Lei Wang <le...@gmail.com>.
@Dennis,

I agree that conforming to ODFF is first goal. I add comments and
attachment in https://issues.apache.org/ooo/show_bug.cgi?id=65221
>From my understanding of ODFF, I give ODFF expect result for some formula
related with empty cell in the attachment(xls). If there is any mistake,
please comments.

Open this file in AOO3.4/LO3.5, 16 cases fail.
There are:

Criterion in COUNTIF/SUMIF
                           ODFF              AOO3.4/LO3.5
 criterion as "="     empty cell        empty string cell
 criterion as "<>"    not empty cell        not empty cell and not empty
string cell
 criterion as "<>0"    any cell except        any cell except 0, not
including empty cell
                             value 0, including
                              empty cell

Criterion in criteria for database formula, such as DSUM, DCOUNT
                              ODFF            AOO3.4/LO3.5
 criterion is empty    value 0            all cells
 cell
 criterion is empty    empty string        all cells
 string
 criterion as "="     empty cell        empty cell and empty string cell
 criterion as "<>"    not empty cell        not empty cell and not empty
string cell

Also AOO3.4/LO3.5 has self-consistent problem.
The different result for same criterion between COUNTIF/SUMIF and database
formulu(DCOUNT/DSUM). These criterion includes:
criterion is empty cell, criterion is empty string, criterion is "=",
criterion is "<>0"

While, open this file in Excel 2003, 8 cases fail.
There are:

Criterion in COUNTIF/SUMIF
                               ODFF                  Excel 2003
 criterion is ""          empty string cell    empty string cell and empty
cell
 criterion is empty    empty string cell    empty string cell and empty cell
 string

Criterion in criteria for database formula, such as DSUM, DCOUNT
                             ODFF            AOO3.4/LO3.5
 criterion is empty    value 0            all cells
 cell

Also Excel 2003 has self-consistent problem.
The different result for same criterion between COUNTIF/SUMIF and database
formulu(DCOUNT/DSUM). These criterion includes:
criterion is empty cell, criterion is empty string

It seems Excel has better conformance for these test cases.

Conform to ODFF is very important. Compatible with Excel is also very
important, especially for some common used Excel scenario, such as
COUNTIF(<refrange>, "") to count all cells displayed as empty. Do we have
any chance to modify ODFF to solve this Excel user problem?


On Tue, Jul 3, 2012 at 1:14 AM, Dennis E. Hamilton
<de...@acm.org>wrote:

> @Andrew,
>
> See latest comments on https://issues.apache.org/ooo/show_bug.cgi?id=65221
>
> I agree with orw that the first goal should be to align with OpenFormula,
> and that is specific about the use of Criteria in SUMIF, COUNTIF, etc., as
> you saw in section 4.11.8.
>
> In the case that there needs to be further clarification in OpenFormula,
> comments can be submitted to the ODF TC for Errata and/or improvements in
> the next edition, ODF 1.3.  OpenFormula was developed with cross-format
> interoperability in mind and Microsoft participation was helpful for that.
>
> User Guide documentation on these cases is probably also important.
>
>  - Dennis
>
> -----Original Message-----
> From: Andrew Pitonyak [mailto:andrew@pitonyak.org]
> Sent: Monday, July 02, 2012 07:39
> To: ooo-dev@incubator.apache.org
> Subject: Re: [Calc][Discuss]Interoperability with Excel for SUMIF,
> COUNTIF, COUNTBLANK, DCOUNT, DSUM, ... with empty cells and "" and =""
> empty string criteria.
>
> On 02.07.2012 04:50, Lei Wang wrote:
> > Hi all,
> >
> > There is an old bug i65221(
> > https://issues.apache.org/ooo/show_bug.cgi?id=65221). It is about
> > interoperability with Excel for SUMIF, COUNTIF, COUNTBLANK, DCOUNT,
> > DSUM,
> > ... with empty cells and "" and ="" empty string criteria.
> >
> [ ... ]
>
> I am surprised that the ODF standard does not dictate how this should
> be handled. I was under the impression that function behaviour in an ODF
> document was well defined. I believe that Robert Weir is far more aware
> of ODF type issues than I (since I have seen him mentioned by name on
> similar topics) - not that I intend to unfairly pull him into this
> discussion.
>
> In Version 1.2 of the part-2 for OpenFormula, I don't see it obviously
> spelled out, but I don't really have time to extensively pursue it.
>
> [ ... ]
>
> 4.7 Empty Cell
> An empty cell is neither zero nor the empty string, and an empty cell
> can be distinguished from cells containing values (including zero and
> the empty string). An empty cell is not the same as an Error, in
> particular, it is distinguishable from the Error #N/A (not available).
>
> Moving on to section 4.8........
>
> The definitions of specific operations and functions that allow
> references as operands and parameters stipulate any particular
> limitations there are on forms of references and how empty cells, when
> permitted, are interpreted.
>
> Now, on to 4.11.8, pretty clear here:
>
> 4.11.8 Criterion
> A criterion is a single cell Reference, Number or Text. It is used in
> comparisons with cell contents.
> A reference to an empty cell is interpreted as the numeric value 0.
> A matching expression can be:
> A Number or Logical value. A matching cell content equals the Number or
> Logical value.
> A value beginning with a comparator (<, <=, =, >, >=, <>). 6.4.9
> For =, if the value is empty it matches empty cells. 4.7
> For <>, if the value is empty it matches non-empty cells.
> For <>, if the value is not empty it matches any cell content except
> the value, including empty cells.
> Note: "=0" does not match empty cells.
> For = and <>, if the value is not empty and can not be interpreted as a
> Number type or one of its subtypes and the host-defined property
> HOST-SEARCH-CRITERIA-MUST-APPLY-TO-WHOLE-CELL is true, comparison is
> against the entire cell contents, if false, comparison is against any
> subpart of the field that matches the criteria. For = and <>, if the
> value is not empty and can not be interpreted as a Number type or one of
> its subtypes 3.4 applies.
> Other Text value. If the host-defined property
> HOST-SEARCH-CRITERIA-MUST-APPLY-TO-WHOLE-CELL is true, the comparison is
> against the entire cell contents, if false, comparison is against any
> subpart of the field that matches the criteria.
>
> [ ... ]
>
>
>

Re: [Calc][Discuss]Interoperability with Excel for SUMIF, COUNTIF, COUNTBLANK, DCOUNT, DSUM, ... with empty cells and "" and ="" empty string criteria.

Posted by Lei Wang <le...@gmail.com>.
@Dennis,

I agree that conforming to ODFF is first goal. I add comments and
attachment in https://issues.apache.org/ooo/show_bug.cgi?id=65221
>From my understanding of ODFF, I give ODFF expect result for some formula
related with empty cell in the attachment(xls). If there is any mistake,
please comments.

Open this file in AOO3.4/LO3.5, 16 cases fail.
There are:

Criterion in COUNTIF/SUMIF
                           ODFF              AOO3.4/LO3.5
 criterion as "="     empty cell        empty string cell
 criterion as "<>"    not empty cell        not empty cell and not empty
string cell
 criterion as "<>0"    any cell except        any cell except 0, not
including empty cell
                             value 0, including
                              empty cell

Criterion in criteria for database formula, such as DSUM, DCOUNT
                              ODFF            AOO3.4/LO3.5
 criterion is empty    value 0            all cells
 cell
 criterion is empty    empty string        all cells
 string
 criterion as "="     empty cell        empty cell and empty string cell
 criterion as "<>"    not empty cell        not empty cell and not empty
string cell

Also AOO3.4/LO3.5 has self-consistent problem.
The different result for same criterion between COUNTIF/SUMIF and database
formulu(DCOUNT/DSUM). These criterion includes:
criterion is empty cell, criterion is empty string, criterion is "=",
criterion is "<>0"

While, open this file in Excel 2003, 8 cases fail.
There are:

Criterion in COUNTIF/SUMIF
                               ODFF                  Excel 2003
 criterion is ""          empty string cell    empty string cell and empty
cell
 criterion is empty    empty string cell    empty string cell and empty cell
 string

Criterion in criteria for database formula, such as DSUM, DCOUNT
                             ODFF            AOO3.4/LO3.5
 criterion is empty    value 0            all cells
 cell

Also Excel 2003 has self-consistent problem.
The different result for same criterion between COUNTIF/SUMIF and database
formulu(DCOUNT/DSUM). These criterion includes:
criterion is empty cell, criterion is empty string

It seems Excel has better conformance for these test cases.

Conform to ODFF is very important. Compatible with Excel is also very
important, especially for some common used Excel scenario, such as
COUNTIF(<refrange>, "") to count all cells displayed as empty. Do we have
any chance to modify ODFF to solve this Excel user problem?


On Tue, Jul 3, 2012 at 1:14 AM, Dennis E. Hamilton
<de...@acm.org>wrote:

> @Andrew,
>
> See latest comments on https://issues.apache.org/ooo/show_bug.cgi?id=65221
>
> I agree with orw that the first goal should be to align with OpenFormula,
> and that is specific about the use of Criteria in SUMIF, COUNTIF, etc., as
> you saw in section 4.11.8.
>
> In the case that there needs to be further clarification in OpenFormula,
> comments can be submitted to the ODF TC for Errata and/or improvements in
> the next edition, ODF 1.3.  OpenFormula was developed with cross-format
> interoperability in mind and Microsoft participation was helpful for that.
>
> User Guide documentation on these cases is probably also important.
>
>  - Dennis
>
> -----Original Message-----
> From: Andrew Pitonyak [mailto:andrew@pitonyak.org]
> Sent: Monday, July 02, 2012 07:39
> To: ooo-dev@incubator.apache.org
> Subject: Re: [Calc][Discuss]Interoperability with Excel for SUMIF,
> COUNTIF, COUNTBLANK, DCOUNT, DSUM, ... with empty cells and "" and =""
> empty string criteria.
>
> On 02.07.2012 04:50, Lei Wang wrote:
> > Hi all,
> >
> > There is an old bug i65221(
> > https://issues.apache.org/ooo/show_bug.cgi?id=65221). It is about
> > interoperability with Excel for SUMIF, COUNTIF, COUNTBLANK, DCOUNT,
> > DSUM,
> > ... with empty cells and "" and ="" empty string criteria.
> >
> [ ... ]
>
> I am surprised that the ODF standard does not dictate how this should
> be handled. I was under the impression that function behaviour in an ODF
> document was well defined. I believe that Robert Weir is far more aware
> of ODF type issues than I (since I have seen him mentioned by name on
> similar topics) - not that I intend to unfairly pull him into this
> discussion.
>
> In Version 1.2 of the part-2 for OpenFormula, I don't see it obviously
> spelled out, but I don't really have time to extensively pursue it.
>
> [ ... ]
>
> 4.7 Empty Cell
> An empty cell is neither zero nor the empty string, and an empty cell
> can be distinguished from cells containing values (including zero and
> the empty string). An empty cell is not the same as an Error, in
> particular, it is distinguishable from the Error #N/A (not available).
>
> Moving on to section 4.8........
>
> The definitions of specific operations and functions that allow
> references as operands and parameters stipulate any particular
> limitations there are on forms of references and how empty cells, when
> permitted, are interpreted.
>
> Now, on to 4.11.8, pretty clear here:
>
> 4.11.8 Criterion
> A criterion is a single cell Reference, Number or Text. It is used in
> comparisons with cell contents.
> A reference to an empty cell is interpreted as the numeric value 0.
> A matching expression can be:
> A Number or Logical value. A matching cell content equals the Number or
> Logical value.
> A value beginning with a comparator (<, <=, =, >, >=, <>). 6.4.9
> For =, if the value is empty it matches empty cells. 4.7
> For <>, if the value is empty it matches non-empty cells.
> For <>, if the value is not empty it matches any cell content except
> the value, including empty cells.
> Note: "=0" does not match empty cells.
> For = and <>, if the value is not empty and can not be interpreted as a
> Number type or one of its subtypes and the host-defined property
> HOST-SEARCH-CRITERIA-MUST-APPLY-TO-WHOLE-CELL is true, comparison is
> against the entire cell contents, if false, comparison is against any
> subpart of the field that matches the criteria. For = and <>, if the
> value is not empty and can not be interpreted as a Number type or one of
> its subtypes 3.4 applies.
> Other Text value. If the host-defined property
> HOST-SEARCH-CRITERIA-MUST-APPLY-TO-WHOLE-CELL is true, the comparison is
> against the entire cell contents, if false, comparison is against any
> subpart of the field that matches the criteria.
>
> [ ... ]
>
>
>

RE: [Calc][Discuss]Interoperability with Excel for SUMIF, COUNTIF, COUNTBLANK, DCOUNT, DSUM, ... with empty cells and "" and ="" empty string criteria.

Posted by "Dennis E. Hamilton" <de...@acm.org>.
@Andrew,

See latest comments on https://issues.apache.org/ooo/show_bug.cgi?id=65221

I agree with orw that the first goal should be to align with OpenFormula, and that is specific about the use of Criteria in SUMIF, COUNTIF, etc., as you saw in section 4.11.8.  

In the case that there needs to be further clarification in OpenFormula, comments can be submitted to the ODF TC for Errata and/or improvements in the next edition, ODF 1.3.  OpenFormula was developed with cross-format interoperability in mind and Microsoft participation was helpful for that.  

User Guide documentation on these cases is probably also important.

 - Dennis

-----Original Message-----
From: Andrew Pitonyak [mailto:andrew@pitonyak.org] 
Sent: Monday, July 02, 2012 07:39
To: ooo-dev@incubator.apache.org
Subject: Re: [Calc][Discuss]Interoperability with Excel for SUMIF, COUNTIF, COUNTBLANK, DCOUNT, DSUM, ... with empty cells and "" and ="" empty string criteria.

On 02.07.2012 04:50, Lei Wang wrote:
> Hi all,
>
> There is an old bug i65221(
> https://issues.apache.org/ooo/show_bug.cgi?id=65221). It is about
> interoperability with Excel for SUMIF, COUNTIF, COUNTBLANK, DCOUNT, 
> DSUM,
> ... with empty cells and "" and ="" empty string criteria.
>
[ ... ]

I am surprised that the ODF standard does not dictate how this should 
be handled. I was under the impression that function behaviour in an ODF 
document was well defined. I believe that Robert Weir is far more aware 
of ODF type issues than I (since I have seen him mentioned by name on 
similar topics) - not that I intend to unfairly pull him into this 
discussion.

In Version 1.2 of the part-2 for OpenFormula, I don't see it obviously 
spelled out, but I don't really have time to extensively pursue it.

[ ... ]

4.7 Empty Cell
An empty cell is neither zero nor the empty string, and an empty cell 
can be distinguished from cells containing values (including zero and 
the empty string). An empty cell is not the same as an Error, in 
particular, it is distinguishable from the Error #N/A (not available).

Moving on to section 4.8........

The definitions of specific operations and functions that allow 
references as operands and parameters stipulate any particular 
limitations there are on forms of references and how empty cells, when 
permitted, are interpreted.

Now, on to 4.11.8, pretty clear here:

4.11.8 Criterion
A criterion is a single cell Reference, Number or Text. It is used in 
comparisons with cell contents.
A reference to an empty cell is interpreted as the numeric value 0.
A matching expression can be:
A Number or Logical value. A matching cell content equals the Number or 
Logical value.
A value beginning with a comparator (<, <=, =, >, >=, <>). 6.4.9
For =, if the value is empty it matches empty cells. 4.7
For <>, if the value is empty it matches non-empty cells.
For <>, if the value is not empty it matches any cell content except 
the value, including empty cells.
Note: "=0" does not match empty cells.
For = and <>, if the value is not empty and can not be interpreted as a 
Number type or one of its subtypes and the host-defined property 
HOST-SEARCH-CRITERIA-MUST-APPLY-TO-WHOLE-CELL is true, comparison is 
against the entire cell contents, if false, comparison is against any 
subpart of the field that matches the criteria. For = and <>, if the 
value is not empty and can not be interpreted as a Number type or one of 
its subtypes 3.4 applies.
Other Text value. If the host-defined property 
HOST-SEARCH-CRITERIA-MUST-APPLY-TO-WHOLE-CELL is true, the comparison is 
against the entire cell contents, if false, comparison is against any 
subpart of the field that matches the criteria.

[ ... ]



Re: [Calc][Discuss]Interoperability with Excel for SUMIF, COUNTIF, COUNTBLANK, DCOUNT, DSUM, ... with empty cells and "" and="" empty string criteria.

Posted by Andrew Pitonyak <an...@pitonyak.org>.
On 02.07.2012 04:50, Lei Wang wrote:
> Hi all,
>
> There is an old bug i65221(
> https://issues.apache.org/ooo/show_bug.cgi?id=65221). It is about
> interoperability with Excel for SUMIF, COUNTIF, COUNTBLANK, DCOUNT, 
> DSUM,
> ... with empty cells and "" and ="" empty string criteria.
>
> There is some discussion about it, but no conclusion. Some people 
> think
> Excel does not do well, so simply mimicking Excel behavior might not 
> be the
> best idea. Some people think self-consistency in Calc is more 
> important.
> Some people think compatibility with Excel is a huge concern.
>
> IMHO, compatible with Excel is very important. I suggest changing 
> these
> formulas behavior according to Excel. Following is my consideration
> 1)Most people use Excel, compatible with Excel is very important. 
> Excel
> does not show self-consistency in some scenarios, Calc doe not show
> self-consistency either.
> 2)Some frequently used function, such as calculating empty cells, can 
> not
> be implemented by Calc, while Excel can.
> 3)Although Excel does not show self-consistency in some scenarios. 
> These
> scenarios are all related with pure empty cell. Excel use following 
> rules
>   When empty cell is criteria, only cells with value 0 meet it. This 
> is
> true for COUNTIF, SUMIF. But not true for DSUM/DCOUNT/IF.
>   When an empty cell in a cell range, it can meet empty string 
> criteria.
> This is true for COUNTIF, SUMIF. But not true for DSUM/DCOUNT.


I believe that you should expand your area of concern..... For example, 
I recently found a spreadsheet created and used in another spreadsheet 
program, don't know off hand if it was a Microsoft variant or a 
WordPerfect variant. The document was opened into Calc using OOo for a 
few years. When the document was tried on LibreOffice, it failed. Why? 
Because a few of the "empty" cells contained a single space and LO threw 
an error rather than dealing with it the way that the other programs had 
dealt with it. In this case, the only thing that I can say 
authoritatively is that LO and AOO treated the file differently. I did 
not pursue why LO made the change. My only point is that if you desire 
to argue that the functions in AOO should be changed to match those in 
Excel, you should also verify the behaviour of general numerical methods 
and how they treat cells with a blank.

Off hand, I think that documenting the differences would be very 
valuable even if changes are not made to AOO.

Perhaps even more important than compatibility with Excel is that the 
software that supports ODF natively are consistent.

I am surprised that the ODF standard does not dictate how this should 
be handled. I was under the impression that function behaviour in an ODF 
document was well defined. I believe that Robert Weir is far more aware 
of ODF type issues than I (since I have seen him mentioned by name on 
similar topics) - not that I intend to unfairly pull him into this 
discussion.

In Version 1.2 of the part-2 for OpenFormula, I don't see it obviously 
spelled out, but I don't really have time to extensively pursue it.

The closest I see of say "SUMIF" is this:

The values returned may vary depending upon the 
HOST-USE-REGULAR-EXPRESSIONS or HOST-USE-WILDCARDS or 
HOST-SEARCH-CRITERIA-MUST-APPLY-TO-WHOLE-CELL properties. 3.4

I did check the host defined behaviors section and that is primarily 
related to things such as locale and case sensitivity.

Well, it does say this:

4.7 Empty Cell
An empty cell is neither zero nor the empty string, and an empty cell 
can be distinguished from cells containing values (including zero and 
the empty string). An empty cell is not the same as an Error, in 
particular, it is distinguishable from the Error #N/A (not available).

Moving on to section 4.8........

The definitions of specific operations and functions that allow 
references as operands and parameters stipulate any particular 
limitations there are on forms of references and how empty cells, when 
permitted, are interpreted.

Now, on to 4.11.8, pretty clear here:

4.11.8 Criterion
A criterion is a single cell Reference, Number or Text. It is used in 
comparisons with cell contents.
A reference to an empty cell is interpreted as the numeric value 0.
A matching expression can be:
A Number or Logical value. A matching cell content equals the Number or 
Logical value.
A value beginning with a comparator (<, <=, =, >, >=, <>). 6.4.9
For =, if the value is empty it matches empty cells. 4.7
For <>, if the value is empty it matches non-empty cells.
For <>, if the value is not empty it matches any cell content except 
the value, including empty cells.
Note: "=0" does not match empty cells.
For = and <>, if the value is not empty and can not be interpreted as a 
Number type or one of its subtypes and the host-defined property 
HOST-SEARCH-CRITERIA-MUST-APPLY-TO-WHOLE-CELL is true, comparison is 
against the entire cell contents, if false, comparison is against any 
subpart of the field that matches the criteria. For = and <>, if the 
value is not empty and can not be interpreted as a Number type or one of 
its subtypes 3.4 applies.
Other Text value. If the host-defined property 
HOST-SEARCH-CRITERIA-MUST-APPLY-TO-WHOLE-CELL is true, the comparison is 
against the entire cell contents, if false, comparison is against any 
subpart of the field that matches the criteria.



In your opinion, are you asking that the methods are changed to match 
4.11.8 or so that they will not match 4.11.8?


Sorry, but I am very interested in this.

Andrew Pitonyak