You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@openoffice.apache.org by bu...@apache.org on 2015/06/23 09:56:59 UTC

[Issue 126381] New: OO corrupts formulas when reading Excel spreadsheet

https://bz.apache.org/ooo/show_bug.cgi?id=126381

          Issue ID: 126381
        Issue Type: DEFECT
           Summary: OO corrupts formulas when reading Excel spreadsheet
           Product: Calc
           Version: 4.1.1
          Hardware: PC
                OS: Windows 7
            Status: UNCONFIRMED
          Severity: critical
          Priority: P5
         Component: programming
          Assignee: issues@openoffice.apache.org
          Reporter: jty@mimuw.edu.pl

-- 
You are receiving this mail because:
You are the assignee for the issue.

[Issue 126381] OO corrupts formulas when reading Excel spreadsheet

Posted by bu...@apache.org.
https://bz.apache.org/ooo/show_bug.cgi?id=126381

orcmid <or...@apache.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
                 CC|                            |orcmid@apache.org
             Status|UNCONFIRMED                 |CONFIRMED
          Component|programming                 |open-import

--- Comment #2 from orcmid <or...@apache.org> ---
SUMMARY (the TLDR)
I changed this to be a problem with open-import, since it involves conversion.
I also found that the problem is with the conversion to and/or recording of
OpenFormula incorrectly, leading to errors of various kinds.  The use of
full-row references in these INDEX formulas is not interoperable and involves a
combination of defects.  It doubtless extends to cases beyond use in INDEX.

DEMONSTRATION

I confirmed J.Ty's example XSLX using Microsoft Office 2013 Excel.  It reopens
fine.

 - - - Making an ODS from Excel and interchanging that - - - 

I used Excel 2013 to save the example as an .ODS file. Excel reopens the saved
.ODS just fine.

Note, what is shown to users and what is recorded in the file can be different.
For example, the XSLX cell A4 formula entry "=INDEX(1:1,INDEX(2:2,3))" is
*recorded* in the Excel-produced ODS file as
"of:=INDEX([.1:.1];INDEX([.2:.2];3))". This is perfectly valid in accordance
with section 5.8 References of the OpenFormula Specification part of ODF 1.2. 
It might also be valid to record it as "of:=INDEX([.1];INDEX([.2];3)" since the
start and end rows are the same each time.  Details at 6.14.6 INDEX determine
that the second parameter is the number of the column to select since the data
source is a single row vector.

Using the Excel-produced ODS, LibreOffice 4.3.5.2 *presents* the A4 formula as
"=INDEX(1:1,INDEX(2:2,3))" and the results for all of A4:A6 are correct. 

However, on performing a Save As of the opened file, that LibreOffice saved
version is not reopened correctly.  The re-saved A4:A6 on opening in
LibreOffice *present* the same formulas but the cells show value "Err:502". 

When LibreOffice recorded that  ODS, the A4 formula is *recorded* as
"of:=INDEX(1:1;INDEX(2:2;3))" and similarly for A5:A6.  That is, the formulas
are rewritten in the Save As result to a form where LibreOffice fails.  When
that file is opened in Excel 2013, Excel 2013 reports that the file is
defective.  If Excel is allowed to repair the file, it will have the formulas
dropped and the values preserved.

Using the Excel-produced ODS, Apache OpenOffice 4.1.1 Calc "presents" the A4
formula as "=INDEX(1:1;INDEX(2:2;3))" and similarly for the rest of A4:A6. 
However, AOO shows "#NAME?" for the value of each of those cells. When the
saved version from LibreOffice is opened in AOO, the results are the same
("Err:502") as when LibreOffice opens that file it saved.

 - - - Importing the .XSLX - - -

Now that we see there are serious asymmetries in how OpenFormula (recorded in
the ODS) INDEX (and full row-/column selection) is interpreted by three
different products, where some products can't even read what they have written,
and AOO can't read any of it, let's go back to the import of .XSLX.

On opening of the J.Ty's original XSLX file in LibreOffice, LibreOffice 4.3.5.2
*presents* mangled formulas for A4:A6, showing
"=INDEX($A1:$AMJ1INDEX($A2:$AMJ2))", "=INDEX($A2:$AMJ2)", and
"=INDEX($A1:$AMJ1A5)".  The cell values are shown as "Err:508", "2", and
"Err:509".  It is cleared that the presented formulas are a defective rewriting
of the row references that corrupts the formula as presented.  On saving of
that file to a .ODS from LibreOffice, the A4:A6 formulas as saved are
"of:=INDEX([.$A1:.$AMJ1]INDEX([.$A2:.$AMJ2]))", "of:=INDEX([.$A2:.$AMJ2])", and
"of:=INDEX([.$A1:.$AMJ1][.A5])".  Excel 2013 reports that file as corrupted.

On opening of the J.Ty's original XSLX file in Apache OpenOffice 4.1.1, the
A4:A6 values are "Err:508", 3, and "Err:509" but the *presented* formulas are
the same as presented by LibreOffice.  When the opened file is saved to a .ODS
file by AOO, the formulas in the file are the same as those from LibreOffice
also.  Excel 2013 reports that file as corrupted.

The defect is confirmed.  There appear to be multiple problems involved.

-- 
You are receiving this mail because:
You are the assignee for the issue.

[Issue 126381] OO corrupts formulas when reading Excel spreadsheet

Posted by bu...@apache.org.
https://bz.apache.org/ooo/show_bug.cgi?id=126381

J.Ty. <jt...@mimuw.edu.pl> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|Major                       |Critical
                 CC|                            |jty@mimuw.edu.pl

-- 
You are receiving this mail because:
You are the assignee for the issue.

[Issue 126381] OO corrupts formulas when reading Excel spreadsheet

Posted by bu...@apache.org.
https://bz.apache.org/ooo/show_bug.cgi?id=126381

J.Ty. <jt...@mimuw.edu.pl> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|Critical                    |Major

-- 
You are receiving this mail because:
You are the assignee for the issue.

[Issue 126381] OO corrupts formulas when reading Excel spreadsheet

Posted by bu...@apache.org.
https://bz.apache.org/ooo/show_bug.cgi?id=126381

--- Comment #1 from J.Ty. <jt...@mimuw.edu.pl> ---
Created attachment 84806
  --> https://bz.apache.org/ooo/attachment.cgi?id=84806&action=edit
an example spreadsheet corrupted by OO

OO corrupts spreadsheets created under Excel, which contain references to whole
rows in the form '1:1'.

E.g., Excel's meaningful =INDEX(1:1,INDEX(2:2,3))
becomes OO's =INDEX($A1:$AMJ1INDEX($A2:$AMJ2)), which makes no sense and
produces an error.

The most pervert form is that Excel's =INDEX(2:2,2)
is turned into OO's =INDEX($A2:$AMJ2), which is syntactically correct, but
produces generally a different value.

-- 
You are receiving this mail because:
You are the assignee for the issue.

[Issue 126381] OO corrupts formulas when reading Excel spreadsheet

Posted by bu...@apache.org.
https://bz.apache.org/ooo/show_bug.cgi?id=126381

brinzing <ol...@gmx.de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |oliver.brinzing@gmx.de

--- Comment #3 from brinzing <ol...@gmx.de> ---
.

-- 
You are receiving this mail because:
You are the assignee for the issue.