You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by bu...@apache.org on 2014/03/03 01:29:51 UTC

[Bug 56205] New: [PATCH] Upgrade OOXML schema to 3rd edition (transitional)

https://issues.apache.org/bugzilla/show_bug.cgi?id=56205

            Bug ID: 56205
           Summary: [PATCH] Upgrade OOXML schema to 3rd edition
                    (transitional)
           Product: POI
           Version: 3.11-dev
          Hardware: All
                OS: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: POI Overall
          Assignee: dev@poi.apache.org
          Reporter: andreas.beeker@gmx.de

Created attachment 31358
  --> https://issues.apache.org/bugzilla/attachment.cgi?id=31358&action=edit
[PATCH] Upgrade OOXML schema to 3rd edition (transitional)

The currently used OOXML schema is a bit outdated and therefore hasn't got a
few of the recently added Office features (e.g. XSSF sheetProtection is missing
a few attributes).

The ECMA site [1] has already provided a 4th edition of the schemas.
After fiddling around a bit with the schemas, I think it's ok to use the 3rd
edition of the transitional schema, as its main incompatibility (to the 1st
edition) are length and percent definitions. The 4th edition would cause much
more changes in the current POI codebase.

Although the testcases run through with some modifications, I'm not sure about
the impact, which this patch would have on users code, especially when the
xmlbeans classes are used directly.

Therefore I'd like to discuss in this entry, if features which aren't covered
by the 1st edition schema, should be created dynamically without a backing
schema or if it's ok to potentially break user code which works with the
internal xml representation.


[1] http://www.ecma-international.org/publications/standards/Ecma-376.htm

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 56205] [PATCH] Upgrade OOXML schema to 3rd edition (transitional)

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=56205

Andreas Beeker <an...@gmx.de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #31358|0                           |1
        is obsolete|                            |

--- Comment #3 from Andreas Beeker <an...@gmx.de> ---
Created attachment 31383
  --> https://issues.apache.org/bugzilla/attachment.cgi?id=31383&action=edit
[PATCH] Upgrade OOXML schema to 3rd edition (transitional)

ok, the SimpleDocument test was a false positive error (or whatever you call
it).
The generated document only differs in the few STOnOff attributes and will be
displayed the same in the MS Word View irrespective of the schema version.
But when viewed in Libre Office, both versions are broken.

Apart of the poi examples, I did a png-rendering of the themes.pptx and this
looks also ok, so the unit conversions should be ok.

I've changed some unit calculation, to use the dxa call.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 56205] [PATCH] Upgrade OOXML schema to 3rd edition (transitional)

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=56205

--- Comment #2 from Andreas Beeker <an...@gmx.de> ---
I thought about comparing the schemas, but the changes are substantially, so
that doesn't work.

So the next try was to check the poi examples and their output, XSSF and XSLF
looked ok, but XWPF has some problems (e.g. the SimpleDocument example looks
not ok).
So the patch needs some rework, especially as most changes were in the XWPF
part.

Apart of the junit test I haven't checked input processing yet.

There are some infos about the compatibility on this office article [1] - note
the line: "... writes files conformant to ISO/IEC 29500 Transitional ...", but
when you look into the details (e.g. [2] as an example for the other affected
percent attributes), you see that although it's able to read an alternative
format, it writes the legacy format. 

As far as I have checked the changes for length/percent attributes, it depends
on POI if the resulting file can be read by versions < 2010, e.g. if
measurement units are used in length attributes, the file probably can't be
read anymore by versions < 2010. Therefore we would need to take care when
populating new attributes to stick with the legacy format, if possible.

The new "sharedTypes" namespace [3] seems to stay out of the resulting file.

So I guess in the end, it's a trade off 
- using a new schema and potentially using/introducing features which can be
only used in newer Office versions
- vs. having the greatest common format, i.e. a schema which only allows one
kind of attribute content

> If we use the newer schemas, how does that change what we output? Will it mean that the files we generate stop being compatible with older office versions?
That depends, if we use the new features

> How about input? Will it mean we stop being able to read files generated by older versions of POI, or older versions of office?
The 3rd transitional schema should be compatible to the 1st edition - but there
are certain features like VML, which are phased out.


[1] http://msdn.microsoft.com/en-US/library/office/gg607163(v=office.14).aspx
[2] http://msdn.microsoft.com/en-us/library/gg548598(v=office.12).aspx
[3] http://schemas.openxmlformats.org/officeDocument/2006/sharedTypes

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 56205] [PATCH] Upgrade OOXML schema to 3rd edition (transitional)

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=56205

--- Comment #1 from Nick Burch <ap...@gagravarr.org> ---
If we use the newer schemas, how does that change what we output? Will it mean
that the files we generate stop being compatible with older office versions?

How about input? Will it mean we stop being able to read files generated by
older versions of POI, or older versions of office?

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org