You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@openoffice.apache.org by Rob Weir <ro...@apache.org> on 2013/06/10 23:09:51 UTC

Re: [DISCUSS] ODF file formats vs Zip

On Mon, Jun 10, 2013 at 12:22 PM, Dennis E. Hamilton
<de...@acm.org> wrote:
> Looking into a document having images with WinZip, it appears that GIF and PNG files are not compressed and SVM files are (with great improvement).  The content.xml files, which can be megabytes long, benefit greatly from compression (9:1 easily).
>
> Apache OpenOffice 3.4.1 and older versions of OpenOffice.org will compress the Thumbnail PNG.  Not sure why, but it is a small file so it shouldn't matter in terms of Save performance.
>

If you really want to try a monster test case, try the spreadsheets
from this old ZDNet article from 2005:

http://www.zdnet.com/blog/ou/performance-analysis-of-openoffice-and-ms-office/120

They are in pre-ODF XML formats, but can easily be converted.  Try it
as DOC and as ODS.

The files themselves look quite reasonable, due to the ZIP
compression.  But then try unzipping the file.   You'll see the
content.xml is much, much larger.

The problem we have with large ODF spreadsheets is our cell-by-cell
table markup is very verbose.  We also lack a "string-pool" structure
in the markup to deal with repeated strings, which are common in
database-like uses of a spreadsheet.

Regards,

-Rob



>  - Dennis
>
> PS: I don't know whether uncompressed results are also obtained by attempting compression and reverting to STORED when the compression is unsuccessful.  Some software does that sort of thing.  (I have a recollection that DEFLATE can also produce uncompressed sections on discovery of their uncompressability, but the result won't be the same size as the original.  I don't know if the DEFLATE compression used will produce those.)
>
> -----Original Message-----
> From: Rob Weir [mailto:robweir@apache.org]
> Sent: Monday, June 10, 2013 05:45 AM
> To: users@openoffice.apache.org; Dennis Hamilton
> Subject: Re: [DISCUSS] ODF file formats vs Zip
>
> On Sun, Jun 9, 2013 at 4:01 PM, Dennis E. Hamilton
> <de...@acm.org> wrote:
>> Regina is correct about the only two compressions.  As far as I know, there is no way to control which compression is used.  (If you save with Password, all files are always compressed.)  Most of the time DEFLATE is used (although there are two files that are not usually compressed, apparently to make metadata mining simpler for non-encrypted packages).
>>
>> There is currently no way to control the compression in AOO.  (The ODF specification simply stipulates the compression that must be used when compression is done, not whether compression is done for parts of unencrypted packages.)
>>
>
> Does anyone know whether AOO is smart enough to not waste time trying
> to compress already compressed files, like PNG images?  This could
> make a big difference in presentations.
>
> -Rob
>
> [ ... ]
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@openoffice.apache.org
> For additional commands, e-mail: users-help@openoffice.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@openoffice.apache.org
For additional commands, e-mail: dev-help@openoffice.apache.org