You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Ryan Desmond (JIRA)" <ji...@apache.org> on 2016/02/29 22:22:18 UTC

[jira] [Updated] (TIKA-1880) Tag for number-columns-repeated not correctly used in ODS documents

     [ https://issues.apache.org/jira/browse/TIKA-1880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan Desmond updated TIKA-1880:
-------------------------------
    Description: 
When the ODS writer has first written, it made the assumption that the the `number-columns-repeated` attribute for cells would only be used for blank cells.  This is not the case with documents created by (at least) LibreOffice  4.4.7.2.

The note in the Tika source (OpenDocumentContentParser.java#L459):

TODO: The following is not correct, the cell should be repeated not spanned!
 * Code generates a HTML cell, spanning all repeated columns, to make the cell look correct.
 * Problems may occur when both spanning and repeating is given, which is not allowed by spec.
 * Cell spanning instead of repeating  is not a problem, because OpenOffice uses it
 * only for empty cells.
 *

  was:
When the ODS writer has first written, it made the assumption that the the `number-columns-repeated` attribute for cells would only be used for blank cells.  This is not the case with documents created by (at least) LibreOffice  4.4.7.2.

The note in the Tika source (OpenDocumentContentParser.java#L459):

```/* TODO: The following is not correct, the cell should be repeated not spanned!
 * Code generates a HTML cell, spanning all repeated columns, to make the cell look correct.
 * Problems may occur when both spanning and repeating is given, which is not allowed by spec.
 * Cell spanning instead of repeating  is not a problem, because OpenOffice uses it
 * only for empty cells.
 */```


> Tag for number-columns-repeated not correctly used in ODS documents
> -------------------------------------------------------------------
>
>                 Key: TIKA-1880
>                 URL: https://issues.apache.org/jira/browse/TIKA-1880
>             Project: Tika
>          Issue Type: Improvement
>          Components: parser
>    Affects Versions: 1.12
>            Reporter: Ryan Desmond
>            Priority: Minor
>              Labels: LibreOffice
>
> When the ODS writer has first written, it made the assumption that the the `number-columns-repeated` attribute for cells would only be used for blank cells.  This is not the case with documents created by (at least) LibreOffice  4.4.7.2.
> The note in the Tika source (OpenDocumentContentParser.java#L459):
> TODO: The following is not correct, the cell should be repeated not spanned!
>  * Code generates a HTML cell, spanning all repeated columns, to make the cell look correct.
>  * Problems may occur when both spanning and repeating is given, which is not allowed by spec.
>  * Cell spanning instead of repeating  is not a problem, because OpenOffice uses it
>  * only for empty cells.
>  *



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)