You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Albert L. (Created) (JIRA)" <ji...@apache.org> on 2011/12/19 16:57:31 UTC

[jira] [Created] (TIKA-816) (XLS/XLSX) Missing date/time in text content.

(XLS/XLSX) Missing date/time in text content.
---------------------------------------------

                 Key: TIKA-816
                 URL: https://issues.apache.org/jira/browse/TIKA-816
             Project: Tika
          Issue Type: Bug
          Components: general
    Affects Versions: 1.0
         Environment: Win7-64 + java version "1.6.0_26"
            Reporter: Albert L.
             Fix For: 1.1


Missing data in text content for XLS and XLSX files.

The date and time are not present for cells with the content "=now()" and "=today()".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (TIKA-816) (XLS/XLSX) Improperly formatted date/time in text content.

Posted by "Nick Burch (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TIKA-816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nick Burch resolved TIKA-816.
-----------------------------

    Resolution: Fixed

As of r1309005 we've upgraded to POI 3.8 Final, which includes the required fixes
                
> (XLS/XLSX) Improperly formatted date/time in text content.
> ----------------------------------------------------------
>
>                 Key: TIKA-816
>                 URL: https://issues.apache.org/jira/browse/TIKA-816
>             Project: Tika
>          Issue Type: Bug
>          Components: general
>    Affects Versions: 1.0
>         Environment: Win7-64 + java version "1.6.0_26"
>            Reporter: Albert L.
>             Fix For: 1.2
>
>
> Improperly formated text content for XLS and XLSX files.
> The date and time are not formatted as date/time data but rather floating point numbers.  This occurs for cells with the content as "=now()" or "=today()".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (TIKA-816) (XLS/XLSX) Improperly formatted date/time in text content.

Posted by "Albert L. (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/TIKA-816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172469#comment-13172469 ] 

Albert L. commented on TIKA-816:
--------------------------------

XLS files seem to work when calling text extraction via HSSF from POI v3.8 beta 5.

XLSX files seem to still FAIL when calling text extraction via XSSF from POI v3.8 beta 5.
                
> (XLS/XLSX) Improperly formatted date/time in text content.
> ----------------------------------------------------------
>
>                 Key: TIKA-816
>                 URL: https://issues.apache.org/jira/browse/TIKA-816
>             Project: Tika
>          Issue Type: Bug
>          Components: general
>    Affects Versions: 1.0
>         Environment: Win7-64 + java version "1.6.0_26"
>            Reporter: Albert L.
>             Fix For: 1.1
>
>
> Improperly formated text content for XLS and XLSX files.
> The date and time are not formatted as date/time data but rather floating point numbers.  This occurs for cells with the content as "=now()" or "=today()".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (TIKA-816) (XLS/XLSX) Improperly formatted date/time in text content.

Posted by "Chris A. Mattmann (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TIKA-816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris A. Mattmann updated TIKA-816:
-----------------------------------

    Fix Version/s:     (was: 1.1)
                   1.2

- push out to 1.2
                
> (XLS/XLSX) Improperly formatted date/time in text content.
> ----------------------------------------------------------
>
>                 Key: TIKA-816
>                 URL: https://issues.apache.org/jira/browse/TIKA-816
>             Project: Tika
>          Issue Type: Bug
>          Components: general
>    Affects Versions: 1.0
>         Environment: Win7-64 + java version "1.6.0_26"
>            Reporter: Albert L.
>             Fix For: 1.2
>
>
> Improperly formated text content for XLS and XLSX files.
> The date and time are not formatted as date/time data but rather floating point numbers.  This occurs for cells with the content as "=now()" or "=today()".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (TIKA-816) (XLS/XLSX) Improperly formatted date/time in text content.

Posted by "Nick Burch (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/TIKA-816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172985#comment-13172985 ] 

Nick Burch commented on TIKA-816:
---------------------------------

Now that POI bug #52369 is fixed, we should get the XLSX fix on the next POI upgrade

For the XLS side, we weren't formatting formula cells. I've fixed this in r1221119.
                
> (XLS/XLSX) Improperly formatted date/time in text content.
> ----------------------------------------------------------
>
>                 Key: TIKA-816
>                 URL: https://issues.apache.org/jira/browse/TIKA-816
>             Project: Tika
>          Issue Type: Bug
>          Components: general
>    Affects Versions: 1.0
>         Environment: Win7-64 + java version "1.6.0_26"
>            Reporter: Albert L.
>             Fix For: 1.1
>
>
> Improperly formated text content for XLS and XLSX files.
> The date and time are not formatted as date/time data but rather floating point numbers.  This occurs for cells with the content as "=now()" or "=today()".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (TIKA-816) (XLS/XLSX) Improperly formatted date/time in text content.

Posted by "Albert L. (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/TIKA-816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172470#comment-13172470 ] 

Albert L. commented on TIKA-816:
--------------------------------

Bug 52369 - XLSX: text extraction malformed "=NOW()" and "=TODAY()" cells
https://issues.apache.org/bugzilla/show_bug.cgi?id=52369
                
> (XLS/XLSX) Improperly formatted date/time in text content.
> ----------------------------------------------------------
>
>                 Key: TIKA-816
>                 URL: https://issues.apache.org/jira/browse/TIKA-816
>             Project: Tika
>          Issue Type: Bug
>          Components: general
>    Affects Versions: 1.0
>         Environment: Win7-64 + java version "1.6.0_26"
>            Reporter: Albert L.
>             Fix For: 1.1
>
>
> Improperly formated text content for XLS and XLSX files.
> The date and time are not formatted as date/time data but rather floating point numbers.  This occurs for cells with the content as "=now()" or "=today()".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (TIKA-816) (XLS/XLSX) Improperly formatted date/time in text content.

Posted by "Albert L. (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TIKA-816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Albert L. updated TIKA-816:
---------------------------

    Description: 
Improperly formated text content for XLS and XLSX files.

The date and time are not formatted as date/time data but rather floating point numbers.  This occurs for cells with the content as "=now()" or "=today()".

  was:
Missing data in text content for XLS and XLSX files.

The date and time are not present for cells with the content "=now()" and "=today()".

        Summary: (XLS/XLSX) Improperly formatted date/time in text content.  (was: (XLS/XLSX) Missing date/time in text content.)
    
> (XLS/XLSX) Improperly formatted date/time in text content.
> ----------------------------------------------------------
>
>                 Key: TIKA-816
>                 URL: https://issues.apache.org/jira/browse/TIKA-816
>             Project: Tika
>          Issue Type: Bug
>          Components: general
>    Affects Versions: 1.0
>         Environment: Win7-64 + java version "1.6.0_26"
>            Reporter: Albert L.
>             Fix For: 1.1
>
>
> Improperly formated text content for XLS and XLSX files.
> The date and time are not formatted as date/time data but rather floating point numbers.  This occurs for cells with the content as "=now()" or "=today()".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira