You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tika.apache.org by Harry Hochheiser <hs...@gmail.com> on 2010/08/11 23:45:24 UTC

Difficulty with Excel Parsing

A question from a tika novice:

I have a very simple excel file that I'm trying to parse with the
command line application. If I try it in .xlsx format, it comes out
fine, but I get nothing (beyond markup) if I parse the same file saved
as a .xls.

Obviously, I'd like to find a way to make this file parse consistently.

My tika is a current checkout from SVN.

Any suggestions would be appreciated.

thanks,

harry

Re: Difficulty with Excel Parsing

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Wed, Aug 11, 2010 at 11:45 PM, Harry Hochheiser <hs...@gmail.com> wrote:
> I have a very simple excel file that I'm trying to parse with the
> command line application. If I try it in .xlsx format, it comes out
> fine, but I get nothing (beyond markup) if I parse the same file saved
> as a .xls.

Tika should have no problems parsing also .xls files. Can you file a
bug report [1] about this and attach an example spreadsheet that shows
the problem?

[1] https://issues.apache.org/jira/browse/TIKA

BR,

Jukka Zitting