You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Roland Everaert <re...@gmail.com> on 2013/10/18 13:36:05 UTC

XLSB files not indexed

Hi,

Can someone tells me if tika is supposed to extract data from xlsb files
(the new MS Office format in binary form)?

If so then it seems that solr is not able to index them like it is not able
to index ODF files (a JIRA is already opened for ODF
https://issues.apache.org/jira/browse/SOLR-4809)

Can someone confirm the problem, or tell me what to do to make solr works
with XLSB files.


Regards,


Roland.

Re: XLSB files not indexed

Posted by Roland Everaert <re...@gmail.com>.
Hi Otis,

In our case, there is no exception raised by tika or solr, a lucene
document is created, but the content field contains only a few white spaces
like for ODF files.


Roland.


On Sat, Oct 19, 2013 at 3:54 AM, Otis Gospodnetic <
otis.gospodnetic@gmail.com> wrote:

> Hi Roland,
>
> It looks like:
> Tika - yes
> Solr - no?
>
> Based on http://search-lucene.com/?q=xlsb
>
> ODF != XLSB though, I think...
>
> Otis
> --
> Solr & ElasticSearch Support -- http://sematext.com/
> Performance Monitoring -- http://sematext.com/spm
>
>
>
> On Fri, Oct 18, 2013 at 7:36 AM, Roland Everaert <re...@gmail.com>
> wrote:
> > Hi,
> >
> > Can someone tells me if tika is supposed to extract data from xlsb files
> > (the new MS Office format in binary form)?
> >
> > If so then it seems that solr is not able to index them like it is not
> able
> > to index ODF files (a JIRA is already opened for ODF
> > https://issues.apache.org/jira/browse/SOLR-4809)
> >
> > Can someone confirm the problem, or tell me what to do to make solr works
> > with XLSB files.
> >
> >
> > Regards,
> >
> >
> > Roland.
>

Re: XLSB files not indexed

Posted by Otis Gospodnetic <ot...@gmail.com>.
Hi Roland,

It looks like:
Tika - yes
Solr - no?

Based on http://search-lucene.com/?q=xlsb

ODF != XLSB though, I think...

Otis
--
Solr & ElasticSearch Support -- http://sematext.com/
Performance Monitoring -- http://sematext.com/spm



On Fri, Oct 18, 2013 at 7:36 AM, Roland Everaert <re...@gmail.com> wrote:
> Hi,
>
> Can someone tells me if tika is supposed to extract data from xlsb files
> (the new MS Office format in binary form)?
>
> If so then it seems that solr is not able to index them like it is not able
> to index ODF files (a JIRA is already opened for ODF
> https://issues.apache.org/jira/browse/SOLR-4809)
>
> Can someone confirm the problem, or tell me what to do to make solr works
> with XLSB files.
>
>
> Regards,
>
>
> Roland.