You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Sithik <si...@gmail.com> on 2014/12/12 07:56:23 UTC

[Hep] tab delimited gz file indexing steps

Team,
I have a compressed text file (gz) which holds tab delimited data. Is it
possible for me to index this file directly without doing any pre
processing of uncompressing the file on my own? if so, can you please tell
me the steps/config changes I am supposed to follow.

BTW, I am using solr-4.10.

Thanks in advance

-Sithik

Re: [Hep] tab delimited gz file indexing steps

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
gzcat may do the job by streaming as it expands.

Another option is to DataImportHandler and write a custom FileSystem
data source that will do expansion.

Regards,
   Alex.
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853


On 12 December 2014 at 09:24, Shawn Heisey <ap...@elyograg.org> wrote:
> On 12/11/2014 11:56 PM, Sithik wrote:
>> I have a compressed text file (gz) which holds tab delimited data. Is it
>> possible for me to index this file directly without doing any pre
>> processing of uncompressing the file on my own? if so, can you please tell
>> me the steps/config changes I am supposed to follow.
>
> The CSV update handler can work with tab-delimited files.
>
> https://wiki.apache.org/solr/UpdateCSV#Tab-delimited_importing
>
> I think you probably do need to uncompress it with gzip before Solr can
> do anything with it.
>
> Thanks,
> Shawn
>

Re: [Hep] tab delimited gz file indexing steps

Posted by Shawn Heisey <ap...@elyograg.org>.
On 12/11/2014 11:56 PM, Sithik wrote:
> I have a compressed text file (gz) which holds tab delimited data. Is it
> possible for me to index this file directly without doing any pre
> processing of uncompressing the file on my own? if so, can you please tell
> me the steps/config changes I am supposed to follow.

The CSV update handler can work with tab-delimited files.

https://wiki.apache.org/solr/UpdateCSV#Tab-delimited_importing

I think you probably do need to uncompress it with gzip before Solr can
do anything with it.

Thanks,
Shawn