You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by nabil Kouici <ko...@yahoo.fr> on 2014/10/09 20:58:14 UTC

Data Import Handler for CSV file





Hi All,

Is it possible to have in solr a DIH to load from CSV file. Actually I'm using update/csv handler but not responding to my need.

Regards,
NKI.

Re: Data Import Handler for CSV file

Posted by Ahmet Arslan <io...@yahoo.com.INVALID>.
Hi,

I think you can define field names in the first line of csv. Why don't you use curl to index csv?

I don't have full working example with DIH but I have following example that indexed every line as a separate solr scoument.

You need to add a transformer that splits each line according to comma.

<dataConfig>
<dataSource type="FileDataSource" encoding="UTF-8" name="fds"/>
    <document>
       <entity name="f" processor="FileListEntityProcessor" fileName=".*txt" baseDir="/Volumes/data/Documents" recursive="false" rootEntity="false" dataSource="null" transformer="TemplateTransformer" >
             <entity onError="skip" name="jc"   processor="LineEntityProcessor" url="${f.fileAbsolutePath}" dataSource="fds"  rootEntity="true" transformer="TemplateTransformer">
             <field column="link" template="hello${f.fileAbsolutePath},${jc.rawLine}" />
             <field column="rawLine" name="rawLine" />
             </entity>             
        </entity>
    </document>
</dataConfig>



On Friday, October 10, 2014 12:26 AM, nabil Kouici <ko...@yahoo.fr> wrote:
Hi Ahmet,

Thank you for this replay. Agree with you that csv update handler is fast but we need always to specify columns in the http request. In addition, I don't find documentation how to use csv update from solrj.

Could you please send me an example of DIH to load CSV file?

Regards,
Nabil.





Le Jeudi 9 octobre 2014 21h05, Ahmet Arslan <io...@yahoo.com.INVALID> a écrit :



Hi Nabil,

whats wrong with csv update handler? It is quite fast.

By the way DIH has line entity processor, yes it is doable with existing DIH components.

Ahmet



On Thursday, October 9, 2014 9:58 PM, nabil Kouici <ko...@yahoo.fr> wrote:





Hi All,

Is it possible to have in solr a DIH to load from CSV file. Actually I'm using update/csv handler but not responding to my need.

Regards,
NKI.

Re: Data Import Handler for CSV file

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
You could always define the parameters in the solrconfig.XML on a custom
handler. Don't have to pass the same values over and over again.

Regards,
     Alex
On 09/10/2014 5:26 pm, "nabil Kouici" <ko...@yahoo.fr> wrote:

> Hi Ahmet,
>
> Thank you for this replay. Agree with you that csv update handler is fast
> but we need always to specify columns in the http request. In addition, I
> don't find documentation how to use csv update from solrj.
>
> Could you please send me an example of DIH to load CSV file?
>
> Regards,
> Nabil.
>
>
> Le Jeudi 9 octobre 2014 21h05, Ahmet Arslan <io...@yahoo.com.INVALID> a
> écrit :
>
>
>
> Hi Nabil,
>
> whats wrong with csv update handler? It is quite fast.
>
> By the way DIH has line entity processor, yes it is doable with existing
> DIH components.
>
> Ahmet
>
>
>
> On Thursday, October 9, 2014 9:58 PM, nabil Kouici <ko...@yahoo.fr>
> wrote:
>
>
>
>
>
> Hi All,
>
> Is it possible to have in solr a DIH to load from CSV file. Actually I'm
> using update/csv handler but not responding to my need.
>
> Regards,
> NKI.

RE: Data Import Handler for CSV file

Posted by "Dyer, James" <Ja...@ingramcontent.com>.
Nabil,

Unfortunately, the out-of-the box functionality for DIH lacks a lot of what the csv handler has to offer.  There is a LineEntityProcessor (see http://wiki.apache.org/solr/DataImportHandler#LineEntityProcessor), but this will just output each line in a field called "rawLine".  It is up to you to then write a Transformer that will split it on commas (or better, use a lib like commons-csv to process it).

There is an extension available as an old patch that will give LineEntityProcessor the ability to handle delimited and fixed-width files.  However, you'll need to apply the patch yourself and build DIH from source.   See https://issues.apache.org/jira/browse/SOLR-2549 .

James Dyer
Ingram Content Group
(615) 213-4311

-----Original Message-----
From: nabil Kouici [mailto:kouicin@yahoo.fr] 
Sent: Thursday, October 09, 2014 4:26 PM
To: solr-user@lucene.apache.org; Ahmet Arslan
Subject: Re: Data Import Handler for CSV file

Hi Ahmet,
 
Thank you for this replay. Agree with you that csv update handler is fast but we need always to specify columns in the http request. In addition, I don't find documentation how to use csv update from solrj.

Could you please send me an example of DIH to load CSV file?

Regards,
Nabil.


Le Jeudi 9 octobre 2014 21h05, Ahmet Arslan <io...@yahoo.com.INVALID> a écrit :
 


Hi Nabil,

whats wrong with csv update handler? It is quite fast.

By the way DIH has line entity processor, yes it is doable with existing DIH components.

Ahmet



On Thursday, October 9, 2014 9:58 PM, nabil Kouici <ko...@yahoo.fr> wrote:





Hi All,

Is it possible to have in solr a DIH to load from CSV file. Actually I'm using update/csv handler but not responding to my need.

Regards,
NKI.


Re: Data Import Handler for CSV file

Posted by nabil Kouici <ko...@yahoo.fr>.
Hi Ahmet,
 
Thank you for this replay. Agree with you that csv update handler is fast but we need always to specify columns in the http request. In addition, I don't find documentation how to use csv update from solrj.

Could you please send me an example of DIH to load CSV file?

Regards,
Nabil.


Le Jeudi 9 octobre 2014 21h05, Ahmet Arslan <io...@yahoo.com.INVALID> a écrit :
 


Hi Nabil,

whats wrong with csv update handler? It is quite fast.

By the way DIH has line entity processor, yes it is doable with existing DIH components.

Ahmet



On Thursday, October 9, 2014 9:58 PM, nabil Kouici <ko...@yahoo.fr> wrote:





Hi All,

Is it possible to have in solr a DIH to load from CSV file. Actually I'm using update/csv handler but not responding to my need.

Regards,
NKI. 

Re: Data Import Handler for CSV file

Posted by Ahmet Arslan <io...@yahoo.com.INVALID>.
Hi Nabil,

whats wrong with csv update handler? It is quite fast.

By the way DIH has line entity processor, yes it is doable with existing DIH components.

Ahmet
 

On Thursday, October 9, 2014 9:58 PM, nabil Kouici <ko...@yahoo.fr> wrote:





Hi All,

Is it possible to have in solr a DIH to load from CSV file. Actually I'm using update/csv handler but not responding to my need.

Regards,
NKI.