You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Insight 49, LLC" <in...@gmail.com> on 2009/09/13 18:44:17 UTC

CSV Update - Need help mapping csv field to schema's ID

Using http://localhost:8983/solr/update/csv?stream.file, is there any 
way to map one of the csv fields to one's schema unique id?

e.g. A file with 3 fields (sku, product,price):
http://localhost:8983/solr/update/csv?stream.file=products.csv&stream.contentType=text/plain;charset=utf-8&header=true&separator=%2c&encapsulator=%22&escape=%5c&fieldnames=sku,product,price

I would like to add an additional name:value pair for every line, 
mapping the sku field to my schema's id field:

.map={sku.field}:{id}

I would prefer NOT to change the schema by adding a <copyField 
source="sku" dest="id"/>.

I read: http://wiki.apache.org/solr/UpdateCSV, but can't quite get it.

Thanks!

Dan

Re: CSV Update - Need help mapping csv field to schema's ID

Posted by "Mark A. Matienzo" <ma...@matienzo.org>.
On Tue, Sep 15, 2009 at 2:23 PM, Insight 49, LLC <in...@gmail.com> wrote:
> Want to map each sku field to the schema unique id field using update/csv.

You can set the sku field to be the uniqueKey field in the schema. See
http://wiki.apache.org/solr/SchemaXml#head-bec9b4f189d7f493c42f99b479ed0a8d0dd3d76e
for more info.

Mark Matienzo
Applications Developer, Digital Experience Group
The New York Public Library

Re: CSV Update - Need help mapping csv field to schema's ID

Posted by "Insight 49, LLC" <in...@gmail.com>.
Bump. Can anyone help guide me in the right direction?

Want to map each sku field to the schema unique id field using update/csv.

Thanks. Dan.


Insight 49, LLC wrote:
> Using http://localhost:8983/solr/update/csv?stream.file, is there any 
> way to map one of the csv fields to one's schema unique id?
> 
> e.g. A file with 3 fields (sku, product,price):
> http://localhost:8983/solr/update/csv?stream.file=products.csv&stream.contentType=text/plain;charset=utf-8&header=true&separator=%2c&encapsulator=%22&escape=%5c&fieldnames=sku,product,price 
> 
> 
> I would like to add an additional name:value pair for every line, 
> mapping the sku field to my schema's id field:
> 
> .map={sku.field}:{id}
> 
> I would prefer NOT to change the schema by adding a <copyField 
> source="sku" dest="id"/>.
> 
> I read: http://wiki.apache.org/solr/UpdateCSV, but can't quite get it.
> 
> Thanks!
> 
> Dan
> 

Re: CSV Update - Need help mapping csv field to schema's ID

Posted by "Insight 49, LLC" <in...@gmail.com>.
Darn. I hate when I create work for people.

My need is to take a csv file, use the CSV update handler, but then add 
an additional copyfield (sku from csv to id from schema) to create a 
unique id for each record.

Thanks guys. Terrific work on SOLR.

Dan


Grant Ingersoll wrote:
> 
> On Sep 15, 2009, at 8:25 PM, Yonik Seeley wrote:
> 
>> Darn... I shouldn't trust my memory.
>> From http://issues.apache.org/jira/browse/SOLR-284
>> '''drop "ext." from parameter names, and revisit naming to try and
>> unify with other update handlers like CSV'''
>>
>> So now map.a=b in CSV is for values but map.a=b in SolrCell is for 
>> fields....
>> perhaps we should change map in SolrCell to fmap?
> 
> That's fine by me.  Just update the docs when you're done.
> 
>>
>> My longer range idea was to pull out some generally useful things like
>> field mapping, etc, such that they could be shared across update
>> handlers.
> 
> See also:
>  SOLR-1032, SOLR-1069 for related things.  We should be able to refactor 
> the field mapping code easy enough.
> 

Re: CSV Update - Need help mapping csv field to schema's ID

Posted by Grant Ingersoll <gs...@apache.org>.
On Sep 16, 2009, at 9:41 AM, Grant Ingersoll wrote:

>
> On Sep 15, 2009, at 8:25 PM, Yonik Seeley wrote:
>
>>> : .map={sku.field}:{id}
>>>
>>> the map param is for replacing a *value* with a different'  
>>> value ... it's
>>> useful for things like numeric codes in CSV files that you want to  
>>> replace
>>> with strings in your index.
>>
>> Darn... I shouldn't trust my memory.
>> From http://issues.apache.org/jira/browse/SOLR-284
>> '''drop "ext." from parameter names, and revisit naming to try and
>> unify with other update handlers like CSV'''
>>
>> So now map.a=b in CSV is for values but map.a=b in SolrCell is for  
>> fields....
>> perhaps we should change map in SolrCell to fmap?
>
> That's fine by me.  Just update the docs when you're done.

Actually, I can do this now.

>
>>
>> My longer range idea was to pull out some generally useful things  
>> like
>> field mapping, etc, such that they could be shared across update
>> handlers.
>
> See also:
> SOLR-1032, SOLR-1069 for related things.  We should be able to  
> refactor the field mapping code easy enough.
>
>>
>> -Yonik
>> http://www.lucidimagination.com
>>
>>
>> ---------- Forwarded message ----------
>> From: Chris Hostetter <ho...@fucit.org>
>> Date: Tue, Sep 15, 2009 at 8:12 PM
>> Subject: Re: CSV Update - Need help mapping csv field to schema's ID
>> To: solr-user@lucene.apache.org
>>
>>
>>
>> : I would like to add an additional name:value pair for every line,  
>> mapping the
>> : sku field to my schema's id field:
>> :
>> : .map={sku.field}:{id}
>>
>> the map param is for replacing a *value* with a different'  
>> value ... it's
>> useful for things like numeric codes in CSV files that you want to  
>> replace
>> with strings in your index.
>>
>> : I would prefer NOT to change the schema by adding a <copyField  
>> source="sku"
>> : dest="id"/>.
>>
>> that's the only solution i can think of unless you want to write an
>> UpdateProcessor.
>>
>>
>> -Hoss
>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
>
> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
> using Solr/Lucene:
> http://www.lucidimagination.com/search
>

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:
http://www.lucidimagination.com/search


Re: CSV Update - Need help mapping csv field to schema's ID

Posted by Grant Ingersoll <gs...@apache.org>.
On Sep 15, 2009, at 8:25 PM, Yonik Seeley wrote:

>> : .map={sku.field}:{id}
>>
>> the map param is for replacing a *value* with a different'  
>> value ... it's
>> useful for things like numeric codes in CSV files that you want to  
>> replace
>> with strings in your index.
>
> Darn... I shouldn't trust my memory.
> From http://issues.apache.org/jira/browse/SOLR-284
> '''drop "ext." from parameter names, and revisit naming to try and
> unify with other update handlers like CSV'''
>
> So now map.a=b in CSV is for values but map.a=b in SolrCell is for  
> fields....
> perhaps we should change map in SolrCell to fmap?

That's fine by me.  Just update the docs when you're done.

>
> My longer range idea was to pull out some generally useful things like
> field mapping, etc, such that they could be shared across update
> handlers.

See also:
  SOLR-1032, SOLR-1069 for related things.  We should be able to  
refactor the field mapping code easy enough.

>
> -Yonik
> http://www.lucidimagination.com
>
>
> ---------- Forwarded message ----------
> From: Chris Hostetter <ho...@fucit.org>
> Date: Tue, Sep 15, 2009 at 8:12 PM
> Subject: Re: CSV Update - Need help mapping csv field to schema's ID
> To: solr-user@lucene.apache.org
>
>
>
> : I would like to add an additional name:value pair for every line,  
> mapping the
> : sku field to my schema's id field:
> :
> : .map={sku.field}:{id}
>
> the map param is for replacing a *value* with a different' value ...  
> it's
> useful for things like numeric codes in CSV files that you want to  
> replace
> with strings in your index.
>
> : I would prefer NOT to change the schema by adding a <copyField  
> source="sku"
> : dest="id"/>.
>
> that's the only solution i can think of unless you want to write an
> UpdateProcessor.
>
>
> -Hoss

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:
http://www.lucidimagination.com/search


Fwd: CSV Update - Need help mapping csv field to schema's ID

Posted by Yonik Seeley <yo...@lucidimagination.com>.
> : .map={sku.field}:{id}
>
> the map param is for replacing a *value* with a different' value ... it's
> useful for things like numeric codes in CSV files that you want to replace
> with strings in your index.

Darn... I shouldn't trust my memory.
>From http://issues.apache.org/jira/browse/SOLR-284
'''drop "ext." from parameter names, and revisit naming to try and
unify with other update handlers like CSV'''

So now map.a=b in CSV is for values but map.a=b in SolrCell is for fields....
perhaps we should change map in SolrCell to fmap?

My longer range idea was to pull out some generally useful things like
field mapping, etc, such that they could be shared across update
handlers.

-Yonik
http://www.lucidimagination.com


---------- Forwarded message ----------
From: Chris Hostetter <ho...@fucit.org>
Date: Tue, Sep 15, 2009 at 8:12 PM
Subject: Re: CSV Update - Need help mapping csv field to schema's ID
To: solr-user@lucene.apache.org



: I would like to add an additional name:value pair for every line, mapping the
: sku field to my schema's id field:
:
: .map={sku.field}:{id}

the map param is for replacing a *value* with a different' value ... it's
useful for things like numeric codes in CSV files that you want to replace
with strings in your index.

: I would prefer NOT to change the schema by adding a <copyField source="sku"
: dest="id"/>.

that's the only solution i can think of unless you want to write an
UpdateProcessor.


-Hoss

Re: CSV Update - Need help mapping csv field to schema's ID

Posted by "Insight 49, LLC" <in...@gmail.com>.
Thanks guys...

Yonik and Grant commented on this thread in the dev group.

Dan

Chris Hostetter wrote:
> : I would like to add an additional name:value pair for every line, mapping the
> : sku field to my schema's id field:
> : 
> : .map={sku.field}:{id}
> 
> the map param is for replacing a *value* with a different' value ... it's 
> useful for things like numeric codes in CSV files that you want to replace 
> with strings in your index.
> 
> : I would prefer NOT to change the schema by adding a <copyField source="sku"
> : dest="id"/>.
> 
> that's the only solution i can think of unless you want to write an 
> UpdateProcessor.
> 
> 
> -Hoss
> 


Re: CSV Update - Need help mapping csv field to schema's ID

Posted by Chris Hostetter <ho...@fucit.org>.
: I would like to add an additional name:value pair for every line, mapping the
: sku field to my schema's id field:
: 
: .map={sku.field}:{id}

the map param is for replacing a *value* with a different' value ... it's 
useful for things like numeric codes in CSV files that you want to replace 
with strings in your index.

: I would prefer NOT to change the schema by adding a <copyField source="sku"
: dest="id"/>.

that's the only solution i can think of unless you want to write an 
UpdateProcessor.


-Hoss