You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by James O'Rourke <ja...@bittorrent.com> on 2007/09/03 06:22:44 UTC

updates on the server

Is there a way to pass the solr server a set of documents without all  
the fields present and only update the fields that are provided  
leaving the remaining document fields intact or do I need to pull  
those documents over the wire myself and do the update manual and  
then add them back to the index?

James


Re: updates on the server

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Sep 6, 2007, at 2:56 PM, Matthew Runo wrote:
> On a related note, it'd be great if we could set up a series of  
> transformations to be done on data when it comes into the index,  
> before being indexed. I guess a custom tokenizer might be the best  
> way to do this though..?
>
> ie:
>
> -Post
> -Data is cleaned up, properly escaped, etc
> -Then data is passed to whatever tokenizer we want to use.

Solr should do more work on the data indexing side, to allow clients  
to more easily hand documents to it and modify them.  XML isn't  
necessarily the prettiest way, and we see other formats being  
supported with the CSV and rich document indexing.

A custom tokenizer or token filter make great sense in the single  
field sense of data transformation, but parsing some request data  
into multiple fields must be done at a higher level.

	Erik


Re: updates on the server

Posted by Matthew Runo <mr...@zappos.com>.
On a related note, it'd be great if we could set up a series of  
transformations to be done on data when it comes into the index,  
before being indexed. I guess a custom tokenizer might be the best  
way to do this though..?

ie:

-Post
-Data is cleaned up, properly escaped, etc
-Then data is passed to whatever tokenizer we want to use.

+--------------------------------------------------------+
  | Matthew Runo
  | Zappos Development
  | mruno@zappos.com
  | 702-943-7833
+--------------------------------------------------------+


On Sep 3, 2007, at 7:10 AM, Erik Hatcher wrote:

>
> On Sep 3, 2007, at 12:22 AM, James O'Rourke wrote:
>> Is there a way to pass the solr server a set of documents without  
>> all the fields present and only update the fields that are  
>> provided leaving the remaining document fields intact or do I need  
>> to pull those documents over the wire myself and do the update  
>> manual and then add them back to the index?
>
> With Solr currently you cannot update a specific field, you have to  
> re-send the entire document to replace the existing one.  However,  
> preliminary support for such capability has been contributed here:  
> http://issues.apache.org/jira/browse/SOLR-139 - this is not in its  
> final form, so this is to use at your own risk given the caveats  
> listed in that issue about concurrency.
>
> I'm currently using the patch I posted to that issue in a  
> production environment and its working fine thus far, but it will  
> change in at least core ways and likely request parameter and  
> formatting ways before making its debut in Solr's trunk.
>
> 	Erik
>


Re: updates on the server

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Sep 3, 2007, at 12:22 AM, James O'Rourke wrote:
> Is there a way to pass the solr server a set of documents without  
> all the fields present and only update the fields that are provided  
> leaving the remaining document fields intact or do I need to pull  
> those documents over the wire myself and do the update manual and  
> then add them back to the index?

With Solr currently you cannot update a specific field, you have to  
re-send the entire document to replace the existing one.  However,  
preliminary support for such capability has been contributed here:  
http://issues.apache.org/jira/browse/SOLR-139 - this is not in its  
final form, so this is to use at your own risk given the caveats  
listed in that issue about concurrency.

I'm currently using the patch I posted to that issue in a production  
environment and its working fine thus far, but it will change in at  
least core ways and likely request parameter and formatting ways  
before making its debut in Solr's trunk.

	Erik