You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Benson Margulies <bi...@gmail.com> on 2011/07/21 00:05:41 UTC

Updating fields in an existing document

We find ourselves in the following quandry:

At initial index time, we store a value in a field, and we use it for
facetting. So it, seemingly, has to be there as a field.

However, from time to time, something happens that causes us to want
to change this value. As far as we know, this requires us to
completely re-index the document, which is slow.

It struck me that we can't be the only people to go down this road, so
I write to inquire if we are missing something.

Re: Updating fields in an existing document

Posted by Grant Ingersoll <gs...@apache.org>.
This is a pretty low level issue with inverted indexes (i.e. the underlying data structure used) and not so much the architecture.  It is possible, I suppose, to solve it at the architectural level, but in many cases this causes performance problems that are not usually acceptable.

On Jul 20, 2011, at 7:08 PM, Jonathan Rochkind wrote:

> Nope, you're not missing anything, there's no way to alter a document in an index but reindexing the whole document. Solr's architecture would make it difficult (although never say impossible) to do otherwise. But you're right it would be convenient for people other than you. 
> 
> Reindexing a single document ought not to be slow, although if you have many of them at once it could be, or if you end up needing to very frequently commit to an index it can indeed cause problems. 
> ________________________________________
> From: Benson Margulies [bimargulies@gmail.com]
> Sent: Wednesday, July 20, 2011 6:05 PM
> To: solr-user
> Subject: Updating fields in an existing document
> 
> We find ourselves in the following quandry:
> 
> At initial index time, we store a value in a field, and we use it for
> facetting. So it, seemingly, has to be there as a field.
> 
> However, from time to time, something happens that causes us to want
> to change this value. As far as we know, this requires us to
> completely re-index the document, which is slow.
> 
> It struck me that we can't be the only people to go down this road, so
> I write to inquire if we are missing something.

--------------------------
Grant Ingersoll




Re: Updating fields in an existing document

Posted by Chris Hostetter <ho...@fucit.org>.
: As in http://wiki.apache.org/solr/UpdateXmlMessages?

Exactly ... the title is "XML Messages for Updating a Solr Index"

But i do see some confusing usages of "add/update" in the context of 
documents that definitely don't belong there -- so i've changed them to 
"add/replace". 

Thanks for bringing this up.

-Hoss

Re: Updating fields in an existing document

Posted by Benson Margulies <bi...@gmail.com>.
As in http://wiki.apache.org/solr/UpdateXmlMessages?

On Mon, Jul 25, 2011 at 4:10 PM, Chris Hostetter
<ho...@fucit.org> wrote:
> : A followup. The wiki has a whole discussion of the 'update' XML
> : message. But solrj has nothing like it. Does that really exist? Is
> : there a reason to use it? If I just 'add' the document a second time,
> : it will replace?
>
> You should only see "update" in Solr docs used in the context of
> "updating" the index by adding (which might be replacing) or deleting
> documents.  (you'll note there is no "<update>" tag or anything like that
> in the XML syntax)
>
>
> -Hoss
>

Re: Updating fields in an existing document

Posted by Chris Hostetter <ho...@fucit.org>.
: A followup. The wiki has a whole discussion of the 'update' XML
: message. But solrj has nothing like it. Does that really exist? Is
: there a reason to use it? If I just 'add' the document a second time,
: it will replace?

You should only see "update" in Solr docs used in the context of 
"updating" the index by adding (which might be replacing) or deleting 
documents.  (you'll note there is no "<update>" tag or anything like that 
in the XML syntax) 


-Hoss

Re: Updating fields in an existing document

Posted by Marc SCHNEIDER <ma...@gmail.com>.
Yes that's it if you add twice the same document (ie with the same id) it
will replace it.

On Thu, Jul 21, 2011 at 7:46 PM, Benson Margulies <bi...@gmail.com>wrote:

> A followup. The wiki has a whole discussion of the 'update' XML
> message. But solrj has nothing like it. Does that really exist? Is
> there a reason to use it? If I just 'add' the document a second time,
> it will replace?
>
> On Wed, Jul 20, 2011 at 7:08 PM, Jonathan Rochkind <ro...@jhu.edu>
> wrote:
> > Nope, you're not missing anything, there's no way to alter a document in
> an index but reindexing the whole document. Solr's architecture would make
> it difficult (although never say impossible) to do otherwise. But you're
> right it would be convenient for people other than you.
> >
> > Reindexing a single document ought not to be slow, although if you have
> many of them at once it could be, or if you end up needing to very
> frequently commit to an index it can indeed cause problems.
> > ________________________________________
> > From: Benson Margulies [bimargulies@gmail.com]
> > Sent: Wednesday, July 20, 2011 6:05 PM
> > To: solr-user
> > Subject: Updating fields in an existing document
> >
> > We find ourselves in the following quandry:
> >
> > At initial index time, we store a value in a field, and we use it for
> > facetting. So it, seemingly, has to be there as a field.
> >
> > However, from time to time, something happens that causes us to want
> > to change this value. As far as we know, this requires us to
> > completely re-index the document, which is slow.
> >
> > It struck me that we can't be the only people to go down this road, so
> > I write to inquire if we are missing something.
> >
>

Re: Updating fields in an existing document

Posted by Benson Margulies <bi...@gmail.com>.
A followup. The wiki has a whole discussion of the 'update' XML
message. But solrj has nothing like it. Does that really exist? Is
there a reason to use it? If I just 'add' the document a second time,
it will replace?

On Wed, Jul 20, 2011 at 7:08 PM, Jonathan Rochkind <ro...@jhu.edu> wrote:
> Nope, you're not missing anything, there's no way to alter a document in an index but reindexing the whole document. Solr's architecture would make it difficult (although never say impossible) to do otherwise. But you're right it would be convenient for people other than you.
>
> Reindexing a single document ought not to be slow, although if you have many of them at once it could be, or if you end up needing to very frequently commit to an index it can indeed cause problems.
> ________________________________________
> From: Benson Margulies [bimargulies@gmail.com]
> Sent: Wednesday, July 20, 2011 6:05 PM
> To: solr-user
> Subject: Updating fields in an existing document
>
> We find ourselves in the following quandry:
>
> At initial index time, we store a value in a field, and we use it for
> facetting. So it, seemingly, has to be there as a field.
>
> However, from time to time, something happens that causes us to want
> to change this value. As far as we know, this requires us to
> completely re-index the document, which is slow.
>
> It struck me that we can't be the only people to go down this road, so
> I write to inquire if we are missing something.
>

RE: Updating fields in an existing document

Posted by Jonathan Rochkind <ro...@jhu.edu>.
Nope, you're not missing anything, there's no way to alter a document in an index but reindexing the whole document. Solr's architecture would make it difficult (although never say impossible) to do otherwise. But you're right it would be convenient for people other than you. 

Reindexing a single document ought not to be slow, although if you have many of them at once it could be, or if you end up needing to very frequently commit to an index it can indeed cause problems. 
________________________________________
From: Benson Margulies [bimargulies@gmail.com]
Sent: Wednesday, July 20, 2011 6:05 PM
To: solr-user
Subject: Updating fields in an existing document

We find ourselves in the following quandry:

At initial index time, we store a value in a field, and we use it for
facetting. So it, seemingly, has to be there as a field.

However, from time to time, something happens that causes us to want
to change this value. As far as we know, this requires us to
completely re-index the document, which is slow.

It struck me that we can't be the only people to go down this road, so
I write to inquire if we are missing something.