You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Savannah Beckett <sa...@yahoo.com> on 2010/09/10 07:29:44 UTC

How to Update Value of One Field of a Document in Index?

I use nutch to crawl and index to Solr.  My code is working.  Now, I want to 
update the value of one of the fields of a document in the solr index after the 
document was already indexed, and I have only the document id.  How do I do 
that?  

Thanks.


      

Re: How to Update Value of One Field of a Document in Index?

Posted by "Grijesh.singh" <pi...@gmail.com>.
There is no way to update any field in solr,You have to reindex that entire
document again.

you can get that doc from index create xml with existing fields with your
updated field and post that xml to solr.

-----
Grijesh
-- 
View this message in context: http://lucene.472066.n3.nabble.com/How-to-Update-Value-of-One-Field-of-a-Document-in-Index-tp1450648p1450772.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: How to Update Value of One Field of a Document in Index?

Posted by Markus Jelsma <ma...@buyways.nl>.
The MoreLikeThis component actually can accept external input:

http://wiki.apache.org/solr/MoreLikeThisHandler#Using_ContentStreams
 
-----Original message-----
From: Jonathan Rochkind <ro...@jhu.edu>
Sent: Fri 10-09-2010 18:59
To: solr-user@lucene.apache.org; 
Subject: RE: How to Update Value of One Field of a Document in Index?

"More like this" is intended to be run at query time. For what reasons are you thinking you want to (re-)index each document based on the results of MoreLikeThis?  You're right that that's not what the component is intended for. 

Jonathan
________________________________________
From: Savannah Beckett [savannah_beckett30@yahoo.com]
Sent: Friday, September 10, 2010 11:18 AM
To: solr-user@lucene.apache.org
Subject: Re: How to Update Value of One Field of a Document in Index?

Thanks.  I am trying to use MoreLikeThis in Solr to find similar documents in
the solr index and use the data from these similar documents to modify a field
in each document that I am indexing.  I found that MoreLikeThis in Solr only
works when the document is in the index, is it true?  If so, I may have to wait
til the indexing is finished, then run my own command to do MoreLikeThis to each
document in the index, and then reindex each document?  It sounds like it's not
efficient.  Is there a better way?
Thanks.




________________________________
From: Liam O'Boyle <li...@intelligencebank.com>
To: solr-user@lucene.apache.org
Cc: user@nutch.apache.org
Sent: Thu, September 9, 2010 11:06:36 PM
Subject: Re: How to Update Value of One Field of a Document in Index?

Hi Savannah,

You can only reindex the entire document; if you only have the ID,
then do a search to retrieve the rest of the data, then reindex.  This
assumes that all of the fields you need to index are stored (so that
you can retrieve them) and not just indexed.

Liam

On Fri, Sep 10, 2010 at 3:29 PM, Savannah Beckett
<sa...@yahoo.com> wrote:
>
> I use nutch to crawl and index to Solr.  My code is working.  Now, I want to
> update the value of one of the fields of a document in the solr index after
the
> document was already indexed, and I have only the document id.  How do I do
> that?
>
> Thanks.
>
>
>




Re: How to Update Value of One Field of a Document in Index?

Posted by Zachary Chang <ch...@yahoo.de>.
  Hi Savannah,

if you *only want to boost* documents based on the information you 
calculate from the MoreLikeThis results (i.e. numeric measure), you 
might want to take a look at the ExternalFileField type. This field type 
reads its contents from a file which contains key-value pairs, e.g. the 
document ids and the corresponding measure values, resp.
If some values change you still have to regenerate the whole file 
(instead of the whole index). But of course, this file can be generated 
from a DB, which might be updated incrementally.

For setup and usage e.g. see: 
http://dev.tailsweep.com/solr-external-scoring/

Zachary

On 10.09.2010 19:57, Savannah Beckett wrote:
> I want to do MoreLikeThis to find documents that are similar to the document
> that I am indexing.  Then I want to calculate the average of one of the fields
> of all those documents and input this average into a field of the document that
> I am indexing.  From my research, it seems that MoreLikeThis can only be used to
> find similarity of document that is already in the index.  So, I think I need to
> index it first, and then use MoreLikeThis to find similar documents in the index
> and then reindex that document.  Any better way?  I try not to reindex a
> document because it's not efficient.  I don't have to use MoreLikeThis.
> Thanks.
>
>
>
> ________________________________
> From: Jonathan Rochkind<ro...@jhu.edu>
> To: "solr-user@lucene.apache.org"<so...@lucene.apache.org>
> Sent: Fri, September 10, 2010 9:58:20 AM
> Subject: RE: How to Update Value of One Field of a Document in Index?
>
> "More like this" is intended to be run at query time. For what reasons are you
> thinking you want to (re-)index each document based on the results of
> MoreLikeThis?  You're right that that's not what the component is intended for.
>
>
> Jonathan
> ________________________________________
> From: Savannah Beckett [savannah_beckett30@yahoo.com]
> Sent: Friday, September 10, 2010 11:18 AM
> To: solr-user@lucene.apache.org
> Subject: Re: How to Update Value of One Field of a Document in Index?
>
> Thanks.  I am trying to use MoreLikeThis in Solr to find similar documents in
> the solr index and use the data from these similar documents to modify a field
> in each document that I am indexing.  I found that MoreLikeThis in Solr only
> works when the document is in the index, is it true?  If so, I may have to wait
> til the indexing is finished, then run my own command to do MoreLikeThis to each
> document in the index, and then reindex each document?  It sounds like it's not
> efficient.  Is there a better way?
> Thanks.
>
>
>
>
> ________________________________
> From: Liam O'Boyle<li...@intelligencebank.com>
> To: solr-user@lucene.apache.org
> Cc: user@nutch.apache.org
> Sent: Thu, September 9, 2010 11:06:36 PM
> Subject: Re: How to Update Value of One Field of a Document in Index?
>
> Hi Savannah,
>
> You can only reindex the entire document; if you only have the ID,
> then do a search to retrieve the rest of the data, then reindex.  This
> assumes that all of the fields you need to index are stored (so that
> you can retrieve them) and not just indexed.
>
> Liam
>
> On Fri, Sep 10, 2010 at 3:29 PM, Savannah Beckett
> <sa...@yahoo.com>  wrote:
>> I use nutch to crawl and index to Solr.  My code is working.  Now, I want to
>> update the value of one of the fields of a document in the solr index after
> the
>> document was already indexed, and I have only the document id.  How do I do
>> that?
>>
>> Thanks.
>>
>>
>>
>
>

__________________________________________________
Do You Yahoo!?
Sie sind Spam leid? Yahoo! Mail verfügt über einen herausragenden Schutz gegen Massenmails. 
http://mail.yahoo.com 

Re: How to Update Value of One Field of a Document in Index?

Posted by Savannah Beckett <sa...@yahoo.com>.
I want to do MoreLikeThis to find documents that are similar to the document 
that I am indexing.  Then I want to calculate the average of one of the fields 
of all those documents and input this average into a field of the document that 
I am indexing.  From my research, it seems that MoreLikeThis can only be used to 
find similarity of document that is already in the index.  So, I think I need to 
index it first, and then use MoreLikeThis to find similar documents in the index 
and then reindex that document.  Any better way?  I try not to reindex a 
document because it's not efficient.  I don't have to use MoreLikeThis.
Thanks.



________________________________
From: Jonathan Rochkind <ro...@jhu.edu>
To: "solr-user@lucene.apache.org" <so...@lucene.apache.org>
Sent: Fri, September 10, 2010 9:58:20 AM
Subject: RE: How to Update Value of One Field of a Document in Index?

"More like this" is intended to be run at query time. For what reasons are you 
thinking you want to (re-)index each document based on the results of 
MoreLikeThis?  You're right that that's not what the component is intended for. 


Jonathan
________________________________________
From: Savannah Beckett [savannah_beckett30@yahoo.com]
Sent: Friday, September 10, 2010 11:18 AM
To: solr-user@lucene.apache.org
Subject: Re: How to Update Value of One Field of a Document in Index?

Thanks.  I am trying to use MoreLikeThis in Solr to find similar documents in
the solr index and use the data from these similar documents to modify a field
in each document that I am indexing.  I found that MoreLikeThis in Solr only
works when the document is in the index, is it true?  If so, I may have to wait
til the indexing is finished, then run my own command to do MoreLikeThis to each
document in the index, and then reindex each document?  It sounds like it's not
efficient.  Is there a better way?
Thanks.




________________________________
From: Liam O'Boyle <li...@intelligencebank.com>
To: solr-user@lucene.apache.org
Cc: user@nutch.apache.org
Sent: Thu, September 9, 2010 11:06:36 PM
Subject: Re: How to Update Value of One Field of a Document in Index?

Hi Savannah,

You can only reindex the entire document; if you only have the ID,
then do a search to retrieve the rest of the data, then reindex.  This
assumes that all of the fields you need to index are stored (so that
you can retrieve them) and not just indexed.

Liam

On Fri, Sep 10, 2010 at 3:29 PM, Savannah Beckett
<sa...@yahoo.com> wrote:
>
> I use nutch to crawl and index to Solr.  My code is working.  Now, I want to
> update the value of one of the fields of a document in the solr index after
the
> document was already indexed, and I have only the document id.  How do I do
> that?
>
> Thanks.
>
>
>


      

RE: How to Update Value of One Field of a Document in Index?

Posted by Jonathan Rochkind <ro...@jhu.edu>.
"More like this" is intended to be run at query time. For what reasons are you thinking you want to (re-)index each document based on the results of MoreLikeThis?  You're right that that's not what the component is intended for. 

Jonathan
________________________________________
From: Savannah Beckett [savannah_beckett30@yahoo.com]
Sent: Friday, September 10, 2010 11:18 AM
To: solr-user@lucene.apache.org
Subject: Re: How to Update Value of One Field of a Document in Index?

Thanks.  I am trying to use MoreLikeThis in Solr to find similar documents in
the solr index and use the data from these similar documents to modify a field
in each document that I am indexing.  I found that MoreLikeThis in Solr only
works when the document is in the index, is it true?  If so, I may have to wait
til the indexing is finished, then run my own command to do MoreLikeThis to each
document in the index, and then reindex each document?  It sounds like it's not
efficient.  Is there a better way?
Thanks.




________________________________
From: Liam O'Boyle <li...@intelligencebank.com>
To: solr-user@lucene.apache.org
Cc: user@nutch.apache.org
Sent: Thu, September 9, 2010 11:06:36 PM
Subject: Re: How to Update Value of One Field of a Document in Index?

Hi Savannah,

You can only reindex the entire document; if you only have the ID,
then do a search to retrieve the rest of the data, then reindex.  This
assumes that all of the fields you need to index are stored (so that
you can retrieve them) and not just indexed.

Liam

On Fri, Sep 10, 2010 at 3:29 PM, Savannah Beckett
<sa...@yahoo.com> wrote:
>
> I use nutch to crawl and index to Solr.  My code is working.  Now, I want to
> update the value of one of the fields of a document in the solr index after
the
> document was already indexed, and I have only the document id.  How do I do
> that?
>
> Thanks.
>
>
>




Re: How to Update Value of One Field of a Document in Index?

Posted by Savannah Beckett <sa...@yahoo.com>.
Thanks.  I am trying to use MoreLikeThis in Solr to find similar documents in 
the solr index and use the data from these similar documents to modify a field 
in each document that I am indexing.  I found that MoreLikeThis in Solr only 
works when the document is in the index, is it true?  If so, I may have to wait 
til the indexing is finished, then run my own command to do MoreLikeThis to each 
document in the index, and then reindex each document?  It sounds like it's not 
efficient.  Is there a better way?
Thanks.




________________________________
From: Liam O'Boyle <li...@intelligencebank.com>
To: solr-user@lucene.apache.org
Cc: user@nutch.apache.org
Sent: Thu, September 9, 2010 11:06:36 PM
Subject: Re: How to Update Value of One Field of a Document in Index?

Hi Savannah,

You can only reindex the entire document; if you only have the ID,
then do a search to retrieve the rest of the data, then reindex.  This
assumes that all of the fields you need to index are stored (so that
you can retrieve them) and not just indexed.

Liam

On Fri, Sep 10, 2010 at 3:29 PM, Savannah Beckett
<sa...@yahoo.com> wrote:
>
> I use nutch to crawl and index to Solr.  My code is working.  Now, I want to
> update the value of one of the fields of a document in the solr index after 
the
> document was already indexed, and I have only the document id.  How do I do
> that?
>
> Thanks.
>
>
>



      

Re: How to Update Value of One Field of a Document in Index?

Posted by Luis Cappa Banda <lu...@gmail.com>.
Hello.

You should be able to get the current document that you want to update,
change your notes value with the new ones to be added bye the user, and then
make and update petition to Solr to delete the old document (findable by the
id that you include in the POST petition) and add the new document with the
changes done. Try to develop a small Java application with SolrJ resources,
for example. Depending on the number of update petitions that your
system/application will do I recommend you, or not, to include a commit
order after the update one. Also you can configure a periodic auto-commit to
update indexes automatically.

Re: How to Update Value of One Field of a Document in Index?

Posted by Erick Erickson <er...@gmail.com>.
OK, thanks.

On Wed, Apr 27, 2011 at 9:29 AM, Steven A Rowe <sa...@syr.edu> wrote:
>> There's the "limited join" patch, see:
>> https://issues.apache.org/jira/browse/SOLR-2272
>> that hasn't been applied  yet
>
> Correction: Yonik committed this feature in r1096978.
>
>

RE: How to Update Value of One Field of a Document in Index?

Posted by Steven A Rowe <sa...@syr.edu>.
> There's the "limited join" patch, see:
> https://issues.apache.org/jira/browse/SOLR-2272
> that hasn't been applied  yet

Correction: Yonik committed this feature in r1096978.


Re: How to Update Value of One Field of a Document in Index?

Posted by Erick Erickson <er...@gmail.com>.
(2) isn't viable. Updating a multiValued field is the same as any other field, a
    delete followed by an add of the entire document.
(1) could work. The problem here is how you need to search. If you need
    to search your notes it would be separate from the document. In other
    words, you couldn't form a query like
    "+body:(body text of interest) + notes:(stuff I put in my notes)
    You could search for each independently, but not both together.

There's the "limited join" patch, see:
https://issues.apache.org/jira/browse/SOLR-2272
that hasn't been applied  yet that *might* help. A variant of (1) is
that there's
really no need to have two separate cores. Documents in Solr don't need to all
have the same fields. So you could have, say, two types of documents,
"fulldoc" and "notesdoc". Fulldocs wouldn't have a "notes" field, and notesdocs
wouldn't have a "body" field. You'd have to take some care that the <uniqueKey>
was different for the two different types of documents. That might
work with 2272.
Warning: I haven't played with that patch, so caveat emptor.

Best
Erick

On Wed, Apr 27, 2011 at 12:35 AM, Peter Spam <ps...@mac.com> wrote:
> My schema: id, name, checksum, body, notes, date
>
> I'd like for a user to be able to add notes to the notes field, and not have to re-index the document (since the body field may contain 100MB of text).  Some ideas:
>
> 1) How about creating another core which only contains id, checksum, and notes?  Then, "updating" (delete followed by add) wouldn't be that painful?
>
> 2) What about using a multValued field?  Could you just keep adding values as the user enters more notes?
>
>
> Pete
>
> On Sep 9, 2010, at 11:06 PM, Liam O'Boyle wrote:
>
>> Hi Savannah,
>>
>> You can only reindex the entire document; if you only have the ID,
>> then do a search to retrieve the rest of the data, then reindex.  This
>> assumes that all of the fields you need to index are stored (so that
>> you can retrieve them) and not just indexed.
>>
>> Liam
>>
>> On Fri, Sep 10, 2010 at 3:29 PM, Savannah Beckett
>> <sa...@yahoo.com> wrote:
>>>
>>> I use nutch to crawl and index to Solr.  My code is working.  Now, I want to
>>> update the value of one of the fields of a document in the solr index after the
>>> document was already indexed, and I have only the document id.  How do I do
>>> that?
>>>
>>> Thanks.
>>>
>>>
>>>
>
>

Re: How to Update Value of One Field of a Document in Index?

Posted by Peter Spam <ps...@mac.com>.
My schema: id, name, checksum, body, notes, date

I'd like for a user to be able to add notes to the notes field, and not have to re-index the document (since the body field may contain 100MB of text).  Some ideas:

1) How about creating another core which only contains id, checksum, and notes?  Then, "updating" (delete followed by add) wouldn't be that painful?

2) What about using a multValued field?  Could you just keep adding values as the user enters more notes?


Pete

On Sep 9, 2010, at 11:06 PM, Liam O'Boyle wrote:

> Hi Savannah,
> 
> You can only reindex the entire document; if you only have the ID,
> then do a search to retrieve the rest of the data, then reindex.  This
> assumes that all of the fields you need to index are stored (so that
> you can retrieve them) and not just indexed.
> 
> Liam
> 
> On Fri, Sep 10, 2010 at 3:29 PM, Savannah Beckett
> <sa...@yahoo.com> wrote:
>> 
>> I use nutch to crawl and index to Solr.  My code is working.  Now, I want to
>> update the value of one of the fields of a document in the solr index after the
>> document was already indexed, and I have only the document id.  How do I do
>> that?
>> 
>> Thanks.
>> 
>> 
>> 


Re: How to Update Value of One Field of a Document in Index?

Posted by Peter Spam <ps...@mac.com>.
My schema: id, name, checksum, body, notes, date

I'd like for a user to be able to add notes to the notes field, and not have to re-index the document (since the body field may contain 100MB of text).  Some ideas:

1) How about creating another core which only contains id, checksum, and notes?  Then, "updating" (delete followed by add) wouldn't be that painful?

2) What about using a multValued field?  Could you just keep adding values as the user enters more notes?


Pete

On Sep 9, 2010, at 11:06 PM, Liam O'Boyle wrote:

> Hi Savannah,
> 
> You can only reindex the entire document; if you only have the ID,
> then do a search to retrieve the rest of the data, then reindex.  This
> assumes that all of the fields you need to index are stored (so that
> you can retrieve them) and not just indexed.
> 
> Liam
> 
> On Fri, Sep 10, 2010 at 3:29 PM, Savannah Beckett
> <sa...@yahoo.com> wrote:
>> 
>> I use nutch to crawl and index to Solr.  My code is working.  Now, I want to
>> update the value of one of the fields of a document in the solr index after the
>> document was already indexed, and I have only the document id.  How do I do
>> that?
>> 
>> Thanks.
>> 
>> 
>> 


Re: How to Update Value of One Field of a Document in Index?

Posted by Liam O'Boyle <li...@intelligencebank.com>.
Hi Savannah,

You can only reindex the entire document; if you only have the ID,
then do a search to retrieve the rest of the data, then reindex.  This
assumes that all of the fields you need to index are stored (so that
you can retrieve them) and not just indexed.

Liam

On Fri, Sep 10, 2010 at 3:29 PM, Savannah Beckett
<sa...@yahoo.com> wrote:
>
> I use nutch to crawl and index to Solr.  My code is working.  Now, I want to
> update the value of one of the fields of a document in the solr index after the
> document was already indexed, and I have only the document id.  How do I do
> that?
>
> Thanks.
>
>
>