You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Otis Gospodnetic <ot...@yahoo.com> on 2009/03/18 20:43:00 UTC
Re: Question about incremental index update
Victor,
Daily updates (or hourly or more frequent) are not going to be a problem. I don't follow your question about document ID and using URL.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
----- Original Message ----
> From: "Huang, Zijian(Victor)" <zi...@etrade.com>
> To: solr-user@lucene.apache.org
> Sent: Wednesday, March 18, 2009 2:51:59 PM
> Subject: Question about incremental index update
>
> Hi:
> Is it easy to do daily incremental index update in Solr assuming the
> index is around 1G? In terms of giving a document an ID to facilitate
> index update, is it using the URL a good way to do so?
>
> Thanks
>
>
> Victor
Re: Question about incremental index update
Posted by Shalin Shekhar Mangar <sh...@gmail.com>.
On Thu, Mar 19, 2009 at 2:14 AM, Huang, Zijian(Victor) <
zijian.huang@etrade.com> wrote:
>
> I mean the document ID in Slor xml doc format. Inside the Solr wiki,
> it tells me that I can update a particular doc by its ID if I assigned
> one previously. I am thinking if using the url as the doc ID will be a
> good thing to do.
>
There's the uniqueKey in schema.xml. If you send another document with the
same uniqueKey then it will replace the document already existing in the
index.
--
Regards,
Shalin Shekhar Mangar.
Re: Question about incremental index update
Posted by Otis Gospodnetic <ot...@yahoo.com>.
Victor,
Yes, if you use the same ID (and a URL could serve as a Document ID), Solr will update the Document.
Note that Solr doesn't do crawling/web page fetching, but Nutch and Droids do.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
----- Original Message ----
> From: "Huang, Zijian(Victor)" <zi...@etrade.com>
> To: solr-user@lucene.apache.org
> Sent: Wednesday, March 18, 2009 4:44:30 PM
> Subject: RE: Question about incremental index update
>
> Hi, Otis:
> so does Solr already has some kind of libraries build-in, which it
> can automatically detect the different within two set of crawled
> documents and update the index to the newer one?
> I mean the document ID in Slor xml doc format. Inside the Solr wiki,
> it tells me that I can update a particular doc by its ID if I assigned
> one previously. I am thinking if using the url as the doc ID will be a
> good thing to do.
>
> Thanks
>
> Vic
>
> -----Original Message-----
> From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
> Sent: Wednesday, March 18, 2009 12:43 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Question about incremental index update
>
>
> Victor,
>
> Daily updates (or hourly or more frequent) are not going to be a
> problem. I don't follow your question about document ID and using URL.
>
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> ----- Original Message ----
> > From: "Huang, Zijian(Victor)"
> > To: solr-user@lucene.apache.org
> > Sent: Wednesday, March 18, 2009 2:51:59 PM
> > Subject: Question about incremental index update
> >
> > Hi:
> > Is it easy to do daily incremental index update in Solr assuming
> > the index is around 1G? In terms of giving a document an ID to
> > facilitate index update, is it using the URL a good way to do so?
> >
> > Thanks
> >
> >
> > Victor
RE: Question about incremental index update
Posted by "Huang, Zijian(Victor)" <zi...@etrade.com>.
Hi, Otis:
so does Solr already has some kind of libraries build-in, which it
can automatically detect the different within two set of crawled
documents and update the index to the newer one?
I mean the document ID in Slor xml doc format. Inside the Solr wiki,
it tells me that I can update a particular doc by its ID if I assigned
one previously. I am thinking if using the url as the doc ID will be a
good thing to do.
Thanks
Vic
-----Original Message-----
From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
Sent: Wednesday, March 18, 2009 12:43 PM
To: solr-user@lucene.apache.org
Subject: Re: Question about incremental index update
Victor,
Daily updates (or hourly or more frequent) are not going to be a
problem. I don't follow your question about document ID and using URL.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
----- Original Message ----
> From: "Huang, Zijian(Victor)" <zi...@etrade.com>
> To: solr-user@lucene.apache.org
> Sent: Wednesday, March 18, 2009 2:51:59 PM
> Subject: Question about incremental index update
>
> Hi:
> Is it easy to do daily incremental index update in Solr assuming
> the index is around 1G? In terms of giving a document an ID to
> facilitate index update, is it using the URL a good way to do so?
>
> Thanks
>
>
> Victor