You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Joe Zhang <sm...@gmail.com> on 2012/12/03 06:45:39 UTC

duplicated URL sent from Nutch to solr index

Dear list,

I just want to confirm an expected behavior of solr:

Assuming we have " <uniqueKey>id</uniqueKey>" in schema.xml for solr, when
we send the same URL from nutch to solr multiple times. would there be ONLY
ONE entry for that URL, but the content (if changed) and timestamp would be
updated?


Thanks!

Joe

Re: duplicated URL sent from Nutch to solr index

Posted by Xi Shen <da...@gmail.com>.
Then the "URL" must be the same.


On Mon, Dec 3, 2012 at 2:34 PM, Joe Zhang <sm...@gmail.com> wrote:

> Sorry I didn't make it perfectly clear. The "id" field is URL.
>
> On Sun, Dec 2, 2012 at 11:33 PM, Joe Zhang <sm...@gmail.com> wrote:
>
> > Thanks!
> >
> >
> > On Sun, Dec 2, 2012 at 11:20 PM, Xi Shen <da...@gmail.com> wrote:
> >
> >> If the value for "id" field is the same, the old entry will be update;
> if
> >> it is new, a new entry will be created & indexed.
> >>
> >> This is my experience. :)
> >>
> >>
> >> On Mon, Dec 3, 2012 at 1:45 PM, Joe Zhang <sm...@gmail.com> wrote:
> >>
> >> > Dear list,
> >> >
> >> > I just want to confirm an expected behavior of solr:
> >> >
> >> > Assuming we have " <uniqueKey>id</uniqueKey>" in schema.xml for solr,
> >> when
> >> > we send the same URL from nutch to solr multiple times. would there be
> >> ONLY
> >> > ONE entry for that URL, but the content (if changed) and timestamp
> >> would be
> >> > updated?
> >> >
> >> >
> >> > Thanks!
> >> >
> >> > Joe
> >> >
> >>
> >>
> >>
> >> --
> >> Regards,
> >> David Shen
> >>
> >> http://about.me/davidshen
> >> https://twitter.com/#!/davidshen84
> >>
> >
> >
>



-- 
Regards,
David Shen

http://about.me/davidshen
https://twitter.com/#!/davidshen84

Re: duplicated URL sent from Nutch to solr index

Posted by Joe Zhang <sm...@gmail.com>.
Sorry I didn't make it perfectly clear. The "id" field is URL.

On Sun, Dec 2, 2012 at 11:33 PM, Joe Zhang <sm...@gmail.com> wrote:

> Thanks!
>
>
> On Sun, Dec 2, 2012 at 11:20 PM, Xi Shen <da...@gmail.com> wrote:
>
>> If the value for "id" field is the same, the old entry will be update; if
>> it is new, a new entry will be created & indexed.
>>
>> This is my experience. :)
>>
>>
>> On Mon, Dec 3, 2012 at 1:45 PM, Joe Zhang <sm...@gmail.com> wrote:
>>
>> > Dear list,
>> >
>> > I just want to confirm an expected behavior of solr:
>> >
>> > Assuming we have " <uniqueKey>id</uniqueKey>" in schema.xml for solr,
>> when
>> > we send the same URL from nutch to solr multiple times. would there be
>> ONLY
>> > ONE entry for that URL, but the content (if changed) and timestamp
>> would be
>> > updated?
>> >
>> >
>> > Thanks!
>> >
>> > Joe
>> >
>>
>>
>>
>> --
>> Regards,
>> David Shen
>>
>> http://about.me/davidshen
>> https://twitter.com/#!/davidshen84
>>
>
>

Re: duplicated URL sent from Nutch to solr index

Posted by Joe Zhang <sm...@gmail.com>.
Thanks!

On Sun, Dec 2, 2012 at 11:20 PM, Xi Shen <da...@gmail.com> wrote:

> If the value for "id" field is the same, the old entry will be update; if
> it is new, a new entry will be created & indexed.
>
> This is my experience. :)
>
>
> On Mon, Dec 3, 2012 at 1:45 PM, Joe Zhang <sm...@gmail.com> wrote:
>
> > Dear list,
> >
> > I just want to confirm an expected behavior of solr:
> >
> > Assuming we have " <uniqueKey>id</uniqueKey>" in schema.xml for solr,
> when
> > we send the same URL from nutch to solr multiple times. would there be
> ONLY
> > ONE entry for that URL, but the content (if changed) and timestamp would
> be
> > updated?
> >
> >
> > Thanks!
> >
> > Joe
> >
>
>
>
> --
> Regards,
> David Shen
>
> http://about.me/davidshen
> https://twitter.com/#!/davidshen84
>

Re: duplicated URL sent from Nutch to solr index

Posted by Xi Shen <da...@gmail.com>.
If the value for "id" field is the same, the old entry will be update; if
it is new, a new entry will be created & indexed.

This is my experience. :)


On Mon, Dec 3, 2012 at 1:45 PM, Joe Zhang <sm...@gmail.com> wrote:

> Dear list,
>
> I just want to confirm an expected behavior of solr:
>
> Assuming we have " <uniqueKey>id</uniqueKey>" in schema.xml for solr, when
> we send the same URL from nutch to solr multiple times. would there be ONLY
> ONE entry for that URL, but the content (if changed) and timestamp would be
> updated?
>
>
> Thanks!
>
> Joe
>



-- 
Regards,
David Shen

http://about.me/davidshen
https://twitter.com/#!/davidshen84