You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Luís Portela Afonso <me...@gmail.com> on 2013/08/22 23:56:29 UTC

SOLR Prevent solr of modifying fields when update doc

Hi,

How can i prevent solr from update some fields when updating a doc?
The problem is, i have an uuid with the field name uuid, but it is not an unique key. When a rss source updates a feed, solr will update the doc with the same link but it generates a new uuid. This is not the desired because this id is used by me to relate feeds with an user.

Can someone help me?

Many Thanks

Re: SOLR Prevent solr of modifying fields when update doc

Posted by Luis Portela Afonso <me...@gmail.com>.
Hi, right now I'm using the link field that comes in any rss entry as my
uniqueKey.
That was the best solution that I found because in many updated documents,
this was the only field that never changes.

Now I'm facing another problem. When I want to search for a document with
that id or link, because that is my uniqueKey, I'm not able to get an
unique result.
I can't successfully search for a field that is a URL on solr.
I think that is because I'm encoding the URL that I'm searching for, but
solr doesn't decodes it.

Thanks for the concern and help

On Saturday, August 24, 2013, Erick Erickson wrote:

> bq:  but the uniqueId is generated by me. But when solr indexes and there
> is an update in a doc, it deletes the doc and creates a new one, so it
> generates a new UUID.
>
> right, this is why I was saying that a UUID field may not fit your use
> case. The _point_ of a UUID field is to generate a unique entry for every
> added document, there's no concept of "only generate the UUID once per
> <uniqueKey> indexed" which seems to be what you want.
>
> So I'd do something like just use the <uniqueKey> field rather than a
> separate UUID field. That doesn't change by definition. What advantage do
> you think you get from the UUID field over just using your <uniqueKey>
> field?
>
> Best,
> Erick
>
>
> On Sat, Aug 24, 2013 at 6:26 AM, Luis Portela Afonso <
> meligaletiko@gmail.com <javascript:;>
> > wrote:
>
> > Hi,
> >
> > The uuid, that was been used like the id of a document, it's generated by
> > solr using an updatechain.
> > I just use the recommend method to generate uuid's.
> >
> > I think an atomic update is not suitable for me, because I want that solr
> > indexes the feeds and not me. I don't want to send information to solr, I
> > want that indexes it each 15 minutes, for example, and now it's doing
> that.
> >
> > Lance, I don't understand what you want to say with, software that I use
> to
> > index.
> > I just use solr. I have a configuration with two entities. One that
> selects
> > my rss sources from a database and then the main entity that get
> > information from an URL and processes it.
> >
> > Thank you all for the answers.
> > Much appreciated
> >
> > On Saturday, August 24, 2013, Greg Preston wrote:
> >
> > > But there is an API for sending a delta over the wire, and server side
> it
> > > does a read, overlay, delete, and insert.  And only the fields you sent
> > > will be changed.
> > >
> > > *Might require your unchanged fields to all be stored, though.
> > >
> > >
> > > -Greg
> > >
> > >
> > > On Fri, Aug 23, 2013 at 7:08 PM, Lance Norskog <goksron@gmail.com<javascript:;>
> > <javascript:;>>
> > > wrote:
> > >
> > > > Solr does not by default generate unique IDs. It uses what you give
> as
> > > > your unique field, usually called 'id'.
> > > >
> > > > What software do you use to index data from your RSS feeds? Maybe
> that
> > is
> > > > creating a new 'id' field?
> > > >
> > > > There is no partial update, Solr (Lucene) always rewrites the
> complete
> > > > document.
> > > >
> > > >
> > > > On 08/23/2013 09:03 AM, Greg Preston wrote:
> > > >
> > > >> Perhaps an atomic update that only changes the fields you want to
> > > change?
> > > >>
> > > >> -Greg
> > > >>
> > > >>
> > > >> On Fri, Aug 23, 2013 at 4:16 AM, Luís Portela Afonso
> > > >> <meligaletiko@gmail.com <javascript:;> <javascript:;>> wrote:
> > > >>
> > > >>> Hi thanks by the answer, but the uniqueId is generated by me. But
> > when
> > > >>> solr indexes and there is an update in a doc, it deletes the doc
> and
> > > >>> creates a new one, so it generates a new UUID.
> > > >>> It is not suitable for me, because i want that solr just updates
> some
> > > >>> fields, because the UUID is the key that i use to map it to an user
> > in
> > > my
> > > >>> database.
> > > >>>
> > > >>> Right now i'm using information that comes from the source and
> never
> > > >>> chages, as my uniqueId, like for example the guid, that exists in
> > some
> > > rss
> > > >>> feeds, or if it doesn't exists i use link.
> > > >>>
> > > >>> I think there is any simple solution for me, because for what i
> have
> > > >>> read, when an update to a doc exists, SOLR deletes the old one and
> > > create a
> > > >>> new one, right?
> > > >>>
> > > >>> On Aug 23, 2013, at 12:07 PM, Erick Erickson <
> > erickerickson@gmail.com <javascript:;><javascript:;>
> > > >
> > > >>> wrote:
> > > >>>
> > > >>>  Well, not much in the way of help because you can't do what you
> > > >>>> want AFAIK. I don't think UUID is suitable for your use-case. Why
> > not
> > > >>>> use your <uniqueId>?
> > > >>>>
> > > >>>> Or generate something yourself...
> > > >>>>
> > > >>>> Best
> > > >>>> Erick
> > > >>>>
> > > >>>>
> > > >>>> On Thu, Aug 22, 2013 at 5:56 PM, Luís Portela Afonso <
> > > >>>> meligaletiko@gmail.com <javascript:;> <javascript:;>
> > > >>>>
> > > >>>>> wrote:
> > > >>>>> Hi,
> > > >>>>>
> > > >>>>> How can i prevent solr from update some fields when updating a
> doc?
> > > >>>>> The problem is, i have an uuid with the field name uuid, but it
> is
> > > not
> > > >>>>> an
> > > >>>>> unique key. When a rss source updates a feed, solr will update
> the
> > > doc
> > > >>>>> with
> > > >>>>> the same link but it generates a new uuid. This is not the
> desired
> > > >>>>> because
> > > >>>>> this id is used by me to relate feeds with an user.
> > > >>>>>
> > > >>>>> Can someone help me?
> > > >>>>>
> > > >>>>> Many Thanks
> > > >>>>>
> > > >>>>
> > > >
> > >
> >
> >
> > --
> > Sent from Gmail Mobile
> >
>


-- 
Sent from Gmail Mobile

Re: SOLR Prevent solr of modifying fields when update doc

Posted by Erick Erickson <er...@gmail.com>.
bq:  but the uniqueId is generated by me. But when solr indexes and there
is an update in a doc, it deletes the doc and creates a new one, so it
generates a new UUID.

right, this is why I was saying that a UUID field may not fit your use
case. The _point_ of a UUID field is to generate a unique entry for every
added document, there's no concept of "only generate the UUID once per
<uniqueKey> indexed" which seems to be what you want.

So I'd do something like just use the <uniqueKey> field rather than a
separate UUID field. That doesn't change by definition. What advantage do
you think you get from the UUID field over just using your <uniqueKey>
field?

Best,
Erick


On Sat, Aug 24, 2013 at 6:26 AM, Luis Portela Afonso <meligaletiko@gmail.com
> wrote:

> Hi,
>
> The uuid, that was been used like the id of a document, it's generated by
> solr using an updatechain.
> I just use the recommend method to generate uuid's.
>
> I think an atomic update is not suitable for me, because I want that solr
> indexes the feeds and not me. I don't want to send information to solr, I
> want that indexes it each 15 minutes, for example, and now it's doing that.
>
> Lance, I don't understand what you want to say with, software that I use to
> index.
> I just use solr. I have a configuration with two entities. One that selects
> my rss sources from a database and then the main entity that get
> information from an URL and processes it.
>
> Thank you all for the answers.
> Much appreciated
>
> On Saturday, August 24, 2013, Greg Preston wrote:
>
> > But there is an API for sending a delta over the wire, and server side it
> > does a read, overlay, delete, and insert.  And only the fields you sent
> > will be changed.
> >
> > *Might require your unchanged fields to all be stored, though.
> >
> >
> > -Greg
> >
> >
> > On Fri, Aug 23, 2013 at 7:08 PM, Lance Norskog <goksron@gmail.com
> <javascript:;>>
> > wrote:
> >
> > > Solr does not by default generate unique IDs. It uses what you give as
> > > your unique field, usually called 'id'.
> > >
> > > What software do you use to index data from your RSS feeds? Maybe that
> is
> > > creating a new 'id' field?
> > >
> > > There is no partial update, Solr (Lucene) always rewrites the complete
> > > document.
> > >
> > >
> > > On 08/23/2013 09:03 AM, Greg Preston wrote:
> > >
> > >> Perhaps an atomic update that only changes the fields you want to
> > change?
> > >>
> > >> -Greg
> > >>
> > >>
> > >> On Fri, Aug 23, 2013 at 4:16 AM, Luís Portela Afonso
> > >> <meligaletiko@gmail.com <javascript:;>> wrote:
> > >>
> > >>> Hi thanks by the answer, but the uniqueId is generated by me. But
> when
> > >>> solr indexes and there is an update in a doc, it deletes the doc and
> > >>> creates a new one, so it generates a new UUID.
> > >>> It is not suitable for me, because i want that solr just updates some
> > >>> fields, because the UUID is the key that i use to map it to an user
> in
> > my
> > >>> database.
> > >>>
> > >>> Right now i'm using information that comes from the source and never
> > >>> chages, as my uniqueId, like for example the guid, that exists in
> some
> > rss
> > >>> feeds, or if it doesn't exists i use link.
> > >>>
> > >>> I think there is any simple solution for me, because for what i have
> > >>> read, when an update to a doc exists, SOLR deletes the old one and
> > create a
> > >>> new one, right?
> > >>>
> > >>> On Aug 23, 2013, at 12:07 PM, Erick Erickson <
> erickerickson@gmail.com<javascript:;>
> > >
> > >>> wrote:
> > >>>
> > >>>  Well, not much in the way of help because you can't do what you
> > >>>> want AFAIK. I don't think UUID is suitable for your use-case. Why
> not
> > >>>> use your <uniqueId>?
> > >>>>
> > >>>> Or generate something yourself...
> > >>>>
> > >>>> Best
> > >>>> Erick
> > >>>>
> > >>>>
> > >>>> On Thu, Aug 22, 2013 at 5:56 PM, Luís Portela Afonso <
> > >>>> meligaletiko@gmail.com <javascript:;>
> > >>>>
> > >>>>> wrote:
> > >>>>> Hi,
> > >>>>>
> > >>>>> How can i prevent solr from update some fields when updating a doc?
> > >>>>> The problem is, i have an uuid with the field name uuid, but it is
> > not
> > >>>>> an
> > >>>>> unique key. When a rss source updates a feed, solr will update the
> > doc
> > >>>>> with
> > >>>>> the same link but it generates a new uuid. This is not the desired
> > >>>>> because
> > >>>>> this id is used by me to relate feeds with an user.
> > >>>>>
> > >>>>> Can someone help me?
> > >>>>>
> > >>>>> Many Thanks
> > >>>>>
> > >>>>
> > >
> >
>
>
> --
> Sent from Gmail Mobile
>

Re: SOLR Prevent solr of modifying fields when update doc

Posted by Luis Portela Afonso <me...@gmail.com>.
Hi,

The uuid, that was been used like the id of a document, it's generated by
solr using an updatechain.
I just use the recommend method to generate uuid's.

I think an atomic update is not suitable for me, because I want that solr
indexes the feeds and not me. I don't want to send information to solr, I
want that indexes it each 15 minutes, for example, and now it's doing that.

Lance, I don't understand what you want to say with, software that I use to
index.
I just use solr. I have a configuration with two entities. One that selects
my rss sources from a database and then the main entity that get
information from an URL and processes it.

Thank you all for the answers.
Much appreciated

On Saturday, August 24, 2013, Greg Preston wrote:

> But there is an API for sending a delta over the wire, and server side it
> does a read, overlay, delete, and insert.  And only the fields you sent
> will be changed.
>
> *Might require your unchanged fields to all be stored, though.
>
>
> -Greg
>
>
> On Fri, Aug 23, 2013 at 7:08 PM, Lance Norskog <goksron@gmail.com<javascript:;>>
> wrote:
>
> > Solr does not by default generate unique IDs. It uses what you give as
> > your unique field, usually called 'id'.
> >
> > What software do you use to index data from your RSS feeds? Maybe that is
> > creating a new 'id' field?
> >
> > There is no partial update, Solr (Lucene) always rewrites the complete
> > document.
> >
> >
> > On 08/23/2013 09:03 AM, Greg Preston wrote:
> >
> >> Perhaps an atomic update that only changes the fields you want to
> change?
> >>
> >> -Greg
> >>
> >>
> >> On Fri, Aug 23, 2013 at 4:16 AM, Luís Portela Afonso
> >> <meligaletiko@gmail.com <javascript:;>> wrote:
> >>
> >>> Hi thanks by the answer, but the uniqueId is generated by me. But when
> >>> solr indexes and there is an update in a doc, it deletes the doc and
> >>> creates a new one, so it generates a new UUID.
> >>> It is not suitable for me, because i want that solr just updates some
> >>> fields, because the UUID is the key that i use to map it to an user in
> my
> >>> database.
> >>>
> >>> Right now i'm using information that comes from the source and never
> >>> chages, as my uniqueId, like for example the guid, that exists in some
> rss
> >>> feeds, or if it doesn't exists i use link.
> >>>
> >>> I think there is any simple solution for me, because for what i have
> >>> read, when an update to a doc exists, SOLR deletes the old one and
> create a
> >>> new one, right?
> >>>
> >>> On Aug 23, 2013, at 12:07 PM, Erick Erickson <erickerickson@gmail.com<javascript:;>
> >
> >>> wrote:
> >>>
> >>>  Well, not much in the way of help because you can't do what you
> >>>> want AFAIK. I don't think UUID is suitable for your use-case. Why not
> >>>> use your <uniqueId>?
> >>>>
> >>>> Or generate something yourself...
> >>>>
> >>>> Best
> >>>> Erick
> >>>>
> >>>>
> >>>> On Thu, Aug 22, 2013 at 5:56 PM, Luís Portela Afonso <
> >>>> meligaletiko@gmail.com <javascript:;>
> >>>>
> >>>>> wrote:
> >>>>> Hi,
> >>>>>
> >>>>> How can i prevent solr from update some fields when updating a doc?
> >>>>> The problem is, i have an uuid with the field name uuid, but it is
> not
> >>>>> an
> >>>>> unique key. When a rss source updates a feed, solr will update the
> doc
> >>>>> with
> >>>>> the same link but it generates a new uuid. This is not the desired
> >>>>> because
> >>>>> this id is used by me to relate feeds with an user.
> >>>>>
> >>>>> Can someone help me?
> >>>>>
> >>>>> Many Thanks
> >>>>>
> >>>>
> >
>


-- 
Sent from Gmail Mobile

Re: SOLR Prevent solr of modifying fields when update doc

Posted by Greg Preston <gp...@marinsoftware.com>.
But there is an API for sending a delta over the wire, and server side it
does a read, overlay, delete, and insert.  And only the fields you sent
will be changed.

*Might require your unchanged fields to all be stored, though.


-Greg


On Fri, Aug 23, 2013 at 7:08 PM, Lance Norskog <go...@gmail.com> wrote:

> Solr does not by default generate unique IDs. It uses what you give as
> your unique field, usually called 'id'.
>
> What software do you use to index data from your RSS feeds? Maybe that is
> creating a new 'id' field?
>
> There is no partial update, Solr (Lucene) always rewrites the complete
> document.
>
>
> On 08/23/2013 09:03 AM, Greg Preston wrote:
>
>> Perhaps an atomic update that only changes the fields you want to change?
>>
>> -Greg
>>
>>
>> On Fri, Aug 23, 2013 at 4:16 AM, Luís Portela Afonso
>> <me...@gmail.com> wrote:
>>
>>> Hi thanks by the answer, but the uniqueId is generated by me. But when
>>> solr indexes and there is an update in a doc, it deletes the doc and
>>> creates a new one, so it generates a new UUID.
>>> It is not suitable for me, because i want that solr just updates some
>>> fields, because the UUID is the key that i use to map it to an user in my
>>> database.
>>>
>>> Right now i'm using information that comes from the source and never
>>> chages, as my uniqueId, like for example the guid, that exists in some rss
>>> feeds, or if it doesn't exists i use link.
>>>
>>> I think there is any simple solution for me, because for what i have
>>> read, when an update to a doc exists, SOLR deletes the old one and create a
>>> new one, right?
>>>
>>> On Aug 23, 2013, at 12:07 PM, Erick Erickson <er...@gmail.com>
>>> wrote:
>>>
>>>  Well, not much in the way of help because you can't do what you
>>>> want AFAIK. I don't think UUID is suitable for your use-case. Why not
>>>> use your <uniqueId>?
>>>>
>>>> Or generate something yourself...
>>>>
>>>> Best
>>>> Erick
>>>>
>>>>
>>>> On Thu, Aug 22, 2013 at 5:56 PM, Luís Portela Afonso <
>>>> meligaletiko@gmail.com
>>>>
>>>>> wrote:
>>>>> Hi,
>>>>>
>>>>> How can i prevent solr from update some fields when updating a doc?
>>>>> The problem is, i have an uuid with the field name uuid, but it is not
>>>>> an
>>>>> unique key. When a rss source updates a feed, solr will update the doc
>>>>> with
>>>>> the same link but it generates a new uuid. This is not the desired
>>>>> because
>>>>> this id is used by me to relate feeds with an user.
>>>>>
>>>>> Can someone help me?
>>>>>
>>>>> Many Thanks
>>>>>
>>>>
>

Re: SOLR Prevent solr of modifying fields when update doc

Posted by Lance Norskog <go...@gmail.com>.
Solr does not by default generate unique IDs. It uses what you give as 
your unique field, usually called 'id'.

What software do you use to index data from your RSS feeds? Maybe that 
is creating a new 'id' field?

There is no partial update, Solr (Lucene) always rewrites the complete 
document.

On 08/23/2013 09:03 AM, Greg Preston wrote:
> Perhaps an atomic update that only changes the fields you want to change?
>
> -Greg
>
>
> On Fri, Aug 23, 2013 at 4:16 AM, Luís Portela Afonso
> <me...@gmail.com> wrote:
>> Hi thanks by the answer, but the uniqueId is generated by me. But when solr indexes and there is an update in a doc, it deletes the doc and creates a new one, so it generates a new UUID.
>> It is not suitable for me, because i want that solr just updates some fields, because the UUID is the key that i use to map it to an user in my database.
>>
>> Right now i'm using information that comes from the source and never chages, as my uniqueId, like for example the guid, that exists in some rss feeds, or if it doesn't exists i use link.
>>
>> I think there is any simple solution for me, because for what i have read, when an update to a doc exists, SOLR deletes the old one and create a new one, right?
>>
>> On Aug 23, 2013, at 12:07 PM, Erick Erickson <er...@gmail.com> wrote:
>>
>>> Well, not much in the way of help because you can't do what you
>>> want AFAIK. I don't think UUID is suitable for your use-case. Why not
>>> use your <uniqueId>?
>>>
>>> Or generate something yourself...
>>>
>>> Best
>>> Erick
>>>
>>>
>>> On Thu, Aug 22, 2013 at 5:56 PM, Luís Portela Afonso <meligaletiko@gmail.com
>>>> wrote:
>>>> Hi,
>>>>
>>>> How can i prevent solr from update some fields when updating a doc?
>>>> The problem is, i have an uuid with the field name uuid, but it is not an
>>>> unique key. When a rss source updates a feed, solr will update the doc with
>>>> the same link but it generates a new uuid. This is not the desired because
>>>> this id is used by me to relate feeds with an user.
>>>>
>>>> Can someone help me?
>>>>
>>>> Many Thanks


Re: SOLR Prevent solr of modifying fields when update doc

Posted by Greg Preston <gp...@marinsoftware.com>.
Perhaps an atomic update that only changes the fields you want to change?

-Greg


On Fri, Aug 23, 2013 at 4:16 AM, Luís Portela Afonso
<me...@gmail.com> wrote:
> Hi thanks by the answer, but the uniqueId is generated by me. But when solr indexes and there is an update in a doc, it deletes the doc and creates a new one, so it generates a new UUID.
> It is not suitable for me, because i want that solr just updates some fields, because the UUID is the key that i use to map it to an user in my database.
>
> Right now i'm using information that comes from the source and never chages, as my uniqueId, like for example the guid, that exists in some rss feeds, or if it doesn't exists i use link.
>
> I think there is any simple solution for me, because for what i have read, when an update to a doc exists, SOLR deletes the old one and create a new one, right?
>
> On Aug 23, 2013, at 12:07 PM, Erick Erickson <er...@gmail.com> wrote:
>
>> Well, not much in the way of help because you can't do what you
>> want AFAIK. I don't think UUID is suitable for your use-case. Why not
>> use your <uniqueId>?
>>
>> Or generate something yourself...
>>
>> Best
>> Erick
>>
>>
>> On Thu, Aug 22, 2013 at 5:56 PM, Luís Portela Afonso <meligaletiko@gmail.com
>>> wrote:
>>
>>> Hi,
>>>
>>> How can i prevent solr from update some fields when updating a doc?
>>> The problem is, i have an uuid with the field name uuid, but it is not an
>>> unique key. When a rss source updates a feed, solr will update the doc with
>>> the same link but it generates a new uuid. This is not the desired because
>>> this id is used by me to relate feeds with an user.
>>>
>>> Can someone help me?
>>>
>>> Many Thanks
>

Re: SOLR Prevent solr of modifying fields when update doc

Posted by Luís Portela Afonso <me...@gmail.com>.
Hi thanks by the answer, but the uniqueId is generated by me. But when solr indexes and there is an update in a doc, it deletes the doc and creates a new one, so it generates a new UUID.
It is not suitable for me, because i want that solr just updates some fields, because the UUID is the key that i use to map it to an user in my database.

Right now i'm using information that comes from the source and never chages, as my uniqueId, like for example the guid, that exists in some rss feeds, or if it doesn't exists i use link.

I think there is any simple solution for me, because for what i have read, when an update to a doc exists, SOLR deletes the old one and create a new one, right?

On Aug 23, 2013, at 12:07 PM, Erick Erickson <er...@gmail.com> wrote:

> Well, not much in the way of help because you can't do what you
> want AFAIK. I don't think UUID is suitable for your use-case. Why not
> use your <uniqueId>?
> 
> Or generate something yourself...
> 
> Best
> Erick
> 
> 
> On Thu, Aug 22, 2013 at 5:56 PM, Luís Portela Afonso <meligaletiko@gmail.com
>> wrote:
> 
>> Hi,
>> 
>> How can i prevent solr from update some fields when updating a doc?
>> The problem is, i have an uuid with the field name uuid, but it is not an
>> unique key. When a rss source updates a feed, solr will update the doc with
>> the same link but it generates a new uuid. This is not the desired because
>> this id is used by me to relate feeds with an user.
>> 
>> Can someone help me?
>> 
>> Many Thanks


Re: SOLR Prevent solr of modifying fields when update doc

Posted by Erick Erickson <er...@gmail.com>.
Well, not much in the way of help because you can't do what you
want AFAIK. I don't think UUID is suitable for your use-case. Why not
use your <uniqueId>?

Or generate something yourself...

Best
Erick


On Thu, Aug 22, 2013 at 5:56 PM, Luís Portela Afonso <meligaletiko@gmail.com
> wrote:

> Hi,
>
> How can i prevent solr from update some fields when updating a doc?
> The problem is, i have an uuid with the field name uuid, but it is not an
> unique key. When a rss source updates a feed, solr will update the doc with
> the same link but it generates a new uuid. This is not the desired because
> this id is used by me to relate feeds with an user.
>
> Can someone help me?
>
> Many Thanks