You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by yo tomi <yo...@gmail.com> on 2020/07/17 07:32:18 UTC
AtomicUpdate on SolrCloud is not working
Hi, All
When I did AtomicUpdate on SolrCloud by the following setting, it does
not work properly.
---
<updateRequestProcessorChain name="skip-empty">
<processor class="solr.DistributedUpdateProcessorFactory"/>
<processor class="TrimFieldUpdateProcessorFactory" />
<processor class="RemoveBlankFieldUpdateProcessorFactory" />
<processor class="solr.LogUpdateProcessorFactory" />
<processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>
---
When changed as follows and made it work, it became as expected.
---
<updateRequestProcessorChain name="skip-empty">
<processor class="TrimFieldUpdateProcessorFactory" />
<processor class="RemoveBlankFieldUpdateProcessorFactory" />
<processor class="solr.LogUpdateProcessorFactory" />
<processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>
---
The later setting and the way of using post-processor could make the
same result, I though,
but using post-processor, bug of SOLR-8030 makes me not feel like using it.
By the latter setting even, is there any possibility of SOLR-8030 to
become? Seeing the source code, tlog which is from leader comes to
Replica seems to be processed correctly with UpdateRequestProcessor,
the latter setting had not been the right one for the bug, I
though.Anyone knows the most appropriate way to configure AtomicUpdate
on SolrCloud?
Thanks,
Yoshiaki
Re: AtomicUpdate on SolrCloud is not working
Posted by Shawn Heisey <ap...@elyograg.org>.
On 7/19/2020 1:37 AM, yo tomi wrote:
> I have no choice but use post-processor.
> However bug of SOLR-8030 makes me not feel like using it.
Can you explain why you need the trim field and remove blank field
processors to be post processors? When I think about these
functionalities, they should work fully as expected even when executed
as "pre" processors.
Thanks,
Shawn
Re: AtomicUpdate on SolrCloud is not working
Posted by yo tomi <yo...@gmail.com>.
Hi Jörn & shown
"does not work properly" means pre-processors
(TrimFieldUpdateProcessorFactory and
RemoveBlankFieldUpdateProcessorFactory) don't trim and remove blank for
string fields.
example:
When the following schema:
---
<field name="id" type="string" multiValued="false" indexed="true"
required="true" stored="true"/>
<field name="title" type="string" uninvertible="false" indexed="true"
stored="true"/>
---
update following documents with "Documents" of solr admin:
---
{
"id": "1",
"title": {"set": " test "}
},
{
"id": "2",
"title": {set": ""}
}
---
Then the follows are indexed, when pre-processor:
---
{
"id": "1",
"title": " test "
},
{
"id": "2",
"title": ""
}
---
When post-processor:
---
{
"id": "1",
"title": "test"
},
{
"id": "2"
}
---
I have no choice but use post-processor.
However bug of SOLR-8030 makes me not feel like using it.
By the way, version of solr is 8.4.
Best,
Yoshiaki
Re: AtomicUpdate on SolrCloud is not working
Posted by Shawn Heisey <ap...@elyograg.org>.
On 7/17/2020 1:32 AM, yo tomi wrote:
> When I did AtomicUpdate on SolrCloud by the following setting, it does
> not work properly.
As Jörn Franke already mentioned, you haven't said exactly what "does
not work properly" actually means in your situation. Without that
information, it will be very difficult to provide any real help.
Atomic update functionality is currently implemented in
DistributedUpdateProcessorFactory.
> ---
> <updateRequestProcessorChain name="skip-empty">
> <processor class="solr.DistributedUpdateProcessorFactory"/>
> <processor class="TrimFieldUpdateProcessorFactory" />
> <processor class="RemoveBlankFieldUpdateProcessorFactory" />
> <processor class="solr.LogUpdateProcessorFactory" />
> <processor class="solr.RunUpdateProcessorFactory" />
> </updateRequestProcessorChain>
> ---
> When changed as follows and made it work, it became as expected.
> ---
> <updateRequestProcessorChain name="skip-empty">
> <processor class="TrimFieldUpdateProcessorFactory" />
> <processor class="RemoveBlankFieldUpdateProcessorFactory" />
> <processor class="solr.LogUpdateProcessorFactory" />
> <processor class="solr.RunUpdateProcessorFactory" />
> </updateRequestProcessorChain>
> ---
The effective result difference between these configurations is that
atomic updates will happen first with the first config, and in the
second, atomic updates will happen second to last -- just before
RunUpdateProcessorFactory.
Also, with the first config, most of the update processors are going to
be executed on the machine with the shard leader (after the update is
distributed) and if there is more than one NRT replica, they will be
executed multiple times. With the second config, most of the processors
will be executed on the machine that actually receives the update
request. For the purposes of that discussion, remember that when a PULL
replica is elected leader, it is effectively an NRT replica.
Does that information help you determine why it doesn't do what you expect?
> The later setting and the way of using post-processor could make the
> same result, I though,
> but using post-processor, bug of SOLR-8030 makes me not feel like using it.
> By the latter setting even, is there any possibility of SOLR-8030 to
> become?
See this part of the reference guide for a bunch of gory details about
DistributedUpdateProcessorFactory:
https://cwiki.apache.org/confluence/display/SOLR/UpdateRequestProcessor#UpdateRequestProcessor-DistributedUpdates
In SOLR-8030, the general consensus among committers is that you should
configure almost all update processors as "pre" processors -- placed
before DistributedUpdatePorcessorFactory in the config. When done this
way, updates are usually faster and less likely to yield inconsistent
results.
There may be situations where having them as "post" processors is
correct, but that won't happen very often. The second config above does
implicitly use "pre" for most of the processors.
Thanks,
Shawn
Re: AtomicUpdate on SolrCloud is not working
Posted by Issei Nishigata <du...@gmail.com>.
I have the same problem in my Solr8.
I think it's because in the first way,
TrimFieldUpdateProcessorFactory and RemoveBlankFieldUpdateProcessorFactory
is not taking effect.
On SolrCloud, TrimFieldUpdateProcessorFactory,
RemoveBlankFieldUpdateProcessorFactory and other processors
only run on the first node that receives an update request.
Consequently, it's necessary to execute TrimFieldUpdateProcessorFactory and
RemoveBlankFieldUpdateProcessorFactory
after giving the document to the replica node using the
DistributedUpdateProcessor,
so we need to use the second way that he described otherwise it won't
operate properly.
But even with this way, both I and he are worried whether it will be cause
of SOLR-8030.
I also want to know about this, does anyone have any comment about this?
Best,
Issei
2020年7月17日(金) 18:34 Jörn Franke <jo...@gmail.com>:
> What does „not work correctly mean“?
>
> Have you checked that all fields are stored or doc values?
>
> > Am 17.07.2020 um 11:26 schrieb yo tomi <yo...@gmail.com>:
> >
> > Hi All
> >
> > Sorry, above settings are contrary with each other.
> > Actually, following setting does not work properly.
> > ---
> > <updateRequestProcessorChain name="skip-empty">
> > <processor class="TrimFieldUpdateProcessorFactory" />
> > <processor class="RemoveBlankFieldUpdateProcessorFactory" />
> > <processor class="solr.LogUpdateProcessorFactory" />
> > <processor class="solr.RunUpdateProcessorFactory" />
> > </updateRequestProcessorChain>
> > ---
> > And follows is working as expected.
> > ---
> > <updateRequestProcessorChain name="skip-empty">
> > <processor class="solr.DistributedUpdateProcessorFactory"/>
> > <processor class="TrimFieldUpdateProcessorFactory" />
> > <processor class="RemoveBlankFieldUpdateProcessorFactory" />
> > <processor class="solr.LogUpdateProcessorFactory" />
> > <processor class="solr.RunUpdateProcessorFactory" />
> > </updateRequestProcessorChain>
> > ---
> >
> > Thanks,
> > Yoshiaki
> >
> >
> > 2020年7月17日(金) 16:32 yo tomi <yo...@gmail.com>:
> >
> >> Hi, All
> >> When I did AtomicUpdate on SolrCloud by the following setting, it does
> not work properly.
> >>
> >> ---
> >> <updateRequestProcessorChain name="skip-empty">
> >> <processor class="solr.DistributedUpdateProcessorFactory"/>
> >> <processor class="TrimFieldUpdateProcessorFactory" />
> >> <processor class="RemoveBlankFieldUpdateProcessorFactory" />
> >> <processor class="solr.LogUpdateProcessorFactory" />
> >> <processor class="solr.RunUpdateProcessorFactory" />
> >> </updateRequestProcessorChain>
> >> ---
> >> When changed as follows and made it work, it became as expected.
> >> ---
> >> <updateRequestProcessorChain name="skip-empty">
> >> <processor class="TrimFieldUpdateProcessorFactory" />
> >> <processor class="RemoveBlankFieldUpdateProcessorFactory" />
> >> <processor class="solr.LogUpdateProcessorFactory" />
> >> <processor class="solr.RunUpdateProcessorFactory" />
> >> </updateRequestProcessorChain>
> >> ---
> >> The later setting and the way of using post-processor could make the
> same result, I though,
> >> but using post-processor, bug of SOLR-8030 makes me not feel like using
> it.
> >> By the latter setting even, is there any possibility of SOLR-8030 to
> become? Seeing the source code, tlog which is from leader comes to Replica
> seems to be processed correctly with UpdateRequestProcessor,
> >> the latter setting had not been the right one for the bug, I
> though.Anyone knows the most appropriate way to configure AtomicUpdate on
> SolrCloud?
> >>
> >> Thanks,
> >> Yoshiaki
> >>
> >>
>
Re: AtomicUpdate on SolrCloud is not working
Posted by Jörn Franke <jo...@gmail.com>.
What does „not work correctly mean“?
Have you checked that all fields are stored or doc values?
> Am 17.07.2020 um 11:26 schrieb yo tomi <yo...@gmail.com>:
>
> Hi All
>
> Sorry, above settings are contrary with each other.
> Actually, following setting does not work properly.
> ---
> <updateRequestProcessorChain name="skip-empty">
> <processor class="TrimFieldUpdateProcessorFactory" />
> <processor class="RemoveBlankFieldUpdateProcessorFactory" />
> <processor class="solr.LogUpdateProcessorFactory" />
> <processor class="solr.RunUpdateProcessorFactory" />
> </updateRequestProcessorChain>
> ---
> And follows is working as expected.
> ---
> <updateRequestProcessorChain name="skip-empty">
> <processor class="solr.DistributedUpdateProcessorFactory"/>
> <processor class="TrimFieldUpdateProcessorFactory" />
> <processor class="RemoveBlankFieldUpdateProcessorFactory" />
> <processor class="solr.LogUpdateProcessorFactory" />
> <processor class="solr.RunUpdateProcessorFactory" />
> </updateRequestProcessorChain>
> ---
>
> Thanks,
> Yoshiaki
>
>
> 2020年7月17日(金) 16:32 yo tomi <yo...@gmail.com>:
>
>> Hi, All
>> When I did AtomicUpdate on SolrCloud by the following setting, it does not work properly.
>>
>> ---
>> <updateRequestProcessorChain name="skip-empty">
>> <processor class="solr.DistributedUpdateProcessorFactory"/>
>> <processor class="TrimFieldUpdateProcessorFactory" />
>> <processor class="RemoveBlankFieldUpdateProcessorFactory" />
>> <processor class="solr.LogUpdateProcessorFactory" />
>> <processor class="solr.RunUpdateProcessorFactory" />
>> </updateRequestProcessorChain>
>> ---
>> When changed as follows and made it work, it became as expected.
>> ---
>> <updateRequestProcessorChain name="skip-empty">
>> <processor class="TrimFieldUpdateProcessorFactory" />
>> <processor class="RemoveBlankFieldUpdateProcessorFactory" />
>> <processor class="solr.LogUpdateProcessorFactory" />
>> <processor class="solr.RunUpdateProcessorFactory" />
>> </updateRequestProcessorChain>
>> ---
>> The later setting and the way of using post-processor could make the same result, I though,
>> but using post-processor, bug of SOLR-8030 makes me not feel like using it.
>> By the latter setting even, is there any possibility of SOLR-8030 to become? Seeing the source code, tlog which is from leader comes to Replica seems to be processed correctly with UpdateRequestProcessor,
>> the latter setting had not been the right one for the bug, I though.Anyone knows the most appropriate way to configure AtomicUpdate on SolrCloud?
>>
>> Thanks,
>> Yoshiaki
>>
>>
Re: AtomicUpdate on SolrCloud is not working
Posted by yo tomi <yo...@gmail.com>.
Hi All
Sorry, above settings are contrary with each other.
Actually, following setting does not work properly.
---
<updateRequestProcessorChain name="skip-empty">
<processor class="TrimFieldUpdateProcessorFactory" />
<processor class="RemoveBlankFieldUpdateProcessorFactory" />
<processor class="solr.LogUpdateProcessorFactory" />
<processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>
---
And follows is working as expected.
---
<updateRequestProcessorChain name="skip-empty">
<processor class="solr.DistributedUpdateProcessorFactory"/>
<processor class="TrimFieldUpdateProcessorFactory" />
<processor class="RemoveBlankFieldUpdateProcessorFactory" />
<processor class="solr.LogUpdateProcessorFactory" />
<processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>
---
Thanks,
Yoshiaki
2020年7月17日(金) 16:32 yo tomi <yo...@gmail.com>:
> Hi, All
> When I did AtomicUpdate on SolrCloud by the following setting, it does not work properly.
>
> ---
> <updateRequestProcessorChain name="skip-empty">
> <processor class="solr.DistributedUpdateProcessorFactory"/>
> <processor class="TrimFieldUpdateProcessorFactory" />
> <processor class="RemoveBlankFieldUpdateProcessorFactory" />
> <processor class="solr.LogUpdateProcessorFactory" />
> <processor class="solr.RunUpdateProcessorFactory" />
> </updateRequestProcessorChain>
> ---
> When changed as follows and made it work, it became as expected.
> ---
> <updateRequestProcessorChain name="skip-empty">
> <processor class="TrimFieldUpdateProcessorFactory" />
> <processor class="RemoveBlankFieldUpdateProcessorFactory" />
> <processor class="solr.LogUpdateProcessorFactory" />
> <processor class="solr.RunUpdateProcessorFactory" />
> </updateRequestProcessorChain>
> ---
> The later setting and the way of using post-processor could make the same result, I though,
> but using post-processor, bug of SOLR-8030 makes me not feel like using it.
> By the latter setting even, is there any possibility of SOLR-8030 to become? Seeing the source code, tlog which is from leader comes to Replica seems to be processed correctly with UpdateRequestProcessor,
> the latter setting had not been the right one for the bug, I though.Anyone knows the most appropriate way to configure AtomicUpdate on SolrCloud?
>
> Thanks,
> Yoshiaki
>
>