You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Derek Poh <dp...@globalsources.com> on 2018/09/18 07:13:30 UTC

TolerantUpdateProcessorFactory maxErrors=-1 issue

Hi

I am using CSV formatted indexupdates to index on tab delimited file.

I have define "TolerantUpdateProcessorFactory" with "maxErrors=-1" in 
the solrconfig.xml to skip any document update error and proceed to 
update the remaining documents without failing.
Howeverit does not seemto be workingas there is an document in the tab 
delimited file withadditional number of fields and this caused the 
indexing to abort instead.

This is how I start the indexing,
curl -o /apps/search/logs/indexing.log 
"http://localhost:8983/solr/$collection/update?update.chain=$updateChainName&commit=true&separator=%09&encapsulator=^&fieldnames=$fieldnames$splitOptions" 
--data-binary "@/apps/search/feed/$csvFilePath/$csvFileName" -H 
'Content-type:application/csv'

This is how the TolerantUpdateProcessorFactory is defined in the 
solrconfig.xml,
<updateRequestProcessorChain name="exhibitor-product-chain">
   <processor class="solr.CloneFieldUpdateProcessorFactory">
     <str name="source">P_SupplierId</str>
     <str name="source">P_TradeShowId</str>
     <str name="source">P_ProductId</str>
     <str name="dest">id</str>
   </processor>
   <processor class="solr.ConcatFieldUpdateProcessorFactory">
     <str name="fieldName">id</str>
     <str name="delimiter"></str>
   </processor>
   <processor class="solr.TolerantUpdateProcessorFactory">
      <int name="maxErrors">-1</int>
   </processor>
   <processor class="solr.processor.DocExpirationUpdateProcessorFactory">
     <null name="ttlFieldName"/>
     <null name="ttlParamName"/>
     <int name="autoDeletePeriodSeconds">43200</int>
     <str name="expirationFieldName">P_TradeShowOnlineEndDateUTC</str>
   </processor>
   <processor class="solr.LogUpdateProcessorFactory" />
   <processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>

Solr version is 6.6.2.

Derek

----------------------
CONFIDENTIALITY NOTICE 

This e-mail (including any attachments) may contain confidential and/or privileged information. If you are not the intended recipient or have received this e-mail in error, please inform the sender immediately and delete this e-mail (including any attachments) from your computer, and you must not use, disclose to anyone else or copy this e-mail (including any attachments), whether in whole or in part. 

This e-mail and any reply to it may be monitored for security, legal, regulatory compliance and/or other appropriate reasons.

Re: TolerantUpdateProcessorFactory maxErrors=-1 issue

Posted by Derek Poh <dp...@globalsources.com>.
Hi Tomas

I moved TolerantUpdateProcessorFactoryto the beginning of the chain, 
reload the collection.
The indexing process still abort.

On 22/9/2018 4:28 AM, Tomás Fernández Löbbe wrote:
> Hi Derek,
> I suspect you need to move the TolerantUpdateProcessorFactory to the
> beginning of the chain
>
> On Thu, Sep 20, 2018 at 6:17 PM Derek Poh <dp...@globalsources.com> wrote:
>
>> Does any one have any idea whatcould be the causeof this?
>>
>> On 19/9/2018 11:40 AM, Derek Poh wrote:
>>> In addition, I tried withmaxErrors=3 and with only 1error document,
>>> the indexing process still gets aborted.
>>>
>>> Could it be the way I defined the TolerantUpdateProcessorFactory in
>>> solrconfg.xml?
>>>
>>> On 18/9/2018 3:13 PM, Derek Poh wrote:
>>>> Hi
>>>>
>>>> I am using CSV formatted indexupdates to index on tab delimited file.
>>>>
>>>> I have define "TolerantUpdateProcessorFactory" with "maxErrors=-1" in
>>>> the solrconfig.xml to skip any document update error and proceed to
>>>> update the remaining documents without failing.
>>>> Howeverit does not seemto be workingas there is an document in the
>>>> tab delimited file withadditional number of fields and this caused
>>>> the indexing to abort instead.
>>>>
>>>> This is how I start the indexing,
>>>> curl -o /apps/search/logs/indexing.log
>>>> "
>> http://localhost:8983/solr/$collection/update?update.chain=$updateChainName&commit=true&separator=%09&encapsulator=^&fieldnames=$fieldnames$splitOptions"
>>
>>>> --data-binary "@/apps/search/feed/$csvFilePath/$csvFileName" -H
>>>> 'Content-type:application/csv'
>>>>
>>>> This is how the TolerantUpdateProcessorFactory is defined in the
>>>> solrconfig.xml,
>>>> <updateRequestProcessorChain name="exhibitor-product-chain">
>>>>    <processor class="solr.CloneFieldUpdateProcessorFactory">
>>>>      <str name="source">P_SupplierId</str>
>>>>      <str name="source">P_TradeShowId</str>
>>>>      <str name="source">P_ProductId</str>
>>>>      <str name="dest">id</str>
>>>>    </processor>
>>>>    <processor class="solr.ConcatFieldUpdateProcessorFactory">
>>>>      <str name="fieldName">id</str>
>>>>      <str name="delimiter"></str>
>>>>    </processor>
>>>>    <processor class="solr.TolerantUpdateProcessorFactory">
>>>>       <int name="maxErrors">-1</int>
>>>>    </processor>
>>>>    <processor class="solr.processor.DocExpirationUpdateProcessorFactory">
>>>>      <null name="ttlFieldName"/>
>>>>      <null name="ttlParamName"/>
>>>>      <int name="autoDeletePeriodSeconds">43200</int>
>>>>      <str name="expirationFieldName">P_TradeShowOnlineEndDateUTC</str>
>>>>    </processor>
>>>>    <processor class="solr.LogUpdateProcessorFactory" />
>>>>    <processor class="solr.RunUpdateProcessorFactory" />
>>>> </updateRequestProcessorChain>
>>>>
>>>> Solr version is 6.6.2.
>>>>
>>>> Derek
>>>>
>>>> ----------------------
>>>> CONFIDENTIALITY NOTICE
>>>> This e-mail (including any attachments) may contain confidential
>>>> and/or privileged information. If you are not the intended recipient
>>>> or have received this e-mail in error, please inform the sender
>>>> immediately and delete this e-mail (including any attachments) from
>>>> your computer, and you must not use, disclose to anyone else or copy
>>>> this e-mail (including any attachments), whether in whole or in part.
>>>> This e-mail and any reply to it may be monitored for security, legal,
>>>> regulatory compliance and/or other appropriate reasons.
>>>
>>> ----------------------
>>> CONFIDENTIALITY NOTICE
>>> This e-mail (including any attachments) may contain confidential
>>> and/or privileged information. If you are not the intended recipient
>>> or have received this e-mail in error, please inform the sender
>>> immediately and delete this e-mail (including any attachments) from
>>> your computer, and you must not use, disclose to anyone else or copy
>>> this e-mail (including any attachments), whether in whole or in part.
>>> This e-mail and any reply to it may be monitored for security, legal,
>>> regulatory compliance and/or other appropriate reasons.
>>
>> ----------------------
>> CONFIDENTIALITY NOTICE
>>
>> This e-mail (including any attachments) may contain confidential and/or
>> privileged information. If you are not the intended recipient or have
>> received this e-mail in error, please inform the sender immediately and
>> delete this e-mail (including any attachments) from your computer, and you
>> must not use, disclose to anyone else or copy this e-mail (including any
>> attachments), whether in whole or in part.
>>
>> This e-mail and any reply to it may be monitored for security, legal,
>> regulatory compliance and/or other appropriate reasons.


----------------------
CONFIDENTIALITY NOTICE 

This e-mail (including any attachments) may contain confidential and/or privileged information. If you are not the intended recipient or have received this e-mail in error, please inform the sender immediately and delete this e-mail (including any attachments) from your computer, and you must not use, disclose to anyone else or copy this e-mail (including any attachments), whether in whole or in part. 

This e-mail and any reply to it may be monitored for security, legal, regulatory compliance and/or other appropriate reasons.

Re: TolerantUpdateProcessorFactory maxErrors=-1 issue

Posted by Tomás Fernández Löbbe <to...@gmail.com>.
Hi Derek,
I suspect you need to move the TolerantUpdateProcessorFactory to the
beginning of the chain

On Thu, Sep 20, 2018 at 6:17 PM Derek Poh <dp...@globalsources.com> wrote:

> Does any one have any idea whatcould be the causeof this?
>
> On 19/9/2018 11:40 AM, Derek Poh wrote:
> > In addition, I tried withmaxErrors=3 and with only 1error document,
> > the indexing process still gets aborted.
> >
> > Could it be the way I defined the TolerantUpdateProcessorFactory in
> > solrconfg.xml?
> >
> > On 18/9/2018 3:13 PM, Derek Poh wrote:
> >> Hi
> >>
> >> I am using CSV formatted indexupdates to index on tab delimited file.
> >>
> >> I have define "TolerantUpdateProcessorFactory" with "maxErrors=-1" in
> >> the solrconfig.xml to skip any document update error and proceed to
> >> update the remaining documents without failing.
> >> Howeverit does not seemto be workingas there is an document in the
> >> tab delimited file withadditional number of fields and this caused
> >> the indexing to abort instead.
> >>
> >> This is how I start the indexing,
> >> curl -o /apps/search/logs/indexing.log
> >> "
> http://localhost:8983/solr/$collection/update?update.chain=$updateChainName&commit=true&separator=%09&encapsulator=^&fieldnames=$fieldnames$splitOptions"
>
> >> --data-binary "@/apps/search/feed/$csvFilePath/$csvFileName" -H
> >> 'Content-type:application/csv'
> >>
> >> This is how the TolerantUpdateProcessorFactory is defined in the
> >> solrconfig.xml,
> >> <updateRequestProcessorChain name="exhibitor-product-chain">
> >>   <processor class="solr.CloneFieldUpdateProcessorFactory">
> >>     <str name="source">P_SupplierId</str>
> >>     <str name="source">P_TradeShowId</str>
> >>     <str name="source">P_ProductId</str>
> >>     <str name="dest">id</str>
> >>   </processor>
> >>   <processor class="solr.ConcatFieldUpdateProcessorFactory">
> >>     <str name="fieldName">id</str>
> >>     <str name="delimiter"></str>
> >>   </processor>
> >>   <processor class="solr.TolerantUpdateProcessorFactory">
> >>      <int name="maxErrors">-1</int>
> >>   </processor>
> >>   <processor class="solr.processor.DocExpirationUpdateProcessorFactory">
> >>     <null name="ttlFieldName"/>
> >>     <null name="ttlParamName"/>
> >>     <int name="autoDeletePeriodSeconds">43200</int>
> >>     <str name="expirationFieldName">P_TradeShowOnlineEndDateUTC</str>
> >>   </processor>
> >>   <processor class="solr.LogUpdateProcessorFactory" />
> >>   <processor class="solr.RunUpdateProcessorFactory" />
> >> </updateRequestProcessorChain>
> >>
> >> Solr version is 6.6.2.
> >>
> >> Derek
> >>
> >> ----------------------
> >> CONFIDENTIALITY NOTICE
> >> This e-mail (including any attachments) may contain confidential
> >> and/or privileged information. If you are not the intended recipient
> >> or have received this e-mail in error, please inform the sender
> >> immediately and delete this e-mail (including any attachments) from
> >> your computer, and you must not use, disclose to anyone else or copy
> >> this e-mail (including any attachments), whether in whole or in part.
> >> This e-mail and any reply to it may be monitored for security, legal,
> >> regulatory compliance and/or other appropriate reasons.
> >
> >
> > ----------------------
> > CONFIDENTIALITY NOTICE
> > This e-mail (including any attachments) may contain confidential
> > and/or privileged information. If you are not the intended recipient
> > or have received this e-mail in error, please inform the sender
> > immediately and delete this e-mail (including any attachments) from
> > your computer, and you must not use, disclose to anyone else or copy
> > this e-mail (including any attachments), whether in whole or in part.
> > This e-mail and any reply to it may be monitored for security, legal,
> > regulatory compliance and/or other appropriate reasons.
>
>
> ----------------------
> CONFIDENTIALITY NOTICE
>
> This e-mail (including any attachments) may contain confidential and/or
> privileged information. If you are not the intended recipient or have
> received this e-mail in error, please inform the sender immediately and
> delete this e-mail (including any attachments) from your computer, and you
> must not use, disclose to anyone else or copy this e-mail (including any
> attachments), whether in whole or in part.
>
> This e-mail and any reply to it may be monitored for security, legal,
> regulatory compliance and/or other appropriate reasons.

Re: TolerantUpdateProcessorFactory maxErrors=-1 issue

Posted by Derek Poh <dp...@globalsources.com>.
Does any one have any idea whatcould be the causeof this?

On 19/9/2018 11:40 AM, Derek Poh wrote:
> In addition, I tried withmaxErrors=3 and with only 1error document, 
> the indexing process still gets aborted.
>
> Could it be the way I defined the TolerantUpdateProcessorFactory in 
> solrconfg.xml?
>
> On 18/9/2018 3:13 PM, Derek Poh wrote:
>> Hi
>>
>> I am using CSV formatted indexupdates to index on tab delimited file.
>>
>> I have define "TolerantUpdateProcessorFactory" with "maxErrors=-1" in 
>> the solrconfig.xml to skip any document update error and proceed to 
>> update the remaining documents without failing.
>> Howeverit does not seemto be workingas there is an document in the 
>> tab delimited file withadditional number of fields and this caused 
>> the indexing to abort instead.
>>
>> This is how I start the indexing,
>> curl -o /apps/search/logs/indexing.log 
>> "http://localhost:8983/solr/$collection/update?update.chain=$updateChainName&commit=true&separator=%09&encapsulator=^&fieldnames=$fieldnames$splitOptions" 
>> --data-binary "@/apps/search/feed/$csvFilePath/$csvFileName" -H 
>> 'Content-type:application/csv'
>>
>> This is how the TolerantUpdateProcessorFactory is defined in the 
>> solrconfig.xml,
>> <updateRequestProcessorChain name="exhibitor-product-chain">
>>   <processor class="solr.CloneFieldUpdateProcessorFactory">
>>     <str name="source">P_SupplierId</str>
>>     <str name="source">P_TradeShowId</str>
>>     <str name="source">P_ProductId</str>
>>     <str name="dest">id</str>
>>   </processor>
>>   <processor class="solr.ConcatFieldUpdateProcessorFactory">
>>     <str name="fieldName">id</str>
>>     <str name="delimiter"></str>
>>   </processor>
>>   <processor class="solr.TolerantUpdateProcessorFactory">
>>      <int name="maxErrors">-1</int>
>>   </processor>
>>   <processor class="solr.processor.DocExpirationUpdateProcessorFactory">
>>     <null name="ttlFieldName"/>
>>     <null name="ttlParamName"/>
>>     <int name="autoDeletePeriodSeconds">43200</int>
>>     <str name="expirationFieldName">P_TradeShowOnlineEndDateUTC</str>
>>   </processor>
>>   <processor class="solr.LogUpdateProcessorFactory" />
>>   <processor class="solr.RunUpdateProcessorFactory" />
>> </updateRequestProcessorChain>
>>
>> Solr version is 6.6.2.
>>
>> Derek
>>
>> ----------------------
>> CONFIDENTIALITY NOTICE
>> This e-mail (including any attachments) may contain confidential 
>> and/or privileged information. If you are not the intended recipient 
>> or have received this e-mail in error, please inform the sender 
>> immediately and delete this e-mail (including any attachments) from 
>> your computer, and you must not use, disclose to anyone else or copy 
>> this e-mail (including any attachments), whether in whole or in part.
>> This e-mail and any reply to it may be monitored for security, legal, 
>> regulatory compliance and/or other appropriate reasons.
>
>
> ----------------------
> CONFIDENTIALITY NOTICE
> This e-mail (including any attachments) may contain confidential 
> and/or privileged information. If you are not the intended recipient 
> or have received this e-mail in error, please inform the sender 
> immediately and delete this e-mail (including any attachments) from 
> your computer, and you must not use, disclose to anyone else or copy 
> this e-mail (including any attachments), whether in whole or in part.
> This e-mail and any reply to it may be monitored for security, legal, 
> regulatory compliance and/or other appropriate reasons.


----------------------
CONFIDENTIALITY NOTICE 

This e-mail (including any attachments) may contain confidential and/or privileged information. If you are not the intended recipient or have received this e-mail in error, please inform the sender immediately and delete this e-mail (including any attachments) from your computer, and you must not use, disclose to anyone else or copy this e-mail (including any attachments), whether in whole or in part. 

This e-mail and any reply to it may be monitored for security, legal, regulatory compliance and/or other appropriate reasons.

Re: TolerantUpdateProcessorFactory maxErrors=-1 issue

Posted by Derek Poh <dp...@globalsources.com>.
In addition, I tried withmaxErrors=3 and with only 1error document, the 
indexing process still gets aborted.

Could it be the way I defined the TolerantUpdateProcessorFactory in 
solrconfg.xml?

On 18/9/2018 3:13 PM, Derek Poh wrote:
> Hi
>
> I am using CSV formatted indexupdates to index on tab delimited file.
>
> I have define "TolerantUpdateProcessorFactory" with "maxErrors=-1" in 
> the solrconfig.xml to skip any document update error and proceed to 
> update the remaining documents without failing.
> Howeverit does not seemto be workingas there is an document in the tab 
> delimited file withadditional number of fields and this caused the 
> indexing to abort instead.
>
> This is how I start the indexing,
> curl -o /apps/search/logs/indexing.log 
> "http://localhost:8983/solr/$collection/update?update.chain=$updateChainName&commit=true&separator=%09&encapsulator=^&fieldnames=$fieldnames$splitOptions" 
> --data-binary "@/apps/search/feed/$csvFilePath/$csvFileName" -H 
> 'Content-type:application/csv'
>
> This is how the TolerantUpdateProcessorFactory is defined in the 
> solrconfig.xml,
> <updateRequestProcessorChain name="exhibitor-product-chain">
>   <processor class="solr.CloneFieldUpdateProcessorFactory">
>     <str name="source">P_SupplierId</str>
>     <str name="source">P_TradeShowId</str>
>     <str name="source">P_ProductId</str>
>     <str name="dest">id</str>
>   </processor>
>   <processor class="solr.ConcatFieldUpdateProcessorFactory">
>     <str name="fieldName">id</str>
>     <str name="delimiter"></str>
>   </processor>
>   <processor class="solr.TolerantUpdateProcessorFactory">
>      <int name="maxErrors">-1</int>
>   </processor>
>   <processor class="solr.processor.DocExpirationUpdateProcessorFactory">
>     <null name="ttlFieldName"/>
>     <null name="ttlParamName"/>
>     <int name="autoDeletePeriodSeconds">43200</int>
>     <str name="expirationFieldName">P_TradeShowOnlineEndDateUTC</str>
>   </processor>
>   <processor class="solr.LogUpdateProcessorFactory" />
>   <processor class="solr.RunUpdateProcessorFactory" />
> </updateRequestProcessorChain>
>
> Solr version is 6.6.2.
>
> Derek
>
> ----------------------
> CONFIDENTIALITY NOTICE
> This e-mail (including any attachments) may contain confidential 
> and/or privileged information. If you are not the intended recipient 
> or have received this e-mail in error, please inform the sender 
> immediately and delete this e-mail (including any attachments) from 
> your computer, and you must not use, disclose to anyone else or copy 
> this e-mail (including any attachments), whether in whole or in part.
> This e-mail and any reply to it may be monitored for security, legal, 
> regulatory compliance and/or other appropriate reasons.


----------------------
CONFIDENTIALITY NOTICE 

This e-mail (including any attachments) may contain confidential and/or privileged information. If you are not the intended recipient or have received this e-mail in error, please inform the sender immediately and delete this e-mail (including any attachments) from your computer, and you must not use, disclose to anyone else or copy this e-mail (including any attachments), whether in whole or in part. 

This e-mail and any reply to it may be monitored for security, legal, regulatory compliance and/or other appropriate reasons.