You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Ludovic Boutros (JIRA)" <ji...@apache.org> on 2017/01/04 08:25:58 UTC

[jira] [Commented] (SOLR-8030) Transaction log does not store the update chain (or req params?) used for updates

    [ https://issues.apache.org/jira/browse/SOLR-8030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15797577#comment-15797577 ] 

Ludovic Boutros commented on SOLR-8030:
---------------------------------------

Thank you [~hossman].

The point is that for main operations, the Document Update Processors do not have access to the Solr request.
The parameters are stored in the commands (add, delete, commit).
I don't know if for merging and rollback operations the parameters could also be stored in the command.

This way we do would not have to worry about request parameters.

I agree with [~dsmiley] that the log replay is too complicated.
But I do not agree with [~dsmiley] that this should not be fixed because of very specific use cases.

I think the log replay must be symetric by default.
This is the natural behavior of all database systems that I can think of.
If you need something else you can always check the REPLAY flag in your processor.

Currently, your index can be easily corrupted because your update processor logic is not applied during log replay.


> Transaction log does not store the update chain (or req params?) used for updates
> ---------------------------------------------------------------------------------
>
>                 Key: SOLR-8030
>                 URL: https://issues.apache.org/jira/browse/SOLR-8030
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>    Affects Versions: 5.3
>            Reporter: Ludovic Boutros
>         Attachments: SOLR-8030.patch
>
>
> Transaction Log does not store the update chain, or any other details from the original update request such as the request params, used during updates.
> Therefore tLog uses the default update chain, and a synthetic request, during log replay.
> If we implement custom update logic with multiple distinct update chains that use custom processors after DistributedUpdateProcessor, or if the default chain uses processors whose behavior depends on other request params, then log replay may be incorrect.
> Potentially problematic scenerios (need test cases):
> * DBQ where the main query string uses local param variables that refer to other request params
> * custom Update chain set as {{default="true"}} using something like StatelessScriptUpdateProcessorFactory after DUP where the script depends on request params.
> * multiple named update chains with diff processors configured after DUP and specific requests sent to diff chains -- ex: ParseDateProcessor w/ custom formats configured after DUP in some special chains, but not in the default chain



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org