You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Jose Armando Garcia Sancio (Jira)" <ji...@apache.org> on 2021/07/13 03:48:00 UTC

[jira] [Updated] (KAFKA-13073) Simulation test fails due to inconsistency in MockLog's implementation

     [ https://issues.apache.org/jira/browse/KAFKA-13073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jose Armando Garcia Sancio updated KAFKA-13073:
-----------------------------------------------
    Labels: kip-500  (was: )

> Simulation test fails due to inconsistency in MockLog's implementation
> ----------------------------------------------------------------------
>
>                 Key: KAFKA-13073
>                 URL: https://issues.apache.org/jira/browse/KAFKA-13073
>             Project: Kafka
>          Issue Type: Bug
>          Components: controller, replication
>    Affects Versions: 3.0.0
>            Reporter: Jose Armando Garcia Sancio
>            Assignee: Jose Armando Garcia Sancio
>            Priority: Major
>              Labels: kip-500
>             Fix For: 3.0.0
>
>
> We are getting the following error on trunk
> {code:java}
> RaftEventSimulationTest > canRecoverAfterAllNodesKilled STANDARD_OUT
>     timestamp = 2021-07-12T16:26:55.663, RaftEventSimulationTest:canRecoverAfterAllNodesKilled =
>       java.lang.RuntimeException:
>         Uncaught exception during poll of node 1                                  |-------------------jqwik-------------------
>     tries = 25                    | # of calls to property
>     checks = 25                   | # of not rejected calls
>     generation = RANDOMIZED       | parameters are randomly generated
>     after-failure = PREVIOUS_SEED | use the previous seed
>     when-fixed-seed = ALLOW       | fixing the random seed is allowed
>     edge-cases#mode = MIXIN       | edge cases are mixed in
>     edge-cases#total = 108        | # of all combined edge cases
>     edge-cases#tried = 4          | # of edge cases tried in current run
>     seed = 8079861963960994566    | random seed to reproduce generated values    Sample
>     ------
>       arg0: 4002
>       arg1: 2
>       arg2: 4{code}
> I think there are a couple of issues here:
>  # The {{ListenerContext}} for {{KafkaRaftClient}} uses the value returned by {{ReplicatedLog::startOffset()}} to determined the log start and when to load a snapshot while the {{MockLog}} implementation uses {{logStartOffset}} which could be a different value.
>  # {{MockLog}} doesn't implement {{ReplicatedLog::maybeClean}} so the log start offset is always 0.
>  # The snapshot id validation for {{MockLog}} and {{KafkaMetadataLog}}'s {{createNewSnapshot}} throws an exception when the snapshot id is less than the log start offset.
> Solutions:
> Fix the error quoted above we only need to fix bullet point 3. but I think we should fix all of the issues enumerated in this Jira.
> For 1. we should change the {{MockLog}} implementation so that it uses {{startOffset}} both externally and internally.
> For 2. I will file another issue to track this implementation.
> For 3. I think this validation is too strict. I think it is safe to simply ignore any attempt by the state machine to create an snapshot with an id less that the log start offset. We should return a {{Optional.empty()}}when the snapshot id is less than the log start offset. This tells the user that it doesn't need to generate a snapshot for that offset. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)