You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Jun Rao <ju...@gmail.com> on 2014/05/01 20:26:44 UTC

Re: Review Request 20745: Patch for KAFKA-1397

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20745/#review41904
-----------------------------------------------------------


1. Should we restore the commented out test in ServerShutdownTest.scala?


core/src/main/scala/kafka/controller/ReplicaStateMachine.scala
<https://reviews.apache.org/r/20745/#comment75556>

    Could you explain why we need to send metadata if the topic is being deleted?



core/src/main/scala/kafka/controller/TopicDeletionManager.scala
<https://reviews.apache.org/r/20745/#comment75583>

    Do we need to change any comment? Also, I think those comments could probably be made clearer if we (1) separate out the normal cases with other cases such as broker failures and other conflicting concurrent operations; (2) describe the kind of requests sent to the brokers in each step.



core/src/main/scala/kafka/controller/TopicDeletionManager.scala
<https://reviews.apache.org/r/20745/#comment75577>

    It seems it's useful to extend the ShutdownableThread a bit. Currently, it has shutdown() which initiates the shutdown (by setting the flag and doing the interrupt if necessary) and waits until the thread completes the shutdown. It would be useful to add initiateShutdown() and waitShutdownComplete() (We can get rid of awaitShutdown()). Then, here, the shutdown sequence would be:
    
    1. deleteTopicThread.inititateShutdown()
    2. resumeTopicDeletionThread()
    3. deleteTopicThread.waitShutdownCompelte().
    
    This way, we make sure that deleteTopicThread can do at most 1 more loop after the first step is completed. 
    
    Also, perhaps it's clearer if we just add comment on each of the steps, instead of commenting everything at the beginning.
     



core/src/main/scala/kafka/controller/TopicDeletionManager.scala
<https://reviews.apache.org/r/20745/#comment75581>

    Perhaps it's better to call this TopicPartitionToBeDeleted? This method is not doing the same type of check as isTopicDeletionInProgress().



core/src/main/scala/kafka/controller/TopicDeletionManager.scala
<https://reviews.apache.org/r/20745/#comment75557>

    It seems this test is not necessary since the outer loop will detect that too.



core/src/main/scala/kafka/server/KafkaApis.scala
<https://reviews.apache.org/r/20745/#comment75584>

    This doesn't seem to be needed any more.



core/src/main/scala/kafka/server/KafkaServer.scala
<https://reviews.apache.org/r/20745/#comment75585>

    This doesn't seem to be needed any more.



core/src/main/scala/kafka/server/ReplicaManager.scala
<https://reviews.apache.org/r/20745/#comment75586>

    Do we still need to do this now that we make sure that if a replica is deleted, the controller will never send a LeaderAndIsrRequest to this replica until the topic is deleted?



core/src/test/scala/unit/kafka/admin/DeleteTopicTest.scala
<https://reviews.apache.org/r/20745/#comment75587>

    We can just use the default larger timeout. This applies to other waitUntilTrue() usage. The only exception is that if we want to test sth that won't happen, using a smaller timeout will save time.



core/src/test/scala/unit/kafka/admin/DeleteTopicTest.scala
<https://reviews.apache.org/r/20745/#comment75588>

    1000ms mentioned in the message is not consistent with the actual timeout.



core/src/test/scala/unit/kafka/admin/DeleteTopicTest.scala
<https://reviews.apache.org/r/20745/#comment75589>

    This is not going to be reliable since we can't be sure whether the controller has completed the deletion before the failover.



core/src/test/scala/unit/kafka/admin/DeleteTopicTest.scala
<https://reviews.apache.org/r/20745/#comment75590>

    We can just use the default timeout in waitUntilTrue.



core/src/test/scala/unit/kafka/admin/DeleteTopicTest.scala
<https://reviews.apache.org/r/20745/#comment75591>

    This test seems to be the same as testPartitionReassignmentDuringDeleteTopic().



core/src/test/scala/unit/kafka/admin/DeleteTopicTest.scala
<https://reviews.apache.org/r/20745/#comment75592>

    If we want the add partition operation to block, should we shutdown the follower before calling addPartitions()?


- Jun Rao


On April 30, 2014, 9:55 p.m., Timothy Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/20745/
> -----------------------------------------------------------
> 
> (Updated April 30, 2014, 9:55 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1397
>     https://issues.apache.org/jira/browse/KAFKA-1397
> 
> 
> Repository: kafka
> 
> 
> Description
> -------
> 
> KAFKA-1397: Fix delete topic tests and deadlock
> 
> 
> Diffs
> -----
> 
>   core/src/main/scala/kafka/controller/ControllerChannelManager.scala 919aeb26f93d2fc34d873cfb3dbfab7235a9d635 
>   core/src/main/scala/kafka/controller/KafkaController.scala 933de9dd324c7086efe6aa610335ef370d9e9c12 
>   core/src/main/scala/kafka/controller/ReplicaStateMachine.scala 0e47dac8cbf65a86d053a3371a18af467afd70ae 
>   core/src/main/scala/kafka/controller/TopicDeletionManager.scala e4bc2439ce1933c7c7571d255464ee678226a6cb 
>   core/src/main/scala/kafka/log/LogManager.scala ac67f081e6219fd2181479e7a2bb88ea6044e6cc 
>   core/src/main/scala/kafka/server/KafkaApis.scala 0b668f230c8556fdf08654ce522a11847d0bf39b 
>   core/src/main/scala/kafka/server/KafkaServer.scala c208f83bed7fb91f07fae42f2b66892e6d46fecc 
>   core/src/main/scala/kafka/server/ReplicaManager.scala 11c20cee83fda9a492156674d351a0096b13fd99 
>   core/src/test/scala/unit/kafka/admin/DeleteTopicTest.scala 9c29e144bba2c9bafa91941b6ca5c263490693b3 
>   core/src/test/scala/unit/kafka/server/DynamicConfigChangeTest.scala 5c487968014b56490eb2bc876cef1c52efd1cdad 
>   core/src/test/scala/unit/kafka/server/SimpleFetchTest.scala b1c4ce9f66aa86c68f2e6988f67546fa63ed1f9b 
>   core/src/test/scala/unit/kafka/utils/TestUtils.scala 384c74e7c3985abff864e61dea5e530dbd189d5d 
> 
> Diff: https://reviews.apache.org/r/20745/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Timothy Chen
> 
>


Re: Review Request 20745: Patch for KAFKA-1397

Posted by Timothy Chen <tn...@apache.org>.

> On May 1, 2014, 6:26 p.m., Jun Rao wrote:
> > core/src/main/scala/kafka/controller/TopicDeletionManager.scala, lines 394-396
> > <https://reviews.apache.org/r/20745/diff/4/?file=571881#file571881line394>
> >
> >     It seems this test is not necessary since the outer loop will detect that too.

It's actually necessary since isRunning might been set to false while it woke up from the condition, in that case we don't want it to still try to acquire the controller lock but to bail early.


> On May 1, 2014, 6:26 p.m., Jun Rao wrote:
> > core/src/main/scala/kafka/server/ReplicaManager.scala, lines 128-136
> > <https://reviews.apache.org/r/20745/diff/4/?file=571885#file571885line128>
> >
> >     Do we still need to do this now that we make sure that if a replica is deleted, the controller will never send a LeaderAndIsrRequest to this replica until the topic is deleted?

We need this as described in the comment that if a broker that was shutdown during the topic deletion and brought back up, the stopReplicaRequest(deletePartition=true) without this extra log manager cleanup will never delete the log and the folders associated with as ReplicaManager doesn't hold that information anymore as we don't send the leader and isr request again as part of the retry.


> On May 1, 2014, 6:26 p.m., Jun Rao wrote:
> > core/src/test/scala/unit/kafka/admin/DeleteTopicTest.scala, lines 86-94
> > <https://reviews.apache.org/r/20745/diff/4/?file=571886#file571886line86>
> >
> >     This is not going to be reliable since we can't be sure whether the controller has completed the deletion before the failover.

Ok removing the test for now.


> On May 1, 2014, 6:26 p.m., Jun Rao wrote:
> > core/src/test/scala/unit/kafka/admin/DeleteTopicTest.scala, lines 239-240
> > <https://reviews.apache.org/r/20745/diff/4/?file=571886#file571886line239>
> >
> >     This test seems to be the same as testPartitionReassignmentDuringDeleteTopic().

Just the order of requests, but as well cannot gurantee what is processed first as they're just asynchronous. Will remove.


- Timothy


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20745/#review41904
-----------------------------------------------------------


On April 30, 2014, 9:55 p.m., Timothy Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/20745/
> -----------------------------------------------------------
> 
> (Updated April 30, 2014, 9:55 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1397
>     https://issues.apache.org/jira/browse/KAFKA-1397
> 
> 
> Repository: kafka
> 
> 
> Description
> -------
> 
> KAFKA-1397: Fix delete topic tests and deadlock
> 
> 
> Diffs
> -----
> 
>   core/src/main/scala/kafka/controller/ControllerChannelManager.scala 919aeb26f93d2fc34d873cfb3dbfab7235a9d635 
>   core/src/main/scala/kafka/controller/KafkaController.scala 933de9dd324c7086efe6aa610335ef370d9e9c12 
>   core/src/main/scala/kafka/controller/ReplicaStateMachine.scala 0e47dac8cbf65a86d053a3371a18af467afd70ae 
>   core/src/main/scala/kafka/controller/TopicDeletionManager.scala e4bc2439ce1933c7c7571d255464ee678226a6cb 
>   core/src/main/scala/kafka/log/LogManager.scala ac67f081e6219fd2181479e7a2bb88ea6044e6cc 
>   core/src/main/scala/kafka/server/KafkaApis.scala 0b668f230c8556fdf08654ce522a11847d0bf39b 
>   core/src/main/scala/kafka/server/KafkaServer.scala c208f83bed7fb91f07fae42f2b66892e6d46fecc 
>   core/src/main/scala/kafka/server/ReplicaManager.scala 11c20cee83fda9a492156674d351a0096b13fd99 
>   core/src/test/scala/unit/kafka/admin/DeleteTopicTest.scala 9c29e144bba2c9bafa91941b6ca5c263490693b3 
>   core/src/test/scala/unit/kafka/server/DynamicConfigChangeTest.scala 5c487968014b56490eb2bc876cef1c52efd1cdad 
>   core/src/test/scala/unit/kafka/server/SimpleFetchTest.scala b1c4ce9f66aa86c68f2e6988f67546fa63ed1f9b 
>   core/src/test/scala/unit/kafka/utils/TestUtils.scala 384c74e7c3985abff864e61dea5e530dbd189d5d 
> 
> Diff: https://reviews.apache.org/r/20745/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Timothy Chen
> 
>