You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@kafka.apache.org by "Ewen Cheslack-Postava (JIRA)" <ji...@apache.org> on 2015/12/11 23:39:46 UTC

[jira] [Resolved] (KAFKA-1589) Strengthen System Tests

     [ https://issues.apache.org/jira/browse/KAFKA-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ewen Cheslack-Postava resolved KAFKA-1589.
------------------------------------------
    Resolution: Won't Fix
      Assignee: Ewen Cheslack-Postava  (was: Gwen Shapira)

Refers to old system tests. New ducktape system tests already address many of these issues. (The only one probably not well addressed is the unit tests piece, which are difficult to do for system test functionality. Some of these are handled by the sanity_checks tests.)

> Strengthen System Tests
> -----------------------
>
>                 Key: KAFKA-1589
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1589
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Guozhang Wang
>            Assignee: Ewen Cheslack-Postava
>              Labels: newbie
>             Fix For: 0.10.0.0
>
>
> Although the system test code is also part of the open source repository, not too much attention is paid to this module today. The incurred results is that we keep breaking the system tests with either changes on the admin tools, or library upgrades that change the APIs like Zookeeper. And when the system tests breaks / hangs / etc, it is also hard to debug the issue. We need to treat the system test suite just as part of the open source code. 
> Based on my personal experience trouble shooting system tests, I would propose doing at least the follow enhancement around system tests.
> 1. Add unit tests for all system util test tools, for example:
> kafka_system_test_utils.get_controller_attributes
> kafka_system_test_utils.get_leader_for
> 2. Add exception handling logic in the python test framework to clean-up the testbed upon failures, so that the subsequent test cases will not be affected.
> 3. Remove timing based mechanism such as "sleep(5000) to wait for metadata to be propagated" as much as possible to avoid transient failures.
> After those enhancements, we should probably also pick a very small subset (say one from each suite) of the system test cases into the patch reviewing process along with the unit tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)