You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Marton Greber (Jira)" <ji...@apache.org> on 2023/03/22 15:10:00 UTC

[jira] [Updated] (KUDU-3464) Failed mini-cluster creation leaves chronyd open in Python test infra

     [ https://issues.apache.org/jira/browse/KUDU-3464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marton Greber updated KUDU-3464:
--------------------------------
    Description: 
*Description:*
While working on adding extra startup flag support in the Python test infra I was tinkering with adding negative tests. One example, when a wrong flag name is specified in a test class.
{code:python}
class TestKuduTestStartupFlagsMasterWrongFlagName(KuduTestBase, CompatUnitTest):
    @classmethod
    def setUpClass(self):
        extra_master_flags=[("non_existent_flag","1")]
        extra_tserver_flags=[("tablet_apply_pool_overload_threshold_ms", "1")]
        error_msg = 'RUNTIME_ERROR'
        with self.assertRaisesRegex(self, Exception, error_msg):
            super(TestKuduTestStartupFlagsMasterWrongFlagName, self)\
                .setUpClass(extra_master_flags, extra_tserver_flags)

    def test_startup_flags_master_wrong_flag_name(self):
        pass

class TestKuduTestStartupFlagsTserverWrongFlagName(KuduTestBase, CompatUnitTest):
    @classmethod
    def setUpClass(self):
        extra_master_flags=[("check_expired_table_interval_seconds","1")]
        extra_tserver_flags=[("non_existent_flag","1")]
        error_msg = 'RUNTIME_ERROR'
        with self.assertRaisesRegex(self, Exception, error_msg):
            super(TestKuduTestStartupFlagsTserverWrongFlagName, self)\
                .setUpClass(extra_master_flags, extra_tserver_flags)

    def test_startup_flags_tserver_wrong_flag_name(self):
        pass
{code}
By themselves, these test run fine. Running them after each other results in the following error:
{code:bash}
2023-03-22T14:48:01Z Fatal error : Another chronyd may already be running (pid=377244), check /tmp/kudutest-0/minicluster-data/chrony.0/chronyd.pid
Could not open connection to daemon
{code}
It times out with the above error, when the control flow reaches the second test:
{code:python}
_ ERROR at setup of TestKuduTestStartupFlagsTserverWrongFlagName.test_startup_flags_tserver_wrong_flag_name _
Exception: Error in response: {'code': 'TIMED_OUT', 'message': 'failed to start NTP server 0: failed to contact chronyd in 1.000s'}

During handling of the above exception, another exception occurred:

self = <class 'kudu.tests.test_common.TestKuduTestStartupFlagsTserverWrongFlagName'>

    @classmethod
    def setUpClass(self):
        extra_master_flags=[("check_expired_table_interval_seconds","1")]
        extra_tserver_flags=[("non_existent_flag","1")]
        error_msg = 'RUNTIME_ERROR'
        with self.assertRaisesRegex(self, Exception, error_msg):
            super(TestKuduTestStartupFlagsTserverWrongFlagName, self)\
>               .setUpClass(extra_master_flags, extra_tserver_flags)
E           TypeError: _formatMessage() missing 1 required positional argument: 'standardMsg'

kudu/tests/test_common.py:82: TypeError
{code}
I suspect that when the first cluster creation fails, chronyd is not properly disposed of.

(After quitting the test execution, the referred /tmp/kudutest-0 location is properly cleaned up.)

*Consequences:*

If developers writing Python tests mess up a flag name, or value for more than once in the code they get "Could not open connection to daemon" errors. (which is not really helpful at first)

However for properly written test code this bug has no negative effect.
        Summary: Failed mini-cluster creation leaves chronyd open in Python test infra  (was: Failed mini-cluster creation leaves chronyd running in Python test infra)

> Failed mini-cluster creation leaves chronyd open in Python test infra
> ---------------------------------------------------------------------
>
>                 Key: KUDU-3464
>                 URL: https://issues.apache.org/jira/browse/KUDU-3464
>             Project: Kudu
>          Issue Type: Bug
>            Reporter: Marton Greber
>            Priority: Minor
>              Labels: client
>
> *Description:*
> While working on adding extra startup flag support in the Python test infra I was tinkering with adding negative tests. One example, when a wrong flag name is specified in a test class.
> {code:python}
> class TestKuduTestStartupFlagsMasterWrongFlagName(KuduTestBase, CompatUnitTest):
>     @classmethod
>     def setUpClass(self):
>         extra_master_flags=[("non_existent_flag","1")]
>         extra_tserver_flags=[("tablet_apply_pool_overload_threshold_ms", "1")]
>         error_msg = 'RUNTIME_ERROR'
>         with self.assertRaisesRegex(self, Exception, error_msg):
>             super(TestKuduTestStartupFlagsMasterWrongFlagName, self)\
>                 .setUpClass(extra_master_flags, extra_tserver_flags)
>     def test_startup_flags_master_wrong_flag_name(self):
>         pass
> class TestKuduTestStartupFlagsTserverWrongFlagName(KuduTestBase, CompatUnitTest):
>     @classmethod
>     def setUpClass(self):
>         extra_master_flags=[("check_expired_table_interval_seconds","1")]
>         extra_tserver_flags=[("non_existent_flag","1")]
>         error_msg = 'RUNTIME_ERROR'
>         with self.assertRaisesRegex(self, Exception, error_msg):
>             super(TestKuduTestStartupFlagsTserverWrongFlagName, self)\
>                 .setUpClass(extra_master_flags, extra_tserver_flags)
>     def test_startup_flags_tserver_wrong_flag_name(self):
>         pass
> {code}
> By themselves, these test run fine. Running them after each other results in the following error:
> {code:bash}
> 2023-03-22T14:48:01Z Fatal error : Another chronyd may already be running (pid=377244), check /tmp/kudutest-0/minicluster-data/chrony.0/chronyd.pid
> Could not open connection to daemon
> {code}
> It times out with the above error, when the control flow reaches the second test:
> {code:python}
> _ ERROR at setup of TestKuduTestStartupFlagsTserverWrongFlagName.test_startup_flags_tserver_wrong_flag_name _
> Exception: Error in response: {'code': 'TIMED_OUT', 'message': 'failed to start NTP server 0: failed to contact chronyd in 1.000s'}
> During handling of the above exception, another exception occurred:
> self = <class 'kudu.tests.test_common.TestKuduTestStartupFlagsTserverWrongFlagName'>
>     @classmethod
>     def setUpClass(self):
>         extra_master_flags=[("check_expired_table_interval_seconds","1")]
>         extra_tserver_flags=[("non_existent_flag","1")]
>         error_msg = 'RUNTIME_ERROR'
>         with self.assertRaisesRegex(self, Exception, error_msg):
>             super(TestKuduTestStartupFlagsTserverWrongFlagName, self)\
> >               .setUpClass(extra_master_flags, extra_tserver_flags)
> E           TypeError: _formatMessage() missing 1 required positional argument: 'standardMsg'
> kudu/tests/test_common.py:82: TypeError
> {code}
> I suspect that when the first cluster creation fails, chronyd is not properly disposed of.
> (After quitting the test execution, the referred /tmp/kudutest-0 location is properly cleaned up.)
> *Consequences:*
> If developers writing Python tests mess up a flag name, or value for more than once in the code they get "Could not open connection to daemon" errors. (which is not really helpful at first)
> However for properly written test code this bug has no negative effect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)