You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Csaba Ringhofer (JIRA)" <ji...@apache.org> on 2018/06/22 11:03:00 UTC

[jira] [Updated] (IMPALA-7197) Make process handling during tests more robust

     [ https://issues.apache.org/jira/browse/IMPALA-7197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Csaba Ringhofer updated IMPALA-7197:
------------------------------------
    Description: 
tests/common/impala_cluster.py contains some helper classes for starting/stopping/listing impala daemons. These work well in the general case, but do not check for things that could go wrong. Some examples:
- start() just starts the process and does not wait for it to be actually responsive - this means that a crash/freeze during startup will not be noticed during the startup, but in a later (possible unrelated) test/operation that tries to reach the daemon
- kill() does not wait for the process to terminate, which may mislead tests that list/count Impala daemons (this is a possible cause of IMPALA-7175)
- kill() does not check if the PID it kills is actually an Impala daemon - the process may have crashed and the OS may have given the same PID to a new process (this is very unlikely, but the results are potentially really weird)

It is probably not possible to make these functions completely synchronous, but reducing the chance for issues can save us from some long investigations in the future.

Note that the impala processes are detached from the test the process, so no notification is sent if they are terminated. 



  was:
tests/common/impala_cluster.py contains some helper classes for starting/stopping/listing impala daemons. These work well in the general case, but do not check for things that could go wrong. Some examples:
-start() just starts the process and does not wait for it to be actually responsive - this means that a crash/freeze during startup will not be noticed during the startup, but in a later (possible unrelated) test/operation that tries to reach the daemon
- kill() does not wait for the process to terminate, which may mislead tests that list/count Impala daemons (this is a possible cause of IMPALA-7175)
- kill() does not check if the PID it kills is actually an Impala daemon - the process may have crashed and the OS may have given the same PID to a new process (this is very unlikely, but the results are potentially really weird)

It is probably not possible to make these functions completely synchronous, but reducing the chance for issues can save us from some long investigations in the future.

Note that the impala processes are detached from the test the process, so no notification is sent if they are terminated. 




> Make process handling during tests more robust
> ----------------------------------------------
>
>                 Key: IMPALA-7197
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7197
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Infrastructure
>            Reporter: Csaba Ringhofer
>            Priority: Major
>
> tests/common/impala_cluster.py contains some helper classes for starting/stopping/listing impala daemons. These work well in the general case, but do not check for things that could go wrong. Some examples:
> - start() just starts the process and does not wait for it to be actually responsive - this means that a crash/freeze during startup will not be noticed during the startup, but in a later (possible unrelated) test/operation that tries to reach the daemon
> - kill() does not wait for the process to terminate, which may mislead tests that list/count Impala daemons (this is a possible cause of IMPALA-7175)
> - kill() does not check if the PID it kills is actually an Impala daemon - the process may have crashed and the OS may have given the same PID to a new process (this is very unlikely, but the results are potentially really weird)
> It is probably not possible to make these functions completely synchronous, but reducing the chance for issues can save us from some long investigations in the future.
> Note that the impala processes are detached from the test the process, so no notification is sent if they are terminated. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org