You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org> on 2016/10/03 18:50:20 UTC

[jira] [Commented] (KUDU-1669) Java ITClient test can orphan processes

    [ https://issues.apache.org/jira/browse/KUDU-1669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15543116#comment-15543116 ] 

Jean-Daniel Cryans commented on KUDU-1669:
------------------------------------------

The patch I pushed to fix the handling of InterruptedException uncovered another error:

{noformat}
17:05:56.829 [WARN - main] (MiniKuduCluster.java:388) Could not delete path /home/jenkins-slave/workspace/kudu-2/build/test-tmp/ts-0-1475514351023
java.io.IOException: Unable to delete directory /home/jenkins-slave/workspace/kudu-2/build/test-tmp/ts-0-1475514351023/wals/0af627e9d2654c7d96ce4582b769d91f.
        at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1541)
        at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2270)
        at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
        at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
        at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2270)
        at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
        at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
        at org.apache.kudu.client.MiniKuduCluster.shutdown(MiniKuduCluster.java:383)
{noformat}

This happened in TestPartitionPruner which doesn't kill processes until the very end.

> Java ITClient test can orphan processes
> ---------------------------------------
>
>                 Key: KUDU-1669
>                 URL: https://issues.apache.org/jira/browse/KUDU-1669
>             Project: Kudu
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 1.0.0
>            Reporter: Adar Dembo
>            Assignee: Jean-Daniel Cryans
>         Attachments: org.apache.kudu.client.ITClient-output.txt.gz
>
>
> Saw this with ITClient but it can happen to any test that uses a MiniCluster, restarts a process, and interrupts its own threads. Maybe ITClient is the only example of that.
> Here's the interesting stack trace:
> {noformat}
> 21:18:12.316 [ERROR - Thread-8] (ITClient.java:134) Couldn't restart a master
> java.lang.InterruptedException: sleep interrupted
>         at java.lang.Thread.sleep(Native Method)
>         at org.apache.kudu.client.MiniKuduCluster.configureAndStartProcess(MiniKuduCluster.java:239)
>         at org.apache.kudu.client.MiniKuduCluster.restartDeadProcessOnPort(MiniKuduCluster.java:282)
>         at org.apache.kudu.client.MiniKuduCluster.restartDeadMasterOnPort(MiniKuduCluster.java:256)
>         at org.apache.kudu.client.BaseKuduTest.restartLeaderMaster(BaseKuduTest.java:431)
>         at org.apache.kudu.client.ITClient$ChaosThread.restartMaster(ITClient.java:222)
>         at org.apache.kudu.client.ITClient$ChaosThread.run(ITClient.java:161)
>         at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Because of this thread interruption, the newly restarted process never makes it into BaseKuduTest.masterProcesses, which means it isn't destroyed when the test class cleans up. All sorts of bad stuff can happen then. It's possible the process is completely orphaned on the test machine (though I imagine we'd kill it eventually). I noticed this because in one of my precommit test runs ITClient left behind a test directory; presumably because the orphaned master continued to write files even after the test directory was cleaned up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)