You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "yifan zou (JIRA)" <ji...@apache.org> on 2019/04/18 20:05:00 UTC

[jira] [Comment Edited] (BEAM-7109) Thread leaking in Portable Python Precommit

    [ https://issues.apache.org/jira/browse/BEAM-7109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16821459#comment-16821459 ] 

yifan zou edited comment on BEAM-7109 at 4/18/19 8:04 PM:
----------------------------------------------------------

I reproduce it on a non-deploy node: jenkins-14. You can use the commands below to ssh to the VM and run the test. I will not launch the Jenkins on that node for now for the investigation purpose. 

*gcloud auth login*

*gcloud compute --project "apache-beam-testing" ssh --zone "us-central1-b" "apache-beam-jenkins-14"*

*sudo su jenkins*

*cd /home/jenkins/jenkins-slave/workspace/testspace/beam  // I already clone the git repo*

*./gradlew --continue --max-workers=12 -Dorg.gradle.jvmargs=-Xms2g -Dorg.gradle.jvmargs=-Xmx4g :portablePythonPreCommit*

 

// Other commands I used:

sysctl -a | grep kernel.pid_max (get kernel.pid_max)

ps -eLf (list active threads)

ps -eLf | wc -l (count active threads)

top -u jenkins (list running process)

 


was (Author: yifanzou):
I reproduce it on a non-deploy node: jenkins-14. You can use the commands below to ssh to the VM and run the test. I will not launch the Jenkins on that node for now for the investigation purpose. 

*gcloud auth login*

*gcloud compute --project "apache-beam-testing" ssh --zone "us-central1-b" "apache-beam-jenkins-14"*

*sudo su jenkins*

*cd /home/jenkins/jenkins-slave/workspace/testspace/beam  // I already clone the git repo*

*./gradlew --continue --max-workers=12 -Dorg.gradle.jvmargs=-Xms2g -Dorg.gradle.jvmargs=-Xmx4g :portablePythonPreCommit*

 

// Other commands I used:

sysctl -a | grep kernel.pid_max (get kernel.pid_max)

ps -eLf (list active thread)

ps -eLf | wc -l (count active thread)

top -u jenkins (list running process)

 

> Thread leaking in Portable Python Precommit 
> --------------------------------------------
>
>                 Key: BEAM-7109
>                 URL: https://issues.apache.org/jira/browse/BEAM-7109
>             Project: Beam
>          Issue Type: Bug
>          Components: testing
>            Reporter: yifan zou
>            Assignee: Ankur Goenka
>            Priority: Critical
>         Attachments: threadDump.txt
>
>
> Beam Jenkins constantly break due to some weird errors such as "Unable to create new native thread". The recent build worker failure happened on [apache-beam-jenkins-8] ([https://builds.apache.org/computer/apache-beam-jenkins-8/builds]). Checking the thread number on that VM shows: 
> Thread limit: kernel.pid_max = 32768 
> Actual used: 32411
>  
> Dumping the thread usage (see [^threadDump.txt]) exposed thread leaking on some Python tests. And based on the execution history of the jenkins-8, the [beam_PreCommit_Portable_Python_Commit] ([https://builds.apache.org/job/beam_PreCommit_Portable_Python_Commit]) is suspicious. We ran this test multiple times on a plain node and observed that some thread started by +_apache_beam.runners.worker.sdk_worker_main_+ were not tear down after tests complete. The stale threads finally accumulated and ate the VM kernel thread quota. 
>  
> cc: [~alanmyrvold], [~jasonkuster], [~altay]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)