You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "yifan zou (JIRA)" <ji...@apache.org> on 2019/04/18 20:05:00 UTC
[jira] [Comment Edited] (BEAM-7109) Thread leaking in Portable
Python Precommit
[ https://issues.apache.org/jira/browse/BEAM-7109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16821459#comment-16821459 ]
yifan zou edited comment on BEAM-7109 at 4/18/19 8:04 PM:
----------------------------------------------------------
I reproduce it on a non-deploy node: jenkins-14. You can use the commands below to ssh to the VM and run the test. I will not launch the Jenkins on that node for now for the investigation purpose.
*gcloud auth login*
*gcloud compute --project "apache-beam-testing" ssh --zone "us-central1-b" "apache-beam-jenkins-14"*
*sudo su jenkins*
*cd /home/jenkins/jenkins-slave/workspace/testspace/beam // I already clone the git repo*
*./gradlew --continue --max-workers=12 -Dorg.gradle.jvmargs=-Xms2g -Dorg.gradle.jvmargs=-Xmx4g :portablePythonPreCommit*
// Other commands I used:
sysctl -a | grep kernel.pid_max (get kernel.pid_max)
ps -eLf (list active threads)
ps -eLf | wc -l (count active threads)
top -u jenkins (list running process)
was (Author: yifanzou):
I reproduce it on a non-deploy node: jenkins-14. You can use the commands below to ssh to the VM and run the test. I will not launch the Jenkins on that node for now for the investigation purpose.
*gcloud auth login*
*gcloud compute --project "apache-beam-testing" ssh --zone "us-central1-b" "apache-beam-jenkins-14"*
*sudo su jenkins*
*cd /home/jenkins/jenkins-slave/workspace/testspace/beam // I already clone the git repo*
*./gradlew --continue --max-workers=12 -Dorg.gradle.jvmargs=-Xms2g -Dorg.gradle.jvmargs=-Xmx4g :portablePythonPreCommit*
// Other commands I used:
sysctl -a | grep kernel.pid_max (get kernel.pid_max)
ps -eLf (list active thread)
ps -eLf | wc -l (count active thread)
top -u jenkins (list running process)
> Thread leaking in Portable Python Precommit
> --------------------------------------------
>
> Key: BEAM-7109
> URL: https://issues.apache.org/jira/browse/BEAM-7109
> Project: Beam
> Issue Type: Bug
> Components: testing
> Reporter: yifan zou
> Assignee: Ankur Goenka
> Priority: Critical
> Attachments: threadDump.txt
>
>
> Beam Jenkins constantly break due to some weird errors such as "Unable to create new native thread". The recent build worker failure happened on [apache-beam-jenkins-8] ([https://builds.apache.org/computer/apache-beam-jenkins-8/builds]). Checking the thread number on that VM shows:
> Thread limit: kernel.pid_max = 32768
> Actual used: 32411
>
> Dumping the thread usage (see [^threadDump.txt]) exposed thread leaking on some Python tests. And based on the execution history of the jenkins-8, the [beam_PreCommit_Portable_Python_Commit] ([https://builds.apache.org/job/beam_PreCommit_Portable_Python_Commit]) is suspicious. We ran this test multiple times on a plain node and observed that some thread started by +_apache_beam.runners.worker.sdk_worker_main_+ were not tear down after tests complete. The stale threads finally accumulated and ate the VM kernel thread quota.
>
> cc: [~alanmyrvold], [~jasonkuster], [~altay]
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)