You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Jonathan Hung (JIRA)" <ji...@apache.org> on 2019/01/25 03:24:00 UTC

[jira] [Comment Edited] (HADOOP-15711) Fix branch-2 builds

    [ https://issues.apache.org/jira/browse/HADOOP-15711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16751846#comment-16751846 ] 

Jonathan Hung edited comment on HADOOP-15711 at 1/25/19 3:23 AM:
-----------------------------------------------------------------

In the qbt runs there's fatal errors in the logs such as
{noformat}
---------------  T H R E A D  ---------------



Current thread (0x00007f3cc031d800):  VMThread [stack: 0x00007f3ca0dce000,0x00007f3ca0ecf000] [id=23500]



Stack: [0x00007f3ca0dce000,0x00007f3ca0ecf000],  sp=0x00007f3ca0ecdb10,  free space=1022k

Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)

V  [libjvm.so+0x966c25]

V  [libjvm.so+0x49b96e]

V  [libjvm.so+0x872b51]

V  [libjvm.so+0x96b69a]

V  [libjvm.so+0x96baf2]

V  [libjvm.so+0x7da992]



VM_Operation (0x00007f3c95bafad0): RevokeBias, mode: safepoint, requested by thread 0x00007f3cc0744800


{noformat}
Suspected it might be related to [https://bugs.openjdk.java.net/browse/JDK-6869327,] so I tried adding {{-XX:+UseCountedLoopSafepoints}} to one of the runs but it didn't seem to do anything

Then tried porting HADOOP-14816 (and HADOOP-15610) to a test branch forked off branch-2, getting similar results as reported in HDFS-12711, here's a test run : [https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86-jhung/39/] (run with openjdk8) - so at least it appears the unit tests are running to completion with openjdk8.


was (Author: jhung):
In the qbt runs there's fatal errors in the logs such as
{noformat}
---------------  T H R E A D  ---------------



Current thread (0x00007f3cc031d800):  VMThread [stack: 0x00007f3ca0dce000,0x00007f3ca0ecf000] [id=23500]



Stack: [0x00007f3ca0dce000,0x00007f3ca0ecf000],  sp=0x00007f3ca0ecdb10,  free space=1022k

Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)

V  [libjvm.so+0x966c25]

V  [libjvm.so+0x49b96e]

V  [libjvm.so+0x872b51]

V  [libjvm.so+0x96b69a]

V  [libjvm.so+0x96baf2]

V  [libjvm.so+0x7da992]



VM_Operation (0x00007f3c95bafad0): RevokeBias, mode: safepoint, requested by thread 0x00007f3cc0744800


{noformat}
Suspected it might be related to [https://bugs.openjdk.java.net/browse/JDK-6869327,] so I tried adding {{-XX:+UseCountedLoopSafepoints}} to one of the runs but it didn't seem to do anything

Then tried porting HADOOP-14816 (and HADOOP-15610) to a test branch forked off branch-2, getting similar results as reported in HDFS-12711, here's a test run : [https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86-jhung/39/] (run with openjdk8)

> Fix branch-2 builds
> -------------------
>
>                 Key: HADOOP-15711
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15711
>             Project: Hadoop Common
>          Issue Type: Task
>            Reporter: Jonathan Hung
>            Priority: Critical
>         Attachments: HADOOP-15711.001.branch-2.patch
>
>
> Branch-2 builds have been disabled for a while: https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/
> A test run here causes hdfs tests to hang: https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86-jhung/4/
> Running hadoop-hdfs tests locally reveal some errors such as:{noformat}[ERROR] testComplexAppend2(org.apache.hadoop.hdfs.TestFileAppend2)  Time elapsed: 0.059 s  <<< ERROR!
> java.lang.OutOfMemoryError: unable to create new native thread
>         at java.lang.Thread.start0(Native Method)
>         at java.lang.Thread.start(Thread.java:714)
>         at org.apache.hadoop.hdfs.server.namenode.FSImage.saveFSImageInAllDirs(FSImage.java:1164)
>         at org.apache.hadoop.hdfs.server.namenode.FSImage.saveFSImageInAllDirs(FSImage.java:1128)
>         at org.apache.hadoop.hdfs.server.namenode.FSImage.format(FSImage.java:174)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1172)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:403)
>         at org.apache.hadoop.hdfs.DFSTestUtil.formatNameNode(DFSTestUtil.java:234)
>         at org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:1080)
>         at org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:883)
>         at org.apache.hadoop.hdfs.MiniDFSCluster.<init>(MiniDFSCluster.java:514)
>         at org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:473)
>         at org.apache.hadoop.hdfs.TestFileAppend2.testComplexAppend(TestFileAppend2.java:489)
>         at org.apache.hadoop.hdfs.TestFileAppend2.testComplexAppend2(TestFileAppend2.java:543)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43){noformat}
> I was able to get more tests passing locally by increasing the max user process count on my machine. But the error suggests that there's an issue in the tests themselves. Not sure if the error seen locally is the same reason as why jenkins builds are failing, I wasn't able to confirm based on the jenkins builds' lack of output.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org