You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Anu Engineer (JIRA)" <ji...@apache.org> on 2017/06/23 21:53:00 UTC

[jira] [Created] (HDFS-12029) Data node process crashes after kernel upgrade

Anu Engineer created HDFS-12029:
-----------------------------------

             Summary:  Data node process crashes after kernel upgrade
                 Key: HDFS-12029
                 URL: https://issues.apache.org/jira/browse/HDFS-12029
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: datanode
            Reporter: Anu Engineer
            Priority: Critical


 We have seen that when Linux kernel is upgraded to address a specific CVE 
 ( https://access.redhat.com/security/vulnerabilities/stackguard ) it might cause a datanode crash.

We have observed this issue while upgrading from 3.10.0-514.6.2 to 3.10.0-514.21.2 versions of the kernel.

Original kernel fix is here -- https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1be7107fbe18eed3e319a6c3e83c78254b693acb

Datanode fails with the following stack trace, 

{noformat}

# 
# A fatal error has been detected by the Java Runtime Environment: 
# 
# SIGBUS (0x7) at pc=0x00007f458d078b7c, pid=13214, tid=139936990349120 
# 
# JRE version: (8.0_40-b25) (build ) 
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.40-b25 mixed mode linux-amd64 compressed oops) 
# Problematic frame: 
# j java.lang.Object.<clinit>()V+0 
# 
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again 
# 
# An error report file with more information is saved as: 
# /tmp/hs_err_pid13214.log 
# 
# If you would like to submit a bug report, please visit: 
# http://bugreport.java.com/bugreport/crash.jsp 
# 
{noformat}

The root cause is a failure in jsvc. If we pass a greater than 1MB value as the stack size argument, this can be mitigated.  Something like:

{code}
exec "$JSVC" \
-Xss2m
org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter "$@"
{code}

This JIRA tracks potential fixes for this problem. We don't have data on how this impacts other applications that run on datanode as this might impact datanodes memory usage.





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org