You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Yue Cheng <yu...@vt.edu> on 2014/08/13 00:23:54 UTC

Hadoop 2.4 failed to launch job on aws s3n

Hi,

I deployed Hadoop 2.4 on AWS EC2 using S3 native file system as a
replacement of HDFS. I tried several example apps, all gave me the
following stack tracing msgs (an older thread on Jul 24 hang there w/o
being resolved... So I attach the DEBUG info here...):

hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.0.jar
wordcount s3n://mybkt/wc/ s3n://mybkt/out

14/08/12 21:57:35 DEBUG util.Shell: setsid exited with exit code 0
14/08/12 21:57:36 DEBUG lib.MutableMetricsFactory: field
org.apache.hadoop.metrics2.lib.MutableRate
org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess
with annotation
@org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate
of successful kerberos logins and latency (milliseconds)], about=,
type=DEFAULT, always=false, sampleName=Ops)
14/08/12 21:57:36 DEBUG lib.MutableMetricsFactory: field
org.apache.hadoop.metrics2.lib.MutableRate
org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure
with annotation
@org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate
of failed kerberos logins and latency (milliseconds)], about=,
type=DEFAULT, always=false, sampleName=Ops)
14/08/12 21:57:36 DEBUG lib.MutableMetricsFactory: field
org.apache.hadoop.metrics2.lib.MutableRate
org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with
annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time,
value=[GetGroups], about=, type=DEFAULT, always=false, sampleName=Ops)
14/08/12 21:57:36 DEBUG impl.MetricsSystemImpl: UgiMetrics, User and group
related metrics
14/08/12 21:57:36 DEBUG util.KerberosName: Kerberos krb5 configuration not
found, setting default realm to empty
14/08/12 21:57:36 DEBUG security.Groups:  Creating new Groups object
14/08/12 21:57:36 DEBUG util.NativeCodeLoader: Trying to load the
custom-built native-hadoop library...
14/08/12 21:57:36 DEBUG util.NativeCodeLoader: Failed to load native-hadoop
with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path
14/08/12 21:57:36 DEBUG util.NativeCodeLoader:
java.library.path=/home/ubuntu/hadoop-2.4.0/lib
14/08/12 21:57:36 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
14/08/12 21:57:36 DEBUG security.JniBasedUnixGroupsMappingWithFallback:
Falling back to shell based
14/08/12 21:57:36 DEBUG security.JniBasedUnixGroupsMappingWithFallback:
Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
14/08/12 21:57:36 DEBUG security.Groups: Group mapping
impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback;
cacheTimeout=300000; warningDeltaMs=5000
14/08/12 21:57:36 DEBUG security.UserGroupInformation: hadoop login
14/08/12 21:57:36 DEBUG security.UserGroupInformation: hadoop login commit
14/08/12 21:57:36 DEBUG security.UserGroupInformation: using local
user:UnixPrincipal: ubuntu
14/08/12 21:57:36 DEBUG security.UserGroupInformation: UGI loginUser:ubuntu
(auth:SIMPLE)
14/08/12 21:57:36 DEBUG service.Jets3tProperties: s3service.https-only=true
14/08/12 21:57:36 DEBUG service.Jets3tProperties:
storage-service.internal-error-retry-max=5
14/08/12 21:57:36 DEBUG service.Jets3tProperties:
http.connection-manager.factory-class-name=org.jets3t.service.utils.RestUtils$ConnManagerFactory
14/08/12 21:57:36 DEBUG service.Jets3tProperties:
httpclient.connection-timeout-ms=60000
14/08/12 21:57:36 DEBUG service.Jets3tProperties:
httpclient.socket-timeout-ms=60000
14/08/12 21:57:36 DEBUG service.Jets3tProperties:
httpclient.stale-checking-enabled=true
14/08/12 21:57:36 DEBUG service.Jets3tProperties: httpclient.useragent=null
14/08/12 21:57:36 DEBUG utils.RestUtils: Setting user agent string:
JetS3t/0.9.0 (Linux/3.13.0-29-generic; amd64; en; JVM 1.7.0_55)
14/08/12 21:57:36 DEBUG service.Jets3tProperties:
http.protocol.expect-continue=true
14/08/12 21:57:36 DEBUG service.Jets3tProperties:
httpclient.connection-manager-timeout=0
14/08/12 21:57:36 DEBUG service.Jets3tProperties: httpclient.retry-max=5
14/08/12 21:57:36 DEBUG service.Jets3tProperties:
httpclient.proxy-autodetect=true
14/08/12 21:57:36 DEBUG service.Jets3tProperties: s3service.s3-endpoint=
s3.amazonaws.com
14/08/12 21:57:36 DEBUG proxy.PluginProxyUtil: About to attempt auto proxy
detection under Java version:1.7.0_55-b14
14/08/12 21:57:36 DEBUG proxy.PluginProxyUtil: Sun Plugin reported java
version not 1.3.X, 1.4.X, 1.5.X or 1.6.X - trying failover detection...
14/08/12 21:57:36 DEBUG proxy.PluginProxyUtil: Using failover proxy
detection...
14/08/12 21:57:36 DEBUG proxy.PluginProxyUtil: Plugin Proxy Config List
Property:null
14/08/12 21:57:36 DEBUG proxy.PluginProxyUtil: No configured plugin proxy
list
14/08/12 21:57:36 DEBUG service.Jets3tProperties:
s3service.default-storage-class=null
14/08/12 21:57:36 DEBUG service.Jets3tProperties:
s3service.server-side-encryption=null
14/08/12 21:57:36 DEBUG service.Jets3tProperties:
http.connection-manager.factory-class-name=org.jets3t.service.utils.RestUtils$ConnManagerFactory
14/08/12 21:57:36 DEBUG service.Jets3tProperties:
httpclient.connection-timeout-ms=60000
14/08/12 21:57:36 DEBUG service.Jets3tProperties:
httpclient.socket-timeout-ms=60000
14/08/12 21:57:36 DEBUG service.Jets3tProperties:
httpclient.stale-checking-enabled=true
14/08/12 21:57:36 DEBUG service.Jets3tProperties: httpclient.useragent=null
14/08/12 21:57:36 DEBUG utils.RestUtils: Setting user agent string:
JetS3t/0.9.0 (Linux/3.13.0-29-generic; amd64; en; JVM 1.7.0_55)
14/08/12 21:57:36 DEBUG service.Jets3tProperties:
http.protocol.expect-continue=true
14/08/12 21:57:36 DEBUG service.Jets3tProperties:
httpclient.connection-manager-timeout=0
14/08/12 21:57:36 DEBUG service.Jets3tProperties: httpclient.retry-max=5
14/08/12 21:57:36 DEBUG service.Jets3tProperties:
httpclient.proxy-autodetect=true
14/08/12 21:57:36 DEBUG service.Jets3tProperties: s3service.s3-endpoint=
s3.amazonaws.com
14/08/12 21:57:36 DEBUG proxy.PluginProxyUtil: About to attempt auto proxy
detection under Java version:1.7.0_55-b14
14/08/12 21:57:36 DEBUG proxy.PluginProxyUtil: Sun Plugin reported java
version not 1.3.X, 1.4.X, 1.5.X or 1.6.X - trying failover detection...
14/08/12 21:57:36 DEBUG proxy.PluginProxyUtil: Using failover proxy
detection...
14/08/12 21:57:36 DEBUG proxy.PluginProxyUtil: Plugin Proxy Config List
Property:null
14/08/12 21:57:36 DEBUG proxy.PluginProxyUtil: No configured plugin proxy
list
14/08/12 21:57:36 DEBUG service.Jets3tProperties: devpay.user-token=null
14/08/12 21:57:36 DEBUG service.Jets3tProperties: devpay.product-token=null
14/08/12 21:57:36 DEBUG service.Jets3tProperties:
httpclient.requester-pays-buckets-enabled=false
14/08/12 21:57:36 DEBUG security.UserGroupInformation: PrivilegedAction
as:ubuntu (auth:SIMPLE)
from:org.apache.hadoop.mapreduce.Job.connect(Job.java:1250)
14/08/12 21:57:36 DEBUG mapreduce.Cluster: Trying ClientProtocolProvider :
org.apache.hadoop.mapred.YarnClientProtocolProvider
14/08/12 21:57:36 DEBUG service.AbstractService: Service:
org.apache.hadoop.mapred.ResourceMgrDelegate entered state INITED
14/08/12 21:57:36 DEBUG service.AbstractService: Service:
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl entered state INITED
14/08/12 21:57:37 INFO client.RMProxy: Connecting to ResourceManager at /
172.31.20.187:8032
14/08/12 21:57:37 DEBUG security.UserGroupInformation: PrivilegedAction
as:ubuntu (auth:SIMPLE)
from:org.apache.hadoop.yarn.client.RMProxy.getProxy(RMProxy.java:130)
14/08/12 21:57:37 DEBUG ipc.YarnRPC: Creating YarnRPC for
org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC
14/08/12 21:57:37 DEBUG ipc.HadoopYarnProtoRPC: Creating a
HadoopYarnProtoRpc proxy for protocol interface
org.apache.hadoop.yarn.api.ApplicationClientProtocol
14/08/12 21:57:37 DEBUG ipc.Server: rpcKind=RPC_PROTOCOL_BUFFER,
rpcRequestWrapperClass=class
org.apache.hadoop.ipc.ProtobufRpcEngine$RpcRequestWrapper,
rpcInvoker=org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker@7d66036e
14/08/12 21:57:37 DEBUG ipc.Client: getting client out of cache:
org.apache.hadoop.ipc.Client@71cebfd2
14/08/12 21:57:37 DEBUG service.AbstractService: Service
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl is started
14/08/12 21:57:37 DEBUG service.AbstractService: Service
org.apache.hadoop.mapred.ResourceMgrDelegate is started
14/08/12 21:57:37 DEBUG security.UserGroupInformation: PrivilegedAction
as:ubuntu (auth:SIMPLE)
from:org.apache.hadoop.fs.FileContext.getAbstractFileSystem(FileContext.java:330)
14/08/12 21:57:37 DEBUG security.UserGroupInformation:
PrivilegedActionException as:ubuntu (auth:SIMPLE)
cause:org.apache.hadoop.fs.UnsupportedFileSystemException: No
AbstractFileSystem for scheme: s3n
14/08/12 21:57:37 INFO mapreduce.Cluster: Failed to use
org.apache.hadoop.mapred.YarnClientProtocolProvider due to error: Error in
instantiating YarnClient
14/08/12 21:57:37 DEBUG mapreduce.Cluster: Trying ClientProtocolProvider :
org.apache.hadoop.mapred.LocalClientProtocolProvider
14/08/12 21:57:37 DEBUG mapreduce.Cluster: Cannot pick
org.apache.hadoop.mapred.LocalClientProtocolProvider as the
ClientProtocolProvider - returned null protocol
14/08/12 21:57:37 DEBUG security.UserGroupInformation:
PrivilegedActionException as:ubuntu (auth:SIMPLE)
cause:java.io.IOException: Cannot initialize Cluster. Please check your
configuration for mapreduce.framework.name and the correspond server
addresses.
java.io.IOException: Cannot initialize Cluster. Please check your
configuration for mapreduce.framework.name and the correspond server
addresses.
    at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:120)
    at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:82)
    at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:75)
    at org.apache.hadoop.mapreduce.Job$9.run(Job.java:1255)
    at org.apache.hadoop.mapreduce.Job$9.run(Job.java:1251)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
    at org.apache.hadoop.mapreduce.Job.connect(Job.java:1250)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:1279)
    at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
    at org.apache.hadoop.examples.WordCount.main(WordCount.java:84)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
    at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145)
    at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

Here are my config files:
yarn-site.xml:
<configuration>

<!-- Site specific YARN configuration properties -->
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.resourcemanager.address</name>
        <value>172.31.20.187:8032</value>
    </property>
    <property>
        <name>yarn.resourcemanager.resource-tracker.address</name>
        <value>172.31.20.187:8031</value>
    </property>
    <property>
        <name>yarn.resourcemanager.scheduler.address</name>
        <value>172.31.20.187:8030</value>
    </property>
    <property>
        <name>yarn.nodemanager.local-dirs</name>
        <value>/home/ubuntu/hdfs/tmp</value>
    </property>

</configuration>

mapred-site.xml:
<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>

    <property>
        <name>mapreduce.map.memory.mb</name>
        <value>640</value>
        <description>Larger resource limit for maps.</description>
    </property>

    <property>
        <name>mapreduce.map.java.opts</name>
        <value>-Xmx768m</value>
        <description>Heap-size for child jvms of maps.</description>
    </property>

    <property>
        <name>mapreduce.reduce.memory.mb</name>
        <value>640</value>
        <description>Larger resource limit for reduces.</description>
    </property>

    <property>
        <name>mapreduce.reduce.java.opts</name>
        <value>-Xmx768m</value>
        <description>Heap-size for child jvms of reduces.</description>
    </property>

    <property>
        <name>mapreduce.jobtracker.address</name>
        <value>172.31.20.187:8021</value>
    </property>

</configuration>

I also followed this link for configuration of AWS S3's access control
(core-site.xml):
https://wiki.apache.org/hadoop/AmazonS3

core-site.xml:
    <property>
        <name>fs.defaultFS</name>
        <value>s3n://mybkt</value>
    </property>

    <property>
        <name>io.file.buffer.size</name>
        <value>131072</value>
    </property>

    <property>
        <name>fs.s3n.awsAccessKeyId</name>
        <value>123</value>
    </property>

    <property>
        <name>fs.s3n.awsSecretAccessKey</name>
        <value>456</value>
    </property>

I tried Hadoop v1 as well and s3n file system works for wordcount. But it
seems it doesn't work for Hadoop v2.

Please help.

Thanks,
Yue