You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Prabhu Joseph (JIRA)" <ji...@apache.org> on 2019/06/04 10:54:00 UTC

[jira] [Created] (HADOOP-16347) MapReduce job tasks fails on S3

Prabhu Joseph created HADOOP-16347:
--------------------------------------

             Summary: MapReduce job tasks fails on S3
                 Key: HADOOP-16347
                 URL: https://issues.apache.org/jira/browse/HADOOP-16347
             Project: Hadoop Common
          Issue Type: Sub-task
          Components: fs/s3
    Affects Versions: 3.3.0
            Reporter: Prabhu Joseph


MapReduce Job tasks fails. There are few tasks which fails with below exception and few hangs and then times out. List Files on S3 works fine from hadoop client.
 
*Exception from failed task:*
 
{code}
2019-05-30 20:23:05,424 ERROR [IPC Server handler 19 on 35791] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1559246386193_0001_m_000000_0 - exited : org.apache.hadoop.fs.s3a.AWSClientIOException: doesBucketExist on qe-cloudstorage-bucket: com.amazonaws.SdkClientException: Unable to execute HTTP request: error:14090086:SSL routines:ssl3_get_server_certificate:certificate verify failed: Unable to execute HTTP request: error:14090086:SSL routines:ssl3_get_server_certificate:certificate verify failed
        at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:204)
        at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:111)
        at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:314)
        at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:406)
        at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:310)
        at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:285)
        at org.apache.hadoop.fs.s3a.S3AFileSystem.verifyBucketExists(S3AFileSystem.java:444)
        at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:350)
        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3315)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:136)
        at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3364)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3332)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:491)
        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361)
        at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.<init>(FileOutputCommitter.java:161)
        at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.<init>(FileOutputCommitter.java:117)
        at org.apache.hadoop.examples.terasort.TeraOutputFormat.getOutputCommitter(TeraOutputFormat.java:152)
        at org.apache.hadoop.mapred.Task.initialize(Task.java:606)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
Caused by: com.amazonaws.SdkClientException: Unable to execute HTTP request: error:14090086:SSL routines:ssl3_get_server_certificate:certificate verify failed
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1116)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1066)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
        at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
        at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4368)
        at com.amazonaws.services.s3.AmazonS3Client.getBucketRegionViaHeadRequest(AmazonS3Client.java:5129)
        at com.amazonaws.services.s3.AmazonS3Client.fetchRegionFromCache(AmazonS3Client.java:5103)
        at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4352)
        at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4315)
        at com.amazonaws.services.s3.AmazonS3Client.headBucket(AmazonS3Client.java:1344)
        at com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:1284)
        at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$verifyBucketExists$1(S3AFileSystem.java:445)
        at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109)
        ... 22 more
Caused by: javax.net.ssl.SSLException: error:14090086:SSL routines:ssl3_get_server_certificate:certificate verify failed
        at org.wildfly.openssl.OpenSSLEngine.unwrap(OpenSSLEngine.java:543)
        at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:624)
        at org.wildfly.openssl.OpenSSLSocket.runHandshake(OpenSSLSocket.java:319)
        at org.wildfly.openssl.OpenSSLSocket.startHandshake(OpenSSLSocket.java:210)
        at com.amazonaws.thirdparty.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:396)
        at com.amazonaws.thirdparty.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:355)
        at com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142)
        at com.amazonaws.thirdparty.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:373)
        at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at com.amazonaws.http.conn.ClientConnectionManagerFactory$Handler.invoke(ClientConnectionManagerFactory.java:76)
        at com.amazonaws.http.conn.$Proxy17.connect(Unknown Source)
        at com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:381)
        at com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:237)
        at com.amazonaws.thirdparty.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
        at com.amazonaws.thirdparty.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
        at com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
        at com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
        at com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1238)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1058)
        ... 37 more

{code}
 
 
*ThreadDump of* *Hanging* *MapTask:*
 

{code}
"main" #1 prio=5 os_prio=0 tid=0x00007ff424064800 nid=0x15109 waiting on condition [0x00007ff42c3fc000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doPauseBeforeRetry(AmazonHttpClient.java:1679)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.pauseBeforeRetry(AmazonHttpClient.java:1653)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1191)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1058)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4368)
at com.amazonaws.services.s3.AmazonS3Client.getBucketRegionViaHeadRequest(AmazonS3Client.java:5129)
at com.amazonaws.services.s3.AmazonS3Client.fetchRegionFromCache(AmazonS3Client.java:5103)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4352)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4315)
at com.amazonaws.services.s3.AmazonS3Client.headBucket(AmazonS3Client.java:1344)
at com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:1284)
at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$verifyBucketExists$1(S3AFileSystem.java:445)
at org.apache.hadoop.fs.s3a.S3AFileSystem$$Lambda$19/350413251.execute(Unknown Source)
at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109)
at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:314)
at org.apache.hadoop.fs.s3a.Invoker$$Lambda$20/253767021.execute(Unknown Source)
at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:406)
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:310)
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:285)
at org.apache.hadoop.fs.s3a.S3AFileSystem.verifyBucketExists(S3AFileSystem.java:444)
at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:350)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3315)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:136)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3364)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3332)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:491)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361)
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.<init>(FileOutputCommitter.java:161)
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.<init>(FileOutputCommitter.java:117)
at org.apache.hadoop.examples.terasort.TeraOutputFormat.getOutputCommitter(TeraOutputFormat.java:152)
at org.apache.hadoop.mapred.Task.initialize(Task.java:606)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org