You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uniffle.apache.org by GitBox <gi...@apache.org> on 2022/09/01 08:29:47 UTC

[GitHub] [incubator-uniffle] kaijchen opened a new issue, #196: Flaky test ShuffleFlushManagerOnKerberizedHdfsTest

kaijchen opened a new issue, #196:
URL: https://github.com/apache/incubator-uniffle/issues/196

   https://github.com/apache/incubator-uniffle/runs/8129274621?check_suite_focus=true#step:4:1722
   https://github.com/apache/incubator-uniffle/runs/8129689956?check_suite_focus=true#step:4:1722
   
   ```
   Error:  Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 56.754 s <<< FAILURE! - in org.apache.uniffle.server.ShuffleFlushManagerOnKerberizedHdfsTest
   Error:  clearTest  Time elapsed: 51.62 s  <<< FAILURE!
   org.opentest4j.AssertionFailedError: Unexpected flush process
   	at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:39)
   	at org.junit.jupiter.api.Assertions.fail(Assertions.java:134)
   	at org.apache.uniffle.server.ShuffleFlushManagerTest.waitForFlush(ShuffleFlushManagerTest.java:343)
   	at org.apache.uniffle.server.ShuffleFlushManagerOnKerberizedHdfsTest.clearTest(ShuffleFlushManagerOnKerberizedHdfsTest.java:120)
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:725)
   	at org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
   	at org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
   	at org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:149)
   	at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:140)
   	at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:84)
   	at org.junit.jupiter.engine.execution.ExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(ExecutableInvoker.java:115)
   	at org.junit.jupiter.engine.execution.ExecutableInvoker.lambda$invoke$0(ExecutableInvoker.java:105)
   	at org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
   	at org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
   	at org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
   	at org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
   	at org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:104)
   	at org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:98)
   	at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$7(TestMethodTestDescriptor.java:214)
   	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
   	at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:210)
   	at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:135)
   	at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:66)
   	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:151)
   	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
   	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
   	at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
   	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
   	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
   	at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
   	at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
   	at java.util.ArrayList.forEach(ArrayList.java:1259)
   	at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:41)
   	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:155)
   	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
   	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
   	at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
   	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
   	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
   	at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
   	at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
   	at java.util.ArrayList.forEach(ArrayList.java:1259)
   	at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:41)
   	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:155)
   	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
   	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
   	at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
   	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
   	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
   	at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
   	at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
   	at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.submit(SameThreadHierarchicalTestExecutorService.java:35)
   	at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:57)
   	at org.junit.platform.engine.support.hierarchical.HierarchicalTestEngine.execute(HierarchicalTestEngine.java:54)
   	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:107)
   	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88)
   	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54)
   	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67)
   	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52)
   	at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:114)
   	at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:86)
   	at org.junit.platform.launcher.core.DefaultLauncherSession$DelegatingLauncher.execute(DefaultLauncherSession.java:86)
   	at org.junit.platform.launcher.core.SessionPerRequestLauncher.execute(SessionPerRequestLauncher.java:53)
   	at org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.invokeAllTests(JUnitPlatformProvider.java:150)
   	at org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.invoke(JUnitPlatformProvider.java:124)
   	at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
   	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
   	at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
   	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] jerqi closed issue #196: Flaky test ShuffleFlushManagerOnKerberizedHdfsTest

Posted by GitBox <gi...@apache.org>.
jerqi closed issue #196: Flaky test ShuffleFlushManagerOnKerberizedHdfsTest
URL: https://github.com/apache/incubator-uniffle/issues/196


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] zuston commented on issue #196: Flaky test ShuffleFlushManagerOnKerberizedHdfsTest

Posted by GitBox <gi...@apache.org>.
zuston commented on issue #196:
URL: https://github.com/apache/incubator-uniffle/issues/196#issuecomment-1257374629

   OK. I think I will fix it in the next days


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] zuston commented on issue #196: Flaky test ShuffleFlushManagerOnKerberizedHdfsTest

Posted by GitBox <gi...@apache.org>.
zuston commented on issue #196:
URL: https://github.com/apache/incubator-uniffle/issues/196#issuecomment-1233996276

   The stacktrace is as follow
   
   ```
   2022-09-01 07:31:40,131 ERROR [FlushEventThreadPool] server.ShuffleFlushManager (ShuffleFlushManager.java:flushToFile(209)) - Exception happened when process flush shuffle data for ShuffleDataFlushEvent: eventId=0, appId=complexWriteTest_appId1, shuffleId=1, startPartition=0, endPartition=1
   java.lang.RuntimeException: java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - Server not found in Kerberos database)]; Host Details : local host is: "fv-az489-314/10.1.1.91"; destination host is: "localhost":37279; 
   	at org.apache.uniffle.storage.common.HdfsStorage.newWriteHandler(HdfsStorage.java:113)
   	at org.apache.uniffle.storage.common.AbstractStorage.lambda$getOrCreateWriteHandler$2(AbstractStorage.java:50)
   	at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1660)
   	at org.apache.uniffle.storage.common.AbstractStorage.getOrCreateWriteHandler(AbstractStorage.java:50)
   	at org.apache.uniffle.server.ShuffleFlushManager.flushToFile(ShuffleFlushManager.java:168)
   	at org.apache.uniffle.server.ShuffleFlushManager.lambda$null$0(ShuffleFlushManager.java:100)
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   	at java.lang.Thread.run(Thread.java:750)
   Caused by: java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - Server not found in Kerberos database)]; Host Details : local host is: "fv-az489-314/10.1.1.91"; destination host is: "localhost":37279; 
   	at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:782)
   	at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1493)
   	at org.apache.hadoop.ipc.Client.call(Client.java:1435)
   	at org.apache.hadoop.ipc.Client.call(Client.java:1345)
   	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
   	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
   	at com.sun.proxy.$Proxy68.getFileInfo(Unknown Source)
   	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:796)
   	at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:409)
   	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163)
   	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155)
   	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
   	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:346)
   	at com.sun.proxy.$Proxy69.getFileInfo(Unknown Source)
   	at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1649)
   	at org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1440)
   	at org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1437)
   	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
   	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1437)
   	at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1437)
   	at org.apache.uniffle.storage.handler.impl.HdfsShuffleWriteHandler.initialize(HdfsShuffleWriteHandler.java:89)
   	at org.apache.uniffle.storage.handler.impl.HdfsShuffleWriteHandler.<init>(HdfsShuffleWriteHandler.java:81)
   	at org.apache.uniffle.storage.common.HdfsStorage.newWriteHandler(HdfsStorage.java:108)
   	... 8 more
   Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - Server not found in Kerberos database)]
   	at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:755)
   	at java.security.AccessController.doPrivileged(Native Method)
   	at javax.security.auth.Subject.doAs(Subject.java:422)
   	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
   	at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:718)
   	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:811)
   	at org.apache.hadoop.ipc.Client$Connection.access$3500(Client.java:410)
   	at org.apache.hadoop.ipc.Client.getConnection(Client.java:1550)
   	at org.apache.hadoop.ipc.Client.call(Client.java:1381)
   	... 31 more
   Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - Server not found in Kerberos database)]
   	at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
   	at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:406)
   	at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:614)
   	at org.apache.hadoop.ipc.Client$Connection.access$2200(Client.java:410)
   	at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:798)
   	at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:794)
   	at java.security.AccessController.doPrivileged(Native Method)
   	at javax.security.auth.Subject.doAs(Subject.java:422)
   	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
   	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:793)
   	... 34 more
   Caused by: GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - Server not found in Kerberos database)
   	at sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:772)
   	at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:248)
   	at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
   	at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192)
   	... 43 more
   Caused by: KrbException: Server not found in Kerberos database (7) - Server not found in Kerberos database
   	at sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:73)
   	at sun.security.krb5.KrbTgsReq.getReply(KrbTgsReq.java:226)
   	at sun.security.krb5.KrbTgsReq.sendAndGetCreds(KrbTgsReq.java:237)
   	at sun.security.krb5.internal.CredentialsUtil.serviceCredsSingle(CredentialsUtil.java:477)
   	at sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:340)
   	at sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:314)
   	at sun.security.krb5.internal.CredentialsUtil.acquireServiceCreds(CredentialsUtil.java:169)
   	at sun.security.krb5.Credentials.acquireServiceCreds(Credentials.java:490)
   	at sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:695)
   	... 46 more
   Caused by: KrbException: Identifier doesn't match expected value (906)
   	at sun.security.krb5.internal.KDCRep.init(KDCRep.java:140)
   	at sun.security.krb5.internal.TGSRep.init(TGSRep.java:65)
   	at sun.security.krb5.internal.TGSRep.<init>(TGSRep.java:60)
   	at sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:55)
   	... 54 more
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] zuston commented on issue #196: Flaky test ShuffleFlushManagerOnKerberizedHdfsTest

Posted by GitBox <gi...@apache.org>.
zuston commented on issue #196:
URL: https://github.com/apache/incubator-uniffle/issues/196#issuecomment-1260793401

   I reproduce this failure due to the DNS lookup failure. I try to fix this failure by adding the hostname into /etc/hosts mapping file.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] jerqi commented on issue #196: Flaky test ShuffleFlushManagerOnKerberizedHdfsTest

Posted by GitBox <gi...@apache.org>.
jerqi commented on issue #196:
URL: https://github.com/apache/incubator-uniffle/issues/196#issuecomment-1301941797

   @zuston Kerberos test seems unstable, I will reopen this issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] zuston commented on issue #196: Flaky test ShuffleFlushManagerOnKerberizedHdfsTest

Posted by GitBox <gi...@apache.org>.
zuston commented on issue #196:
URL: https://github.com/apache/incubator-uniffle/issues/196#issuecomment-1233937144

   Let me take a look. Oh, this test case is introduced by me, and looks the some test cases are not stable.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] jerqi commented on issue #196: Flaky test ShuffleFlushManagerOnKerberizedHdfsTest

Posted by GitBox <gi...@apache.org>.
jerqi commented on issue #196:
URL: https://github.com/apache/incubator-uniffle/issues/196#issuecomment-1256062111

   > The stacktrace is as follow
   > 
   > ```
   > 2022-09-01 07:31:40,131 ERROR [FlushEventThreadPool] server.ShuffleFlushManager (ShuffleFlushManager.java:flushToFile(209)) - Exception happened when process flush shuffle data for ShuffleDataFlushEvent: eventId=0, appId=complexWriteTest_appId1, shuffleId=1, startPartition=0, endPartition=1
   > java.lang.RuntimeException: java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - Server not found in Kerberos database)]; Host Details : local host is: "fv-az489-314/10.1.1.91"; destination host is: "localhost":37279; 
   > 	at org.apache.uniffle.storage.common.HdfsStorage.newWriteHandler(HdfsStorage.java:113)
   > 	at org.apache.uniffle.storage.common.AbstractStorage.lambda$getOrCreateWriteHandler$2(AbstractStorage.java:50)
   > 	at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1660)
   > 	at org.apache.uniffle.storage.common.AbstractStorage.getOrCreateWriteHandler(AbstractStorage.java:50)
   > 	at org.apache.uniffle.server.ShuffleFlushManager.flushToFile(ShuffleFlushManager.java:168)
   > 	at org.apache.uniffle.server.ShuffleFlushManager.lambda$null$0(ShuffleFlushManager.java:100)
   > 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   > 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   > 	at java.lang.Thread.run(Thread.java:750)
   > Caused by: java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - Server not found in Kerberos database)]; Host Details : local host is: "fv-az489-314/10.1.1.91"; destination host is: "localhost":37279; 
   > 	at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:782)
   > 	at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1493)
   > 	at org.apache.hadoop.ipc.Client.call(Client.java:1435)
   > 	at org.apache.hadoop.ipc.Client.call(Client.java:1345)
   > 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
   > 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
   > 	at com.sun.proxy.$Proxy68.getFileInfo(Unknown Source)
   > 	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:796)
   > 	at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
   > 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   > 	at java.lang.reflect.Method.invoke(Method.java:498)
   > 	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:409)
   > 	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163)
   > 	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155)
   > 	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
   > 	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:346)
   > 	at com.sun.proxy.$Proxy69.getFileInfo(Unknown Source)
   > 	at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1649)
   > 	at org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1440)
   > 	at org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1437)
   > 	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
   > 	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1437)
   > 	at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1437)
   > 	at org.apache.uniffle.storage.handler.impl.HdfsShuffleWriteHandler.initialize(HdfsShuffleWriteHandler.java:89)
   > 	at org.apache.uniffle.storage.handler.impl.HdfsShuffleWriteHandler.<init>(HdfsShuffleWriteHandler.java:81)
   > 	at org.apache.uniffle.storage.common.HdfsStorage.newWriteHandler(HdfsStorage.java:108)
   > 	... 8 more
   > Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - Server not found in Kerberos database)]
   > 	at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:755)
   > 	at java.security.AccessController.doPrivileged(Native Method)
   > 	at javax.security.auth.Subject.doAs(Subject.java:422)
   > 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
   > 	at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:718)
   > 	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:811)
   > 	at org.apache.hadoop.ipc.Client$Connection.access$3500(Client.java:410)
   > 	at org.apache.hadoop.ipc.Client.getConnection(Client.java:1550)
   > 	at org.apache.hadoop.ipc.Client.call(Client.java:1381)
   > 	... 31 more
   > Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - Server not found in Kerberos database)]
   > 	at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
   > 	at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:406)
   > 	at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:614)
   > 	at org.apache.hadoop.ipc.Client$Connection.access$2200(Client.java:410)
   > 	at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:798)
   > 	at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:794)
   > 	at java.security.AccessController.doPrivileged(Native Method)
   > 	at javax.security.auth.Subject.doAs(Subject.java:422)
   > 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
   > 	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:793)
   > 	... 34 more
   > Caused by: GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - Server not found in Kerberos database)
   > 	at sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:772)
   > 	at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:248)
   > 	at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
   > 	at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192)
   > 	... 43 more
   > Caused by: KrbException: Server not found in Kerberos database (7) - Server not found in Kerberos database
   > 	at sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:73)
   > 	at sun.security.krb5.KrbTgsReq.getReply(KrbTgsReq.java:226)
   > 	at sun.security.krb5.KrbTgsReq.sendAndGetCreds(KrbTgsReq.java:237)
   > 	at sun.security.krb5.internal.CredentialsUtil.serviceCredsSingle(CredentialsUtil.java:477)
   > 	at sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:340)
   > 	at sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:314)
   > 	at sun.security.krb5.internal.CredentialsUtil.acquireServiceCreds(CredentialsUtil.java:169)
   > 	at sun.security.krb5.Credentials.acquireServiceCreds(Credentials.java:490)
   > 	at sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:695)
   > 	... 46 more
   > Caused by: KrbException: Identifier doesn't match expected value (906)
   > 	at sun.security.krb5.internal.KDCRep.init(KDCRep.java:140)
   > 	at sun.security.krb5.internal.TGSRep.init(TGSRep.java:65)
   > 	at sun.security.krb5.internal.TGSRep.<init>(TGSRep.java:60)
   > 	at sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:55)
   > 	... 54 more
   > ```
   
   It occurs again
   https://github.com/apache/incubator-uniffle/actions/runs/3111709653/jobs/5044304701
   This test is still flaky.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org