You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@slider.apache.org by "David.Serafini" <Da...@target.com> on 2017/08/03 21:37:48 UTC

Slider jobs randomly failing with keytab error

I've had several jobs running under slider (v0.91 on hadoop v2.7.3) for several months, and lately they've started failing with this error:

java.io.IOException: Resource hdfs://littleredns/user/SVDFE001/.slider/keytabs/security/SVDFE001.keytab changed on src filesystem (expected 1501528458148, was 1501616648926
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:255)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Failing this attempt. Failing the application.


The weird thing is that the keytab file hasn't changed.  'hdfs dfs -ls' confirms that the last mod time was well before the error (days, in some cases).

I haven't been able to find any informative errors msgs in the logs I can find (this particular Hadoop cluster deletes logs of failed containers before yarn saves them - I don't have permission to fix this).

Has anyone encountered this before, or has any ideas what it might really be (since it doesn't seem like the keytab could be the problem.)

thanks,
-david