You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@crail.apache.org by David Crespi <da...@storedgesystems.com> on 2020/01/12 18:34:23 UTC
Can't find CRAIL_HOME on spark-submit using yarn

Hi All,
I think this is an old problem that seems to have resurfaced.
I’ve pulled the latest crail (v1.2-rc2-1-g8a739dd) and the latest crail-spark, which I believe
this last patch fixed this issue:

(https://github.com/zrlio/crail-spark-io/commit/6880cf691237baffb3d2bf71984ccea6cdd5776c)

But this error below now occurs.  Nothing I do seems to get CRAIL_HOME correctly set.
If I change it to something random, then it does show the path, which is the wrong random path…
which I would expect.  So it does find it. CRAIL_HOME is set up in all my environments, as well as
being passed to yarn on the command line option.

--conf "spark.yarn.appMasterEnv.CRAIL_HOME=/crail"

I do not get this error when I’m not using yarn, and submitting to a local spark cluster.
Am I missing something or perhaps the crail-client missed an update?

Caused by: java.lang.IllegalArgumentException: CRAIL_HOME environment variable is not set or empty
        at org.apache.crail.conf.CrailConfiguration.createConfigurationFromFile(CrailConfiguration.java:48)
        at org.apache.spark.storage.CrailDispatcher.org$apache$spark$storage$CrailDispatcher$$init(CrailDispatcher.scala:119)
        at org.apache.spark.storage.CrailDispatcher$.get(CrailDispatcher.scala:662)
        at org.apache.spark.shuffle.crail.CrailShuffleWriter.<init>(CrailShuffleWriter.scala:43)
        at org.apache.spark.shuffle.crail.CrailShuffleManager.getWriter(CrailShuffleManager.scala:75)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:98)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
        at org.apache.spark.scheduler.Task.run(Task.scala:121)
        at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

Regards,

           David