You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spark.apache.org by Naveen Madhire <vm...@umail.iu.edu> on 2014/12/29 00:10:23 UTC

Spark Error - Failed to locate the winutils binary in the hadoop binary path

Hi All,

I am getting the below error while running a simple spark application from
Eclipse.

I am using Eclipse, Maven, Java.

I've spark running locally on my Windows laptop. I copied the spark files
from the spark summit 2014 training
http://databricks.com/spark-training-resources#itas

I can run sample commands and small programs using the spark shell. But
getting the below error while running from Eclipse.

14/12/28 18:01:59 ERROR Shell: Failed to locate the winutils binary in the
hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in
the Hadoop binaries.
at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:278)
at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:300)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:293)
at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
at
org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:362)
at org.apache.spark.SparkContext$$anonfun$26.apply(SparkContext.scala:696)
at org.apache.spark.SparkContext$$anonfun$26.apply(SparkContext.scala:696)
at
org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:170)
at
org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:170)
at scala.Option.map(Option.scala:145)
at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:170)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:194)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:205)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:203)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:203)
at org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:28)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:205)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:203)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:203)
at org.apache.spark.rdd.FilteredRDD.getPartitions(FilteredRDD.scala:29)


Please suggest if I am doing something wrong.


Thanks for help

-Naveen

Re: Spark Error - Failed to locate the winutils binary in the hadoop binary path

Posted by Naveen Madhire <vm...@umail.iu.edu>.

Hi All,

Sorry, I should have checked the JIRA issue tracker before sending the
email.

I found this is an already existing issue,

https://issues.apache.org/jira/browse/SPARK-2356

And the solution is present in the below location,
http://qnalist.com/questions/4994960/run-spark-unit-test-on-windows-7


Now it is working fine.

Thanks all.


On Sun, Dec 28, 2014 at 6:10 PM, Naveen Madhire <vm...@umail.iu.edu>
wrote:

> Hi All,
>
> I am getting the below error while running a simple spark application from
> Eclipse.
>
> I am using Eclipse, Maven, Java.
>
> I've spark running locally on my Windows laptop. I copied the spark files
> from the spark summit 2014 training
> http://databricks.com/spark-training-resources#itas
>
> I can run sample commands and small programs using the spark shell. But
> getting the below error while running from Eclipse.
>
> 14/12/28 18:01:59 ERROR Shell: Failed to locate the winutils binary in the
> hadoop binary path
> java.io.IOException: Could not locate executable null\bin\winutils.exe in
> the Hadoop binaries.
> at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:278)
> at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:300)
> at org.apache.hadoop.util.Shell.<clinit>(Shell.java:293)
> at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
> at
> org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:362)
> at org.apache.spark.SparkContext$$anonfun$26.apply(SparkContext.scala:696)
> at org.apache.spark.SparkContext$$anonfun$26.apply(SparkContext.scala:696)
> at
> org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:170)
> at
> org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:170)
> at scala.Option.map(Option.scala:145)
> at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:170)
> at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:194)
> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:205)
> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:203)
> at scala.Option.getOrElse(Option.scala:120)
> at org.apache.spark.rdd.RDD.partitions(RDD.scala:203)
> at org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:28)
> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:205)
> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:203)
> at scala.Option.getOrElse(Option.scala:120)
> at org.apache.spark.rdd.RDD.partitions(RDD.scala:203)
> at org.apache.spark.rdd.FilteredRDD.getPartitions(FilteredRDD.scala:29)
>
>
> Please suggest if I am doing something wrong.
>
>
> Thanks for help
>
> -Naveen
>