You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Ningjun Wang <ni...@gmail.com> on 2014/12/05 05:28:33 UTC

SparkContext.textfile() cannot load file using UNC path on windows

SparkContext.textfile() cannot load file using UNC path on windows

I run the following on Windows XP

val conf = new
SparkConf().setAppName("testproj1.ClassificationEngine").setMaster("local")
val sc = new SparkContext(conf)
sc.textFile(raw"\\10.209.128.150\TempShare\SvmPocData\reuters-two-categories.load").count()
// This line throw the following exception

Exception in thread "main" org.apache.hadoop.mapred.InvalidInputException:
Input path does not exist: file://
10.209.128.150/TempShare/SvmPocData/reuters-two-categories.load
at
org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:197)

at
org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:208)

at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:179)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:202)
at org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:28)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:202)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1135)
at org.apache.spark.rdd.RDD.count(RDD.scala:904)
at
testproj1.ClassificationEngine$.buildIndex(ClassificationEngine.scala:49)
at testproj1.ClassificationEngine$.main(ClassificationEngine.scala:36)
at testproj1.ClassificationEngine.main(ClassificationEngine.scala)

If I use local path, it works
sc.textFile(raw"C:/temp/Share/SvmPocData/reuters-two-categories.load").count()

sc.textFile(raw"C:\temp\Share\SvmPocData\reuters-two-categories.load").count()


I tried other form of UNC path below and always got the same exception
sc.textFile(raw"//
10.209.128.150/TempShare/SvmPocData/reuters-two-categories.load").count()

sc.textFile(raw"file://
10.209.128.150/TempShare/SvmPocData/reuters-two-categories.load").count()

sc.textFile(raw"file:///
10.209.128.150/TempShare/SvmPocData/reuters-two-categories.load").count()

sc.textFile(raw"file:////
10.209.128.150/TempShare/SvmPocData/reuters-two-categories.load").count()

The UNC path is valid. I can go to my windows explorer and type
“\\10.209.128.150\TempShare\SvmPocData\reuters-two-categories.load" to open
the file in notepade.

Please advise.