You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by "Wang, Ningjun (LNG-NPV)" <ni...@lexisnexis.com> on 2014/11/26 16:48:08 UTC

SparkContext.textfile() cannot load file using UNC path on windows

SparkContext.textfile() cannot load file using UNC path on windows

I run the following on Windows XP

    val conf = new SparkConf().setAppName("testproj1.ClassificationEngine").setMaster("local")
    val sc = new SparkContext(conf)
    sc.textFile(raw"\\10.209.128.150\TempShare\SvmPocData\reuters-two-categories.load").count() // This line throw the following exception

Exception in thread "main" org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file://10.209.128.150/TempShare/SvmPocData/reuters-two-categories.load
       at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:197)
       at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:208)
       at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:179)
       at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204)
       at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202)
       at scala.Option.getOrElse(Option.scala:120)
       at org.apache.spark.rdd.RDD.partitions(RDD.scala:202)
       at org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:28)
       at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204)
       at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202)
       at scala.Option.getOrElse(Option.scala:120)
       at org.apache.spark.rdd.RDD.partitions(RDD.scala:202)
       at org.apache.spark.SparkContext.runJob(SparkContext.scala:1135)
       at org.apache.spark.rdd.RDD.count(RDD.scala:904)
       at testproj1.ClassificationEngine$.buildIndex(ClassificationEngine.scala:49)
       at testproj1.ClassificationEngine$.main(ClassificationEngine.scala:36)
       at testproj1.ClassificationEngine.main(ClassificationEngine.scala)

If I use local path, it works
sc.textFile(raw"C:/temp/Share/SvmPocData/reuters-two-categories.load").count()
sc.textFile(raw"C:\temp\Share\SvmPocData\reuters-two-categories.load").count()

I tried other form of UNC path below and always got the same exception
  sc.textFile(raw"//10.209.128.150/TempShare/SvmPocData/reuters-two-categories.load").count()

    sc.textFile(raw"file://10.209.128.150/TempShare/SvmPocData/reuters-two-categories.load").count()

    sc.textFile(raw"file:///10.209.128.150/TempShare/SvmPocData/reuters-two-categories.load").count()

    sc.textFile(raw"file:////10.209.128.150/TempShare/SvmPocData/reuters-two-categories.load").count()

The UNC path is valid. I can go to my windows explorer and type "\\10.209.128.150\TempShare\SvmPocData\reuters-two-categories.load" to open the file in notepade.

Please advise.

Regards,

Ningjun