You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by "Kali.tummala@gmail.com" <Ka...@gmail.com> on 2015/10/23 20:40:02 UTC

Saprk error:- Not a valid DFS File name

Hi All, 

got this weird error when I tried to run spark on YARN-CLUSTER mode , I have
33 files and I am looping spark in bash one by one most of them worked ok
except few files.

Is this below error HDFS or spark error ? 

Exception in thread "Driver" java.lang.IllegalArgumentException: Pathname
/user/myid/-u/12:51/_temporary/0 from
hdfs://dev/user/myid/-u/12:51/_temporary/0 is not a valid DFS filename.

File Name which I passed to spark , does file name causes issue ?

hdfs://dev/data/20151019/sipmktdata.ColorDataArchive.UTD.P4_M-P.v5.2015-09-18.txt.20150918

Thanks
Sri 



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Saprk-error-Not-a-valid-DFS-File-name-tp25186.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: Saprk error:- Not a valid DFS File name

Posted by pratik khadloya <ti...@gmail.com>.
I had face a similar issue. The actual problem was not in the file name.
We run Spark on Yarn. The actual problem was seen in the logs by running
the command:
$ yarn logs -applicationId <app-id>

Scroll from the beginning to know the actual error.

~Pratik

On Fri, Oct 23, 2015 at 11:40 AM Kali.tummala@gmail.com <
Kali.tummala@gmail.com> wrote:

> Hi All,
>
> got this weird error when I tried to run spark on YARN-CLUSTER mode , I
> have
> 33 files and I am looping spark in bash one by one most of them worked ok
> except few files.
>
> Is this below error HDFS or spark error ?
>
> Exception in thread "Driver" java.lang.IllegalArgumentException: Pathname
> /user/myid/-u/12:51/_temporary/0 from
> hdfs://dev/user/myid/-u/12:51/_temporary/0 is not a valid DFS filename.
>
> File Name which I passed to spark , does file name causes issue ?
>
>
> hdfs://dev/data/20151019/sipmktdata.ColorDataArchive.UTD.P4_M-P.v5.2015-09-18.txt.20150918
>
> Thanks
> Sri
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Saprk-error-Not-a-valid-DFS-File-name-tp25186.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Re: Saprk error:- Not a valid DFS File name

Posted by pratik khadloya <ti...@gmail.com>.
Check what you have at SimpleMktDataFlow.scala:106

~Pratik

On Fri, Oct 23, 2015 at 11:47 AM Kali.tummala@gmail.com <
Kali.tummala@gmail.com> wrote:

> Full Error:-
> at
>
> org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:195)
>         at
>
> org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:104)
>         at
>
> org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:831)
>         at
>
> org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:827)
>         at
>
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>         at
>
> org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:827)
>         at
>
> org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:820)
>         at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1817)
>         at
>
> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.setupJob(FileOutputCommitter.java:305)
>         at
>
> org.apache.hadoop.mapred.FileOutputCommitter.setupJob(FileOutputCommitter.java:131)
>         at
> org.apache.spark.SparkHadoopWriter.preSetup(SparkHadoopWriter.scala:64)
>         at
>
> org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopDataset(PairRDDFunctions.scala:1046)
>         at
>
> org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:941)
>         at
>
> org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:850)
>         at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:1164)
>         at
> com.citi.ocean.spark.SimpleMktDataFlow$.main(SimpleMktDataFlow.scala:106)
>         at
> com.citi.ocean.spark.SimpleMktDataFlow.main(SimpleMktDataFlow.scala)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at
>
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:427)
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Saprk-error-Not-a-valid-DFS-File-name-tp25186p25188.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Re: Saprk error:- Not a valid DFS File name

Posted by "Kali.tummala@gmail.com" <Ka...@gmail.com>.
Full Error:-
at
org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:195)
	at
org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:104)
	at
org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:831)
	at
org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:827)
	at
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
	at
org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:827)
	at
org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:820)
	at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1817)
	at
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.setupJob(FileOutputCommitter.java:305)
	at
org.apache.hadoop.mapred.FileOutputCommitter.setupJob(FileOutputCommitter.java:131)
	at org.apache.spark.SparkHadoopWriter.preSetup(SparkHadoopWriter.scala:64)
	at
org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopDataset(PairRDDFunctions.scala:1046)
	at
org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:941)
	at
org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:850)
	at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:1164)
	at
com.citi.ocean.spark.SimpleMktDataFlow$.main(SimpleMktDataFlow.scala:106)
	at com.citi.ocean.spark.SimpleMktDataFlow.main(SimpleMktDataFlow.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:427)




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Saprk-error-Not-a-valid-DFS-File-name-tp25186p25188.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org