You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by lmk <la...@gmail.com> on 2014/07/24 12:54:42 UTC

save to HDFS

Hi,
I have a scala application which I have launched into a spark cluster. I
have the following statement trying to save to a folder in the master:
saveAsHadoopFile[TextOutputFormat[NullWritable,
Text]]("hdfs://masteripaddress:9000/root/test-app/test1/")

The application is executed successfully and log says that save is complete
also. But I am not able to find the file I have saved anywhere. Is there a
way I can access this file?

Pls advice.

Regards,
lmk



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/save-to-HDFS-tp10578.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: save to HDFS

Posted by lmk <la...@gmail.com>.
Thanks Akhil.
I was able to view the files. Actually I was trying to list the same using
regular ls and since it did not show anything I was concerned.
Thanks for showing me the right direction.

Regards,
lmk



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/save-to-HDFS-tp10578p10583.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: save to HDFS

Posted by Akhil Das <ak...@sigmoidanalytics.com>.
This piece of code

saveAsHadoopFile[TextOutputFormat[NullWritable,Text]]("hdfs://
masteripaddress:9000/root/test-app/test1/")

Saves the RDD into HDFS, and yes you can physically see the files using the
hadoop command (hadoop fs -ls /root/test-app/test1 - yes you need to login
to the cluster). In case if you are not able to execute the command (like
hadoop command not found), you can do like $HADOOP_HOME/bin/hadoop fs -ls
/root/test-app/test1



Thanks
Best Regards


On Thu, Jul 24, 2014 at 4:34 PM, lmk <la...@gmail.com>
wrote:

> Hi Akhil,
> I am sure that the RDD that I saved is not empty. I have tested it using
> take.
> But is there no way that I can see this saved physically like we do in the
> normal context? Can't I view this folder as I am already logged into the
> cluster?
> And, should I run hadoop fs -ls
> hdfs://masteripaddress:9000/root/test-app/test1/
> after I login to the cluster?
>
> Thanks,
> lmk
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/save-to-HDFS-tp10578p10581.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>

Re: save to HDFS

Posted by lmk <la...@gmail.com>.
Hi Akhil,
I am sure that the RDD that I saved is not empty. I have tested it using
take.
But is there no way that I can see this saved physically like we do in the
normal context? Can't I view this folder as I am already logged into the
cluster?
And, should I run hadoop fs -ls
hdfs://masteripaddress:9000/root/test-app/test1/
after I login to the cluster?

Thanks,
lmk



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/save-to-HDFS-tp10578p10581.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: save to HDFS

Posted by Akhil Das <ak...@sigmoidanalytics.com>.
Are you sure the RDD that you were saving isn't empty!?

Are you seeing a _SUCCESS file in this location? hdfs://
masteripaddress:9000/root/test-app/test1/
 (Do hadoop fs -ls hdfs://masteripaddress:9000/root/test-app/test1/)


Thanks
Best Regards


On Thu, Jul 24, 2014 at 4:24 PM, lmk <la...@gmail.com>
wrote:

> Hi,
> I have a scala application which I have launched into a spark cluster. I
> have the following statement trying to save to a folder in the master:
> saveAsHadoopFile[TextOutputFormat[NullWritable,
> Text]]("hdfs://masteripaddress:9000/root/test-app/test1/")
>
> The application is executed successfully and log says that save is complete
> also. But I am not able to find the file I have saved anywhere. Is there a
> way I can access this file?
>
> Pls advice.
>
> Regards,
> lmk
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/save-to-HDFS-tp10578.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>