You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Shay Seng <sh...@urbanengines.com> on 2014/09/24 19:41:35 UTC

RDD save as Seq File

Hi,
Why does RDD.saveAsObjectFile() to S3 leave a bunch of *_$folder$  empty
files around? Is it possible for the saveas to clean up?

tks

Re: RDD save as Seq File

Posted by Sean Owen <so...@cloudera.com>.

It's really Hadoop's support for S3. Hadoop FS semantics need
directories, and S3 doesn't have a proper notion of directories.

http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/s3native/NativeS3FileSystem.html

You should leave them AFAIK.

On Wed, Sep 24, 2014 at 6:41 PM, Shay Seng <sh...@urbanengines.com> wrote:
> Hi,
> Why does RDD.saveAsObjectFile() to S3 leave a bunch of *_$folder$  empty
> files around? Is it possible for the saveas to clean up?
>
> tks

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org