You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Koert Kuipers <ko...@tresata.com> on 2014/03/11 17:06:27 UTC

RDD.saveAs...

I find the current design to write RDDs to disk (or a database, etc) kind
of ugly. It will lead to a proliferation of saveAs methods. A better
abstraction would be nice (perhaps a Sink trait to write to)

Re: RDD.saveAs...

Posted by Matei Zaharia <ma...@gmail.com>.
I agree that we can’t keep adding these to the core API, partly because it will get unwieldy to maintain and partly just because each storage system will bring in lots of dependencies. We can simply have helper classes in different modules for each storage system. There’s some discussion on this at https://spark-project.atlassian.net/browse/SPARK-1127.

Matei

On Mar 11, 2014, at 9:06 AM, Koert Kuipers <ko...@tresata.com> wrote:

> I find the current design to write RDDs to disk (or a database, etc) kind of ugly. It will lead to a proliferation of saveAs methods. A better abstraction would be nice (perhaps a Sink trait to write to)
>