You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Hu...@Dell.com on 2014/01/09 23:54:36 UTC

is saveAsTextFile in java uses buffered I/O streams?

Can someone provide me details on the spark java implementation of saveAsTextFile API if it uses buffered I/O streams or not and at what point is flush it buffers if they are used?

I remember from attending spark summit presentations that current spark release still uses buffered I/O streams and that an upcoming option to support unbuffered I/O streams upon writing data to local file or hdfs storage.


Thanks,
Hussam

Re: is saveAsTextFile in java uses buffered I/O streams?

Posted by Matei Zaharia <ma...@gmail.com>.

It just uses the Hadoop FileSystem API, I don’t think there’s any extra buffering. That API itself may do buffering in the HDFS case, though newer versions of HDFS fix that.

Matei

On Jan 9, 2014, at 2:54 PM, Hussam_Jarada@Dell.com wrote:

>  
> Can someone provide me details on the spark java implementation of saveAsTextFile API if it uses buffered I/O streams or not and at what point is flush it buffers if they are used?
>  
> I remember from attending spark summit presentations that current spark release still uses buffered I/O streams and that an upcoming option to support unbuffered I/O streams upon writing data to local file or hdfs storage.
>  
>  
> Thanks,
> Hussam