You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by gaganbm <ga...@gmail.com> on 2014/03/21 11:42:33 UTC
Persist streams to text files
Hi,
I am trying to persist the DStreams to text files. When I use the inbuilt
API 'saveAsTextFiles' as :
stream.saveAsTextFiles(resultDirectory)
this creates a number of subdirectories, for each batch, and within each sub
directory, it creates bunch of text files for each RDD (I assume).
I am wondering if I can have single text files for each batch. Is there any
API for that ? Or else, a single output file for the entire stream ?
I tried to manually write from each RDD stream to a text file as :
stream.foreachRDD(rdd =>{
rdd.foreach(element => {
fileWriter.write(element)
})
})
where 'fileWriter' simply makes use of a Java BufferedWriter to write
strings to a file. However, this fails with exception :
DStreamCheckpointData.writeObject used
java.io.BufferedWriter
java.io.NotSerializableException: java.io.BufferedWriter
at
java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1183)
at
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
.....
Any help on how to proceed with this ?
--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Persist-streams-to-text-files-tp2986.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.