You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by mathewvinoj <vi...@hotmail.com> on 2015/07/15 07:03:07 UTC
spark cache issue while doing saveAsTextFile and saveAsParquetFile
Hi There,
I am using cache mapPartition to do some processing and cache the result as
below
I am storing the file as both format (parquet and textfile) where
recomputing is happening both time.Eventhough i put the cache its not
working as expected.
below is the code snippet.Any help is really appreciated.
val record = sql(sqlString)
val outputRecords=record.repartition(1).mapPartitions{rows =>
val finalList1 = ListBuffer[Row]()
while (rows.hasNext){
.
.
finalList1.add(xyz)
}
finalList1.iterator
}.cache()
val l = applySchema(outputRecords, schemaName).cache()
l.saveAsTextFile(filename + ".txt")
l.saveAsParquetFile(filename+ ".parquet")
Expected result: When we do saveAsTextFile the computation should happen and
cache the result
and the second time when we do saveAsparquetFile it should get the result
from the cache.
thanks
--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/spark-cache-issue-while-doing-saveAsTextFile-and-saveAsParquetFile-tp23845.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org