You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by holdingonrobin <ro...@gmail.com> on 2014/06/23 23:26:47 UTC

Re: how to make saveAsTextFile NOT split output into multiple file?

I used some standard Java IO libraries to write files directly to the
cluster. It is a little bit trivial tho:

    val sc = getSparkContext
    val hadoopConf = SparkHadoopUtil.get.newConfiguration
    
    val hdfsPath = "hdfs://your/path"
    
    val fs = FileSystem.get(hadoopConf)
    val path = new Path(hdfsPath)
    val os = new DataOutputStream(new BufferedOutputStream(fs.create(path)))
    val data = List(List(0,2,3),List(1,4,5),List(2,6,9))

    // this works
    os.writeBytes(data(2).mkString(", "))
    os.close

Hope you find it helpful



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/how-to-make-saveAsTextFile-NOT-split-output-into-multiple-file-tp8129p8143.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.