You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by "zrc@zjdex.com" <zr...@zjdex.com> on 2018/08/23 02:30:43 UTC
About the question of Spark Structured Streaming window output
Hi :
I have some questions about spark structured streaming window output in spark 2.3.1. I write the application code as following:
case class DataType(time:Timestamp, value:Long) {}
val spark = SparkSession
.builder
.appName("StructuredNetworkWordCount")
.master("local[1]")
.getOrCreate()
import spark.implicits._
// Create DataFrame representing the stream of input lines from connection to localhost:9999
val lines = spark.readStream
.format("socket")
.option("host", "localhost")
.option("port", 9999)
.load()
val words = lines.as[String].map(l => {
var tmp = l.split(",")
DataType(Timestamp.valueOf(tmp(0)), tmp(1).toLong)
}).as[DataType]
val windowedCounts = words
.withWatermark("time", "1 minutes")
.groupBy(window($"time", "5 minutes"))
.agg(sum("value") as "sumvalue")
.select("window.start", "window.end","sumvalue")
val query = windowedCounts.writeStream
.outputMode("update")
.format("console")
.trigger(Trigger.ProcessingTime("10 seconds"))
.start()
query.awaitTermination()
the input data format is :
2018-08-20 12:01:00,1
2018-08-20 12:02:01,1
My questions are:
1、when I set the append output model, I send inputdata, but there is no result to output. How to use append model in window aggreate case ?
2、when I set the update output model, I send inputdata, the result is output every batch .But I want output the result only once when window is end. How can I do?
Thanks in advance!
zrc@zjdex.com