You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Khaled Zaouk <kh...@gmail.com> on 2018/05/02 08:51:16 UTC

[Spark Streaming]: Does DStream workload run over Spark SQL engine?

Hi,

I have a question regarding the execution engine of Spark Streaming
(DStream API): Does Spark streaming jobs run over the Spark SQL engine?

For example, if I change a configuration parameter related to Spark SQL
(like spark.sql.streaming.minBatchesToRetain or
spark.sql.objectHashAggregate.sortBased.fallbackThreshold), does this
make any difference when I run Spark streaming job (using DStream API)?

Thank you!

Khaled

Re: [Spark Streaming]: Does DStream workload run over Spark SQL engine?

Posted by Saisai Shao <sa...@gmail.com>.

No, the underlying of DStream is RDD, so it will not leverage any SparkSQL
related feature. I think you should use Structured Streaming instead, which
is based on SparkSQL.

Khaled Zaouk <kh...@gmail.com> 于2018年5月2日周三 下午4:51写道：

> Hi,
>
> I have a question regarding the execution engine of Spark Streaming
> (DStream API): Does Spark streaming jobs run over the Spark SQL engine?
>
> For example, if I change a configuration parameter related to Spark SQL
> (like spark.sql.streaming.minBatchesToRetain or
> spark.sql.objectHashAggregate.sortBased.fallbackThreshold), does this
> make any difference when I run Spark streaming job (using DStream API)?
>
> Thank you!
>
> Khaled
>