You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spark.apache.org by Archit Thakur <ar...@gmail.com> on 2015/09/29 21:06:26 UTC

Dynamic DAG use-case for spark streaming.

Hi,

 We are using spark streaming as our processing engine, and as part of
output we want to push the data to UI. Now there would be multiple users
accessing the system with there different filters on. Based on the filters
and other inputs we want to either run a SQL Query on DStream or do a
custom logic processing. This would need the system to read the
filters/query and generate the execution graph at runtime. I cant see any
support in spark streaming for generating the execution graph on the fly.
I think I can broadcast the query to executors and read the broadcasted
query at runtime but that would also limit my user to 1 at a time.

Do we not expect the spark streaming to take queries/filters from outside
world. Does output in spark streaming only means outputting to an external
source which could then be queried.

Thanks,
Archit Thakur.

Re: Dynamic DAG use-case for spark streaming.

Posted by Tathagata Das <td...@databricks.com>.

A very basic support that is there in DStream is DStream.transform() which
take arbitrary RDD => RDD function. This function can actually choose to do
different computation with time. That may be of help to you.

On Tue, Sep 29, 2015 at 12:06 PM, Archit Thakur <ar...@gmail.com>
wrote:

> Hi,
>
>  We are using spark streaming as our processing engine, and as part of
> output we want to push the data to UI. Now there would be multiple users
> accessing the system with there different filters on. Based on the filters
> and other inputs we want to either run a SQL Query on DStream or do a
> custom logic processing. This would need the system to read the
> filters/query and generate the execution graph at runtime. I cant see any
> support in spark streaming for generating the execution graph on the fly.
> I think I can broadcast the query to executors and read the broadcasted
> query at runtime but that would also limit my user to 1 at a time.
>
> Do we not expect the spark streaming to take queries/filters from outside
> world. Does output in spark streaming only means outputting to an external
> source which could then be queried.
>
> Thanks,
> Archit Thakur.
>

Re: Dynamic DAG use-case for spark streaming.

Posted by Tathagata Das <td...@databricks.com>.

A very basic support that is there in DStream is DStream.transform() which
take arbitrary RDD => RDD function. This function can actually choose to do
different computation with time. That may be of help to you.

On Tue, Sep 29, 2015 at 12:06 PM, Archit Thakur <ar...@gmail.com>
wrote:

> Hi,
>
>  We are using spark streaming as our processing engine, and as part of
> output we want to push the data to UI. Now there would be multiple users
> accessing the system with there different filters on. Based on the filters
> and other inputs we want to either run a SQL Query on DStream or do a
> custom logic processing. This would need the system to read the
> filters/query and generate the execution graph at runtime. I cant see any
> support in spark streaming for generating the execution graph on the fly.
> I think I can broadcast the query to executors and read the broadcasted
> query at runtime but that would also limit my user to 1 at a time.
>
> Do we not expect the spark streaming to take queries/filters from outside
> world. Does output in spark streaming only means outputting to an external
> source which could then be queried.
>
> Thanks,
> Archit Thakur.
>