You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by kant kodali <ka...@gmail.com> on 2016/11/17 09:28:30 UTC
Another Interesting Question on SPARK SQL
Which parts in the diagram above are executed by DataSource connectors and
which parts are executed by Tungsten? or to put it in another way which
phase in the diagram above does Tungsten leverages the Datasource
connectors (such as say cassandra connector ) ?
My understanding so far is that connectors come in during Physical planning
phase but I am not sure if the connectors take logical plan as an input?
Thanks,
kant
Re: Another Interesting Question on SPARK SQL
Posted by Herman van Hövell tot Westerflier <hv...@databricks.com>.
The diagram you have included, is a depiction of the steps Catalyst (the
spark optimizer) takes to create an executable plan. Tungsten mainly comes
into play during code generation and the actual execution.
A datasource is represented by a LogicalRelation during analysis &
optimization. The spark planner takes such a LogicalRelation and plans it
as either RowDataSourceScanExec or an BatchedDataSourceScanExec depending
on the datasource. Both scan nodes support whole stage code generation.
HTH
On Thu, Nov 17, 2016 at 1:28 AM, kant kodali <ka...@gmail.com> wrote:
>
>
> Which parts in the diagram above are executed by DataSource connectors and
> which parts are executed by Tungsten? or to put it in another way which
> phase in the diagram above does Tungsten leverages the Datasource
> connectors (such as say cassandra connector ) ?
>
> My understanding so far is that connectors come in during Physical
> planning phase but I am not sure if the connectors take logical plan as an
> input?
>
> Thanks,
> kant
>