You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by kant kodali <ka...@gmail.com> on 2016/11/17 09:28:30 UTC

Another Interesting Question on SPARK SQL

​
Which parts in the diagram above are executed by DataSource connectors and
which parts are executed by Tungsten? or to put it in another way which
phase in the diagram above does Tungsten leverages the Datasource
connectors (such as say cassandra connector ) ?

My understanding so far is that connectors come in during Physical planning
phase but I am not sure if the connectors take logical plan as an input?

Thanks,
kant

Re: Another Interesting Question on SPARK SQL

Posted by Herman van Hövell tot Westerflier <hv...@databricks.com>.
The diagram you have included, is a depiction of the steps Catalyst (the
spark optimizer) takes to create an executable plan. Tungsten mainly comes
into play during code generation and the actual execution.

A datasource is represented by a LogicalRelation during analysis &
optimization. The spark planner takes such a LogicalRelation and plans it
as either RowDataSourceScanExec or an BatchedDataSourceScanExec depending
on the datasource. Both scan nodes support whole stage code generation.

HTH


On Thu, Nov 17, 2016 at 1:28 AM, kant kodali <ka...@gmail.com> wrote:

>
> ​
> Which parts in the diagram above are executed by DataSource connectors and
> which parts are executed by Tungsten? or to put it in another way which
> phase in the diagram above does Tungsten leverages the Datasource
> connectors (such as say cassandra connector ) ?
>
> My understanding so far is that connectors come in during Physical
> planning phase but I am not sure if the connectors take logical plan as an
> input?
>
> Thanks,
> kant
>