You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by Magnus Pierre <mp...@maprtech.com> on 2015/06/08 11:52:47 UTC

Drill query optimization

Unsorted questions regarding Drill Optimization

Newbie with little exposure to SQL optimization so please accept that my level of understanding is not that great. 

------------------------
I am trying to push down all relational expressions related to a storage plugin from Drill to JDBC.


I need to both optimize the plan in such way that the filters, projection, joins and so on is pushed down into the source, and then rewrite the plan in Drill to remove steps that are now going to take place in the DB.


 My current understanding is that once I receive the opportunity to optimize the plan in my extension of the StoragePluginOptimizerRule, the Drill planner and optimizer has already produced the complete physical plan in steps per each storage plugin, as well as probably done some optimization in order to push down operations.

Question 1:

What is exposed to the particular storage plugin? Is it what’s related to that storage plugin???

Question 2:

I’ve been studying the Calcite jdbc adapter, which seems to be doing all I need to perform, including generating the statement for running in the target database. (Not sure about the actual optimization of the plan yet to remove steps that are planned to execute in Drill, but that is a later problem)

My problem is really that it seems to be quite many different layers of optimizers, and plans in this area of the code.

Or more exactly, I don’t know what is what and how they relate between Drill and Calcite.

From what I understand I need to convert from Drill relations to relations that calcite understands. Some pointers how Drill utilizes calcite, and the volcano optimizer, and converters available would be very useful.

 

    ScanPrel scan = (ScanPrel) call.rel(1);

    FilterPrel filter = (FilterPrel) call.rel(0);

    RexNode condition = filter.getCondition();

    RelOptCluster cluster = scan.getCluster();

    RelTraitSet trait = filter.getTraitSet();

    JdbcFilter filter2 = new org.apache.calcite.adapter.jdbc.JdbcRules.JdbcFilter(cluster, trait, filter.getInput(), condition);

Filter2 seems to be fine, I call isValid and it returns true.

But then calling:

    DatabaseMetaData meta = grpScan.getClient().getConnection().getMetaData();

    SqlDialect dialect = SqlDialect.create(meta);

    RelDataTypeFactory tf = cluster.getTypeFactory();

    JdbcImplementor impl = new JdbcImplementor(dialect, (JavaTypeFactory) tf);

    Result impResult = filter2.implement(impl);

Throws an exception:

java.lang.ClassCastException: org.apache.calcite.plan.volcano.RelSubset cannot be cast to org.apache.calcite.adapter.jdbc.JdbcRel

I tried first feeding in the FilterPrel directly into filter2 constructor but then I get:

java.lang.ClassCastException: org.apache.drill.exec.planner.physical.FilterPrel cannot be cast to org.apache.calcite.adapter.jdbc.JdbcRel

My guess is that I need to send in a converted filter that is a true JdbcRel. Any ideas how to do this?

Or I am totally lost. If so, please point this out. :)

Regards,

Magnus

Re: Drill query optimization

Posted by Aman Sinha <as...@maprtech.com>.
Magnus,
couple of examples you might want to look at are the users/implementors of
GroupScan.canPushDownProjects()  and
GroupScan.supportsPartitionFilterPushdown()

These are exposed as part of storage plugins such that optimizer rules can
determine the capabilities of the underlying plugin.

Query optimization in Calcite+Drill is primarily driven by 3 things: Rules,
Costing and Planners such as VolcanoPanner that explore the plan search
space. (note that there are other types of planners in Calcite, e.g
HepPlanner that evaluates heuristic rules such as decorrelation).

Calcite generates a logical plan after doing the parsing, validation of the
SQL query.  The Calcite logical plan is converted to Drill logical plan
(using Drill logical rules under org.apache.drill.exec.planner.logical) and
then into a Drill physical plan (using Drill physical rules under
o.a.d.e.p.physical) that can be executed in Drill.   Note that all 3 types
of rules: Calcite logical rules, Drill logical rules and Drill physical
rules are added to a single consolidated 'DrillRuleSet'  which is supplied
to Calcite's  'Planner'  via a FrameworkConfig  instance (see
DrillSqlWorker).    There is no pre-defined order in which these rules are
applied.  The only way to control what gets applied when is through the
trigger criteria defined in the rule's constructor and the matches()
method.

In addition, Drill supplies a 'cost factory' to Calcite's planner.  An
example costing function is DrillJoinRelBase.computeSelfCost().  If you are
not creating a new type of logical or physical plan node, you won't need to
implement this function but since you are doing filter push-down you may
need to examine whether existing cost function for the plan nodes is doing
what you want.

Aman

On Mon, Jun 8, 2015 at 2:52 AM, Magnus Pierre <mp...@maprtech.com> wrote:

> Unsorted questions regarding Drill Optimization
>
> Newbie with little exposure to SQL optimization so please accept that my
> level of understanding is not that great.
>
> ------------------------
> I am trying to push down all relational expressions related to a storage
> plugin from Drill to JDBC.
>
>
> I need to both optimize the plan in such way that the filters, projection,
> joins and so on is pushed down into the source, and then rewrite the plan
> in Drill to remove steps that are now going to take place in the DB.
>
>
>  My current understanding is that once I receive the opportunity to
> optimize the plan in my extension of the StoragePluginOptimizerRule, the
> Drill planner and optimizer has already produced the complete physical plan
> in steps per each storage plugin, as well as probably done some
> optimization in order to push down operations.
>
> Question 1:
>
> What is exposed to the particular storage plugin? Is it what’s related to
> that storage plugin???
>
> Question 2:
>
> I’ve been studying the Calcite jdbc adapter, which seems to be doing all I
> need to perform, including generating the statement for running in the
> target database. (Not sure about the actual optimization of the plan yet to
> remove steps that are planned to execute in Drill, but that is a later
> problem)
>
> My problem is really that it seems to be quite many different layers of
> optimizers, and plans in this area of the code.
>
> Or more exactly, I don’t know what is what and how they relate between
> Drill and Calcite.
>
> From what I understand I need to convert from Drill relations to relations
> that calcite understands. Some pointers how Drill utilizes calcite, and the
> volcano optimizer, and converters available would be very useful.
>
>
>
>     ScanPrel scan = (ScanPrel) call.rel(1);
>
>     FilterPrel filter = (FilterPrel) call.rel(0);
>
>     RexNode condition = filter.getCondition();
>
>     RelOptCluster cluster = scan.getCluster();
>
>     RelTraitSet trait = filter.getTraitSet();
>
>     JdbcFilter filter2 = new
> org.apache.calcite.adapter.jdbc.JdbcRules.JdbcFilter(cluster, trait,
> filter.getInput(), condition);
>
> Filter2 seems to be fine, I call isValid and it returns true.
>
> But then calling:
>
>     DatabaseMetaData meta =
> grpScan.getClient().getConnection().getMetaData();
>
>     SqlDialect dialect = SqlDialect.create(meta);
>
>     RelDataTypeFactory tf = cluster.getTypeFactory();
>
>     JdbcImplementor impl = new JdbcImplementor(dialect, (JavaTypeFactory)
> tf);
>
>     Result impResult = filter2.implement(impl);
>
> Throws an exception:
>
> java.lang.ClassCastException: org.apache.calcite.plan.volcano.RelSubset
> cannot be cast to org.apache.calcite.adapter.jdbc.JdbcRel
>
> I tried first feeding in the FilterPrel directly into filter2 constructor
> but then I get:
>
> java.lang.ClassCastException:
> org.apache.drill.exec.planner.physical.FilterPrel cannot be cast to
> org.apache.calcite.adapter.jdbc.JdbcRel
>
> My guess is that I need to send in a converted filter that is a true
> JdbcRel. Any ideas how to do this?
>
> Or I am totally lost. If so, please point this out. :)
>
> Regards,
>
> Magnus