You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lens.apache.org by Yash Sharma <ya...@gmail.com> on 2015/06/25 09:23:41 UTC
[DISCUSS] Query cost computation
Hi All,
Just need little clarification on the query cost.
How do we compute the query cost currently, Do we calculate the cost of the
overall query ?
Is this cost only for limiting the user from using a very expensive query ?
One generally used approach is to have a DAG/Tree of all the operations in
the Query (in our case the Hive AST ) and then each node/operator having
its own cost.
By this we can calculate the cumulative cost of the query which would be a
summission of all the individual costs of operators. This would provide a
very granular control over the query cost.
This approach can also help us further in Query Optimization where certain
operators can be removed or rearranged. Drill/Hive/Pheonix use a similar
approach via Calcite - though the implementation style vary. Kylin is also
supposedly following a similar approach.
Should we explore this possibility ?
P.S I am asking this question without assuming any technical
in-feasibilities or coupling on current design. Just a open thought.