You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lens.apache.org by "Himanshu Gahlaut (JIRA)" <ji...@apache.org> on 2015/06/24 20:54:05 UTC

[jira] [Comment Edited] (LENS-630) Using Duration and Fact Weight Based Query Cost Calculator for Hive Driver

    [ https://issues.apache.org/jira/browse/LENS-630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14599808#comment-14599808 ] 

Himanshu Gahlaut edited comment on LENS-630 at 6/24/15 6:53 PM:
----------------------------------------------------------------

Thinking loud:

I guess a stable interface for QueryCost is missing. It will be helpful if lens code which is based on query cost like code to calculate cumulative query cost is programmed against a QueryCost interface. 

Presence of an interface QueryCost  as below will be helpful to achieve loosely coupled code:

{code}

public interface QueryCost {

    float getQueryCost();

}

{code}

 Different implementations of this interface could be… 

(1) EstimatedExecutionTimeBasedQueryCost: a machine learning based cost calculator might return  EstimatedExecTimeBasedQueryCost, which will be composed of estimatedExecutionTimeInMillis. Its getQueryCost behaviour would convert estimatedExecutionTimeInMillis into a float cost and return it.

{code}
EstimatedExecTimeBasedQueryCost implements QueryCost {
    private long estimatedExecTimeMillis;

    public float getQueryCost() {
       // Logic to convert execution time to float query cost
    }
}
{code}

(2) QueryWeightBasedQueryCost: DurationAndFactWeightBasedQueryCostCalculator might return QueryWeightBasedQueryCost, which will be composed of queryWeight and its getQueryCost would convert queryWeight into a float cost and return it.

{code}
QueryWeightBasedQueryCost implements QueryCost {
    private float queryWeight;

    public float getQueryCost() {
        // Logic to convert queryWeight to float query cost
    }
}
{code}



(3) QueryExecTimeAndWeightBasedQueryCost: This could be a combination of (1) and (2). Its get queryCost would use both execTime and queryWeight and return a float cost from it.

{code}
QueryExecTimeAndWeightBasedQueryCost implements QueryCost {

    private EstimatedExecutionTimeBasedQueryCost execTimeBasedQueryCost;
    private QueryWeightBasedQueryCost queryWeightBasedQueryCost;

    public float getQueryCost() {
        // Logic to compute float query cost
    }
}
{code}

Lens query cost calculation might today return QueryWeightBasedQueryCost and tomorrow it may start returning  EstimatedExecTimeBasedQueryCost. However, it will be helpful if this future change does not require remaining code which is based on query cost to be changed as well and that is possible, if we have a QueryCost interface.


was (Author: himanshu.gahlaut):
Thinking loud:

I guess a stable interface for QueryCost is missing. There might be an interface QueryCost  as below:

{code}

public interface QueryCost {

    float getQueryCost();

}

{code}

 Different implementations of this interface could be… 

(1) EstimatedExecutionTimeBasedQueryCost: a machine learning based cost calculator might return  EstimatedExecTimeBasedQueryCost, which will be composed of estimatedExecutionTimeInMillis. Its getQueryCost behaviour would convert estimatedExecutionTimeInMillis into a float cost and return it.

{code}
EstimatedExecTimeBasedQueryCost implements QueryCost {
    private long estimatedExecTimeMillis;

    public float getQueryCost() {
       // Logic to convert execution time to float query cost
    }
}
{code}

(2) QueryWeightBasedQueryCost: DurationAndFactWeightBasedQueryCostCalculator might return QueryWeightBasedQueryCost, which will be composed of queryWeight and its getQueryCost would convert queryWeight into a float cost and return it.

{code}
QueryWeightBasedQueryCost implements QueryCost {
    private float queryWeight;

    public float getQueryCost() {
        // Logic to convert queryWeight to float query cost
    }
}
{code}



(3) QueryExecTimeAndWeightBasedQueryCost: This could be a combination of (1) and (2). Its get queryCost would use both execTime and queryWeight and return a float cost from it.

{code}
QueryExecTimeAndWeightBasedQueryCost implements QueryCost {

    private EstimatedExecutionTimeBasedQueryCost execTimeBasedQueryCost;
    private QueryWeightBasedQueryCost queryWeightBasedQueryCost;

    public float getQueryCost() {
        // Logic to compute float query cost
    }
}
{code}


> Using Duration and Fact Weight Based Query Cost Calculator for Hive Driver
> --------------------------------------------------------------------------
>
>                 Key: LENS-630
>                 URL: https://issues.apache.org/jira/browse/LENS-630
>             Project: Apache Lens
>          Issue Type: Improvement
>            Reporter: Himanshu Gahlaut
>            Assignee: Rajat Khandelwal
>
> Along with this, we can add a new field in QueryCost to return the query cost calculated by the implementation. normalizedQueryCost could be one name for that field.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)