You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hawq.apache.org by Shujie Zhang <sh...@pivotal.io> on 2018/02/09 08:35:40 UTC

a vectorized execution design document

Hi,

A vectorized execution design document have been uploaded to the issue#1450:
https://issues.apache.org/jira/browse/HAWQ-1450

Inside the document are a lot of ideas about how to implement a vectorized
executor, We welcome any comments on the content and suggestions for
improvement, thanks.

Zhang Shujie
2018-02-09

Re: a vectorized execution design document

Posted by Shujie Zhang <sh...@pivotal.io>.
Hi,


In past few days when I research how to implement the vectorized executor
of HAWQ, there is one solution like your advice, we can implement a
vectorized data type like this:

typedef struct vector
{
    int len;
    Oid baseType;
    Datum values[];
}

I found two problems with it:

1. when we implement the operator of this type,  it can be added only one
function for an operator, but we have to use a big switch ... case ... for
the different case, Compared to implement more vtype for each type, It is
not flexible.
for example, this is a function for + operator:

Datum vtype_vtype_pl(v1, v2)
{
    switch(v1.basetype)
    {
        switch(v2.baseType)
        {
               vint2vint2pl();
               vint2vint4pl();
               ……
        }
    }
}

2. Another problem is when we check an expression if it can be vectorized,
we have to check all the Var and functions if they have a
vectorized version, but it has some difficult to check.
for example, if the operator funciton is int2int2pl, we want to check if it
has a vectorized version, now we can know there is a function is
vtype_vtype_pl, but we don't know whether the vint2vint2pl is implemented
in the vtype_vtype_pl, although we can create a map to this situation, it
does not seem good.

I also don't suppose to create vtype for each type is good, it leaves more
complex to users, it also maybe creates HUGE metadata in the system tables
and lead to performance degradation, so it should keep to an continual
improvement.

Thank you, Kuien.

Zhang Shujie






On Thu, Mar 15, 2018 at 3:10 PM, 刘奎恩(局外) <ku...@alibaba-inc.com> wrote:

> My two cents on VType: may we import a general vector type, for example,
> TuplesView, as the base data unit/structure (with a set of tuples, e.g.,
> 1024 tuples specified by a GUC value) for vectorized operators?  It is
> similar to the Set of Record but used by executors. Then we may not need to
> create VType for each Type.
>
>
> -------------——
> Kuien Liu/奎恩
>
> ------------------------------------------------------------------
> 发件人:刘奎恩(局外) <ku...@alibaba-inc.com>
> 发送时间:2018年3月1日(星期四) 15:19
> 收件人:dev <de...@hawq.incubator.apache.org>; Shujie Zhang <sh...@pivotal.io>
> 主 题:回复:a vectorized execution design document
>
> Thanks to Shujie for helpful reply.  Yes, it is transparent to upper
> logics which following Volcano model to evaluate cost and generate plan.
> When we finish the vectorization work (mostly), we may seek for a
> Vectorization-aware QO, with consider Bach-a-time, or Operatior-a-time,
> rather than Tuple-a-time.
>
>
> -------------——
> Kuien Liu/奎恩
>
> ------------------------------------------------------------------
> 发件人:Shujie Zhang <sh...@pivotal.io>
> 发送时间:2018年2月27日(星期二) 09:41
> 收件人:dev <de...@hawq.incubator.apache.org>; 刘奎恩(局外) <
> kuien.lke@alibaba-inc.com>
> 主 题:Re: a vectorized execution design document
>
> Hi,
>
> We check the plan node to see if it can be vectorized when the Plan has
> been generated,
>
> In this phase, the only cheapest Plan had been selected, so we have no
> chance to change it.
>
>
> If we want to generate the vectorized Plan in the optimizer, we should
> generate
>
>  the vectorized Path and compute the cost of it, then we can compare with
> both the cost of them
>
> and choose the cheaper one, the trouble is both build-in-optimizer and
> ORCA should
>
> be refactored, it is a complex work:).  Another trouble is that the
> solution space of optimizer
>
> would become larger becuase of adding a new type Path, the planning time
> should be controlled.
>
>
> In this design, we change the Plan after it was generated,  it is
> transparent to upper modules,
>
> so the optimizer is also can be changed to fit the current vectorized Plan
> in the future.
>
> Thanks,
> Zhang Shujie
>
> On Mon, Feb 26, 2018 at 3:01 PM, 刘奎恩(局外) <ku...@alibaba-inc.com>
> wrote:
> Nice doc, clear design. It is a good start ! I saw an example
> on aggregation is illustrated during the doc, we may implement more
> operators with this design, for example, SORT, JOIN.
> One question is: we implement vectorization under plan three, that is, the
> optimizer cannot feel the change in this way, it still estimates overall
> cost like
> ' total_cost = startup_cost + cpu_per_tuple * tuples + seq_page_cost *
> pages 'In my opinion, the second part (CPU costs) changes a lot, so it is
> should be a stage design, any further plan on it?
> -------------——
> Kuien Liu/奎恩
> ------------------------------------------------------------------发件人:Shujie
> Zhang <sh...@pivotal.io>发送时间:2018年2月9日(星期五) 16:35收件人:dev <
> dev@hawq.incubator.apache.org>主 题:a vectorized execution design document
> Hi,
>
> A vectorized execution design document have been uploaded
> to the issue#1450:
> https://issues.apache.org/jira/browse/HAWQ-1450
>
> Inside the document are a lot of ideas about how to implement a vectorized
> executor, We welcome any comments on the content and suggestions for
> improvement, thanks.
>
> Zhang Shujie
> 2018-02-09
>
>
>

回复:a vectorized execution design document

Posted by "刘奎恩(局外)" <ku...@alibaba-inc.com>.
My two cents on VType: may we import a general vector type, for example, TuplesView, as the base data unit/structure (with a set of tuples, e.g., 1024 tuples specified by a GUC value) for vectorized operators?  It is similar to the Set of Record but used by executors. Then we may not need to create VType for each Type.

-------------——
Kuien Liu/奎恩
------------------------------------------------------------------发件人:刘奎恩(局外) <ku...@alibaba-inc.com>发送时间:2018年3月1日(星期四) 15:19收件人:dev <de...@hawq.incubator.apache.org>; Shujie Zhang <sh...@pivotal.io>主 题:回复:a vectorized execution design document
Thanks to Shujie for helpful reply.  Yes, it is transparent to upper logics which following Volcano model to evaluate cost and generate plan. When we finish the vectorization work (mostly), we may seek for a Vectorization-aware QO, with consider Bach-a-time, or Operatior-a-time, rather than Tuple-a-time.

-------------——
Kuien Liu/奎恩
------------------------------------------------------------------发件人:Shujie Zhang <sh...@pivotal.io>发送时间:2018年2月27日(星期二) 09:41收件人:dev <de...@hawq.incubator.apache.org>; 刘奎恩(局外) <ku...@alibaba-inc.com>主 题:Re: a vectorized execution design document
Hi,








We check the plan node to see if it can be vectorized when the Plan has been generated, In this phase, the only cheapest Plan had been selected, so we have no chance to change it.
If we want to generate the vectorized Plan in the optimizer, we should generate the vectorized Path and compute the cost of it, then we can compare with both the cost of them and choose the cheaper one, the trouble is both build-in-optimizer and ORCA should be refactored, it is a complex work:).  Another trouble is that the solution space of optimizer would become larger becuase of adding a new type Path, the planning time should be controlled.










In this design, we change the Plan after it was generated,  it is transparent to upper modules, so the optimizer is also can be changed to fit the current vectorized Plan in the future.
Thanks,Zhang Shujie
On Mon, Feb 26, 2018 at 3:01 PM, 刘奎恩(局外) <ku...@alibaba-inc.com> wrote:
Nice doc, clear design. It is a good start ! I saw an example on aggregation is illustrated during the doc, we may implement more operators with this design, for example, SORT, JOIN. 

One question is: we implement vectorization under plan three, that is, the optimizer cannot feel the change in this way, it still estimates overall cost like

' total_cost = startup_cost + cpu_per_tuple * tuples + seq_page_cost * pages 'In my opinion, the second part (CPU costs) changes a lot, so it is should be a stage design, any further plan on it? 

-------------——

Kuien Liu/奎恩

------------------------------------------------------------------发件人:Shujie Zhang <sh...@pivotal.io>发送时间:2018年2月9日(星期五) 16:35收件人:dev <de...@hawq.incubator.apache.org>主 题:a vectorized execution design document

Hi,



A vectorized execution design document have been uploaded to the issue#1450:

https://issues.apache.org/jira/browse/HAWQ-1450



Inside the document are a lot of ideas about how to implement a vectorized

executor, We welcome any comments on the content and suggestions for

improvement, thanks.



Zhang Shujie

2018-02-09





回复:a vectorized execution design document

Posted by "刘奎恩(局外)" <ku...@alibaba-inc.com>.
Thanks to Shujie for helpful reply.  Yes, it is transparent to upper logics which following Volcano model to evaluate cost and generate plan. When we finish the vectorization work (mostly), we may seek for a Vectorization-aware QO, with consider Bach-a-time, or Operatior-a-time, rather than Tuple-a-time.

-------------——
Kuien Liu/奎恩
------------------------------------------------------------------发件人:Shujie Zhang <sh...@pivotal.io>发送时间:2018年2月27日(星期二) 09:41收件人:dev <de...@hawq.incubator.apache.org>; 刘奎恩(局外) <ku...@alibaba-inc.com>主 题:Re: a vectorized execution design document
Hi,








We check the plan node to see if it can be vectorized when the Plan has been generated, In this phase, the only cheapest Plan had been selected, so we have no chance to change it.
If we want to generate the vectorized Plan in the optimizer, we should generate the vectorized Path and compute the cost of it, then we can compare with both the cost of them and choose the cheaper one, the trouble is both build-in-optimizer and ORCA should be refactored, it is a complex work:).  Another trouble is that the solution space of optimizer would become larger becuase of adding a new type Path, the planning time should be controlled.










In this design, we change the Plan after it was generated,  it is transparent to upper modules, so the optimizer is also can be changed to fit the current vectorized Plan in the future.
Thanks,Zhang Shujie
On Mon, Feb 26, 2018 at 3:01 PM, 刘奎恩(局外) <ku...@alibaba-inc.com> wrote:
Nice doc, clear design. It is a good start ! I saw an example on aggregation is illustrated during the doc, we may implement more operators with this design, for example, SORT, JOIN. 

One question is: we implement vectorization under plan three, that is, the optimizer cannot feel the change in this way, it still estimates overall cost like

' total_cost = startup_cost + cpu_per_tuple * tuples + seq_page_cost * pages 'In my opinion, the second part (CPU costs) changes a lot, so it is should be a stage design, any further plan on it? 

-------------——

Kuien Liu/奎恩

------------------------------------------------------------------发件人:Shujie Zhang <sh...@pivotal.io>发送时间:2018年2月9日(星期五) 16:35收件人:dev <de...@hawq.incubator.apache.org>主 题:a vectorized execution design document

Hi,



A vectorized execution design document have been uploaded to the issue#1450:

https://issues.apache.org/jira/browse/HAWQ-1450



Inside the document are a lot of ideas about how to implement a vectorized

executor, We welcome any comments on the content and suggestions for

improvement, thanks.



Zhang Shujie

2018-02-09





Re: a vectorized execution design document

Posted by Shujie Zhang <sh...@pivotal.io>.
Hi,

We check the plan node to see if it can be vectorized when the Plan has
been generated,

In this phase, the only cheapest Plan had been selected, so we have no
chance to change it.


If we want to generate the vectorized Plan in the optimizer, we should
generate

 the vectorized Path and compute the cost of it, then we can compare with
both the cost of them

and choose the cheaper one, the trouble is both build-in-optimizer and ORCA
should

be refactored, it is a complex work:).  Another trouble is that the
solution space of optimizer

would become larger becuase of adding a new type Path, the planning time
should be controlled.


In this design, we change the Plan after it was generated,  it is
transparent to upper modules,

so the optimizer is also can be changed to fit the current vectorized Plan
in the future.

Thanks,
Zhang Shujie

On Mon, Feb 26, 2018 at 3:01 PM, 刘奎恩(局外) <ku...@alibaba-inc.com> wrote:

> Nice doc, clear design. It is a good start ! I saw an example
> on aggregation is illustrated during the doc, we may implement more
> operators with this design, for example, SORT, JOIN.
> One question is: we implement vectorization under plan three, that is, the
> optimizer cannot feel the change in this way, it still estimates overall
> cost like
> ' total_cost = startup_cost + cpu_per_tuple * tuples + seq_page_cost *
> pages 'In my opinion, the second part (CPU costs) changes a lot, so it is
> should be a stage design, any further plan on it?
> -------------——
> Kuien Liu/奎恩
> ------------------------------------------------------------------发件人:Shujie
> Zhang <sh...@pivotal.io>发送时间:2018年2月9日(星期五) 16:35收件人:dev <
> dev@hawq.incubator.apache.org>主 题:a vectorized execution design document
> Hi,
>
> A vectorized execution design document have been uploaded
> to the issue#1450:
> https://issues.apache.org/jira/browse/HAWQ-1450
>
> Inside the document are a lot of ideas about how to implement a vectorized
> executor, We welcome any comments on the content and suggestions for
> improvement, thanks.
>
> Zhang Shujie
> 2018-02-09
>
>

回复:a vectorized execution design document

Posted by "刘奎恩(局外)" <ku...@alibaba-inc.com>.
Nice doc, clear design. It is a good start ! I saw an example on aggregation is illustrated during the doc, we may implement more operators with this design, for example, SORT, JOIN. 
One question is: we implement vectorization under plan three, that is, the optimizer cannot feel the change in this way, it still estimates overall cost like 
' total_cost = startup_cost + cpu_per_tuple * tuples + seq_page_cost * pages 'In my opinion, the second part (CPU costs) changes a lot, so it is should be a stage design, any further plan on it? 
-------------——
Kuien Liu/奎恩
------------------------------------------------------------------发件人:Shujie Zhang <sh...@pivotal.io>发送时间:2018年2月9日(星期五) 16:35收件人:dev <de...@hawq.incubator.apache.org>主 题:a vectorized execution design document
Hi,

A vectorized execution design document have been uploaded to the issue#1450:
https://issues.apache.org/jira/browse/HAWQ-1450

Inside the document are a lot of ideas about how to implement a vectorized
executor, We welcome any comments on the content and suggestions for
improvement, thanks.

Zhang Shujie
2018-02-09