You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hawq.apache.org by Leon Zhang <le...@gmail.com> on 2015/11/27 08:43:45 UTC

Performance issue about HAWQ 2.0 beta

Hi, HAWQ Developers:

   As my previous email hint, I run TPC-DS test on our development.
Comparing with previous version 1.3.x, we can see the performance
improvement on most of queries.

   But the problem is performance reduction for *some* queries. For
example, the query64, the running time increase from 10754.688 ms
to 68884.731 ms . I am not sure if any changes were made that increase the
running time?

   In order to discuss the detail about this issue, I would like use the
query10. The running time increase from 1795.746 ms to 744919.251 ms. I
also attache the sql about this query, and the query plan for this query.

   Thanks

Re: Performance issue about HAWQ 2.0 beta

Posted by Jiali Yao <jy...@pivotal.io>.
Hi Leon,

I do not see the schema and see there is different plan for HAWQ 1.3 and
HAWQ 2.0.
In general the performance difference maybe occur on different part

1. Hash VS Random  (In HAWQ 1.3 the default is HASH, while in HAWQ 2.0 it
is random) To check it, please see the definition of the related table.
2. Default optimizer.   I do not see what planner you used. If you use open
source version , it should be planner. While in enterprise version and HAWQ
1.3, it use ORCA. It can be seen by GUC optimizer on or off.
3. Segment configuration.When comparing performance, we need to have
comparable segment configuration with HAWQ 2.0 and HAWQ 1.X. It has 5
servers and it has one segment per node in HAWQ 1.X. One more, the vseg is
also based on your hardware. In normal physical server such as 64G memory,
8 coreCPU, we suggest that 8. But if you use VM, you can set to lower
value.
4. default segment num setting: I see in your previous email , you set
default_segment_num to 160. If your cluster is only 5 nodes, the value is
not true. Normally we suggest that is should be cluster size * 8

So for your cases, let us identify same configuration for 1.3 and HAWQ 2.0
and get comparison.
Thanks

Jiali




On Mon, Nov 30, 2015 at 3:22 PM, Leon Zhang <le...@gmail.com> wrote:

> Hi, Martin Visser
>
>    Thanks for you quick reply.  I attached the "explain analyze" in my last
> email of this thread.
>
>   And because hawq-2.0 introduce the "virtual segment", and we configure 8
> virtual-segment for each node. So, we can see different segment numbers.
>
> On Fri, Nov 27, 2015 at 4:58 PM, Martin Visser <mv...@pivotal.io> wrote:
>
> > Hi Leon,
> >
> > looking at the 2.0 plan, you're perhaps missing stats on some of the
> tables
> > for example:
> > -> Parquet table Scan on catalog_sales  (cost=0.00..23885.35 rows=1
> > width=197)
> > -> Parquet table Scan on web_sales  (cost=0.00..11982.30 rows=1
> width=197)
> >
> > Can you check or run explain analyze?  Also number of segments is showing
> > different numbers 1.3 5 segs and 2.0 40 sets
> >
> > On Fri, Nov 27, 2015 at 7:43 AM, Leon Zhang <le...@gmail.com> wrote:
> >
> > > Hi, HAWQ Developers:
> > >
> > >    As my previous email hint, I run TPC-DS test on our development.
> > > Comparing with previous version 1.3.x, we can see the performance
> > > improvement on most of queries.
> > >
> > >    But the problem is performance reduction for *some* queries. For
> > > example, the query64, the running time increase from 10754.688 ms
> > > to 68884.731 ms . I am not sure if any changes were made that increase
> > the
> > > running time?
> > >
> > >    In order to discuss the detail about this issue, I would like use
> the
> > > query10. The running time increase from 1795.746 ms to 744919.251 ms. I
> > > also attache the sql about this query, and the query plan for this
> query.
> > >
> > >    Thanks
> > >
> > >
> >
>

Re: Performance issue about HAWQ 2.0 beta

Posted by Leon Zhang <le...@gmail.com>.
Hi, Martin Visser

   Thanks for you quick reply.  I attached the "explain analyze" in my last
email of this thread.

  And because hawq-2.0 introduce the "virtual segment", and we configure 8
virtual-segment for each node. So, we can see different segment numbers.

On Fri, Nov 27, 2015 at 4:58 PM, Martin Visser <mv...@pivotal.io> wrote:

> Hi Leon,
>
> looking at the 2.0 plan, you're perhaps missing stats on some of the tables
> for example:
> -> Parquet table Scan on catalog_sales  (cost=0.00..23885.35 rows=1
> width=197)
> -> Parquet table Scan on web_sales  (cost=0.00..11982.30 rows=1 width=197)
>
> Can you check or run explain analyze?  Also number of segments is showing
> different numbers 1.3 5 segs and 2.0 40 sets
>
> On Fri, Nov 27, 2015 at 7:43 AM, Leon Zhang <le...@gmail.com> wrote:
>
> > Hi, HAWQ Developers:
> >
> >    As my previous email hint, I run TPC-DS test on our development.
> > Comparing with previous version 1.3.x, we can see the performance
> > improvement on most of queries.
> >
> >    But the problem is performance reduction for *some* queries. For
> > example, the query64, the running time increase from 10754.688 ms
> > to 68884.731 ms . I am not sure if any changes were made that increase
> the
> > running time?
> >
> >    In order to discuss the detail about this issue, I would like use the
> > query10. The running time increase from 1795.746 ms to 744919.251 ms. I
> > also attache the sql about this query, and the query plan for this query.
> >
> >    Thanks
> >
> >
>

Re: Performance issue about HAWQ 2.0 beta

Posted by Martin Visser <mv...@pivotal.io>.
Hi Leon,

looking at the 2.0 plan, you're perhaps missing stats on some of the tables
for example:
-> Parquet table Scan on catalog_sales  (cost=0.00..23885.35 rows=1
width=197)
-> Parquet table Scan on web_sales  (cost=0.00..11982.30 rows=1 width=197)

Can you check or run explain analyze?  Also number of segments is showing
different numbers 1.3 5 segs and 2.0 40 sets

On Fri, Nov 27, 2015 at 7:43 AM, Leon Zhang <le...@gmail.com> wrote:

> Hi, HAWQ Developers:
>
>    As my previous email hint, I run TPC-DS test on our development.
> Comparing with previous version 1.3.x, we can see the performance
> improvement on most of queries.
>
>    But the problem is performance reduction for *some* queries. For
> example, the query64, the running time increase from 10754.688 ms
> to 68884.731 ms . I am not sure if any changes were made that increase the
> running time?
>
>    In order to discuss the detail about this issue, I would like use the
> query10. The running time increase from 1795.746 ms to 744919.251 ms. I
> also attache the sql about this query, and the query plan for this query.
>
>    Thanks
>
>

Re: Performance issue about HAWQ 2.0 beta

Posted by Leon Zhang <le...@gmail.com>.
Hi, Jiali Yao,

   Thanks for you reply.

   Here is the detail information:
   1. the segment configrations:
# select * from gp_segment_configuration ;
 registration_order | role | status | port  | hostname |  address
--------------------+------+--------+-------+----------+------------
                  0 | m    | u      | 25432 | dserver1 | dserver1
                  1 | p    | u      | 40404 | dserver5 | 10.10.0.15
                  2 | p    | u      | 40404 | dserver3 | 10.10.0.13
                  3 | p    | u      | 40404 | dserver1 | 10.10.0.11
                  4 | p    | u      | 40404 | dserver4 | 10.10.0.14
                  5 | p    | u      | 40404 | dserver2 | 10.10.0.12
(6 rows)

    2. The "explain analyze" about the query, see the attachement.

    3. No, this query was tested *without YARN*.

Thanks


On Fri, Nov 27, 2015 at 4:59 PM, Jiali Yao <jy...@pivotal.io> wrote:

> Hi Leon
>
> Thanks for providing it. The result is not as we expected. In our
> performance test, we found the performance is comparable with 1.3.
> Could you please some more information:
> 1. Get segment configuration information from 1.3 and 2.0
> select * from gp_segment_configuration ;
> 2. Could you please run "explain analyze" to get more statistic
> information?
> 3. Want to confirm with you: The result run in yarn mode ,right? Also I see
> your previous email to indicate there is some error in yarn, these query is
> also from that test round, right?
>
> Thanks
>
> Jiali
>
> On Fri, Nov 27, 2015 at 3:43 PM, Leon Zhang <le...@gmail.com> wrote:
>
> > Hi, HAWQ Developers:
> >
> >    As my previous email hint, I run TPC-DS test on our development.
> > Comparing with previous version 1.3.x, we can see the performance
> > improvement on most of queries.
> >
> >    But the problem is performance reduction for *some* queries. For
> > example, the query64, the running time increase from 10754.688 ms
> > to 68884.731 ms . I am not sure if any changes were made that increase
> the
> > running time?
> >
> >    In order to discuss the detail about this issue, I would like use the
> > query10. The running time increase from 1795.746 ms to 744919.251 ms. I
> > also attache the sql about this query, and the query plan for this query.
> >
> >    Thanks
> >
> >
>

Re: Performance issue about HAWQ 2.0 beta

Posted by Jiali Yao <jy...@pivotal.io>.
Hi Leon

Thanks for providing it. The result is not as we expected. In our
performance test, we found the performance is comparable with 1.3.
Could you please some more information:
1. Get segment configuration information from 1.3 and 2.0
select * from gp_segment_configuration ;
2. Could you please run "explain analyze" to get more statistic information?
3. Want to confirm with you: The result run in yarn mode ,right? Also I see
your previous email to indicate there is some error in yarn, these query is
also from that test round, right?

Thanks

Jiali

On Fri, Nov 27, 2015 at 3:43 PM, Leon Zhang <le...@gmail.com> wrote:

> Hi, HAWQ Developers:
>
>    As my previous email hint, I run TPC-DS test on our development.
> Comparing with previous version 1.3.x, we can see the performance
> improvement on most of queries.
>
>    But the problem is performance reduction for *some* queries. For
> example, the query64, the running time increase from 10754.688 ms
> to 68884.731 ms . I am not sure if any changes were made that increase the
> running time?
>
>    In order to discuss the detail about this issue, I would like use the
> query10. The running time increase from 1795.746 ms to 744919.251 ms. I
> also attache the sql about this query, and the query plan for this query.
>
>    Thanks
>
>