You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hawq.apache.org by 来熊 <yi...@163.com> on 2016/09/27 05:32:03 UTC

hawq-site

hi,all:
I am using PXF and hcatalog to query hive, table t1,t2 in hive, and t1 is large table. and hawq in yarn mode.


[hive@master ~]$ hive
hive> select count(1) from t1;
OK
680852926
Time taken: 0.721 seconds, Fetched: 1 row(s)
hive> exit;


when I query t1 in hawq ,it is very very slow:


[gpadmin@master ~]$ 
[gpadmin@master ~]$ psql -U gpadmin -d gpadmin 
gpadmin=# set pxf_service_address to 'master:51200';
SET
Time: 0.410 ms
gpadmin=# select count(*) from hcatalog.default.t2;
 count 
-------
  1000
(1 row)


Time: 910.853 ms


gpadmin=# explain select count(*) from hcatalog.default.t1;
                                             QUERY PLAN                                             
----------------------------------------------------------------------------------------------------
 Aggregate  (cost=0.00..431.00 rows=1 width=8)
   ->  Gather Motion 1:1  (slice1; segments: 1)  (cost=0.00..431.00 rows=1 width=8)
         ->  Aggregate  (cost=0.00..431.00 rows=1 width=8)
               ->  External Scan on t1  (cost=0.00..431.00 rows=1 width=1)
 Optimizer status: PQO version 1.627
(5 rows)


Time: 1388.073 ms
gpadmin=# 
gpadmin=# select count(1) from hcatalog.default.t1;


wait a long time,and cannot get result.
log messages:


2016-09-27 09:46:25.816366 CST,"gpadmin","gpadmin",p764498,th-1935386496,"10.0.230.20","16234",2016-09-27 09:31:13 CST,90355,con51,cmd20,seg-1,,,x90355,sx1,"LOG","00000","ConnID 5. Registered in HAWQ resource manager (By OID)",,,,,,"select count(*) from hcatalog.default.t1;",0,,"rmcomm_QD2RM.c",609,
2016-09-27 09:46:25.816508 CST,,,p760393,th-1935386496,,,,0,con4,,seg-10000,,,,,"LOG","00000","ConnID 5. Expect query resource (256 MB, 0.022727 CORE) x 1 ( MIN 1 ) resource after adjusting based on queue NVSEG limits.",,,,,,,0,,"resqueuemanager.c",1913,
2016-09-27 09:46:25.816603 CST,,,p760393,th-1935386496,,,,0,con4,,seg-10000,,,,,"LOG","00000","Latency of getting resource allocated is 138us",,,,,,,0,,"resqueuemanager.c",4375,
2016-09-27 09:46:25.816743 CST,"gpadmin","gpadmin",p764498,th-1935386496,"10.0.230.20","16234",2016-09-27 09:31:13 CST,90355,con51,cmd20,seg-1,,,x90355,sx1,"LOG","00000","ConnID 5. Acquired resource from resource manager, (256 MB, 0.022727 CORE) x 1.",,,,,,"select count(*) from hcatalog.default.t1;",0,,"rmcomm_QD2RM.c",868,
2016-09-27 09:46:25.816868 CST,"gpadmin","gpadmin",p764498,th-1935386496,"10.0.230.20","16234",2016-09-27 09:31:13 CST,90355,con51,cmd20,seg-1,,,x90355,sx1,"LOG","00000","data locality ratio: 0.000; virtual segment number: 1; different host number: 1; virtual segment number per host(avg/min/max): (1/1/1); segment size(avg/min/max): (0.000 B/0 B/0 B); segment size with penalty(avg/min/max): (0.000 B/0 B/0 B); continuity(avg/min/max): (0.000/0.000/0.000).",,,,,,"select count(*) from hcatalog.default.t1;",0,,"cdbdatalocality.c",3396,




I don't know why hawq only get such little resources.
Is there any parameters I can set to let it (query hive using pxf and hcatalog) faster like in hive directly.

Re: hawq-site

Posted by Kyle Dunn <kd...@pivotal.io>.

You might also try tuning this parameter when using HAWQ with YARN:

hawq_rm_min_resource_perseg - Sets the minimum number of containers yarn
will give out for a query (per segment host).

For example, with an 8 node configuration setting this to 4 (up from the
default of 2) results in 32 containers (vs 16 by default).

YARN queues and associated parameters with them will also be relevant here.

-Kyle


On Tue, Sep 27, 2016 at 1:16 PM Kavinder Dhaliwal <kd...@pivotal.io>
wrote:

> Hi,
>
> Could you please describe a bit more about the nature of your tables in
> hive. What file formats are the underlying files? What is the total size of
> the files?
>
> Additionally, what the Java Heap size you have specified for pxf? and can
> you share your DDL for creating the table.
>
> For more debugging information I would suggest you tail both
> /var/log/pxf/catalina.out and /var/log/pxf/pxf-service.log and then
> run select count(1) from hcatalog.default.t1; It will be obvious from
> catalina.out if there is insufficient memory allocated to pxf.
>
> This will give more insight into where your problem is. Currently there is
> a known issue with PXF, large Hive files (mainly compressed formats), and
> an insufficient Java Heap size. After trying the above, perhaps you can try
> increasing your heap size via the "-Xmx<SIZE>" jvm property and running
> again.
>
> Thanks,
> Kavinder
>
> On Mon, Sep 26, 2016 at 10:32 PM, 来熊 <yi...@163.com> wrote:
>
> > hi,all:
> > I am using PXF and hcatalog to query hive, table t1,t2 in hive, and t1 is
> > large table. and hawq in yarn mode.
> >
> > [hive@master ~]$ hive
> > hive> select count(1) from t1;
> > OK
> > 680852926
> > Time taken: 0.721 seconds, Fetched: 1 row(s)
> > hive> exit;
> >
> > when I query t1 in hawq ,it is very very slow:
> >
> > [gpadmin@master ~]$
> > [gpadmin@master ~]$ psql -U gpadmin -d gpadmin
> > gpadmin=# set pxf_service_address to 'master:51200';
> > SET
> > Time: 0.410 ms
> > gpadmin=# select count(*) from hcatalog.default.t2;
> >  count
> > -------
> >   1000
> > (1 row)
> >
> > Time: 910.853 ms
> >
> > gpadmin=# explain select count(*) from hcatalog.default.t1;
> >                                              QUERY PLAN
> >
> > ------------------------------------------------------------
> > ----------------------------------------
> >  Aggregate  (cost=0.00..431.00 rows=1 width=8)
> >    ->  Gather Motion 1:1  (slice1; segments: 1)  (cost=0.00..431.00
> rows=1
> > width=8)
> >          ->  Aggregate  (cost=0.00..431.00 rows=1 width=8)
> >                ->  External Scan on t1  (cost=0.00..431.00 rows=1
> width=1)
> >  Optimizer status: PQO version 1.627
> > (5 rows)
> >
> > Time: 1388.073 ms
> > gpadmin=#
> > gpadmin=# select count(1) from hcatalog.default.t1;
> >
> > wait a long time,and cannot get result.
> > log messages:
> >
> > 2016-09-27 09:46:25.816366 CST,"gpadmin","gpadmin",
> > p764498,th-1935386496,"10.0.230.20","16234",2016-09-27 09:31:13
> > CST,90355,con51,cmd20,seg-1,,,x90355,sx1,"LOG","00000","ConnID 5.
> > Registered in HAWQ resource manager (By OID)",,,,,,"select count(*) from
> > hcatalog.default.t1;",0,,"rmcomm_QD2RM.c",609,
> > 2016-09-27 09:46:25.816508 CST,,,p760393,th-1935386496,,,
> > ,0,con4,,seg-10000,,,,,"LOG","00000","ConnID 5. Expect query resource
> > (256 MB, 0.022727 CORE) x 1 ( MIN 1 ) resource after adjusting based on
> > queue NVSEG limits.",,,,,,,0,,"resqueuemanager.c",1913,
> > 2016-09-27 09:46:25.816603 CST,,,p760393,th-1935386496,,,
> > ,0,con4,,seg-10000,,,,,"LOG","00000","Latency of getting resource
> > allocated is 138us",,,,,,,0,,"resqueuemanager.c",4375,
> > 2016-09-27 09:46:25.816743 CST,"gpadmin","gpadmin",
> > p764498,th-1935386496,"10.0.230.20","16234",2016-09-27 09:31:13
> > CST,90355,con51,cmd20,seg-1,,,x90355,sx1,"LOG","00000","ConnID 5.
> > Acquired resource from resource manager, (256 MB, 0.022727 CORE) x
> > 1.",,,,,,"select count(*) from hcatalog.default.t1;",0,,"
> > rmcomm_QD2RM.c",868,
> > 2016-09-27 09:46:25.816868 CST,"gpadmin","gpadmin",
> > p764498,th-1935386496,"10.0.230.20","16234",2016-09-27 09:31:13
> > CST,90355,con51,cmd20,seg-1,,,x90355,sx1,"LOG","00000","data locality
> > ratio: 0.000; virtual segment number: 1; different host number: 1;
> virtual
> > segment number per host(avg/min/max): (1/1/1); segment size(avg/min/max):
> > (0.000 B/0 B/0 B); segment size with penalty(avg/min/max): (0.000 B/0 B/0
> > B); continuity(avg/min/max): (0.000/0.000/0.000).",,,,,,"select count(*)
> > from hcatalog.default.t1;",0,,"cdbdatalocality.c",3396,
> >
> >
> > I don't know why hawq only get such little resources.
> > Is there any parameters I can set to let it (query hive using pxf and
> > hcatalog) faster like in hive directly.
> >
> >
> >
>
-- 
*Kyle Dunn | Data Engineering | Pivotal*
Direct: 303.905.3171 <3039053171> | Email: kdunn@pivotal.io

Re: hawq-site

Posted by Kavinder Dhaliwal <kd...@pivotal.io>.

Hi,

Could you please describe a bit more about the nature of your tables in
hive. What file formats are the underlying files? What is the total size of
the files?

Additionally, what the Java Heap size you have specified for pxf? and can
you share your DDL for creating the table.

For more debugging information I would suggest you tail both
/var/log/pxf/catalina.out and /var/log/pxf/pxf-service.log and then
run select count(1) from hcatalog.default.t1; It will be obvious from
catalina.out if there is insufficient memory allocated to pxf.

This will give more insight into where your problem is. Currently there is
a known issue with PXF, large Hive files (mainly compressed formats), and
an insufficient Java Heap size. After trying the above, perhaps you can try
increasing your heap size via the "-Xmx<SIZE>" jvm property and running
again.

Thanks,
Kavinder

On Mon, Sep 26, 2016 at 10:32 PM, 来熊 <yi...@163.com> wrote:

> hi,all:
> I am using PXF and hcatalog to query hive, table t1,t2 in hive, and t1 is
> large table. and hawq in yarn mode.
>
> [hive@master ~]$ hive
> hive> select count(1) from t1;
> OK
> 680852926
> Time taken: 0.721 seconds, Fetched: 1 row(s)
> hive> exit;
>
> when I query t1 in hawq ,it is very very slow:
>
> [gpadmin@master ~]$
> [gpadmin@master ~]$ psql -U gpadmin -d gpadmin
> gpadmin=# set pxf_service_address to 'master:51200';
> SET
> Time: 0.410 ms
> gpadmin=# select count(*) from hcatalog.default.t2;
>  count
> -------
>   1000
> (1 row)
>
> Time: 910.853 ms
>
> gpadmin=# explain select count(*) from hcatalog.default.t1;
>                                              QUERY PLAN
>
> ------------------------------------------------------------
> ----------------------------------------
>  Aggregate  (cost=0.00..431.00 rows=1 width=8)
>    ->  Gather Motion 1:1  (slice1; segments: 1)  (cost=0.00..431.00 rows=1
> width=8)
>          ->  Aggregate  (cost=0.00..431.00 rows=1 width=8)
>                ->  External Scan on t1  (cost=0.00..431.00 rows=1 width=1)
>  Optimizer status: PQO version 1.627
> (5 rows)
>
> Time: 1388.073 ms
> gpadmin=#
> gpadmin=# select count(1) from hcatalog.default.t1;
>
> wait a long time,and cannot get result.
> log messages:
>
> 2016-09-27 09:46:25.816366 CST,"gpadmin","gpadmin",
> p764498,th-1935386496,"10.0.230.20","16234",2016-09-27 09:31:13
> CST,90355,con51,cmd20,seg-1,,,x90355,sx1,"LOG","00000","ConnID 5.
> Registered in HAWQ resource manager (By OID)",,,,,,"select count(*) from
> hcatalog.default.t1;",0,,"rmcomm_QD2RM.c",609,
> 2016-09-27 09:46:25.816508 CST,,,p760393,th-1935386496,,,
> ,0,con4,,seg-10000,,,,,"LOG","00000","ConnID 5. Expect query resource
> (256 MB, 0.022727 CORE) x 1 ( MIN 1 ) resource after adjusting based on
> queue NVSEG limits.",,,,,,,0,,"resqueuemanager.c",1913,
> 2016-09-27 09:46:25.816603 CST,,,p760393,th-1935386496,,,
> ,0,con4,,seg-10000,,,,,"LOG","00000","Latency of getting resource
> allocated is 138us",,,,,,,0,,"resqueuemanager.c",4375,
> 2016-09-27 09:46:25.816743 CST,"gpadmin","gpadmin",
> p764498,th-1935386496,"10.0.230.20","16234",2016-09-27 09:31:13
> CST,90355,con51,cmd20,seg-1,,,x90355,sx1,"LOG","00000","ConnID 5.
> Acquired resource from resource manager, (256 MB, 0.022727 CORE) x
> 1.",,,,,,"select count(*) from hcatalog.default.t1;",0,,"
> rmcomm_QD2RM.c",868,
> 2016-09-27 09:46:25.816868 CST,"gpadmin","gpadmin",
> p764498,th-1935386496,"10.0.230.20","16234",2016-09-27 09:31:13
> CST,90355,con51,cmd20,seg-1,,,x90355,sx1,"LOG","00000","data locality
> ratio: 0.000; virtual segment number: 1; different host number: 1; virtual
> segment number per host(avg/min/max): (1/1/1); segment size(avg/min/max):
> (0.000 B/0 B/0 B); segment size with penalty(avg/min/max): (0.000 B/0 B/0
> B); continuity(avg/min/max): (0.000/0.000/0.000).",,,,,,"select count(*)
> from hcatalog.default.t1;",0,,"cdbdatalocality.c",3396,
>
>
> I don't know why hawq only get such little resources.
> Is there any parameters I can set to let it (query hive using pxf and
> hcatalog) faster like in hive directly.
>
>
>

Re: hawq-site

Posted by Kavinder Dhaliwal <kd...@pivotal.io>.

Hi,

Could you please describe a bit more about the nature of your tables in
hive. What file formats are the underlying files? What is the total size of
the files?

Additionally, what the Java Heap size you have specified for pxf? and can
you share your DDL for creating the table.

For more debugging information I would suggest you tail both
/var/log/pxf/catalina.out and /var/log/pxf/pxf-service.log and then
run select count(1) from hcatalog.default.t1; It will be obvious from
catalina.out if there is insufficient memory allocated to pxf.

This will give more insight into where your problem is. Currently there is
a known issue with PXF, large Hive files (mainly compressed formats), and
an insufficient Java Heap size. After trying the above, perhaps you can try
increasing your heap size via the "-Xmx<SIZE>" jvm property and running
again.

Thanks,
Kavinder

On Mon, Sep 26, 2016 at 10:32 PM, 来熊 <yi...@163.com> wrote:

> hi,all:
> I am using PXF and hcatalog to query hive, table t1,t2 in hive, and t1 is
> large table. and hawq in yarn mode.
>
> [hive@master ~]$ hive
> hive> select count(1) from t1;
> OK
> 680852926
> Time taken: 0.721 seconds, Fetched: 1 row(s)
> hive> exit;
>
> when I query t1 in hawq ,it is very very slow:
>
> [gpadmin@master ~]$
> [gpadmin@master ~]$ psql -U gpadmin -d gpadmin
> gpadmin=# set pxf_service_address to 'master:51200';
> SET
> Time: 0.410 ms
> gpadmin=# select count(*) from hcatalog.default.t2;
>  count
> -------
>   1000
> (1 row)
>
> Time: 910.853 ms
>
> gpadmin=# explain select count(*) from hcatalog.default.t1;
>                                              QUERY PLAN
>
> ------------------------------------------------------------
> ----------------------------------------
>  Aggregate  (cost=0.00..431.00 rows=1 width=8)
>    ->  Gather Motion 1:1  (slice1; segments: 1)  (cost=0.00..431.00 rows=1
> width=8)
>          ->  Aggregate  (cost=0.00..431.00 rows=1 width=8)
>                ->  External Scan on t1  (cost=0.00..431.00 rows=1 width=1)
>  Optimizer status: PQO version 1.627
> (5 rows)
>
> Time: 1388.073 ms
> gpadmin=#
> gpadmin=# select count(1) from hcatalog.default.t1;
>
> wait a long time,and cannot get result.
> log messages:
>
> 2016-09-27 09:46:25.816366 CST,"gpadmin","gpadmin",
> p764498,th-1935386496,"10.0.230.20","16234",2016-09-27 09:31:13
> CST,90355,con51,cmd20,seg-1,,,x90355,sx1,"LOG","00000","ConnID 5.
> Registered in HAWQ resource manager (By OID)",,,,,,"select count(*) from
> hcatalog.default.t1;",0,,"rmcomm_QD2RM.c",609,
> 2016-09-27 09:46:25.816508 CST,,,p760393,th-1935386496,,,
> ,0,con4,,seg-10000,,,,,"LOG","00000","ConnID 5. Expect query resource
> (256 MB, 0.022727 CORE) x 1 ( MIN 1 ) resource after adjusting based on
> queue NVSEG limits.",,,,,,,0,,"resqueuemanager.c",1913,
> 2016-09-27 09:46:25.816603 CST,,,p760393,th-1935386496,,,
> ,0,con4,,seg-10000,,,,,"LOG","00000","Latency of getting resource
> allocated is 138us",,,,,,,0,,"resqueuemanager.c",4375,
> 2016-09-27 09:46:25.816743 CST,"gpadmin","gpadmin",
> p764498,th-1935386496,"10.0.230.20","16234",2016-09-27 09:31:13
> CST,90355,con51,cmd20,seg-1,,,x90355,sx1,"LOG","00000","ConnID 5.
> Acquired resource from resource manager, (256 MB, 0.022727 CORE) x
> 1.",,,,,,"select count(*) from hcatalog.default.t1;",0,,"
> rmcomm_QD2RM.c",868,
> 2016-09-27 09:46:25.816868 CST,"gpadmin","gpadmin",
> p764498,th-1935386496,"10.0.230.20","16234",2016-09-27 09:31:13
> CST,90355,con51,cmd20,seg-1,,,x90355,sx1,"LOG","00000","data locality
> ratio: 0.000; virtual segment number: 1; different host number: 1; virtual
> segment number per host(avg/min/max): (1/1/1); segment size(avg/min/max):
> (0.000 B/0 B/0 B); segment size with penalty(avg/min/max): (0.000 B/0 B/0
> B); continuity(avg/min/max): (0.000/0.000/0.000).",,,,,,"select count(*)
> from hcatalog.default.t1;",0,,"cdbdatalocality.c",3396,
>
>
> I don't know why hawq only get such little resources.
> Is there any parameters I can set to let it (query hive using pxf and
> hcatalog) faster like in hive directly.
>
>
>