You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hawq.apache.org by tao tony <to...@outlook.com> on 2017/05/18 06:17:26 UTC
Re: why these 2 queries had different explain
it seems a bug for querying on an external table because I found
ucloud_pay_tenanid ulist was a external table which using gpfdist.
CREATE EXTERNAL TABLE ucloud_pay_tenanid (
customercode character varying(255),
customername character varying(255),
prefixflag character varying(255),
customertype character varying(255),
comments character varying(255)
) LOCATION (
'gpfdist://hdptest02.hddomain.cn:8087/ucloud_pay_tenanid_*.csv'
) FORMAT 'text' (delimiter E';' null E'' escape E'OFF')
ENCODING 'UTF8';
then I create an internal table using:
create table ucloudtest as select * from ucloud_pay_tenanid;
run the explain,it lookks like the right query plan:
hdb=# explain select cp.tenantid,
hdb-# (select ulist.customertype from ucloudtest ulist where
ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp;
QUERY PLAN
---------------------------------------------------------------------------------------------------------
Gather Motion 1:1 (slice2; segments: 1) (cost=0.00..219.50 rows=100
width=42)
-> Append-only Scan on cptest cp (cost=0.00..219.50 rows=100 width=42)
SubPlan 1
-> Result (cost=2.18..2.19 rows=1 width=7)
Filter: ulist.customercode::text = $0::text
-> Materialize (cost=2.18..2.19 rows=1 width=7)
-> Broadcast Motion 1:1 (slice1; segments: 1)
(cost=0.00..2.17 rows=1 width=7)
-> Append-only Scan on ucloudtest ulist
(cost=0.00..2.17 rows=1 width=7)
Settings: default_hash_table_bucket_number=18; optimizer=off
Optimizer status: legacy query optimizer
(10 rows)
run the query,get the correct result:
hdb=# select cp.tenantid,
(select ulist.customertype from ucloudtest ulist where
ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp;
tenantid | ?column? | orderuuid
----------+----------+----------------------------------
sxve7r6c | 便利 | e6d9b57a0c55484392448ea908c1fe49
sxve7r6c | 便利 | 22a80697bfc74d63b7f28eee246c4368
3e7rph46 | 专卖 | 420ad3e45762459e91860b975e9f2751
3e7rph46 | 专卖 | 0634e7e3539a4116b9917a7493838f51
7jvfka5m | 专卖 | a7b96194fe9f48379e2711ac6000191b
6xydfh4y | 便利 | 7a55e97119784623a53f6e65ef9680c7
sxve7r6c | 便利 | 227f3d22aec14723bb51efc4e2a6f0b4
3e7rph46 | 专卖 | f3d02cc77a2348829be2f72ce24bf846
6xydfh4y | 便利 | bab722ac7d5748408d3ad2973d292ab5
On 05/18/2017 11:11 AM, tao tony wrote:
> hi guys,
>
> The different explains as below make me confused these days,could you
> please help me to explain why ulist.customertype is null.
>
> explain :
>
> hdb=# explain select cp.tenantid,
> (select ulist.customertype from ucloud_pay_tenanid ulist where
> ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp;
> QUERY PLAN
> --------------------------------------------------------------------------------------------------
> Gather Motion 18:1 (slice1; segments: 18) (cost=0.00..1350002.00
> rows=100 width=42)
> -> Append-only Scan on cptest cp (cost=0.00..1350002.00 rows=6
> width=42)
> SubPlan 1
> -> External Scan on ucloud_pay_tenanid ulist
> (cost=0.00..13500.00 rows=56 width=516)
> Filter: customercode::text = $0::text
> Settings: default_hash_table_bucket_number=18; optimizer=off
> Optimizer status: legacy query optimizer
> (7 rows)
>
>
> hdb=# explain select a,(select d from test1 s where t.b=s.e) from test2 t;
> QUERY PLAN
> ---------------------------------------------------------------------------------------------------------
> Gather Motion 1:1 (slice2; segments: 1) (cost=0.00..10.20 rows=9
> width=6)
> -> Append-only Scan on test2 t (cost=0.00..10.20 rows=9 width=6)
> SubPlan 1
> -> Result (cost=1.01..1.02 rows=1 width=4)
> Filter: $0::text = s.e::text
> -> Materialize (cost=1.01..1.02 rows=1 width=4)
> -> Broadcast Motion 1:1 (slice1; segments: 1)
> (cost=0.00..1.01 rows=1 width=4)
> -> Append-only Scan on test1 s
> (cost=0.00..1.01 rows=1 width=4)
> Settings: default_hash_table_bucket_number=18; optimizer=off
> Optimizer status: legacy query optimizer
> (10 rows)
>
> test data:
>
> query1:
>
> hdb-# (select ulist.customertype from ucloud_pay_tenanid ulist where
> ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp;
> tenantid | ?column? | orderuuid
> ----------+----------+----------------------------------
> sxve7r6c | | 901cf777a3144907899226de86ab8cfe
> k5r9qcfj | | 54c8cbdbe64f4dcca5cacb40b7ab52d6
> 3e7rph46 | | f5e777a5189e409da607182d059ec0d5
> 3e7rph46 | | 478d8a0921ab444ca80dc03ce97d3a94
> 3e7rph46 | | 25808c4a66e34ecaad02d2fc183920ca
> 3e7rph46 | | 0a62832067ea4db68641cce35385aae7
> 3e7rph46 | | 4c9e91233af947518c5c5767a2bd8b3c
> 6xydfh4y | | b2fa1af123384452b23bbc5e0794da56
> 6xydfh4y | | ade5731d97044a0eb9d74836ea7dbf48
> pejg93wh | | da2d037765bb4262a82764715cb4453d
> sxve7r6c | | b02df6c998ff4e448b575e3b1fae8e35
> 3e7rph46 | | 4c9e91233af947518c5c5767a2bd8b3c
>
> query 2:
>
> hdb=# select a,(select d from test1 s where t.b=s.e) from test2 t;
> a | ?column?
> ---+----------
> 1 | 1
> 2 | 2
> 3 | 3
> 4 | 4
> 5 | 5
> 6 | 7
> 7 | 8
> 8 | 9
> 9 | 10
> (9 rows)
>
> another query equal to query1,customertype is not null:
>
> hdb=# select tenantid,ulist.customertype,cp.orderuuid from cptest
> cp,ucloud_pay_tenanid ulist
> hdb-# where cp.tenantid = ulist.customercode;
> tenantid | customertype | orderuuid
> ----------+--------------+----------------------------------
> 7jvfka5m | 专卖 | f10771c9eb4e434dadcc82dc7cda0d4d
> 7jvfka5m | 专卖 | 246a13ed45174ee083fd9f743532ab4c
> 7jvfka5m | 专卖 | e2687954d3e94ab0bf1366369bdc9887
> 7jvfka5m | 专卖 | 018ef9e35db94901b4822509494bd6b5
> 7jvfka5m | 专卖 | 9054d2d5d7264a8aa4b41baddf1198e8
> 7jvfka5m | 专卖 | 9054d2d5d7264a8aa4b41baddf1198e8
> 7jvfka5m | 专卖 | a7b96194fe9f48379e2711ac6000191b
> u642nk35 | 便利 | df4ef8599c154c978d021b6765edb55b
> u642nk35 | 便利 | d60c5d3ac0f44ef1a278d50d84847c2f
> u642nk35 | 便利 | df6ca3d8fb444f8fba604db8611c6292
> u642nk35 | 便利 | df4ef8599c154c978d021b6765edb55b
>
Re: why these 2 queries had different explain
Posted by 陶征霖 <zt...@apache.org>.
Hi tony,
Yes, confirmed it's a bug. Filed the bug in apache hawq jira
https://issues.apache.org/jira/browse/HAWQ-1470. Will fix it later.
Thanks,
Zhenglin
2017-05-22 10:31 GMT+08:00 tao tony <to...@outlook.com>:
> OK.this bug appears when using gpfdist external table in select list.I
> create some test data to reproduce this bug in hawq docker.
>
> 1.create a external table
>
> CREATE EXTERNAL TABLE testext (
> a int,
> b character varying(255)
> ) LOCATION (
> 'gpfdist://172.19.0.2:8087/test.csv'
> ) FORMAT 'text' (delimiter E',' null E'' escape E'OFF');
>
> test.csv file contains:
>
> cat gpdata/test.csv
> 1,abc
> 2,bce
> 3,ced
>
> 2.create a internal table:
>
> create table test1(c int);
>
> insert into test1 values(1);
>
> insert into test1 values(2);
>
> insert into test1 values(3);
>
> insert into test1 values(4);
>
> 3.run query,could not get testext.b values:
>
> select c,(select s.b from testext s where t.c=s.a) from test1 t;
> c | ?column?
> ---+----------
> 1 |
> 2 |
> 3 |
> 4 |
> (4 rows)
> 4.create an internal table from testext,and run the query again,ti
> returns the correct result
>
> create table test as select * from testext;
>
> select c,(select s.b from test s where t.c=s.a) from test1 t;
> c | ?column?
> ---+----------
> 1 | abc
> 2 | bce
> 3 | ced
> 4 |
> (4 rows)
>
> you could compare the 2 explains,in step 3 ,testext was not broadcasted.
>
>
>
> On 05/22/2017 09:25 AM, 陶征霖 wrote:
> > Hi tony,
> >
> > Could you please provide the simple reproduce steps so that we can easily
> > debug in our own env.?
> >
> > Thanks,
> > Zhenglin
> >
> > 2017-05-18 14:17 GMT+08:00 tao tony <to...@outlook.com>:
> >
> >> it seems a bug for querying on an external table because I found
> >> ucloud_pay_tenanid ulist was a external table which using gpfdist.
> >>
> >> CREATE EXTERNAL TABLE ucloud_pay_tenanid (
> >> customercode character varying(255),
> >> customername character varying(255),
> >> prefixflag character varying(255),
> >> customertype character varying(255),
> >> comments character varying(255)
> >> ) LOCATION (
> >> 'gpfdist://hdptest02.hddomain.cn:8087/ucloud_pay_tenanid_*.csv'
> >> ) FORMAT 'text' (delimiter E';' null E'' escape E'OFF')
> >> ENCODING 'UTF8';
> >>
> >> then I create an internal table using:
> >>
> >> create table ucloudtest as select * from ucloud_pay_tenanid;
> >>
> >> run the explain,it lookks like the right query plan:
> >>
> >> hdb=# explain select cp.tenantid,
> >> hdb-# (select ulist.customertype from ucloudtest ulist where
> >> ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp;
> >> QUERY PLAN
> >> ------------------------------------------------------------
> >> ---------------------------------------------
> >> Gather Motion 1:1 (slice2; segments: 1) (cost=0.00..219.50 rows=100
> >> width=42)
> >> -> Append-only Scan on cptest cp (cost=0.00..219.50 rows=100
> >> width=42)
> >> SubPlan 1
> >> -> Result (cost=2.18..2.19 rows=1 width=7)
> >> Filter: ulist.customercode::text = $0::text
> >> -> Materialize (cost=2.18..2.19 rows=1 width=7)
> >> -> Broadcast Motion 1:1 (slice1; segments: 1)
> >> (cost=0.00..2.17 rows=1 width=7)
> >> -> Append-only Scan on ucloudtest ulist
> >> (cost=0.00..2.17 rows=1 width=7)
> >> Settings: default_hash_table_bucket_number=18; optimizer=off
> >> Optimizer status: legacy query optimizer
> >> (10 rows)
> >>
> >> run the query,get the correct result:
> >>
> >> hdb=# select cp.tenantid,
> >> (select ulist.customertype from ucloudtest ulist where
> >> ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp;
> >> tenantid | ?column? | orderuuid
> >> ----------+----------+----------------------------------
> >> sxve7r6c | 便利 | e6d9b57a0c55484392448ea908c1fe49
> >> sxve7r6c | 便利 | 22a80697bfc74d63b7f28eee246c4368
> >> 3e7rph46 | 专卖 | 420ad3e45762459e91860b975e9f2751
> >> 3e7rph46 | 专卖 | 0634e7e3539a4116b9917a7493838f51
> >> 7jvfka5m | 专卖 | a7b96194fe9f48379e2711ac6000191b
> >> 6xydfh4y | 便利 | 7a55e97119784623a53f6e65ef9680c7
> >> sxve7r6c | 便利 | 227f3d22aec14723bb51efc4e2a6f0b4
> >> 3e7rph46 | 专卖 | f3d02cc77a2348829be2f72ce24bf846
> >> 6xydfh4y | 便利 | bab722ac7d5748408d3ad2973d292ab5
> >>
> >>
> >> On 05/18/2017 11:11 AM, tao tony wrote:
> >>> hi guys,
> >>>
> >>> The different explains as below make me confused these days,could you
> >>> please help me to explain why ulist.customertype is null.
> >>>
> >>> explain :
> >>>
> >>> hdb=# explain select cp.tenantid,
> >>> (select ulist.customertype from ucloud_pay_tenanid ulist where
> >>> ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp;
> >>> QUERY PLAN
> >>> ------------------------------------------------------------
> >> --------------------------------------
> >>> Gather Motion 18:1 (slice1; segments: 18) (cost=0.00..1350002.00
> >>> rows=100 width=42)
> >>> -> Append-only Scan on cptest cp (cost=0.00..1350002.00 rows=6
> >>> width=42)
> >>> SubPlan 1
> >>> -> External Scan on ucloud_pay_tenanid ulist
> >>> (cost=0.00..13500.00 rows=56 width=516)
> >>> Filter: customercode::text = $0::text
> >>> Settings: default_hash_table_bucket_number=18; optimizer=off
> >>> Optimizer status: legacy query optimizer
> >>> (7 rows)
> >>>
> >>>
> >>> hdb=# explain select a,(select d from test1 s where t.b=s.e) from test2
> >> t;
> >>> QUERY PLAN
> >>> ------------------------------------------------------------
> >> ---------------------------------------------
> >>> Gather Motion 1:1 (slice2; segments: 1) (cost=0.00..10.20 rows=9
> >>> width=6)
> >>> -> Append-only Scan on test2 t (cost=0.00..10.20 rows=9
> width=6)
> >>> SubPlan 1
> >>> -> Result (cost=1.01..1.02 rows=1 width=4)
> >>> Filter: $0::text = s.e::text
> >>> -> Materialize (cost=1.01..1.02 rows=1 width=4)
> >>> -> Broadcast Motion 1:1 (slice1; segments:
> 1)
> >>> (cost=0.00..1.01 rows=1 width=4)
> >>> -> Append-only Scan on test1 s
> >>> (cost=0.00..1.01 rows=1 width=4)
> >>> Settings: default_hash_table_bucket_number=18; optimizer=off
> >>> Optimizer status: legacy query optimizer
> >>> (10 rows)
> >>>
> >>> test data:
> >>>
> >>> query1:
> >>>
> >>> hdb-# (select ulist.customertype from ucloud_pay_tenanid ulist where
> >>> ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp;
> >>> tenantid | ?column? | orderuuid
> >>> ----------+----------+----------------------------------
> >>> sxve7r6c | | 901cf777a3144907899226de86ab8cfe
> >>> k5r9qcfj | | 54c8cbdbe64f4dcca5cacb40b7ab52d6
> >>> 3e7rph46 | | f5e777a5189e409da607182d059ec0d5
> >>> 3e7rph46 | | 478d8a0921ab444ca80dc03ce97d3a94
> >>> 3e7rph46 | | 25808c4a66e34ecaad02d2fc183920ca
> >>> 3e7rph46 | | 0a62832067ea4db68641cce35385aae7
> >>> 3e7rph46 | | 4c9e91233af947518c5c5767a2bd8b3c
> >>> 6xydfh4y | | b2fa1af123384452b23bbc5e0794da56
> >>> 6xydfh4y | | ade5731d97044a0eb9d74836ea7dbf48
> >>> pejg93wh | | da2d037765bb4262a82764715cb4453d
> >>> sxve7r6c | | b02df6c998ff4e448b575e3b1fae8e35
> >>> 3e7rph46 | | 4c9e91233af947518c5c5767a2bd8b3c
> >>>
> >>> query 2:
> >>>
> >>> hdb=# select a,(select d from test1 s where t.b=s.e) from test2 t;
> >>> a | ?column?
> >>> ---+----------
> >>> 1 | 1
> >>> 2 | 2
> >>> 3 | 3
> >>> 4 | 4
> >>> 5 | 5
> >>> 6 | 7
> >>> 7 | 8
> >>> 8 | 9
> >>> 9 | 10
> >>> (9 rows)
> >>>
> >>> another query equal to query1,customertype is not null:
> >>>
> >>> hdb=# select tenantid,ulist.customertype,cp.orderuuid from cptest
> >>> cp,ucloud_pay_tenanid ulist
> >>> hdb-# where cp.tenantid = ulist.customercode;
> >>> tenantid | customertype | orderuuid
> >>> ----------+--------------+----------------------------------
> >>> 7jvfka5m | 专卖 | f10771c9eb4e434dadcc82dc7cda0d4d
> >>> 7jvfka5m | 专卖 | 246a13ed45174ee083fd9f743532ab4c
> >>> 7jvfka5m | 专卖 | e2687954d3e94ab0bf1366369bdc9887
> >>> 7jvfka5m | 专卖 | 018ef9e35db94901b4822509494bd6b5
> >>> 7jvfka5m | 专卖 | 9054d2d5d7264a8aa4b41baddf1198e8
> >>> 7jvfka5m | 专卖 | 9054d2d5d7264a8aa4b41baddf1198e8
> >>> 7jvfka5m | 专卖 | a7b96194fe9f48379e2711ac6000191b
> >>> u642nk35 | 便利 | df4ef8599c154c978d021b6765edb55b
> >>> u642nk35 | 便利 | d60c5d3ac0f44ef1a278d50d84847c2f
> >>> u642nk35 | 便利 | df6ca3d8fb444f8fba604db8611c6292
> >>> u642nk35 | 便利 | df4ef8599c154c978d021b6765edb55b
> >>>
> >>
>
>
Re: why these 2 queries had different explain
Posted by 陶征霖 <zt...@apache.org>.
Hi tony,
Yes, confirmed it's a bug. Filed the bug in apache hawq jira
https://issues.apache.org/jira/browse/HAWQ-1470. Will fix it later.
Thanks,
Zhenglin
2017-05-22 10:31 GMT+08:00 tao tony <to...@outlook.com>:
> OK.this bug appears when using gpfdist external table in select list.I
> create some test data to reproduce this bug in hawq docker.
>
> 1.create a external table
>
> CREATE EXTERNAL TABLE testext (
> a int,
> b character varying(255)
> ) LOCATION (
> 'gpfdist://172.19.0.2:8087/test.csv'
> ) FORMAT 'text' (delimiter E',' null E'' escape E'OFF');
>
> test.csv file contains:
>
> cat gpdata/test.csv
> 1,abc
> 2,bce
> 3,ced
>
> 2.create a internal table:
>
> create table test1(c int);
>
> insert into test1 values(1);
>
> insert into test1 values(2);
>
> insert into test1 values(3);
>
> insert into test1 values(4);
>
> 3.run query,could not get testext.b values:
>
> select c,(select s.b from testext s where t.c=s.a) from test1 t;
> c | ?column?
> ---+----------
> 1 |
> 2 |
> 3 |
> 4 |
> (4 rows)
> 4.create an internal table from testext,and run the query again,ti
> returns the correct result
>
> create table test as select * from testext;
>
> select c,(select s.b from test s where t.c=s.a) from test1 t;
> c | ?column?
> ---+----------
> 1 | abc
> 2 | bce
> 3 | ced
> 4 |
> (4 rows)
>
> you could compare the 2 explains,in step 3 ,testext was not broadcasted.
>
>
>
> On 05/22/2017 09:25 AM, 陶征霖 wrote:
> > Hi tony,
> >
> > Could you please provide the simple reproduce steps so that we can easily
> > debug in our own env.?
> >
> > Thanks,
> > Zhenglin
> >
> > 2017-05-18 14:17 GMT+08:00 tao tony <to...@outlook.com>:
> >
> >> it seems a bug for querying on an external table because I found
> >> ucloud_pay_tenanid ulist was a external table which using gpfdist.
> >>
> >> CREATE EXTERNAL TABLE ucloud_pay_tenanid (
> >> customercode character varying(255),
> >> customername character varying(255),
> >> prefixflag character varying(255),
> >> customertype character varying(255),
> >> comments character varying(255)
> >> ) LOCATION (
> >> 'gpfdist://hdptest02.hddomain.cn:8087/ucloud_pay_tenanid_*.csv'
> >> ) FORMAT 'text' (delimiter E';' null E'' escape E'OFF')
> >> ENCODING 'UTF8';
> >>
> >> then I create an internal table using:
> >>
> >> create table ucloudtest as select * from ucloud_pay_tenanid;
> >>
> >> run the explain,it lookks like the right query plan:
> >>
> >> hdb=# explain select cp.tenantid,
> >> hdb-# (select ulist.customertype from ucloudtest ulist where
> >> ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp;
> >> QUERY PLAN
> >> ------------------------------------------------------------
> >> ---------------------------------------------
> >> Gather Motion 1:1 (slice2; segments: 1) (cost=0.00..219.50 rows=100
> >> width=42)
> >> -> Append-only Scan on cptest cp (cost=0.00..219.50 rows=100
> >> width=42)
> >> SubPlan 1
> >> -> Result (cost=2.18..2.19 rows=1 width=7)
> >> Filter: ulist.customercode::text = $0::text
> >> -> Materialize (cost=2.18..2.19 rows=1 width=7)
> >> -> Broadcast Motion 1:1 (slice1; segments: 1)
> >> (cost=0.00..2.17 rows=1 width=7)
> >> -> Append-only Scan on ucloudtest ulist
> >> (cost=0.00..2.17 rows=1 width=7)
> >> Settings: default_hash_table_bucket_number=18; optimizer=off
> >> Optimizer status: legacy query optimizer
> >> (10 rows)
> >>
> >> run the query,get the correct result:
> >>
> >> hdb=# select cp.tenantid,
> >> (select ulist.customertype from ucloudtest ulist where
> >> ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp;
> >> tenantid | ?column? | orderuuid
> >> ----------+----------+----------------------------------
> >> sxve7r6c | 便利 | e6d9b57a0c55484392448ea908c1fe49
> >> sxve7r6c | 便利 | 22a80697bfc74d63b7f28eee246c4368
> >> 3e7rph46 | 专卖 | 420ad3e45762459e91860b975e9f2751
> >> 3e7rph46 | 专卖 | 0634e7e3539a4116b9917a7493838f51
> >> 7jvfka5m | 专卖 | a7b96194fe9f48379e2711ac6000191b
> >> 6xydfh4y | 便利 | 7a55e97119784623a53f6e65ef9680c7
> >> sxve7r6c | 便利 | 227f3d22aec14723bb51efc4e2a6f0b4
> >> 3e7rph46 | 专卖 | f3d02cc77a2348829be2f72ce24bf846
> >> 6xydfh4y | 便利 | bab722ac7d5748408d3ad2973d292ab5
> >>
> >>
> >> On 05/18/2017 11:11 AM, tao tony wrote:
> >>> hi guys,
> >>>
> >>> The different explains as below make me confused these days,could you
> >>> please help me to explain why ulist.customertype is null.
> >>>
> >>> explain :
> >>>
> >>> hdb=# explain select cp.tenantid,
> >>> (select ulist.customertype from ucloud_pay_tenanid ulist where
> >>> ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp;
> >>> QUERY PLAN
> >>> ------------------------------------------------------------
> >> --------------------------------------
> >>> Gather Motion 18:1 (slice1; segments: 18) (cost=0.00..1350002.00
> >>> rows=100 width=42)
> >>> -> Append-only Scan on cptest cp (cost=0.00..1350002.00 rows=6
> >>> width=42)
> >>> SubPlan 1
> >>> -> External Scan on ucloud_pay_tenanid ulist
> >>> (cost=0.00..13500.00 rows=56 width=516)
> >>> Filter: customercode::text = $0::text
> >>> Settings: default_hash_table_bucket_number=18; optimizer=off
> >>> Optimizer status: legacy query optimizer
> >>> (7 rows)
> >>>
> >>>
> >>> hdb=# explain select a,(select d from test1 s where t.b=s.e) from test2
> >> t;
> >>> QUERY PLAN
> >>> ------------------------------------------------------------
> >> ---------------------------------------------
> >>> Gather Motion 1:1 (slice2; segments: 1) (cost=0.00..10.20 rows=9
> >>> width=6)
> >>> -> Append-only Scan on test2 t (cost=0.00..10.20 rows=9
> width=6)
> >>> SubPlan 1
> >>> -> Result (cost=1.01..1.02 rows=1 width=4)
> >>> Filter: $0::text = s.e::text
> >>> -> Materialize (cost=1.01..1.02 rows=1 width=4)
> >>> -> Broadcast Motion 1:1 (slice1; segments:
> 1)
> >>> (cost=0.00..1.01 rows=1 width=4)
> >>> -> Append-only Scan on test1 s
> >>> (cost=0.00..1.01 rows=1 width=4)
> >>> Settings: default_hash_table_bucket_number=18; optimizer=off
> >>> Optimizer status: legacy query optimizer
> >>> (10 rows)
> >>>
> >>> test data:
> >>>
> >>> query1:
> >>>
> >>> hdb-# (select ulist.customertype from ucloud_pay_tenanid ulist where
> >>> ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp;
> >>> tenantid | ?column? | orderuuid
> >>> ----------+----------+----------------------------------
> >>> sxve7r6c | | 901cf777a3144907899226de86ab8cfe
> >>> k5r9qcfj | | 54c8cbdbe64f4dcca5cacb40b7ab52d6
> >>> 3e7rph46 | | f5e777a5189e409da607182d059ec0d5
> >>> 3e7rph46 | | 478d8a0921ab444ca80dc03ce97d3a94
> >>> 3e7rph46 | | 25808c4a66e34ecaad02d2fc183920ca
> >>> 3e7rph46 | | 0a62832067ea4db68641cce35385aae7
> >>> 3e7rph46 | | 4c9e91233af947518c5c5767a2bd8b3c
> >>> 6xydfh4y | | b2fa1af123384452b23bbc5e0794da56
> >>> 6xydfh4y | | ade5731d97044a0eb9d74836ea7dbf48
> >>> pejg93wh | | da2d037765bb4262a82764715cb4453d
> >>> sxve7r6c | | b02df6c998ff4e448b575e3b1fae8e35
> >>> 3e7rph46 | | 4c9e91233af947518c5c5767a2bd8b3c
> >>>
> >>> query 2:
> >>>
> >>> hdb=# select a,(select d from test1 s where t.b=s.e) from test2 t;
> >>> a | ?column?
> >>> ---+----------
> >>> 1 | 1
> >>> 2 | 2
> >>> 3 | 3
> >>> 4 | 4
> >>> 5 | 5
> >>> 6 | 7
> >>> 7 | 8
> >>> 8 | 9
> >>> 9 | 10
> >>> (9 rows)
> >>>
> >>> another query equal to query1,customertype is not null:
> >>>
> >>> hdb=# select tenantid,ulist.customertype,cp.orderuuid from cptest
> >>> cp,ucloud_pay_tenanid ulist
> >>> hdb-# where cp.tenantid = ulist.customercode;
> >>> tenantid | customertype | orderuuid
> >>> ----------+--------------+----------------------------------
> >>> 7jvfka5m | 专卖 | f10771c9eb4e434dadcc82dc7cda0d4d
> >>> 7jvfka5m | 专卖 | 246a13ed45174ee083fd9f743532ab4c
> >>> 7jvfka5m | 专卖 | e2687954d3e94ab0bf1366369bdc9887
> >>> 7jvfka5m | 专卖 | 018ef9e35db94901b4822509494bd6b5
> >>> 7jvfka5m | 专卖 | 9054d2d5d7264a8aa4b41baddf1198e8
> >>> 7jvfka5m | 专卖 | 9054d2d5d7264a8aa4b41baddf1198e8
> >>> 7jvfka5m | 专卖 | a7b96194fe9f48379e2711ac6000191b
> >>> u642nk35 | 便利 | df4ef8599c154c978d021b6765edb55b
> >>> u642nk35 | 便利 | d60c5d3ac0f44ef1a278d50d84847c2f
> >>> u642nk35 | 便利 | df6ca3d8fb444f8fba604db8611c6292
> >>> u642nk35 | 便利 | df4ef8599c154c978d021b6765edb55b
> >>>
> >>
>
>
Re: why these 2 queries had different explain
Posted by tao tony <to...@outlook.com>.
OK.this bug appears when using gpfdist external table in select list.I
create some test data to reproduce this bug in hawq docker.
1.create a external table
CREATE EXTERNAL TABLE testext (
a int,
b character varying(255)
) LOCATION (
'gpfdist://172.19.0.2:8087/test.csv'
) FORMAT 'text' (delimiter E',' null E'' escape E'OFF');
test.csv file contains:
cat gpdata/test.csv
1,abc
2,bce
3,ced
2.create a internal table:
create table test1(c int);
insert into test1 values(1);
insert into test1 values(2);
insert into test1 values(3);
insert into test1 values(4);
3.run query,could not get testext.b values:
select c,(select s.b from testext s where t.c=s.a) from test1 t;
c | ?column?
---+----------
1 |
2 |
3 |
4 |
(4 rows)
4.create an internal table from testext,and run the query again,ti
returns the correct result
create table test as select * from testext;
select c,(select s.b from test s where t.c=s.a) from test1 t;
c | ?column?
---+----------
1 | abc
2 | bce
3 | ced
4 |
(4 rows)
you could compare the 2 explains,in step 3 ,testext was not broadcasted.
On 05/22/2017 09:25 AM, 陶征霖 wrote:
> Hi tony,
>
> Could you please provide the simple reproduce steps so that we can easily
> debug in our own env.?
>
> Thanks,
> Zhenglin
>
> 2017-05-18 14:17 GMT+08:00 tao tony <to...@outlook.com>:
>
>> it seems a bug for querying on an external table because I found
>> ucloud_pay_tenanid ulist was a external table which using gpfdist.
>>
>> CREATE EXTERNAL TABLE ucloud_pay_tenanid (
>> customercode character varying(255),
>> customername character varying(255),
>> prefixflag character varying(255),
>> customertype character varying(255),
>> comments character varying(255)
>> ) LOCATION (
>> 'gpfdist://hdptest02.hddomain.cn:8087/ucloud_pay_tenanid_*.csv'
>> ) FORMAT 'text' (delimiter E';' null E'' escape E'OFF')
>> ENCODING 'UTF8';
>>
>> then I create an internal table using:
>>
>> create table ucloudtest as select * from ucloud_pay_tenanid;
>>
>> run the explain,it lookks like the right query plan:
>>
>> hdb=# explain select cp.tenantid,
>> hdb-# (select ulist.customertype from ucloudtest ulist where
>> ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp;
>> QUERY PLAN
>> ------------------------------------------------------------
>> ---------------------------------------------
>> Gather Motion 1:1 (slice2; segments: 1) (cost=0.00..219.50 rows=100
>> width=42)
>> -> Append-only Scan on cptest cp (cost=0.00..219.50 rows=100
>> width=42)
>> SubPlan 1
>> -> Result (cost=2.18..2.19 rows=1 width=7)
>> Filter: ulist.customercode::text = $0::text
>> -> Materialize (cost=2.18..2.19 rows=1 width=7)
>> -> Broadcast Motion 1:1 (slice1; segments: 1)
>> (cost=0.00..2.17 rows=1 width=7)
>> -> Append-only Scan on ucloudtest ulist
>> (cost=0.00..2.17 rows=1 width=7)
>> Settings: default_hash_table_bucket_number=18; optimizer=off
>> Optimizer status: legacy query optimizer
>> (10 rows)
>>
>> run the query,get the correct result:
>>
>> hdb=# select cp.tenantid,
>> (select ulist.customertype from ucloudtest ulist where
>> ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp;
>> tenantid | ?column? | orderuuid
>> ----------+----------+----------------------------------
>> sxve7r6c | 便利 | e6d9b57a0c55484392448ea908c1fe49
>> sxve7r6c | 便利 | 22a80697bfc74d63b7f28eee246c4368
>> 3e7rph46 | 专卖 | 420ad3e45762459e91860b975e9f2751
>> 3e7rph46 | 专卖 | 0634e7e3539a4116b9917a7493838f51
>> 7jvfka5m | 专卖 | a7b96194fe9f48379e2711ac6000191b
>> 6xydfh4y | 便利 | 7a55e97119784623a53f6e65ef9680c7
>> sxve7r6c | 便利 | 227f3d22aec14723bb51efc4e2a6f0b4
>> 3e7rph46 | 专卖 | f3d02cc77a2348829be2f72ce24bf846
>> 6xydfh4y | 便利 | bab722ac7d5748408d3ad2973d292ab5
>>
>>
>> On 05/18/2017 11:11 AM, tao tony wrote:
>>> hi guys,
>>>
>>> The different explains as below make me confused these days,could you
>>> please help me to explain why ulist.customertype is null.
>>>
>>> explain :
>>>
>>> hdb=# explain select cp.tenantid,
>>> (select ulist.customertype from ucloud_pay_tenanid ulist where
>>> ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp;
>>> QUERY PLAN
>>> ------------------------------------------------------------
>> --------------------------------------
>>> Gather Motion 18:1 (slice1; segments: 18) (cost=0.00..1350002.00
>>> rows=100 width=42)
>>> -> Append-only Scan on cptest cp (cost=0.00..1350002.00 rows=6
>>> width=42)
>>> SubPlan 1
>>> -> External Scan on ucloud_pay_tenanid ulist
>>> (cost=0.00..13500.00 rows=56 width=516)
>>> Filter: customercode::text = $0::text
>>> Settings: default_hash_table_bucket_number=18; optimizer=off
>>> Optimizer status: legacy query optimizer
>>> (7 rows)
>>>
>>>
>>> hdb=# explain select a,(select d from test1 s where t.b=s.e) from test2
>> t;
>>> QUERY PLAN
>>> ------------------------------------------------------------
>> ---------------------------------------------
>>> Gather Motion 1:1 (slice2; segments: 1) (cost=0.00..10.20 rows=9
>>> width=6)
>>> -> Append-only Scan on test2 t (cost=0.00..10.20 rows=9 width=6)
>>> SubPlan 1
>>> -> Result (cost=1.01..1.02 rows=1 width=4)
>>> Filter: $0::text = s.e::text
>>> -> Materialize (cost=1.01..1.02 rows=1 width=4)
>>> -> Broadcast Motion 1:1 (slice1; segments: 1)
>>> (cost=0.00..1.01 rows=1 width=4)
>>> -> Append-only Scan on test1 s
>>> (cost=0.00..1.01 rows=1 width=4)
>>> Settings: default_hash_table_bucket_number=18; optimizer=off
>>> Optimizer status: legacy query optimizer
>>> (10 rows)
>>>
>>> test data:
>>>
>>> query1:
>>>
>>> hdb-# (select ulist.customertype from ucloud_pay_tenanid ulist where
>>> ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp;
>>> tenantid | ?column? | orderuuid
>>> ----------+----------+----------------------------------
>>> sxve7r6c | | 901cf777a3144907899226de86ab8cfe
>>> k5r9qcfj | | 54c8cbdbe64f4dcca5cacb40b7ab52d6
>>> 3e7rph46 | | f5e777a5189e409da607182d059ec0d5
>>> 3e7rph46 | | 478d8a0921ab444ca80dc03ce97d3a94
>>> 3e7rph46 | | 25808c4a66e34ecaad02d2fc183920ca
>>> 3e7rph46 | | 0a62832067ea4db68641cce35385aae7
>>> 3e7rph46 | | 4c9e91233af947518c5c5767a2bd8b3c
>>> 6xydfh4y | | b2fa1af123384452b23bbc5e0794da56
>>> 6xydfh4y | | ade5731d97044a0eb9d74836ea7dbf48
>>> pejg93wh | | da2d037765bb4262a82764715cb4453d
>>> sxve7r6c | | b02df6c998ff4e448b575e3b1fae8e35
>>> 3e7rph46 | | 4c9e91233af947518c5c5767a2bd8b3c
>>>
>>> query 2:
>>>
>>> hdb=# select a,(select d from test1 s where t.b=s.e) from test2 t;
>>> a | ?column?
>>> ---+----------
>>> 1 | 1
>>> 2 | 2
>>> 3 | 3
>>> 4 | 4
>>> 5 | 5
>>> 6 | 7
>>> 7 | 8
>>> 8 | 9
>>> 9 | 10
>>> (9 rows)
>>>
>>> another query equal to query1,customertype is not null:
>>>
>>> hdb=# select tenantid,ulist.customertype,cp.orderuuid from cptest
>>> cp,ucloud_pay_tenanid ulist
>>> hdb-# where cp.tenantid = ulist.customercode;
>>> tenantid | customertype | orderuuid
>>> ----------+--------------+----------------------------------
>>> 7jvfka5m | 专卖 | f10771c9eb4e434dadcc82dc7cda0d4d
>>> 7jvfka5m | 专卖 | 246a13ed45174ee083fd9f743532ab4c
>>> 7jvfka5m | 专卖 | e2687954d3e94ab0bf1366369bdc9887
>>> 7jvfka5m | 专卖 | 018ef9e35db94901b4822509494bd6b5
>>> 7jvfka5m | 专卖 | 9054d2d5d7264a8aa4b41baddf1198e8
>>> 7jvfka5m | 专卖 | 9054d2d5d7264a8aa4b41baddf1198e8
>>> 7jvfka5m | 专卖 | a7b96194fe9f48379e2711ac6000191b
>>> u642nk35 | 便利 | df4ef8599c154c978d021b6765edb55b
>>> u642nk35 | 便利 | d60c5d3ac0f44ef1a278d50d84847c2f
>>> u642nk35 | 便利 | df6ca3d8fb444f8fba604db8611c6292
>>> u642nk35 | 便利 | df4ef8599c154c978d021b6765edb55b
>>>
>>
Re: why these 2 queries had different explain
Posted by tao tony <to...@outlook.com>.
OK.this bug appears when using gpfdist external table in select list.I
create some test data to reproduce this bug in hawq docker.
1.create a external table
CREATE EXTERNAL TABLE testext (
a int,
b character varying(255)
) LOCATION (
'gpfdist://172.19.0.2:8087/test.csv'
) FORMAT 'text' (delimiter E',' null E'' escape E'OFF');
test.csv file contains:
cat gpdata/test.csv
1,abc
2,bce
3,ced
2.create a internal table:
create table test1(c int);
insert into test1 values(1);
insert into test1 values(2);
insert into test1 values(3);
insert into test1 values(4);
3.run query,could not get testext.b values:
select c,(select s.b from testext s where t.c=s.a) from test1 t;
c | ?column?
---+----------
1 |
2 |
3 |
4 |
(4 rows)
4.create an internal table from testext,and run the query again,ti
returns the correct result
create table test as select * from testext;
select c,(select s.b from test s where t.c=s.a) from test1 t;
c | ?column?
---+----------
1 | abc
2 | bce
3 | ced
4 |
(4 rows)
you could compare the 2 explains,in step 3 ,testext was not broadcasted.
On 05/22/2017 09:25 AM, 陶征霖 wrote:
> Hi tony,
>
> Could you please provide the simple reproduce steps so that we can easily
> debug in our own env.?
>
> Thanks,
> Zhenglin
>
> 2017-05-18 14:17 GMT+08:00 tao tony <to...@outlook.com>:
>
>> it seems a bug for querying on an external table because I found
>> ucloud_pay_tenanid ulist was a external table which using gpfdist.
>>
>> CREATE EXTERNAL TABLE ucloud_pay_tenanid (
>> customercode character varying(255),
>> customername character varying(255),
>> prefixflag character varying(255),
>> customertype character varying(255),
>> comments character varying(255)
>> ) LOCATION (
>> 'gpfdist://hdptest02.hddomain.cn:8087/ucloud_pay_tenanid_*.csv'
>> ) FORMAT 'text' (delimiter E';' null E'' escape E'OFF')
>> ENCODING 'UTF8';
>>
>> then I create an internal table using:
>>
>> create table ucloudtest as select * from ucloud_pay_tenanid;
>>
>> run the explain,it lookks like the right query plan:
>>
>> hdb=# explain select cp.tenantid,
>> hdb-# (select ulist.customertype from ucloudtest ulist where
>> ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp;
>> QUERY PLAN
>> ------------------------------------------------------------
>> ---------------------------------------------
>> Gather Motion 1:1 (slice2; segments: 1) (cost=0.00..219.50 rows=100
>> width=42)
>> -> Append-only Scan on cptest cp (cost=0.00..219.50 rows=100
>> width=42)
>> SubPlan 1
>> -> Result (cost=2.18..2.19 rows=1 width=7)
>> Filter: ulist.customercode::text = $0::text
>> -> Materialize (cost=2.18..2.19 rows=1 width=7)
>> -> Broadcast Motion 1:1 (slice1; segments: 1)
>> (cost=0.00..2.17 rows=1 width=7)
>> -> Append-only Scan on ucloudtest ulist
>> (cost=0.00..2.17 rows=1 width=7)
>> Settings: default_hash_table_bucket_number=18; optimizer=off
>> Optimizer status: legacy query optimizer
>> (10 rows)
>>
>> run the query,get the correct result:
>>
>> hdb=# select cp.tenantid,
>> (select ulist.customertype from ucloudtest ulist where
>> ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp;
>> tenantid | ?column? | orderuuid
>> ----------+----------+----------------------------------
>> sxve7r6c | 便利 | e6d9b57a0c55484392448ea908c1fe49
>> sxve7r6c | 便利 | 22a80697bfc74d63b7f28eee246c4368
>> 3e7rph46 | 专卖 | 420ad3e45762459e91860b975e9f2751
>> 3e7rph46 | 专卖 | 0634e7e3539a4116b9917a7493838f51
>> 7jvfka5m | 专卖 | a7b96194fe9f48379e2711ac6000191b
>> 6xydfh4y | 便利 | 7a55e97119784623a53f6e65ef9680c7
>> sxve7r6c | 便利 | 227f3d22aec14723bb51efc4e2a6f0b4
>> 3e7rph46 | 专卖 | f3d02cc77a2348829be2f72ce24bf846
>> 6xydfh4y | 便利 | bab722ac7d5748408d3ad2973d292ab5
>>
>>
>> On 05/18/2017 11:11 AM, tao tony wrote:
>>> hi guys,
>>>
>>> The different explains as below make me confused these days,could you
>>> please help me to explain why ulist.customertype is null.
>>>
>>> explain :
>>>
>>> hdb=# explain select cp.tenantid,
>>> (select ulist.customertype from ucloud_pay_tenanid ulist where
>>> ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp;
>>> QUERY PLAN
>>> ------------------------------------------------------------
>> --------------------------------------
>>> Gather Motion 18:1 (slice1; segments: 18) (cost=0.00..1350002.00
>>> rows=100 width=42)
>>> -> Append-only Scan on cptest cp (cost=0.00..1350002.00 rows=6
>>> width=42)
>>> SubPlan 1
>>> -> External Scan on ucloud_pay_tenanid ulist
>>> (cost=0.00..13500.00 rows=56 width=516)
>>> Filter: customercode::text = $0::text
>>> Settings: default_hash_table_bucket_number=18; optimizer=off
>>> Optimizer status: legacy query optimizer
>>> (7 rows)
>>>
>>>
>>> hdb=# explain select a,(select d from test1 s where t.b=s.e) from test2
>> t;
>>> QUERY PLAN
>>> ------------------------------------------------------------
>> ---------------------------------------------
>>> Gather Motion 1:1 (slice2; segments: 1) (cost=0.00..10.20 rows=9
>>> width=6)
>>> -> Append-only Scan on test2 t (cost=0.00..10.20 rows=9 width=6)
>>> SubPlan 1
>>> -> Result (cost=1.01..1.02 rows=1 width=4)
>>> Filter: $0::text = s.e::text
>>> -> Materialize (cost=1.01..1.02 rows=1 width=4)
>>> -> Broadcast Motion 1:1 (slice1; segments: 1)
>>> (cost=0.00..1.01 rows=1 width=4)
>>> -> Append-only Scan on test1 s
>>> (cost=0.00..1.01 rows=1 width=4)
>>> Settings: default_hash_table_bucket_number=18; optimizer=off
>>> Optimizer status: legacy query optimizer
>>> (10 rows)
>>>
>>> test data:
>>>
>>> query1:
>>>
>>> hdb-# (select ulist.customertype from ucloud_pay_tenanid ulist where
>>> ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp;
>>> tenantid | ?column? | orderuuid
>>> ----------+----------+----------------------------------
>>> sxve7r6c | | 901cf777a3144907899226de86ab8cfe
>>> k5r9qcfj | | 54c8cbdbe64f4dcca5cacb40b7ab52d6
>>> 3e7rph46 | | f5e777a5189e409da607182d059ec0d5
>>> 3e7rph46 | | 478d8a0921ab444ca80dc03ce97d3a94
>>> 3e7rph46 | | 25808c4a66e34ecaad02d2fc183920ca
>>> 3e7rph46 | | 0a62832067ea4db68641cce35385aae7
>>> 3e7rph46 | | 4c9e91233af947518c5c5767a2bd8b3c
>>> 6xydfh4y | | b2fa1af123384452b23bbc5e0794da56
>>> 6xydfh4y | | ade5731d97044a0eb9d74836ea7dbf48
>>> pejg93wh | | da2d037765bb4262a82764715cb4453d
>>> sxve7r6c | | b02df6c998ff4e448b575e3b1fae8e35
>>> 3e7rph46 | | 4c9e91233af947518c5c5767a2bd8b3c
>>>
>>> query 2:
>>>
>>> hdb=# select a,(select d from test1 s where t.b=s.e) from test2 t;
>>> a | ?column?
>>> ---+----------
>>> 1 | 1
>>> 2 | 2
>>> 3 | 3
>>> 4 | 4
>>> 5 | 5
>>> 6 | 7
>>> 7 | 8
>>> 8 | 9
>>> 9 | 10
>>> (9 rows)
>>>
>>> another query equal to query1,customertype is not null:
>>>
>>> hdb=# select tenantid,ulist.customertype,cp.orderuuid from cptest
>>> cp,ucloud_pay_tenanid ulist
>>> hdb-# where cp.tenantid = ulist.customercode;
>>> tenantid | customertype | orderuuid
>>> ----------+--------------+----------------------------------
>>> 7jvfka5m | 专卖 | f10771c9eb4e434dadcc82dc7cda0d4d
>>> 7jvfka5m | 专卖 | 246a13ed45174ee083fd9f743532ab4c
>>> 7jvfka5m | 专卖 | e2687954d3e94ab0bf1366369bdc9887
>>> 7jvfka5m | 专卖 | 018ef9e35db94901b4822509494bd6b5
>>> 7jvfka5m | 专卖 | 9054d2d5d7264a8aa4b41baddf1198e8
>>> 7jvfka5m | 专卖 | 9054d2d5d7264a8aa4b41baddf1198e8
>>> 7jvfka5m | 专卖 | a7b96194fe9f48379e2711ac6000191b
>>> u642nk35 | 便利 | df4ef8599c154c978d021b6765edb55b
>>> u642nk35 | 便利 | d60c5d3ac0f44ef1a278d50d84847c2f
>>> u642nk35 | 便利 | df6ca3d8fb444f8fba604db8611c6292
>>> u642nk35 | 便利 | df4ef8599c154c978d021b6765edb55b
>>>
>>
Re: why these 2 queries had different explain
Posted by 陶征霖 <zt...@apache.org>.
Hi tony,
Could you please provide the simple reproduce steps so that we can easily
debug in our own env.?
Thanks,
Zhenglin
2017-05-18 14:17 GMT+08:00 tao tony <to...@outlook.com>:
> it seems a bug for querying on an external table because I found
> ucloud_pay_tenanid ulist was a external table which using gpfdist.
>
> CREATE EXTERNAL TABLE ucloud_pay_tenanid (
> customercode character varying(255),
> customername character varying(255),
> prefixflag character varying(255),
> customertype character varying(255),
> comments character varying(255)
> ) LOCATION (
> 'gpfdist://hdptest02.hddomain.cn:8087/ucloud_pay_tenanid_*.csv'
> ) FORMAT 'text' (delimiter E';' null E'' escape E'OFF')
> ENCODING 'UTF8';
>
> then I create an internal table using:
>
> create table ucloudtest as select * from ucloud_pay_tenanid;
>
> run the explain,it lookks like the right query plan:
>
> hdb=# explain select cp.tenantid,
> hdb-# (select ulist.customertype from ucloudtest ulist where
> ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp;
> QUERY PLAN
> ------------------------------------------------------------
> ---------------------------------------------
> Gather Motion 1:1 (slice2; segments: 1) (cost=0.00..219.50 rows=100
> width=42)
> -> Append-only Scan on cptest cp (cost=0.00..219.50 rows=100
> width=42)
> SubPlan 1
> -> Result (cost=2.18..2.19 rows=1 width=7)
> Filter: ulist.customercode::text = $0::text
> -> Materialize (cost=2.18..2.19 rows=1 width=7)
> -> Broadcast Motion 1:1 (slice1; segments: 1)
> (cost=0.00..2.17 rows=1 width=7)
> -> Append-only Scan on ucloudtest ulist
> (cost=0.00..2.17 rows=1 width=7)
> Settings: default_hash_table_bucket_number=18; optimizer=off
> Optimizer status: legacy query optimizer
> (10 rows)
>
> run the query,get the correct result:
>
> hdb=# select cp.tenantid,
> (select ulist.customertype from ucloudtest ulist where
> ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp;
> tenantid | ?column? | orderuuid
> ----------+----------+----------------------------------
> sxve7r6c | 便利 | e6d9b57a0c55484392448ea908c1fe49
> sxve7r6c | 便利 | 22a80697bfc74d63b7f28eee246c4368
> 3e7rph46 | 专卖 | 420ad3e45762459e91860b975e9f2751
> 3e7rph46 | 专卖 | 0634e7e3539a4116b9917a7493838f51
> 7jvfka5m | 专卖 | a7b96194fe9f48379e2711ac6000191b
> 6xydfh4y | 便利 | 7a55e97119784623a53f6e65ef9680c7
> sxve7r6c | 便利 | 227f3d22aec14723bb51efc4e2a6f0b4
> 3e7rph46 | 专卖 | f3d02cc77a2348829be2f72ce24bf846
> 6xydfh4y | 便利 | bab722ac7d5748408d3ad2973d292ab5
>
>
> On 05/18/2017 11:11 AM, tao tony wrote:
> > hi guys,
> >
> > The different explains as below make me confused these days,could you
> > please help me to explain why ulist.customertype is null.
> >
> > explain :
> >
> > hdb=# explain select cp.tenantid,
> > (select ulist.customertype from ucloud_pay_tenanid ulist where
> > ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp;
> > QUERY PLAN
> > ------------------------------------------------------------
> --------------------------------------
> > Gather Motion 18:1 (slice1; segments: 18) (cost=0.00..1350002.00
> > rows=100 width=42)
> > -> Append-only Scan on cptest cp (cost=0.00..1350002.00 rows=6
> > width=42)
> > SubPlan 1
> > -> External Scan on ucloud_pay_tenanid ulist
> > (cost=0.00..13500.00 rows=56 width=516)
> > Filter: customercode::text = $0::text
> > Settings: default_hash_table_bucket_number=18; optimizer=off
> > Optimizer status: legacy query optimizer
> > (7 rows)
> >
> >
> > hdb=# explain select a,(select d from test1 s where t.b=s.e) from test2
> t;
> > QUERY PLAN
> > ------------------------------------------------------------
> ---------------------------------------------
> > Gather Motion 1:1 (slice2; segments: 1) (cost=0.00..10.20 rows=9
> > width=6)
> > -> Append-only Scan on test2 t (cost=0.00..10.20 rows=9 width=6)
> > SubPlan 1
> > -> Result (cost=1.01..1.02 rows=1 width=4)
> > Filter: $0::text = s.e::text
> > -> Materialize (cost=1.01..1.02 rows=1 width=4)
> > -> Broadcast Motion 1:1 (slice1; segments: 1)
> > (cost=0.00..1.01 rows=1 width=4)
> > -> Append-only Scan on test1 s
> > (cost=0.00..1.01 rows=1 width=4)
> > Settings: default_hash_table_bucket_number=18; optimizer=off
> > Optimizer status: legacy query optimizer
> > (10 rows)
> >
> > test data:
> >
> > query1:
> >
> > hdb-# (select ulist.customertype from ucloud_pay_tenanid ulist where
> > ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp;
> > tenantid | ?column? | orderuuid
> > ----------+----------+----------------------------------
> > sxve7r6c | | 901cf777a3144907899226de86ab8cfe
> > k5r9qcfj | | 54c8cbdbe64f4dcca5cacb40b7ab52d6
> > 3e7rph46 | | f5e777a5189e409da607182d059ec0d5
> > 3e7rph46 | | 478d8a0921ab444ca80dc03ce97d3a94
> > 3e7rph46 | | 25808c4a66e34ecaad02d2fc183920ca
> > 3e7rph46 | | 0a62832067ea4db68641cce35385aae7
> > 3e7rph46 | | 4c9e91233af947518c5c5767a2bd8b3c
> > 6xydfh4y | | b2fa1af123384452b23bbc5e0794da56
> > 6xydfh4y | | ade5731d97044a0eb9d74836ea7dbf48
> > pejg93wh | | da2d037765bb4262a82764715cb4453d
> > sxve7r6c | | b02df6c998ff4e448b575e3b1fae8e35
> > 3e7rph46 | | 4c9e91233af947518c5c5767a2bd8b3c
> >
> > query 2:
> >
> > hdb=# select a,(select d from test1 s where t.b=s.e) from test2 t;
> > a | ?column?
> > ---+----------
> > 1 | 1
> > 2 | 2
> > 3 | 3
> > 4 | 4
> > 5 | 5
> > 6 | 7
> > 7 | 8
> > 8 | 9
> > 9 | 10
> > (9 rows)
> >
> > another query equal to query1,customertype is not null:
> >
> > hdb=# select tenantid,ulist.customertype,cp.orderuuid from cptest
> > cp,ucloud_pay_tenanid ulist
> > hdb-# where cp.tenantid = ulist.customercode;
> > tenantid | customertype | orderuuid
> > ----------+--------------+----------------------------------
> > 7jvfka5m | 专卖 | f10771c9eb4e434dadcc82dc7cda0d4d
> > 7jvfka5m | 专卖 | 246a13ed45174ee083fd9f743532ab4c
> > 7jvfka5m | 专卖 | e2687954d3e94ab0bf1366369bdc9887
> > 7jvfka5m | 专卖 | 018ef9e35db94901b4822509494bd6b5
> > 7jvfka5m | 专卖 | 9054d2d5d7264a8aa4b41baddf1198e8
> > 7jvfka5m | 专卖 | 9054d2d5d7264a8aa4b41baddf1198e8
> > 7jvfka5m | 专卖 | a7b96194fe9f48379e2711ac6000191b
> > u642nk35 | 便利 | df4ef8599c154c978d021b6765edb55b
> > u642nk35 | 便利 | d60c5d3ac0f44ef1a278d50d84847c2f
> > u642nk35 | 便利 | df6ca3d8fb444f8fba604db8611c6292
> > u642nk35 | 便利 | df4ef8599c154c978d021b6765edb55b
> >
>
>
Re: why these 2 queries had different explain
Posted by 陶征霖 <zt...@apache.org>.
Hi tony,
Could you please provide the simple reproduce steps so that we can easily
debug in our own env.?
Thanks,
Zhenglin
2017-05-18 14:17 GMT+08:00 tao tony <to...@outlook.com>:
> it seems a bug for querying on an external table because I found
> ucloud_pay_tenanid ulist was a external table which using gpfdist.
>
> CREATE EXTERNAL TABLE ucloud_pay_tenanid (
> customercode character varying(255),
> customername character varying(255),
> prefixflag character varying(255),
> customertype character varying(255),
> comments character varying(255)
> ) LOCATION (
> 'gpfdist://hdptest02.hddomain.cn:8087/ucloud_pay_tenanid_*.csv'
> ) FORMAT 'text' (delimiter E';' null E'' escape E'OFF')
> ENCODING 'UTF8';
>
> then I create an internal table using:
>
> create table ucloudtest as select * from ucloud_pay_tenanid;
>
> run the explain,it lookks like the right query plan:
>
> hdb=# explain select cp.tenantid,
> hdb-# (select ulist.customertype from ucloudtest ulist where
> ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp;
> QUERY PLAN
> ------------------------------------------------------------
> ---------------------------------------------
> Gather Motion 1:1 (slice2; segments: 1) (cost=0.00..219.50 rows=100
> width=42)
> -> Append-only Scan on cptest cp (cost=0.00..219.50 rows=100
> width=42)
> SubPlan 1
> -> Result (cost=2.18..2.19 rows=1 width=7)
> Filter: ulist.customercode::text = $0::text
> -> Materialize (cost=2.18..2.19 rows=1 width=7)
> -> Broadcast Motion 1:1 (slice1; segments: 1)
> (cost=0.00..2.17 rows=1 width=7)
> -> Append-only Scan on ucloudtest ulist
> (cost=0.00..2.17 rows=1 width=7)
> Settings: default_hash_table_bucket_number=18; optimizer=off
> Optimizer status: legacy query optimizer
> (10 rows)
>
> run the query,get the correct result:
>
> hdb=# select cp.tenantid,
> (select ulist.customertype from ucloudtest ulist where
> ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp;
> tenantid | ?column? | orderuuid
> ----------+----------+----------------------------------
> sxve7r6c | 便利 | e6d9b57a0c55484392448ea908c1fe49
> sxve7r6c | 便利 | 22a80697bfc74d63b7f28eee246c4368
> 3e7rph46 | 专卖 | 420ad3e45762459e91860b975e9f2751
> 3e7rph46 | 专卖 | 0634e7e3539a4116b9917a7493838f51
> 7jvfka5m | 专卖 | a7b96194fe9f48379e2711ac6000191b
> 6xydfh4y | 便利 | 7a55e97119784623a53f6e65ef9680c7
> sxve7r6c | 便利 | 227f3d22aec14723bb51efc4e2a6f0b4
> 3e7rph46 | 专卖 | f3d02cc77a2348829be2f72ce24bf846
> 6xydfh4y | 便利 | bab722ac7d5748408d3ad2973d292ab5
>
>
> On 05/18/2017 11:11 AM, tao tony wrote:
> > hi guys,
> >
> > The different explains as below make me confused these days,could you
> > please help me to explain why ulist.customertype is null.
> >
> > explain :
> >
> > hdb=# explain select cp.tenantid,
> > (select ulist.customertype from ucloud_pay_tenanid ulist where
> > ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp;
> > QUERY PLAN
> > ------------------------------------------------------------
> --------------------------------------
> > Gather Motion 18:1 (slice1; segments: 18) (cost=0.00..1350002.00
> > rows=100 width=42)
> > -> Append-only Scan on cptest cp (cost=0.00..1350002.00 rows=6
> > width=42)
> > SubPlan 1
> > -> External Scan on ucloud_pay_tenanid ulist
> > (cost=0.00..13500.00 rows=56 width=516)
> > Filter: customercode::text = $0::text
> > Settings: default_hash_table_bucket_number=18; optimizer=off
> > Optimizer status: legacy query optimizer
> > (7 rows)
> >
> >
> > hdb=# explain select a,(select d from test1 s where t.b=s.e) from test2
> t;
> > QUERY PLAN
> > ------------------------------------------------------------
> ---------------------------------------------
> > Gather Motion 1:1 (slice2; segments: 1) (cost=0.00..10.20 rows=9
> > width=6)
> > -> Append-only Scan on test2 t (cost=0.00..10.20 rows=9 width=6)
> > SubPlan 1
> > -> Result (cost=1.01..1.02 rows=1 width=4)
> > Filter: $0::text = s.e::text
> > -> Materialize (cost=1.01..1.02 rows=1 width=4)
> > -> Broadcast Motion 1:1 (slice1; segments: 1)
> > (cost=0.00..1.01 rows=1 width=4)
> > -> Append-only Scan on test1 s
> > (cost=0.00..1.01 rows=1 width=4)
> > Settings: default_hash_table_bucket_number=18; optimizer=off
> > Optimizer status: legacy query optimizer
> > (10 rows)
> >
> > test data:
> >
> > query1:
> >
> > hdb-# (select ulist.customertype from ucloud_pay_tenanid ulist where
> > ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp;
> > tenantid | ?column? | orderuuid
> > ----------+----------+----------------------------------
> > sxve7r6c | | 901cf777a3144907899226de86ab8cfe
> > k5r9qcfj | | 54c8cbdbe64f4dcca5cacb40b7ab52d6
> > 3e7rph46 | | f5e777a5189e409da607182d059ec0d5
> > 3e7rph46 | | 478d8a0921ab444ca80dc03ce97d3a94
> > 3e7rph46 | | 25808c4a66e34ecaad02d2fc183920ca
> > 3e7rph46 | | 0a62832067ea4db68641cce35385aae7
> > 3e7rph46 | | 4c9e91233af947518c5c5767a2bd8b3c
> > 6xydfh4y | | b2fa1af123384452b23bbc5e0794da56
> > 6xydfh4y | | ade5731d97044a0eb9d74836ea7dbf48
> > pejg93wh | | da2d037765bb4262a82764715cb4453d
> > sxve7r6c | | b02df6c998ff4e448b575e3b1fae8e35
> > 3e7rph46 | | 4c9e91233af947518c5c5767a2bd8b3c
> >
> > query 2:
> >
> > hdb=# select a,(select d from test1 s where t.b=s.e) from test2 t;
> > a | ?column?
> > ---+----------
> > 1 | 1
> > 2 | 2
> > 3 | 3
> > 4 | 4
> > 5 | 5
> > 6 | 7
> > 7 | 8
> > 8 | 9
> > 9 | 10
> > (9 rows)
> >
> > another query equal to query1,customertype is not null:
> >
> > hdb=# select tenantid,ulist.customertype,cp.orderuuid from cptest
> > cp,ucloud_pay_tenanid ulist
> > hdb-# where cp.tenantid = ulist.customercode;
> > tenantid | customertype | orderuuid
> > ----------+--------------+----------------------------------
> > 7jvfka5m | 专卖 | f10771c9eb4e434dadcc82dc7cda0d4d
> > 7jvfka5m | 专卖 | 246a13ed45174ee083fd9f743532ab4c
> > 7jvfka5m | 专卖 | e2687954d3e94ab0bf1366369bdc9887
> > 7jvfka5m | 专卖 | 018ef9e35db94901b4822509494bd6b5
> > 7jvfka5m | 专卖 | 9054d2d5d7264a8aa4b41baddf1198e8
> > 7jvfka5m | 专卖 | 9054d2d5d7264a8aa4b41baddf1198e8
> > 7jvfka5m | 专卖 | a7b96194fe9f48379e2711ac6000191b
> > u642nk35 | 便利 | df4ef8599c154c978d021b6765edb55b
> > u642nk35 | 便利 | d60c5d3ac0f44ef1a278d50d84847c2f
> > u642nk35 | 便利 | df6ca3d8fb444f8fba604db8611c6292
> > u642nk35 | 便利 | df4ef8599c154c978d021b6765edb55b
> >
>
>