You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Abhishek Ravi (JIRA)" <ji...@apache.org> on 2018/11/13 23:56:00 UTC
[jira] [Updated] (DRILL-6849) Runtime filter queries with nested
broadcast returns wrong results
[ https://issues.apache.org/jira/browse/DRILL-6849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Abhishek Ravi updated DRILL-6849:
---------------------------------
Description:
Running few queries on TPC-H SF100 data with latest changes in [PR #1504|[https://github.com/apache/drill/pull/1504].]
I noticed that for a couple of queries, there are vast discrepancies in the results returned by queries with *Runtime Filter enabled* and without.
Below are a couple of failing queries.
h3. Query 1
{code:sql}
select
p.p_mfgr,
p.p_type,
count(*) as num_parts
from
supplier s,
part p,
partsupp ps,
nation n
where
n.n_nationkey = 15
and n.n_nationkey = s.s_nationkey
and s.s_suppkey = ps.ps_suppkey
and ps.ps_partkey = p.p_partkey
and p.p_type like '%STEEL%'
group by
p.p_mfgr,
p.p_type
order by
p.p_mfgr,
p.p_type;
{code}
h4. Expected Result (without Runtime filter)
{noformat}
+-----------------+---------------------------+------------+
| p_mfgr | p_type | num_parts |
+-----------------+---------------------------+------------+
| Manufacturer#1 | ECONOMY ANODIZED STEEL | 4448 |
| Manufacturer#1 | ECONOMY BRUSHED STEEL | 4308 |
| Manufacturer#1 | ECONOMY BURNISHED STEEL | 4268 |
| Manufacturer#1 | ECONOMY PLATED STEEL | 4266 |
| Manufacturer#1 | ECONOMY POLISHED STEEL | 4302 |
| Manufacturer#1 | LARGE ANODIZED STEEL | 4389 |
| Manufacturer#1 | LARGE BRUSHED STEEL | 4254 |
| Manufacturer#1 | LARGE BURNISHED STEEL | 4276 |
| Manufacturer#1 | LARGE PLATED STEEL | 4351 |
| Manufacturer#1 | LARGE POLISHED STEEL | 4281 |
| Manufacturer#1 | MEDIUM ANODIZED STEEL | 4230 |
| Manufacturer#1 | MEDIUM BRUSHED STEEL | 4247 |
| Manufacturer#1 | MEDIUM BURNISHED STEEL | 4286 |
| Manufacturer#1 | MEDIUM PLATED STEEL | 4269 |
| Manufacturer#1 | MEDIUM POLISHED STEEL | 4274 |
| Manufacturer#1 | PROMO ANODIZED STEEL | 4283 |
| Manufacturer#1 | PROMO BRUSHED STEEL | 4221 |
| Manufacturer#1 | PROMO BURNISHED STEEL | 4315 |
| Manufacturer#1 | PROMO PLATED STEEL | 4361 |
| Manufacturer#1 | PROMO POLISHED STEEL | 4213 |
| Manufacturer#1 | SMALL ANODIZED STEEL | 4354 |
| Manufacturer#1 | SMALL BRUSHED STEEL | 4159 |
| Manufacturer#1 | SMALL BURNISHED STEEL | 4222 |
...
...
150 rows
{noformat}
h4. Actual Result
{noformat}
+-----------------+---------------------------+------------+
| p_mfgr | p_type | num_parts |
+-----------------+---------------------------+------------+
| Manufacturer#1 | ECONOMY ANODIZED STEEL | 63 |
| Manufacturer#1 | ECONOMY BRUSHED STEEL | 64 |
| Manufacturer#1 | ECONOMY BURNISHED STEEL | 58 |
| Manufacturer#1 | ECONOMY PLATED STEEL | 73 |
| Manufacturer#1 | ECONOMY POLISHED STEEL | 59 |
| Manufacturer#1 | LARGE ANODIZED STEEL | 60 |
| Manufacturer#1 | LARGE BRUSHED STEEL | 62 |
| Manufacturer#1 | LARGE BURNISHED STEEL | 47 |
| Manufacturer#1 | LARGE PLATED STEEL | 51 |
| Manufacturer#1 | LARGE POLISHED STEEL | 60 |
| Manufacturer#1 | MEDIUM ANODIZED STEEL | 54 |
| Manufacturer#1 | MEDIUM BRUSHED STEEL | 60 |
| Manufacturer#1 | MEDIUM BURNISHED STEEL | 53 |
| Manufacturer#1 | MEDIUM PLATED STEEL | 73 |
| Manufacturer#1 | MEDIUM POLISHED STEEL | 65 |
| Manufacturer#1 | PROMO ANODIZED STEEL | 71 |
| Manufacturer#1 | PROMO BRUSHED STEEL | 67 |
| Manufacturer#1 | PROMO BURNISHED STEEL | 64 |
| Manufacturer#1 | PROMO PLATED STEEL | 47 |
| Manufacturer#1 | PROMO POLISHED STEEL | 51 |
| Manufacturer#1 | SMALL ANODIZED STEEL | 48 |
| Manufacturer#1 | SMALL BRUSHED STEEL | 71 |
| Manufacturer#1 | SMALL BURNISHED STEEL | 33 |
...
...
150 rows
{noformat}
h3. Query 2 (TPC-H 7 query)
{code:sql}
select
supp_nation,
cust_nation,
l_year,
sum(volume) as revenue
from
(
select
n1.n_name as supp_nation,
n2.n_name as cust_nation,
extract(year from l.l_shipdate) as l_year,
l.l_extendedprice * (1 - l.l_discount) as volume
from
supplier s,
lineitem l,
orders o,
customer c,
nation n1,
nation n2
where
s.s_suppkey = l.l_suppkey
and o.o_orderkey = l.l_orderkey
and c.c_custkey = o.o_custkey
and s.s_nationkey = n1.n_nationkey
and c.c_nationkey = n2.n_nationkey
and (
(n1.n_name = 'EGYPT' and n2.n_name = 'UNITED STATES')
or (n1.n_name = 'UNITED STATES' and n2.n_name = 'EGYPT')
)
and l.l_shipdate between date '1995-01-01' and date '1996-12-31'
) as shipping
group by
supp_nation,
cust_nation,
l_year
order by
supp_nation,
cust_nation,
l_year;
{code}
h4. Expected Result
{noformat}
EGYPT UNITED STATES 1996 5.2735809932832E9
UNITED STATES EGYPT 1996 5.320488357330402E9
EGYPT UNITED STATES 1995 5.282512709079098E9
UNITED STATES EGYPT 1995 5.3213732978949E9
{noformat}
h4. Actual Result
{noformat}
EGYPT UNITED STATES 1996 1.6282961704599997E7
UNITED STATES EGYPT 1996 1.39809230059E7
EGYPT UNITED STATES 1995 1.3606895522500003E7
UNITED STATES EGYPT 1995 1.5241044473299999E7
{noformat}
was:
Running few queries on TPC-H SF100 data with latest changes in [PR #1504|[https://github.com/apache/drill/pull/1504].]
I noticed that for a couple of queries, there are vast discrepancies in the results returned by queries with *Runtime Filter enabled* and without.
Below are a couple of failing queries.
h3. h3. Query 1
{code:sql}
select
p.p_mfgr,
p.p_type,
count(*) as num_parts
from
supplier s,
part p,
partsupp ps,
nation n
where
n.n_nationkey = 15
and n.n_nationkey = s.s_nationkey
and s.s_suppkey = ps.ps_suppkey
and ps.ps_partkey = p.p_partkey
and p.p_type like '%STEEL%'
group by
p.p_mfgr,
p.p_type
order by
p.p_mfgr,
p.p_type;
{code}
h4. h4. Expected Result (without Runtime filter)
{noformat}
+-----------------+---------------------------+------------+
| p_mfgr | p_type | num_parts |
+-----------------+---------------------------+------------+
| Manufacturer#1 | ECONOMY ANODIZED STEEL | 4448 |
| Manufacturer#1 | ECONOMY BRUSHED STEEL | 4308 |
| Manufacturer#1 | ECONOMY BURNISHED STEEL | 4268 |
| Manufacturer#1 | ECONOMY PLATED STEEL | 4266 |
| Manufacturer#1 | ECONOMY POLISHED STEEL | 4302 |
| Manufacturer#1 | LARGE ANODIZED STEEL | 4389 |
| Manufacturer#1 | LARGE BRUSHED STEEL | 4254 |
| Manufacturer#1 | LARGE BURNISHED STEEL | 4276 |
| Manufacturer#1 | LARGE PLATED STEEL | 4351 |
| Manufacturer#1 | LARGE POLISHED STEEL | 4281 |
| Manufacturer#1 | MEDIUM ANODIZED STEEL | 4230 |
| Manufacturer#1 | MEDIUM BRUSHED STEEL | 4247 |
| Manufacturer#1 | MEDIUM BURNISHED STEEL | 4286 |
| Manufacturer#1 | MEDIUM PLATED STEEL | 4269 |
| Manufacturer#1 | MEDIUM POLISHED STEEL | 4274 |
| Manufacturer#1 | PROMO ANODIZED STEEL | 4283 |
| Manufacturer#1 | PROMO BRUSHED STEEL | 4221 |
| Manufacturer#1 | PROMO BURNISHED STEEL | 4315 |
| Manufacturer#1 | PROMO PLATED STEEL | 4361 |
| Manufacturer#1 | PROMO POLISHED STEEL | 4213 |
| Manufacturer#1 | SMALL ANODIZED STEEL | 4354 |
| Manufacturer#1 | SMALL BRUSHED STEEL | 4159 |
| Manufacturer#1 | SMALL BURNISHED STEEL | 4222 |
...
...
150 rows
{noformat}
h4. h4. Actual Result
{noformat}
+-----------------+---------------------------+------------+
| p_mfgr | p_type | num_parts |
+-----------------+---------------------------+------------+
| Manufacturer#1 | ECONOMY ANODIZED STEEL | 63 |
| Manufacturer#1 | ECONOMY BRUSHED STEEL | 64 |
| Manufacturer#1 | ECONOMY BURNISHED STEEL | 58 |
| Manufacturer#1 | ECONOMY PLATED STEEL | 73 |
| Manufacturer#1 | ECONOMY POLISHED STEEL | 59 |
| Manufacturer#1 | LARGE ANODIZED STEEL | 60 |
| Manufacturer#1 | LARGE BRUSHED STEEL | 62 |
| Manufacturer#1 | LARGE BURNISHED STEEL | 47 |
| Manufacturer#1 | LARGE PLATED STEEL | 51 |
| Manufacturer#1 | LARGE POLISHED STEEL | 60 |
| Manufacturer#1 | MEDIUM ANODIZED STEEL | 54 |
| Manufacturer#1 | MEDIUM BRUSHED STEEL | 60 |
| Manufacturer#1 | MEDIUM BURNISHED STEEL | 53 |
| Manufacturer#1 | MEDIUM PLATED STEEL | 73 |
| Manufacturer#1 | MEDIUM POLISHED STEEL | 65 |
| Manufacturer#1 | PROMO ANODIZED STEEL | 71 |
| Manufacturer#1 | PROMO BRUSHED STEEL | 67 |
| Manufacturer#1 | PROMO BURNISHED STEEL | 64 |
| Manufacturer#1 | PROMO PLATED STEEL | 47 |
| Manufacturer#1 | PROMO POLISHED STEEL | 51 |
| Manufacturer#1 | SMALL ANODIZED STEEL | 48 |
| Manufacturer#1 | SMALL BRUSHED STEEL | 71 |
| Manufacturer#1 | SMALL BURNISHED STEEL | 33 |
...
...
150 rows
{noformat}
h3. h3. Query 2 (TPC-H 7 query)
{code:sql}
select
supp_nation,
cust_nation,
l_year,
sum(volume) as revenue
from
(
select
n1.n_name as supp_nation,
n2.n_name as cust_nation,
extract(year from l.l_shipdate) as l_year,
l.l_extendedprice * (1 - l.l_discount) as volume
from
supplier s,
lineitem l,
orders o,
customer c,
nation n1,
nation n2
where
s.s_suppkey = l.l_suppkey
and o.o_orderkey = l.l_orderkey
and c.c_custkey = o.o_custkey
and s.s_nationkey = n1.n_nationkey
and c.c_nationkey = n2.n_nationkey
and (
(n1.n_name = 'EGYPT' and n2.n_name = 'UNITED STATES')
or (n1.n_name = 'UNITED STATES' and n2.n_name = 'EGYPT')
)
and l.l_shipdate between date '1995-01-01' and date '1996-12-31'
) as shipping
group by
supp_nation,
cust_nation,
l_year
order by
supp_nation,
cust_nation,
l_year;
{code}
h4. h4. Expected Result
{noformat}
EGYPT UNITED STATES 1996 5.2735809932832E9
UNITED STATES EGYPT 1996 5.320488357330402E9
EGYPT UNITED STATES 1995 5.282512709079098E9
UNITED STATES EGYPT 1995 5.3213732978949E9
{noformat}
h4. h4. Actual Result
{noformat}
EGYPT UNITED STATES 1996 1.6282961704599997E7
UNITED STATES EGYPT 1996 1.39809230059E7
EGYPT UNITED STATES 1995 1.3606895522500003E7
UNITED STATES EGYPT 1995 1.5241044473299999E7
{noformat}
> Runtime filter queries with nested broadcast returns wrong results
> ------------------------------------------------------------------
>
> Key: DRILL-6849
> URL: https://issues.apache.org/jira/browse/DRILL-6849
> Project: Apache Drill
> Issue Type: Bug
> Components: Execution - Flow
> Affects Versions: 1.15.0
> Reporter: Abhishek Ravi
> Assignee: weijie.tong
> Priority: Major
> Fix For: 1.15.0
>
>
> Running few queries on TPC-H SF100 data with latest changes in [PR #1504|[https://github.com/apache/drill/pull/1504].]
> I noticed that for a couple of queries, there are vast discrepancies in the results returned by queries with *Runtime Filter enabled* and without.
> Below are a couple of failing queries.
> h3. Query 1
> {code:sql}
> select
> p.p_mfgr,
> p.p_type,
> count(*) as num_parts
> from
> supplier s,
> part p,
> partsupp ps,
> nation n
> where
> n.n_nationkey = 15
> and n.n_nationkey = s.s_nationkey
> and s.s_suppkey = ps.ps_suppkey
> and ps.ps_partkey = p.p_partkey
> and p.p_type like '%STEEL%'
> group by
> p.p_mfgr,
> p.p_type
> order by
> p.p_mfgr,
> p.p_type;
> {code}
>
> h4. Expected Result (without Runtime filter)
> {noformat}
> +-----------------+---------------------------+------------+
> | p_mfgr | p_type | num_parts |
> +-----------------+---------------------------+------------+
> | Manufacturer#1 | ECONOMY ANODIZED STEEL | 4448 |
> | Manufacturer#1 | ECONOMY BRUSHED STEEL | 4308 |
> | Manufacturer#1 | ECONOMY BURNISHED STEEL | 4268 |
> | Manufacturer#1 | ECONOMY PLATED STEEL | 4266 |
> | Manufacturer#1 | ECONOMY POLISHED STEEL | 4302 |
> | Manufacturer#1 | LARGE ANODIZED STEEL | 4389 |
> | Manufacturer#1 | LARGE BRUSHED STEEL | 4254 |
> | Manufacturer#1 | LARGE BURNISHED STEEL | 4276 |
> | Manufacturer#1 | LARGE PLATED STEEL | 4351 |
> | Manufacturer#1 | LARGE POLISHED STEEL | 4281 |
> | Manufacturer#1 | MEDIUM ANODIZED STEEL | 4230 |
> | Manufacturer#1 | MEDIUM BRUSHED STEEL | 4247 |
> | Manufacturer#1 | MEDIUM BURNISHED STEEL | 4286 |
> | Manufacturer#1 | MEDIUM PLATED STEEL | 4269 |
> | Manufacturer#1 | MEDIUM POLISHED STEEL | 4274 |
> | Manufacturer#1 | PROMO ANODIZED STEEL | 4283 |
> | Manufacturer#1 | PROMO BRUSHED STEEL | 4221 |
> | Manufacturer#1 | PROMO BURNISHED STEEL | 4315 |
> | Manufacturer#1 | PROMO PLATED STEEL | 4361 |
> | Manufacturer#1 | PROMO POLISHED STEEL | 4213 |
> | Manufacturer#1 | SMALL ANODIZED STEEL | 4354 |
> | Manufacturer#1 | SMALL BRUSHED STEEL | 4159 |
> | Manufacturer#1 | SMALL BURNISHED STEEL | 4222 |
> ...
> ...
> 150 rows
> {noformat}
>
> h4. Actual Result
> {noformat}
> +-----------------+---------------------------+------------+
> | p_mfgr | p_type | num_parts |
> +-----------------+---------------------------+------------+
> | Manufacturer#1 | ECONOMY ANODIZED STEEL | 63 |
> | Manufacturer#1 | ECONOMY BRUSHED STEEL | 64 |
> | Manufacturer#1 | ECONOMY BURNISHED STEEL | 58 |
> | Manufacturer#1 | ECONOMY PLATED STEEL | 73 |
> | Manufacturer#1 | ECONOMY POLISHED STEEL | 59 |
> | Manufacturer#1 | LARGE ANODIZED STEEL | 60 |
> | Manufacturer#1 | LARGE BRUSHED STEEL | 62 |
> | Manufacturer#1 | LARGE BURNISHED STEEL | 47 |
> | Manufacturer#1 | LARGE PLATED STEEL | 51 |
> | Manufacturer#1 | LARGE POLISHED STEEL | 60 |
> | Manufacturer#1 | MEDIUM ANODIZED STEEL | 54 |
> | Manufacturer#1 | MEDIUM BRUSHED STEEL | 60 |
> | Manufacturer#1 | MEDIUM BURNISHED STEEL | 53 |
> | Manufacturer#1 | MEDIUM PLATED STEEL | 73 |
> | Manufacturer#1 | MEDIUM POLISHED STEEL | 65 |
> | Manufacturer#1 | PROMO ANODIZED STEEL | 71 |
> | Manufacturer#1 | PROMO BRUSHED STEEL | 67 |
> | Manufacturer#1 | PROMO BURNISHED STEEL | 64 |
> | Manufacturer#1 | PROMO PLATED STEEL | 47 |
> | Manufacturer#1 | PROMO POLISHED STEEL | 51 |
> | Manufacturer#1 | SMALL ANODIZED STEEL | 48 |
> | Manufacturer#1 | SMALL BRUSHED STEEL | 71 |
> | Manufacturer#1 | SMALL BURNISHED STEEL | 33 |
> ...
> ...
> 150 rows
> {noformat}
>
> h3. Query 2 (TPC-H 7 query)
> {code:sql}
> select
> supp_nation,
> cust_nation,
> l_year,
> sum(volume) as revenue
> from
> (
> select
> n1.n_name as supp_nation,
> n2.n_name as cust_nation,
> extract(year from l.l_shipdate) as l_year,
> l.l_extendedprice * (1 - l.l_discount) as volume
> from
> supplier s,
> lineitem l,
> orders o,
> customer c,
> nation n1,
> nation n2
> where
> s.s_suppkey = l.l_suppkey
> and o.o_orderkey = l.l_orderkey
> and c.c_custkey = o.o_custkey
> and s.s_nationkey = n1.n_nationkey
> and c.c_nationkey = n2.n_nationkey
> and (
> (n1.n_name = 'EGYPT' and n2.n_name = 'UNITED STATES')
> or (n1.n_name = 'UNITED STATES' and n2.n_name = 'EGYPT')
> )
> and l.l_shipdate between date '1995-01-01' and date '1996-12-31'
> ) as shipping
> group by
> supp_nation,
> cust_nation,
> l_year
> order by
> supp_nation,
> cust_nation,
> l_year;
> {code}
> h4. Expected Result
> {noformat}
> EGYPT UNITED STATES 1996 5.2735809932832E9
> UNITED STATES EGYPT 1996 5.320488357330402E9
> EGYPT UNITED STATES 1995 5.282512709079098E9
> UNITED STATES EGYPT 1995 5.3213732978949E9
> {noformat}
> h4. Actual Result
> {noformat}
> EGYPT UNITED STATES 1996 1.6282961704599997E7
> UNITED STATES EGYPT 1996 1.39809230059E7
> EGYPT UNITED STATES 1995 1.3606895522500003E7
> UNITED STATES EGYPT 1995 1.5241044473299999E7
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)