You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Deneche A. Hakim (JIRA)" <ji...@apache.org> on 2016/02/26 22:40:18 UTC

[jira] [Created] (DRILL-4449) Wrong results when using metadata cache with specific set of queries

Deneche A. Hakim created DRILL-4449:
---------------------------------------

             Summary: Wrong results when using metadata cache with specific set of queries
                 Key: DRILL-4449
                 URL: https://issues.apache.org/jira/browse/DRILL-4449
             Project: Apache Drill
          Issue Type: Bug
          Components: Storage - Parquet
    Affects Versions: 1.5.0
            Reporter: Deneche A. Hakim
            Priority: Critical
             Fix For: 1.6.0


We are still working on a reproduction but when we have a query similar to this one:
{noformat}
with q1 as (
select a.field
from `table` a
where <some condition that causes the table to be pruned>
group by a.field
having ...
)
, q2 as (
select a.field
from `table` a
where <some other pruning condition>
group by a.field
)
select * from (
select count(*) as cnt from q1
union all
select count(*) as cnt from q2
);
{noformat}

The table is partitioned and both sub queries will force a parquet pruning on the table. Because we share the parquet metadata object in ParquetGroupScan, the second query end up being "over pruned" and we get wrong results.

The plan doesn't show the problem.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)