You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Paul Rogers (JIRA)" <ji...@apache.org> on 2018/01/14 06:09:00 UTC

[jira] [Created] (DRILL-6087) Aggregates that use ObjectHolder will fail when Hash Agg spills

Paul Rogers created DRILL-6087:
----------------------------------

             Summary: Aggregates that use ObjectHolder will fail when Hash Agg spills
                 Key: DRILL-6087
                 URL: https://issues.apache.org/jira/browse/DRILL-6087
             Project: Apache Drill
          Issue Type: Bug
    Affects Versions: 1.12.0
            Reporter: Paul Rogers


Drill has this thing called an “ObjectVector” which is vector that holds onto Java objects. We use it for things like the system tables.

The ObjectVector has something called an ObjectHolder. For various reasons (see [this Wiki writeup|https://github.com/paul-rogers/drill/wiki/Aggregate-UDFs], some Drill aggregates used this holder to create aggregates that need more than a few numbers as working values.

As it turns out, all the Decimal AVG functions use the ObjectHolder to hold the intermediate values. (Also true of Decimal Max, Min and Sum. Also true of Max and Min for VarBytes. Just do a code search for uses of ObjectHolder.)

In the old pre-spill days, things worked fine. But, with Hash Agg spilling, we need to write intermediate values out to disk, then read them back.

But, the object vector never implemented the methods needed for spilling! Instead, it will throw an UnsupportedOperationException.

What does this mean?

If you run a query, using the aggregate functions above, use the Hash Agg, and have enough data to cause spilling, your query will fail. Do the same query with Streaming Agg, and it will work. Reduce data to avoid spilling and the query will work.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)