You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Jacques Nadeau (JIRA)" <ji...@apache.org> on 2015/01/04 22:28:39 UTC

[jira] [Updated] (DRILL-1507) Potential hash insert issue

     [ https://issues.apache.org/jira/browse/DRILL-1507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jacques Nadeau updated DRILL-1507:
----------------------------------
      Component/s:     (was: Functions - Drill)
                   Execution - Operators
    Fix Version/s: 0.8.0
         Assignee: Aman Sinha
         Priority: Minor  (was: Major)

> Potential hash insert issue 
> ----------------------------
>
>                 Key: DRILL-1507
>                 URL: https://issues.apache.org/jira/browse/DRILL-1507
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Operators
>    Affects Versions: 0.6.0
>            Reporter: Chun Chang
>            Assignee: Aman Sinha
>            Priority: Minor
>             Fix For: 0.8.0
>
>
> #Thu Oct 02 17:49:48 PDT 2014
> git.commit.id.abbrev=29dde76
> Running the following "case, group by, and order by" query against json file type, I saw the following hash insert errors repeatedly. The query finishes eventually after a little over 30 min, and the data returned is correct. The same query running against parquet file finishes in about a minute. Here is the query:
> /root/drillATS/incubator-drill/testing/framework/resources/aggregate1/json/testcases/aggregate26.q :
> select cast(case when ss_sold_date_sk is null then 0 else ss_sold_date_sk end as int) as soldd, cast(case when ss_sold_time_sk is null then 0 else ss_sold_time_sk end as bigint) as soldt, cast(case when ss_item_sk is null then 0.0 else ss_item_sk end as float) as itemsk, cast(case when ss_customer_sk is null then 0.0 else ss_customer_sk end as decimal(18,9)) as custsk, cast(case when ss_cdemo_sk is null then 0 else ss_cdemo_sk end as varchar(20)) as cdemo, ss_hdemo_sk as hdemo, ss_addr_sk as addrsk, ss_store_sk as storesk, ss_promo_sk as promo, ss_ticket_number as tickn, sum(ss_quantity) as quantities from store_sales group by cast(case when ss_sold_date_sk is null then 0 else ss_sold_date_sk end as int), cast(case when ss_sold_time_sk is null then 0 else ss_sold_time_sk end as bigint), cast(case when ss_item_sk is null then 0.0 else ss_item_sk end as float), cast(case when ss_customer_sk is null then 0.0 else ss_customer_sk end as decimal(18,9)), cast(case when ss_cdemo_sk is null then 0 else ss_cdemo_sk end as varchar(20)), ss_hdemo_sk, ss_addr_sk, ss_store_sk, ss_promo_sk, ss_ticket_number order by cast(case when ss_sold_date_sk is null then 0 else ss_sold_date_sk end as int), cast(case when ss_sold_time_sk is null then 0 else ss_sold_time_sk end as bigint), cast(case when ss_item_sk is null then 0.0 else ss_item_sk end as float), cast(case when ss_customer_sk is null then 0.0 else ss_customer_sk end as decimal(18,9)), cast(case when ss_cdemo_sk is null then 0 else ss_cdemo_sk end as varchar(20)), ss_hdemo_sk, ss_addr_sk, ss_store_sk, ss_promo_sk, ss_ticket_number limit 100
> Here is the error I saw:
> 11:46:46.836 [e88d1c5f-01f4-4e9a-a24f-a5601be809cf:frag:0:0 - Producer Thread] DEBUG o.apache.drill.exec.memory.Accountor - Fragment:0:0 Reserved 32768 bytes. Total Allocated: 778240
> 11:46:46.848 [e88d1c5f-01f4-4e9a-a24f-a5601be809cf:frag:0:0] DEBUG o.a.d.e.p.impl.common.HashTable - Put into hash table failed .. Retrying with new batch holder...
> .....
> 11:48:49.936 [e88d1c5f-01f4-4e9a-a24f-a5601be809cf:frag:0:0 - Producer Thread] DEBUG o.apache.drill.exec.memory.Accountor - Fragment:0:0 Reserved 32768 bytes. Total Allocated: 778240
> 11:48:49.947 [e88d1c5f-01f4-4e9a-a24f-a5601be809cf:frag:0:0] DEBUG o.a.d.e.p.impl.common.HashTable - Put into hash table failed .. Retrying with new batch holder...
> The data is tpcds and converted into json using drill's json writer. Since eventually the query completes and passes data verification, the json writer is probably converting parquet to json correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)