You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Joe McDonnell (Jira)" <ji...@apache.org> on 2020/02/10 18:57:00 UTC

[jira] [Resolved] (IMPALA-8346) Testcase builder for query planner introduced a data load regression on remote clusters

     [ https://issues.apache.org/jira/browse/IMPALA-8346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joe McDonnell resolved IMPALA-8346.
-----------------------------------
    Fix Version/s: Impala 3.3.0
       Resolution: Fixed

> Testcase builder for query planner introduced a data load regression on remote clusters
> ---------------------------------------------------------------------------------------
>
>                 Key: IMPALA-8346
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8346
>             Project: IMPALA
>          Issue Type: Bug
>    Affects Versions: Impala 3.2.0, Impala 3.3.0
>            Reporter: David Knupp
>            Assignee: Bharath Vissapragada
>            Priority: Blocker
>             Fix For: Impala 3.3.0
>
>
> The patch to address IMPALA-5872 introduced a new script into our data load process. This script has been tested against the single-node mini-cluster, but doesn't appear to run against actual (remote) clusters.
> {noformat}
> Starting Impala Shell without Kerberos authentication
> Opened TCP connection to remote-coordinator-node.mycompany.com:21000
> Connected to remote-coordinator-node.mycompany.com:21000
> Server version: impalad version 3.2.0-cdh6.x-SNAPSHOT RELEASE (build 2e55383eb86de20499e2f9327cd5bcbda6788e50)
> Query: use `tpcds`
> Query: use `tpcds`
> Query: COPY TESTCASE TO '/test-warehouse/tpcds-testcase-data' -- start query 1 in stream 0 using template query11.tpl
> with year_total as (
>  select c_customer_id customer_id
>        ,c_first_name customer_first_name
>        ,c_last_name customer_last_name
>        ,c_preferred_cust_flag customer_preferred_cust_flag
>        ,c_birth_country customer_birth_country
>        ,c_login customer_login
>        ,c_email_address customer_email_address
>        ,d_year dyear
>        ,sum(ss_ext_list_price-ss_ext_discount_amt) year_total
>        ,'s' sale_type
>  from customer
>      ,store_sales
>      ,date_dim
>  where c_customer_sk = ss_customer_sk
>    and ss_sold_date_sk = d_date_sk
>  group by c_customer_id
>          ,c_first_name
>          ,c_last_name
>          ,c_preferred_cust_flag
>          ,c_birth_country
>          ,c_login
>          ,c_email_address
>          ,d_year
>  union all
>  select c_customer_id customer_id
>        ,c_first_name customer_first_name
>        ,c_last_name customer_last_name
>        ,c_preferred_cust_flag customer_preferred_cust_flag
>        ,c_birth_country customer_birth_country
>        ,c_login customer_login
>        ,c_email_address customer_email_address
>        ,d_year dyear
>        ,sum(ws_ext_list_price-ws_ext_discount_amt) year_total
>        ,'w' sale_type
>  from customer
>      ,web_sales
>      ,date_dim
>  where c_customer_sk = ws_bill_customer_sk
>    and ws_sold_date_sk = d_date_sk
>  group by c_customer_id
>          ,c_first_name
>          ,c_last_name
>          ,c_preferred_cust_flag
>          ,c_birth_country
>          ,c_login
>          ,c_email_address
>          ,d_year
>          )
>   select
>                   t_s_secyear.customer_id
>                  ,t_s_secyear.customer_first_name
>                  ,t_s_secyear.customer_last_name
>                  ,t_s_secyear.customer_email_address
>  from year_total t_s_firstyear
>      ,year_total t_s_secyear
>      ,year_total t_w_firstyear
>      ,year_total t_w_secyear
>  where t_s_secyear.customer_id = t_s_firstyear.customer_id
>          and t_s_firstyear.customer_id = t_w_secyear.customer_id
>          and t_s_firstyear.customer_id = t_w_firstyear.customer_id
>          and t_s_firstyear.sale_type = 's'
>          and t_w_firstyear.sale_type = 'w'
>          and t_s_secyear.sale_type = 's'
>          and t_w_secyear.sale_type = 'w'
>          and t_s_firstyear.dyear = 2001
>          and t_s_secyear.dyear = 2001+1
>          and t_w_firstyear.dyear = 2001
>          and t_w_secyear.dyear = 2001+1
>          and t_s_firstyear.year_total > 0
>          and t_w_firstyear.year_total > 0
>          and case when t_w_firstyear.year_total > 0 then t_w_secyear.year_total / t_w_firstyear.year_total else 0.0 end
>              > case when t_s_firstyear.year_total > 0 then t_s_secyear.year_total / t_s_firstyear.year_total else 0.0 end
>  order by t_s_secyear.customer_id
>          ,t_s_secyear.customer_first_name
>          ,t_s_secyear.customer_last_name
>          ,t_s_secyear.customer_email_address
> limit 100
> Query submitted at: 2019-03-23 23:40:12 (Coordinator: http://remote-coordinator-node.mycompany.com:25000)
> ERROR: ImpalaRuntimeException: Error writing test case output to file: hdfs://namenode.mycompany.com:8020/test-warehouse/tpcds-testcase-data/impala-testcase-data-6430bc87-5337-4e65-b6aa-d059088f3a4b
> CAUSED BY: AccessControlException: Permission denied: user=impala, access=WRITE, inode="/test-warehouse/tpcds-testcase-data":hdfs:hdfs:drwxr-xr-x
> [...]
> Could not execute command: COPY TESTCASE TO '/test-warehouse/tpcds-testcase-data' -- start query 1 in stream 0 using template query11.tpl
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)