You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/11/01 18:17:32 UTC

[GitHub] [arrow-datafusion] andygrove opened a new issue, #4065: Stack overflow planning complex query

andygrove opened a new issue, #4065:
URL: https://github.com/apache/arrow-datafusion/issues/4065

   **Describe the bug**
   There is a regression since 13.0.0 affecting TPC-DS query 4. I now get a stack overflow during query planning and optimization,
   
   **To Reproduce**
   
   ```sql
   with year_total as (
    select c_customer_id customer_id
          ,c_first_name customer_first_name
          ,c_last_name customer_last_name
          ,c_preferred_cust_flag customer_preferred_cust_flag
          ,c_birth_country customer_birth_country
          ,c_login customer_login
          ,c_email_address customer_email_address
          ,d_year dyear
          ,sum(((ss_ext_list_price-ss_ext_wholesale_cost-ss_ext_discount_amt)+ss_ext_sales_price)/2) year_total
          ,'s' sale_type
    from customer
        ,store_sales
        ,date_dim
    where c_customer_sk = ss_customer_sk
      and ss_sold_date_sk = d_date_sk
    group by c_customer_id
            ,c_first_name
            ,c_last_name
            ,c_preferred_cust_flag
            ,c_birth_country
            ,c_login
            ,c_email_address
            ,d_year
    union all
    select c_customer_id customer_id
          ,c_first_name customer_first_name
          ,c_last_name customer_last_name
          ,c_preferred_cust_flag customer_preferred_cust_flag
          ,c_birth_country customer_birth_country
          ,c_login customer_login
          ,c_email_address customer_email_address
          ,d_year dyear
          ,sum((((cs_ext_list_price-cs_ext_wholesale_cost-cs_ext_discount_amt)+cs_ext_sales_price)/2) ) year_total
          ,'c' sale_type
    from customer
        ,catalog_sales
        ,date_dim
    where c_customer_sk = cs_bill_customer_sk
      and cs_sold_date_sk = d_date_sk
    group by c_customer_id
            ,c_first_name
            ,c_last_name
            ,c_preferred_cust_flag
            ,c_birth_country
            ,c_login
            ,c_email_address
            ,d_year
   union all
    select c_customer_id customer_id
          ,c_first_name customer_first_name
          ,c_last_name customer_last_name
          ,c_preferred_cust_flag customer_preferred_cust_flag
          ,c_birth_country customer_birth_country
          ,c_login customer_login
          ,c_email_address customer_email_address
          ,d_year dyear
          ,sum((((ws_ext_list_price-ws_ext_wholesale_cost-ws_ext_discount_amt)+ws_ext_sales_price)/2) ) year_total
          ,'w' sale_type
    from customer
        ,web_sales
        ,date_dim
    where c_customer_sk = ws_bill_customer_sk
      and ws_sold_date_sk = d_date_sk
    group by c_customer_id
            ,c_first_name
            ,c_last_name
            ,c_preferred_cust_flag
            ,c_birth_country
            ,c_login
            ,c_email_address
            ,d_year
            )
     select  
                     t_s_secyear.customer_id
                    ,t_s_secyear.customer_first_name
                    ,t_s_secyear.customer_last_name
                    ,t_s_secyear.customer_email_address
    from year_total t_s_firstyear
        ,year_total t_s_secyear
        ,year_total t_c_firstyear
        ,year_total t_c_secyear
        ,year_total t_w_firstyear
        ,year_total t_w_secyear
    where t_s_secyear.customer_id = t_s_firstyear.customer_id
      and t_s_firstyear.customer_id = t_c_secyear.customer_id
      and t_s_firstyear.customer_id = t_c_firstyear.customer_id
      and t_s_firstyear.customer_id = t_w_firstyear.customer_id
      and t_s_firstyear.customer_id = t_w_secyear.customer_id
      and t_s_firstyear.sale_type = 's'
      and t_c_firstyear.sale_type = 'c'
      and t_w_firstyear.sale_type = 'w'
      and t_s_secyear.sale_type = 's'
      and t_c_secyear.sale_type = 'c'
      and t_w_secyear.sale_type = 'w'
      and t_s_firstyear.dyear =  2001
      and t_s_secyear.dyear = 2001+1
      and t_c_firstyear.dyear =  2001
      and t_c_secyear.dyear =  2001+1
      and t_w_firstyear.dyear = 2001
      and t_w_secyear.dyear = 2001+1
      and t_s_firstyear.year_total > 0
      and t_c_firstyear.year_total > 0
      and t_w_firstyear.year_total > 0
      and case when t_c_firstyear.year_total > 0 then t_c_secyear.year_total / t_c_firstyear.year_total else null end
              > case when t_s_firstyear.year_total > 0 then t_s_secyear.year_total / t_s_firstyear.year_total else null end
      and case when t_c_firstyear.year_total > 0 then t_c_secyear.year_total / t_c_firstyear.year_total else null end
              > case when t_w_firstyear.year_total > 0 then t_w_secyear.year_total / t_w_firstyear.year_total else null end
    order by t_s_secyear.customer_id
            ,t_s_secyear.customer_first_name
            ,t_s_secyear.customer_last_name
            ,t_s_secyear.customer_email_address
   limit 100;
   ```
   
   **Expected behavior**
   A clear and concise description of what you expected to happen.
   
   **Additional context**
   Related to https://github.com/apache/arrow-datafusion/issues/1444 and https://github.com/apache/arrow-datafusion/issues/1434
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb closed issue #4065: Stack overflow planning complex query

Posted by GitBox <gi...@apache.org>.
alamb closed issue #4065: Stack overflow planning complex query
URL: https://github.com/apache/arrow-datafusion/issues/4065


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on issue #4065: Stack overflow planning complex query

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #4065:
URL: https://github.com/apache/arrow-datafusion/issues/4065#issuecomment-1367270089

   Note I fixed a seemingly similar stack overflow in the past via https://github.com/apache/arrow-datafusion/pull/1047 maybe we can apply the same sort of workaround here (which was basically to make more smaller functions)
   
   In this prior case release mode was just fine, but debug mode was not
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on issue #4065: Stack overflow planning complex query

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #4065:
URL: https://github.com/apache/arrow-datafusion/issues/4065#issuecomment-1368225049

   Proposed fix here https://github.com/apache/arrow-datafusion/pull/4779#


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on issue #4065: Stack overflow planning complex query

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #4065:
URL: https://github.com/apache/arrow-datafusion/issues/4065#issuecomment-1300505060

   Also related https://github.com/apache/arrow-datafusion/issues/4066


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org