You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@asterixdb.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2023/01/30 03:27:00 UTC

[jira] [Commented] (ASTERIXDB-3043) ArrayIndexOutOfBoundsException due to multiple datasources in join branch

    [ https://issues.apache.org/jira/browse/ASTERIXDB-3043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17681856#comment-17681856 ] 

ASF subversion and git services commented on ASTERIXDB-3043:
------------------------------------------------------------

Commit 1f8236ca93bde4d80559359b14887feae90a1db5 in asterixdb's branch refs/heads/master from Ali Alsuliman
[ https://gitbox.apache.org/repos/asf?p=asterixdb.git;h=1f8236ca93 ]

[ASTERIXDB-3043][COMP] Fix handling of multiple datasources in join branch

- user model changes: no
- storage format changes: no
- interface changes: no

Details:
When analyzing and collecting information from a join branch,
make sure the correct datasource is used instead of using
the first datasource encountered in the branch path.

- Collect and map the datasources variables to their types/source indicators.
- Remove unused isUnnestOverVarAllowed since it's always passed false.
- In case the field name is nested (i.e. there are nested assigns that eventually lead to the datasource), use the optimization context to compute the intermediate nested assign variable.

Change-Id: I1ce48e0bb1198855b6d6d5efad49a19861a3ef98
Reviewed-on: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/16686
Integration-Tests: Jenkins <je...@fulliautomatix.ics.uci.edu>
Tested-by: Jenkins <je...@fulliautomatix.ics.uci.edu>
Reviewed-by: Ali Alsuliman <al...@gmail.com>
Reviewed-by: Glenn Galvizo <gg...@uci.edu>
Reviewed-by: Wail Alkowaileet <wa...@gmail.com>
Reviewed-on: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/17349
Reviewed-by: Michael Blow <mb...@apache.org>
Tested-by: Michael Blow <mb...@apache.org>


> ArrayIndexOutOfBoundsException due to multiple datasources in join branch
> -------------------------------------------------------------------------
>
>                 Key: ASTERIXDB-3043
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-3043
>             Project: Apache AsterixDB
>          Issue Type: Bug
>          Components: COMP - Compiler
>    Affects Versions: 0.9.7
>            Reporter: Ali Alsuliman
>            Assignee: Ali Alsuliman
>            Priority: Major
>             Fix For: 0.9.7
>
>
> The below query fails at compile time with ArrayIndexOutOfBoundsException:
> {noformat}
> USE tpcds;
> SELECT count (*) AS cnt
> FROM customer c, customer_demographics cd2, customer_address ca
> WHERE
>  c.c_current_cdemo_sk  /*+ indexnl */ = cd2.cd_demo_sk
>  AND c.c_current_addr_sk  /*+ indexnl */ = ca.ca_address_sk
>  AND c.c_birth_month in [4,5];{noformat}
> The cause of the failure is not properly handling multiple datasources in a join branch when analyzing and collecting the information from the join branches.
> DDLs:
> {code:java}
> DROP DATAVERSE tpcds IF EXISTS;
> CREATE DATAVERSE tpcds;
> USE tpcds;
> CREATE TYPE tpcds.catalog_sales_type AS
>  CLOSED {
>   cs_sold_date_sk:           bigint?,
>   cs_sold_time_sk:           bigint?,
>   cs_ship_date_sk:           bigint?,
>   cs_bill_customer_sk:       bigint?,
>   cs_bill_cdemo_sk:          bigint?,
>   cs_bill_hdemo_sk:          bigint?,
>   cs_bill_addr_sk:           bigint?,
>   cs_ship_customer_sk:       bigint?,
>   cs_ship_cdemo_sk:          bigint?,
>   cs_ship_hdemo_sk:          bigint?,
>   cs_ship_addr_sk:           bigint?,
>   cs_call_center_sk:         bigint?,
>   cs_catalog_page_sk:        bigint?,
>   cs_ship_mode_sk:           bigint?,
>   cs_warehouse_sk:           bigint?,
>   cs_item_sk:                bigint,
>   cs_promo_sk:               bigint?,
>   cs_order_number:           bigint,
>   cs_quantity:               bigint?,
>   cs_wholesale_cost:         double?,
>   cs_list_price:             double?,
>   cs_sales_price:            double?,
>   cs_ext_discount_amt:       double?,
>   cs_ext_sales_price:        double?,
>   cs_ext_wholesale_cost:     double?,
>   cs_ext_list_price:         double?,
>   cs_ext_tax:                double?,
>   cs_coupon_amt:             double?,
>   cs_ext_ship_cost:          double?,
>   cs_net_paid:               double?,
>   cs_net_paid_inc_tax:       double?,
>   cs_net_paid_inc_ship:      double?,
>   cs_net_paid_inc_ship_tax:  double?,
>   cs_net_profit:             double?
> };
> CREATE TYPE tpcds.customer_demographics_type AS
>  CLOSED {
>   cd_demo_sk : int64,
>   cd_gender : string?,
>   cd_marital_status : string?,
>   cd_education_status : string?,
>   cd_purchase_estimate : int64?,
>   cd_credit_rating : string?,
>   cd_dep_count : int64?,
>   cd_dep_employed_count : int64?,
>   cd_dep_college_count : int64?
> };
> CREATE TYPE tpcds.customer_type AS
>  CLOSED {
>   c_customer_sk : int64,
>   c_customer_id : string,
>   c_current_cdemo_sk : int64?,
>   c_current_hdemo_sk : int64?,
>   c_current_addr_sk : int64?,
>   c_first_shipto_date_sk : int64?,
>   c_first_sales_date_sk : int64?,
>   c_salutation : string?,
>   c_first_name : string?,
>   c_last_name : string?,
>   c_preferred_cust_flag : string?,
>   c_birth_day : int64?,
>   c_birth_month : int64?,
>   c_birth_year : int64?,
>   c_birth_country : string?,
>   c_login : string?,
>   c_email_address : string?,
>   c_last_review_date : string?
> };
> CREATE TYPE tpcds.customer_address_type AS
>  CLOSED {
>   ca_address_sk : bigint,
>   ca_address_id : string,
>   ca_street_number : string?,
>   ca_street_name : string?,
>   ca_street_type : string?,
>   ca_suite_number : string?,
>   ca_city : string?,
>   ca_county : string?,
>   ca_state : string?,
>   ca_zip : string?,
>   ca_country : string?,
>   ca_gmt_offset : double?,
>   ca_location_type : string?
>  };
> CREATE TYPE tpcds.item_type AS
>  CLOSED {
>   i_item_sk : bigint,
>   i_item_id : string,
>   i_rec_start_date : string?,
>   i_rec_end_date : string?,
>   i_item_desc : string?,
>   i_current_price : double?,
>   i_wholesale_cost : double?,
>   i_brand_id : bigint?,
>   i_brand : string?,
>   i_class_id : bigint?,
>   i_class : string?,
>   i_category_id : bigint?,
>   i_category : string?,
>   i_manufact_id : bigint?,
>   i_manufact : string?,
>   i_size : string?,
>   i_formulation : string?,
>   i_color : string?,
>   i_units : string?,
>   i_container : string?,
>   i_manager_id : bigint?,
>   i_product_name : string?
> };
> CREATE TYPE tpcds.date_dim_type AS
>  CLOSED {
>   d_date_sk : bigint,
>   d_date_id : string,
>   d_date : string?,
>   d_month_seq : bigint?,
>   d_week_seq : bigint?,
>   d_quarter_seq : bigint?,
>   d_year : bigint? ,
>   d_dow : bigint? ,
>   d_moy : bigint?,
>   d_dom : bigint?,
>   d_qoy : bigint?,
>   d_fy_year : bigint?,
>   d_fy_quarter_seq : bigint?,
>   d_fy_week_seq : bigint?,
>   d_day_name : string?,
>   d_quarter_name : string?,
>   d_holiday : string?,
>   d_weekend : string?,
>   d_following_holiday : string?,
>   d_first_dom : bigint?,
>   d_last_dom : bigint?,
>   d_same_day_ly : bigint?,
>   d_same_day_lq : bigint?,
>   d_current_day : string?,
>   d_current_week : string?,
>   d_current_month : string?,
>   d_current_quarter : string?,
>   d_current_year : string?
> };
> CREATE DATASET catalog_sales(catalog_sales_type) PRIMARY KEY cs_item_sk, cs_order_number;
> CREATE DATASET customer_demographics(customer_demographics_type) PRIMARY KEY cd_demo_sk;
> CREATE DATASET customer(customer_type) PRIMARY KEY c_customer_sk;
> CREATE DATASET customer_address(customer_address_type) PRIMARY KEY ca_address_sk;
> CREATE DATASET item(item_type) PRIMARY KEY i_item_sk;
> CREATE DATASET date_dim(date_dim_type) PRIMARY KEY d_date_sk; {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)