You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Feng Zhu (Jira)" <ji...@apache.org> on 2022/01/17 10:50:00 UTC

[jira] [Created] (SPARK-37932) Analyzer can fail when join left side and right side are the same view

Feng Zhu created SPARK-37932:
--------------------------------

             Summary: Analyzer can fail when join left side and right side are the same view
                 Key: SPARK-37932
                 URL: https://issues.apache.org/jira/browse/SPARK-37932
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 3.2.0
            Reporter: Feng Zhu


with the View P_LINEORDER in SSB 

the sql is :

use ssb;
SELECT l1.LO_CUSTKEY
    FROM SSB.P_LINEORDER l1
        INNER JOIN (
            SELECT LO_CUSTKEY
            FROM SSB.P_LINEORDER
            GROUP BY LO_CUSTKEY
            HAVING COUNT(DISTINCT LO_SUPPKEY) > 1
        ) l2
        ON l1.LO_CUSTKEY = l2.LO_CUSTKEY
    GROUP BY l1.LO_SUPPKEY, l1.LO_CUSTKEY;

 

and the exception is :

[HIVE Exception] Resolved attribute(s) LO_SUPPKEY#337 missing from lo_orderkey#386L,lo_linenumber#387L,lo_custkey#388,lo_partkey#389,lo_suppkey#390,lo_orderdate#391,lo_orderpriotity#392,lo_shippriotity#393,lo_quantity#394L,lo_extendedprice#395L,lo_ordtotalprice#396L,lo_discount#397L,lo_revenue#398L,lo_supplycost#399L,lo_tax#400L,lo_commitdate#401,lo_shipmode#402,v_revenue#403L in operator !Aggregate [LO_CUSTKEY#388], [LO_CUSTKEY#388, count(distinct LO_SUPPKEY#337) AS count(distinct LO_SUPPKEY#337)#404L]. Attribute(s) with the same name appear in the operation: LO_SUPPKEY. Please check if the right attribute(s) are used.;
GlobalLimit 500
+- LocalLimit 500
   +- Aggregate [LO_SUPPKEY#337, LO_CUSTKEY#335], [LO_CUSTKEY#335]
      +- Join Inner, (LO_CUSTKEY#335 = LO_CUSTKEY#388)
         :- SubqueryAlias L1
         :  +- SubqueryAlias spark_catalog.ssb.p_lineorder
         :     +- View (`ssb`.`p_lineorder`, [lo_orderkey#333L,lo_linenumber#334L,lo_custkey#335,lo_partkey#336,lo_suppkey#337,lo_orderdate#338,lo_orderpriotity#339,lo_shippriotity#340,lo_quantity#341L,lo_extendedprice#342L,lo_ordtotalprice#343L,lo_discount#344L,lo_revenue#345L,lo_supplycost#346L,lo_tax#347L,lo_commitdate#348,lo_shipmode#349,v_revenue#350L])
         :        +- Project [cast(lo_orderkey#351L as bigint) AS lo_orderkey#333L, cast(lo_linenumber#352L as bigint) AS lo_linenumber#334L, cast(lo_custkey#353 as int) AS lo_custkey#335, cast(lo_partkey#354 as int) AS lo_partkey#336, cast(lo_suppkey#355 as int) AS lo_suppkey#337, cast(lo_orderdate#356 as date) AS lo_orderdate#338, cast(lo_orderpriotity#357 as string) AS lo_orderpriotity#339, cast(lo_shippriotity#358 as int) AS lo_shippriotity#340, cast(lo_quantity#359L as bigint) AS lo_quantity#341L, cast(lo_extendedprice#360L as bigint) AS lo_extendedprice#342L, cast(lo_ordtotalprice#361L as bigint) AS lo_ordtotalprice#343L, cast(lo_discount#362L as bigint) AS lo_discount#344L, cast(lo_revenue#363L as bigint) AS lo_revenue#345L, cast(lo_supplycost#364L as bigint) AS lo_supplycost#346L, cast(lo_tax#365L as bigint) AS lo_tax#347L, cast(lo_commitdate#366 as date) AS lo_commitdate#348, cast(lo_shipmode#367 as string) AS lo_shipmode#349, cast(V_REVENUE#332L as bigint) AS v_revenue#350L]
         :           +- Project [lo_orderkey#351L, lo_linenumber#352L, lo_custkey#353, lo_partkey#354, lo_suppkey#355, lo_orderdate#356, lo_orderpriotity#357, lo_shippriotity#358, lo_quantity#359L, lo_extendedprice#360L, lo_ordtotalprice#361L, lo_discount#362L, lo_revenue#363L, lo_supplycost#364L, lo_tax#365L, lo_commitdate#366, lo_shipmode#367, (lo_extendedprice#360L * lo_discount#362L) AS V_REVENUE#332L]
         :              +- SubqueryAlias spark_catalog.ssb.lineorder
         :                 +- HiveTableRelation [`ssb`.`lineorder`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [lo_orderkey#351L, lo_linenumber#352L, lo_custkey#353, lo_partkey#354, lo_suppkey#355, lo_orderda..., Partition Cols: []]
         +- SubqueryAlias L2
            +- Project [LO_CUSTKEY#388]
               +- Filter (count(distinct LO_SUPPKEY#337)#404L > cast(1 as bigint))
                  +- !Aggregate [LO_CUSTKEY#388], [LO_CUSTKEY#388, count(distinct LO_SUPPKEY#337) AS count(distinct LO_SUPPKEY#337)#404L]
                     +- SubqueryAlias spark_catalog.ssb.p_lineorder
                        +- View (`ssb`.`p_lineorder`, [lo_orderkey#386L,lo_linenumber#387L,lo_custkey#388,lo_partkey#389,lo_suppkey#390,lo_orderdate#391,lo_orderpriotity#392,lo_shippriotity#393,lo_quantity#394L,lo_extendedprice#395L,lo_ordtotalprice#396L,lo_discount#397L,lo_revenue#398L,lo_supplycost#399L,lo_tax#400L,lo_commitdate#401,lo_shipmode#402,v_revenue#403L])
                           +- Project [cast(lo_orderkey#368L as bigint) AS lo_orderkey#386L, cast(lo_linenumber#369L as bigint) AS lo_linenumber#387L, cast(lo_custkey#370 as int) AS lo_custkey#388, cast(lo_partkey#371 as int) AS lo_partkey#389, cast(lo_suppkey#372 as int) AS lo_suppkey#390, cast(lo_orderdate#373 as date) AS lo_orderdate#391, cast(lo_orderpriotity#374 as string) AS lo_orderpriotity#392, cast(lo_shippriotity#375 as int) AS lo_shippriotity#393, cast(lo_quantity#376L as bigint) AS lo_quantity#394L, cast(lo_extendedprice#377L as bigint) AS lo_extendedprice#395L, cast(lo_ordtotalprice#378L as bigint) AS lo_ordtotalprice#396L, cast(lo_discount#379L as bigint) AS lo_discount#397L, cast(lo_revenue#380L as bigint) AS lo_revenue#398L, cast(lo_supplycost#381L as bigint) AS lo_supplycost#399L, cast(lo_tax#382L as bigint) AS lo_tax#400L, cast(lo_commitdate#383 as date) AS lo_commitdate#401, cast(lo_shipmode#384 as string) AS lo_shipmode#402, cast(V_REVENUE#332L as bigint) AS v_revenue#403L]
                              +- Project [lo_orderkey#368L, lo_linenumber#369L, lo_custkey#370, lo_partkey#371, lo_suppkey#372, lo_orderdate#373, lo_orderpriotity#374, lo_shippriotity#375, lo_quantity#376L, lo_extendedprice#377L, lo_ordtotalprice#378L, lo_discount#379L, lo_revenue#380L, lo_supplycost#381L, lo_tax#382L, lo_commitdate#383, lo_shipmode#384, (lo_extendedprice#377L * lo_discount#379L) AS V_REVENUE#332L]
                                 +- SubqueryAlias spark_catalog.ssb.lineorder
                                    +- HiveTableRelation [`ssb`.`lineorder`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [lo_orderkey#368L, lo_linenumber#369L, lo_custkey#370, lo_partkey#371, lo_suppkey#372, lo_orderda..., Partition Cols: []]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org