You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/05/21 04:25:23 UTC
[jira] [Updated] (SPARK-15005) Usage of Temp Table twice in Hive query fails with bad error

     [ https://issues.apache.org/jira/browse/SPARK-15005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon updated SPARK-15005:
---------------------------------
    Labels: bulk-closed  (was: )

> Usage of Temp Table twice in Hive query fails with bad error
> ------------------------------------------------------------
>
>                 Key: SPARK-15005
>                 URL: https://issues.apache.org/jira/browse/SPARK-15005
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.6.0
>            Reporter: dciborow
>            Priority: Minor
>              Labels: bulk-closed
>
> When converting a Hive ETL process from Hive to Spark, adjustments might be made to the query. One adjustment is that the Hive query might query from the same table more then once in an join the results together. When Spark tries to process this query it provides an very poor error message, that does not help the user determine what has gone wrong. It should be simple to detect this, and properly report it to the user. 
> Sample Query that contains the error(edited for this post so might not run)
> SELECT
> |            enc.id
> |            enc.name,
> |            enc.sum
> |            FROM
> |            (
> |                SELECT
> |                    *
> |                FROM
> |                    table1
> |        JOIN
> |            (
> |                SELECT
> |                    id,
> |                    SUM(impressions) AS
> |                    sum_impressions,
> |                FROM
> |                    table1 enc
> |                GROUP BY
> |                    enc.id) enc1
> |        ON
> |            (
> |                enc.id = enc1.id)
> Error Message(had to edit to remove a bunch of field names, but tried to leave everything I could)
> 16/04/28 15:47:09 INFO ParseDriver: Parse Completed
> org.apache.spark.sql.AnalysisException: resolved attribute(s) [_id#3372,], [HiveWindowFunction#org.apache.hadoop.hive.ql.udf.generic.GenericUDAFSum(unique_audience#3380) windowspecdefinition(id#3372,ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS _we0#530,HiveWindowFunction#org.apache.hadoop.hive.ql.udf.generic.GenericUDAFSum(total_impressions#3382) windowspecdefinition(id#3372,,ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS _we1#531], [id#3372,];



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org