You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2015/07/14 05:18:04 UTC

[jira] [Commented] (SPARK-6851) Wrong answers for self joins of converted parquet relations

    [ https://issues.apache.org/jira/browse/SPARK-6851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14625755#comment-14625755 ] 

Apache Spark commented on SPARK-6851:
-------------------------------------

User 'adrian-wang' has created a pull request for this issue:
https://github.com/apache/spark/pull/7387

> Wrong answers for self joins of converted parquet relations
> -----------------------------------------------------------
>
>                 Key: SPARK-6851
>                 URL: https://issues.apache.org/jira/browse/SPARK-6851
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.3.1
>            Reporter: Michael Armbrust
>            Assignee: Michael Armbrust
>            Priority: Blocker
>             Fix For: 1.3.1, 1.4.0
>
>
> From the user list (
> /cc [~chinnitv])  When the same relation exists twice in a query plan, our new caching logic replaces both instances with identical replacements.  The bug can be see in the following transformation:
> {code}
> === Applying Rule org.apache.spark.sql.hive.HiveMetastoreCatalog$ParquetConversions ===
> !Project [state#59,month#60]                                           'Project [state#105,month#106]
> ! Join Inner, Some(((state#69 = state#59) && (month#70 = month#60)))    'Join Inner, Some(((state#105 = state#105) && (month#106 = month#106)))
> !  MetastoreRelation default, orders, None                               Subquery orders
> !  Subquery ao                                                            Relation[id#97,category#98,make#99,type#100,price#101,pdate#102,customer#103,city#104,state#105,month#106] org.apache.spark.sql.parquet.ParquetRelation2
> !   Distinct                                                             Subquery ao
> !    Project [state#69,month#70]                                          Distinct
> !     Join Inner, Some((id#81 = id#71))                                    Project [state#105,month#106]
> !      MetastoreRelation default, orders, None                              Join Inner, Some((id#115 = id#97))
> !      MetastoreRelation default, orderupdates, None                         Subquery orders
> !                                                                             Relation[id#97,category#98,make#99,type#100,price#101,pdate#102,customer#103,city#104,state#105,month#106] org.apache.spark.sql.parquet.ParquetRelation2
> !                                                                            Subquery orderupdates
> !                                                                             Relation[id#115,category#116,make#117,type#118,price#119,pdate#120,customer#121,city#122,state#123,month#124] org.apache.spark.sql.parquet.ParquetRelation2
> {code} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org