You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Damien Carol (JIRA)" <ji...@apache.org> on 2015/01/13 17:13:34 UTC

[jira] [Commented] (SPARK-4794) Wrong parse of GROUP BY query

    [ https://issues.apache.org/jira/browse/SPARK-4794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14275464#comment-14275464 ] 

Damien Carol commented on SPARK-4794:
-------------------------------------

[~marmbrus] Sorry for the late answer.

For the record, I'm testing this query every commit (trunk branch on git) without sucess since I created this ticket.

Here the details (EXPLAIN <query>) :
{noformat}
explain [...]
{noformat}

{noformat}
== Physical Plan ==
Project [Annee#3676,Mois#3677,Jour#3678,Heure#3679,Societe#3680,Magasin#3681,CF Presentee#3682,CompteCarteFidelite#3683,NbCompteCarteFidelite#3684,DetentionCF#3685,NbCarteFidelite#3686,PlageDUCB#3687,NbCheque#3688L,CACheque#3689,NbImpaye#3690,NbEnsemble#3691L,NbCompte#3692,ResteDuImpaye#3693]
 !Sort [annee#3695 ASC,mois#3696 ASC,jour#3697 ASC,heure#3698 ASC,nom_societe#3699 ASC,id_magasin#3700 ASC,CarteFidelitePresentee#3702 ASC,CompteCarteFidelite#3705 ASC,NbCompteCarteFidelite#3706 ASC,DetentionCF#3703 ASC,NbCarteFidelite#3704 ASC,Id_CF_Dim_DUCB#3707 ASC], true
  !Exchange (RangePartitioning [annee#3695 ASC,mois#3696 ASC,jour#3697 ASC,heure#3698 ASC,nom_societe#3699 ASC,id_magasin#3700 ASC,CarteFidelitePresentee#3702 ASC,CompteCarteFidelite#3705 ASC,NbCompteCarteFidelite#3706 ASC,DetentionCF#3703 ASC,NbCarteFidelite#3704 ASC,Id_CF_Dim_DUCB#3707 ASC], 200)
   !OutputFaker [Annee#3676,Mois#3677,Jour#3678,Heure#3679,Societe#3680,Magasin#3681,CF Presentee#3682,CompteCarteFidelite#3683,NbCompteCarteFidelite#3684,DetentionCF#3685,NbCarteFidelite#3686,PlageDUCB#3687,NbCheque#3688L,CACheque#3689,NbImpaye#3690,NbEnsemble#3691L,NbCompte#3692,ResteDuImpaye#3693,Mois#3677,Annee#3676,Jour#3678,id_magasin#3700,DetentionCF#3685,CompteCarteFidelite#3683,nom_societe#3699,NbCarteFidelite#3686,NbCompteCarteFidelite#3684,CarteFidelitePresentee#3702,Id_CF_Dim_DUCB#3707,Heure#3679]
    Project [annee#3715 AS Annee#3676,mois#3716 AS Mois#3677,jour#3717 AS Jour#3678,heure#3718 AS Heure#3679,nom_societe#3719 AS Societe#3680,id_magasin#3720 AS Magasin#3681,CarteFidelitePresentee#3722 AS CF Presentee#3682,CompteCarteFidelite#3725 AS CompteCarteFidelite#3683,NbCompteCarteFidelite#3726 AS NbCompteCarteFidelite#3684,DetentionCF#3723 AS DetentionCF#3685,NbCarteFidelite#3724 AS NbCarteFidelite#3686,Id_CF_Dim_DUCB#3727 AS PlageDUCB#3687,NbCheque#3729L AS NbCheque#3688L,CACheque#3730 AS CACheque#3689,NbImpaye#3731 AS NbImpaye#3690,Id_Ensemble#3732L AS NbEnsemble#3691L,ZIBZIN#3734 AS NbCompte#3692,ResteDuImpaye#3733 AS ResteDuImpaye#3693,mois#3716,annee#3715,jour#3717,id_magasin#3720,DetentionCF#3723,CompteCarteFidelite#3725,nom_societe#3719,NbCarteFidelite#3724,NbCompteCarteFidelite#3726,CarteFidelitePresentee#3722,Id_CF_Dim_DUCB#3727,heure#3718]
     Filter ((((annee#3715 = 2014) && (mois#3716 = 1)) && (jour#3717 = 25)) && (id_magasin#3720 = 649))
      ParquetTableScan [Id_CF_Dim_DUCB#3727,ResteDuImpaye#3733,NbCarteFidelite#3724,heure#3718,mois#3716,CompteCarteFidelite#3725,annee#3715,CarteFidelitePresentee#3722,CACheque#3730,NbImpaye#3731,ZIBZIN#3734,NbCompteCarteFidelite#3726,DetentionCF#3723,id_magasin#3720,nom_societe#3719,Id_Ensemble#3732L,jour#3717,NbCheque#3729L], (ParquetRelation hdfs://nc-h07/user/hive/warehouse/testsimon3.db/cf_encaissement_fact_pq, Some(Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml), org.apache.spark.sql.hive.HiveContext@7db3bcc, []), []
{noformat}

Complete stack trace :
{noformat}


15/01/13 17:10:32 INFO SparkExecuteStatementOperation: Running query ' CACHE TABLE A_1421165432909 AS select
    `cf_encaissement_fact_pq`.`annee` as `Annee`,
    `cf_encaissement_fact_pq`.`mois` as `Mois`,
    `cf_encaissement_fact_pq`.`jour` as `Jour`,
    `cf_encaissement_fact_pq`.`heure` as `Heure`,
    `cf_encaissement_fact_pq`.`nom_societe` as `Societe`,
    `cf_encaissement_fact_pq`.`id_magasin` as `Magasin`,
    `cf_encaissement_fact_pq`.`CarteFidelitePresentee` as `CF Presentee`,
    `cf_encaissement_fact_pq`.`CompteCarteFidelite` as `CompteCarteFidelite`,
    `cf_encaissement_fact_pq`.`NbCompteCarteFidelite` as `NbCompteCarteFidelite`,
    `cf_encaissement_fact_pq`.`DetentionCF` as `DetentionCF`,
    `cf_encaissement_fact_pq`.`NbCarteFidelite` as `NbCarteFidelite`,
    `cf_encaissement_fact_pq`.`Id_CF_Dim_DUCB` as `PlageDUCB`,
    `cf_encaissement_fact_pq`.`NbCheque` as `NbCheque`,
    `cf_encaissement_fact_pq`.`CACheque` as `CACheque`,
    `cf_encaissement_fact_pq`.`NbImpaye` as `NbImpaye`,
    `cf_encaissement_fact_pq`.`Id_Ensemble` as `NbEnsemble`,
    `cf_encaissement_fact_pq`.`ZIBZIN` as `NbCompte`,
    `cf_encaissement_fact_pq`.`ResteDuImpaye` as `ResteDuImpaye`
from
    `testsimon3`.`cf_encaissement_fact_pq` as `cf_encaissement_fact_pq`
where
    `cf_encaissement_fact_pq`.`annee` = 2014
and
    `cf_encaissement_fact_pq`.`mois` = 1
and
    `cf_encaissement_fact_pq`.`jour` = 25
and
    `cf_encaissement_fact_pq`.`id_magasin` = 649
order by
    `cf_encaissement_fact_pq`.`annee` ASC,
    `cf_encaissement_fact_pq`.`mois` ASC,
    `cf_encaissement_fact_pq`.`jour` ASC,
    `cf_encaissement_fact_pq`.`heure` ASC,
    `cf_encaissement_fact_pq`.`nom_societe` ASC,
    `cf_encaissement_fact_pq`.`id_magasin` ASC,
    `cf_encaissement_fact_pq`.`CarteFidelitePresentee` ASC,
    `cf_encaissement_fact_pq`.`CompteCarteFidelite` ASC,
    `cf_encaissement_fact_pq`.`NbCompteCarteFidelite` ASC,
    `cf_encaissement_fact_pq`.`DetentionCF` ASC,
    `cf_encaissement_fact_pq`.`NbCarteFidelite` ASC,
    `cf_encaissement_fact_pq`.`Id_CF_Dim_DUCB` ASC'
15/01/13 17:10:32 INFO ParseDriver: Parsing command: select
    `cf_encaissement_fact_pq`.`annee` as `Annee`,
    `cf_encaissement_fact_pq`.`mois` as `Mois`,
    `cf_encaissement_fact_pq`.`jour` as `Jour`,
    `cf_encaissement_fact_pq`.`heure` as `Heure`,
    `cf_encaissement_fact_pq`.`nom_societe` as `Societe`,
    `cf_encaissement_fact_pq`.`id_magasin` as `Magasin`,
    `cf_encaissement_fact_pq`.`CarteFidelitePresentee` as `CF Presentee`,
    `cf_encaissement_fact_pq`.`CompteCarteFidelite` as `CompteCarteFidelite`,
    `cf_encaissement_fact_pq`.`NbCompteCarteFidelite` as `NbCompteCarteFidelite`,
    `cf_encaissement_fact_pq`.`DetentionCF` as `DetentionCF`,
    `cf_encaissement_fact_pq`.`NbCarteFidelite` as `NbCarteFidelite`,
    `cf_encaissement_fact_pq`.`Id_CF_Dim_DUCB` as `PlageDUCB`,
    `cf_encaissement_fact_pq`.`NbCheque` as `NbCheque`,
    `cf_encaissement_fact_pq`.`CACheque` as `CACheque`,
    `cf_encaissement_fact_pq`.`NbImpaye` as `NbImpaye`,
    `cf_encaissement_fact_pq`.`Id_Ensemble` as `NbEnsemble`,
    `cf_encaissement_fact_pq`.`ZIBZIN` as `NbCompte`,
    `cf_encaissement_fact_pq`.`ResteDuImpaye` as `ResteDuImpaye`
from
    `testsimon3`.`cf_encaissement_fact_pq` as `cf_encaissement_fact_pq`
where
    `cf_encaissement_fact_pq`.`annee` = 2014
and
    `cf_encaissement_fact_pq`.`mois` = 1
and
    `cf_encaissement_fact_pq`.`jour` = 25
and
    `cf_encaissement_fact_pq`.`id_magasin` = 649
order by
    `cf_encaissement_fact_pq`.`annee` ASC,
    `cf_encaissement_fact_pq`.`mois` ASC,
    `cf_encaissement_fact_pq`.`jour` ASC,
    `cf_encaissement_fact_pq`.`heure` ASC,
    `cf_encaissement_fact_pq`.`nom_societe` ASC,
    `cf_encaissement_fact_pq`.`id_magasin` ASC,
    `cf_encaissement_fact_pq`.`CarteFidelitePresentee` ASC,
    `cf_encaissement_fact_pq`.`CompteCarteFidelite` ASC,
    `cf_encaissement_fact_pq`.`NbCompteCarteFidelite` ASC,
    `cf_encaissement_fact_pq`.`DetentionCF` ASC,
    `cf_encaissement_fact_pq`.`NbCarteFidelite` ASC,
    `cf_encaissement_fact_pq`.`Id_CF_Dim_DUCB` ASC
15/01/13 17:10:32 INFO ParseDriver: Parse Completed
15/01/13 17:10:33 INFO BlockManager: Removing broadcast 125
15/01/13 17:10:33 INFO BlockManager: Removing block broadcast_125
15/01/13 17:10:33 INFO MemoryStore: Block broadcast_125 of size 225150 dropped from memory (free 273219717)
15/01/13 17:10:33 INFO BlockManager: Removing block broadcast_125_piece0
15/01/13 17:10:33 INFO MemoryStore: Block broadcast_125_piece0 of size 24696 dropped from memory (free 273244413)
15/01/13 17:10:33 INFO BlockManagerInfo: Removed broadcast_125_piece0 on nc-h07:57913 in memory (size: 24.1 KB, free: 264.8 MB)
15/01/13 17:10:33 INFO BlockManagerMaster: Updated info of block broadcast_125_piece0
15/01/13 17:10:33 INFO ContextCleaner: Cleaned broadcast 125
15/01/13 17:10:33 INFO ParquetTypesConverter: Falling back to schema conversion from Parquet types; result: ArrayBuffer(annee#3921, mois#3922, jour#3923, heure#3924, nom_societe#3925, id_magasin#3926, nom_magasin#3927, cartefidelitepresentee#3928, detentioncf#3929, nbcartefidelite#3930, comptecartefidelite#3931, nbcomptecartefidelite#3932, id_cf_dim_ducb#3933, plageducb#3934, nbcheque#3935L, cacheque#3936, nbimpaye#3937, id_ensemble#3938L, resteduimpaye#3939, zibzin#3940)
15/01/13 17:10:33 INFO MemoryStore: ensureFreeSpace(225150) called with curMem=5058143, maxMem=278302556
15/01/13 17:10:33 INFO MemoryStore: Block broadcast_126 stored as values in memory (estimated size 219.9 KB, free 260.4 MB)
15/01/13 17:10:33 INFO MemoryStore: ensureFreeSpace(24701) called with curMem=5283293, maxMem=278302556
15/01/13 17:10:33 INFO MemoryStore: Block broadcast_126_piece0 stored as bytes in memory (estimated size 24.1 KB, free 260.3 MB)
15/01/13 17:10:33 INFO BlockManagerInfo: Added broadcast_126_piece0 in memory on nc-h07:57913 (size: 24.1 KB, free: 264.8 MB)
15/01/13 17:10:33 INFO BlockManagerMaster: Updated info of block broadcast_126_piece0
15/01/13 17:10:33 INFO SparkContext: Created broadcast 126 from NewHadoopRDD at ParquetTableOperations.scala:120
15/01/13 17:10:33 ERROR SparkExecuteStatementOperation: Error executing query:
org.apache.spark.sql.catalyst.errors.package$TreeNodeException: sort, tree:
!Sort [annee#3901 ASC,mois#3902 ASC,jour#3903 ASC,heure#3904 ASC,nom_societe#3905 ASC,id_magasin#3906 ASC,CarteFidelitePresentee#3908 ASC,CompteCarteFidelite#3911 ASC,NbCompteCarteFidelite#3912 ASC,DetentionCF#3909 ASC,NbCarteFidelite#3910 ASC,Id_CF_Dim_DUCB#3913 ASC], true
 !Exchange (RangePartitioning [annee#3901 ASC,mois#3902 ASC,jour#3903 ASC,heure#3904 ASC,nom_societe#3905 ASC,id_magasin#3906 ASC,CarteFidelitePresentee#3908 ASC,CompteCarteFidelite#3911 ASC,NbCompteCarteFidelite#3912 ASC,DetentionCF#3909 ASC,NbCarteFidelite#3910 ASC,Id_CF_Dim_DUCB#3913 ASC], 200)
  !OutputFaker [Annee#3883,Mois#3884,Jour#3885,Heure#3886,Societe#3887,Magasin#3888,CF Presentee#3889,CompteCarteFidelite#3890,NbCompteCarteFidelite#3891,DetentionCF#3892,NbCarteFidelite#3893,PlageDUCB#3894,NbCheque#3895L,CACheque#3896,NbImpaye#3897,NbEnsemble#3898L,NbCompte#3899,ResteDuImpaye#3900,id_magasin#3906,Mois#3884,Id_CF_Dim_DUCB#3913,NbCompteCarteFidelite#3891,Annee#3883,CompteCarteFidelite#3890,Heure#3886,NbCarteFidelite#3893,nom_societe#3905,Jour#3885,CarteFidelitePresentee#3908,DetentionCF#3892]
   Project [annee#3921 AS Annee#3883,mois#3922 AS Mois#3884,jour#3923 AS Jour#3885,heure#3924 AS Heure#3886,nom_societe#3925 AS Societe#3887,id_magasin#3926 AS Magasin#3888,CarteFidelitePresentee#3928 AS CF Presentee#3889,CompteCarteFidelite#3931 AS CompteCarteFidelite#3890,NbCompteCarteFidelite#3932 AS NbCompteCarteFidelite#3891,DetentionCF#3929 AS DetentionCF#3892,NbCarteFidelite#3930 AS NbCarteFidelite#3893,Id_CF_Dim_DUCB#3933 AS PlageDUCB#3894,NbCheque#3935L AS NbCheque#3895L,CACheque#3936 AS CACheque#3896,NbImpaye#3937 AS NbImpaye#3897,Id_Ensemble#3938L AS NbEnsemble#3898L,ZIBZIN#3940 AS NbCompte#3899,ResteDuImpaye#3939 AS ResteDuImpaye#3900,id_magasin#3926,mois#3922,Id_CF_Dim_DUCB#3933,NbCompteCarteFidelite#3932,annee#3921,CompteCarteFidelite#3931,heure#3924,NbCarteFidelite#3930,nom_societe#3925,jour#3923,CarteFidelitePresentee#3928,DetentionCF#3929]
    Filter ((((annee#3921 = 2014) && (mois#3922 = 1)) && (jour#3923 = 25)) && (id_magasin#3926 = 649))
     ParquetTableScan [DetentionCF#3929,heure#3924,ZIBZIN#3940,NbCheque#3935L,id_magasin#3926,ResteDuImpaye#3939,nom_societe#3925,NbCompteCarteFidelite#3932,CarteFidelitePresentee#3928,NbCarteFidelite#3930,CACheque#3936,CompteCarteFidelite#3931,mois#3922,jour#3923,Id_CF_Dim_DUCB#3933,NbImpaye#3937,annee#3921,Id_Ensemble#3938L], (ParquetRelation hdfs://nc-h07/user/hive/warehouse/testsimon3.db/cf_encaissement_fact_pq, Some(Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml), org.apache.spark.sql.hive.HiveContext@7db3bcc, []), []

        at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:47)
        at org.apache.spark.sql.execution.Sort.execute(basicOperators.scala:206)
        at org.apache.spark.sql.execution.Project.execute(basicOperators.scala:43)
        at org.apache.spark.sql.columnar.InMemoryRelation.buildBuffers(InMemoryColumnarTableScan.scala:111)
        at org.apache.spark.sql.columnar.InMemoryRelation.<init>(InMemoryColumnarTableScan.scala:100)
        at org.apache.spark.sql.columnar.InMemoryRelation$.apply(InMemoryColumnarTableScan.scala:41)
        at org.apache.spark.sql.CacheManager$$anonfun$cacheQuery$1.apply(CacheManager.scala:93)
        at org.apache.spark.sql.CacheManager$class.writeLock(CacheManager.scala:67)
        at org.apache.spark.sql.CacheManager$class.cacheQuery(CacheManager.scala:85)
        at org.apache.spark.sql.SQLContext.cacheQuery(SQLContext.scala:50)
        at org.apache.spark.sql.CacheManager$class.cacheTable(CacheManager.scala:49)
        at org.apache.spark.sql.SQLContext.cacheTable(SQLContext.scala:50)
        at org.apache.spark.sql.execution.CacheTableCommand.run(commands.scala:143)
        at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:53)
        at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:53)
        at org.apache.spark.sql.execution.ExecutedCommand.execute(commands.scala:61)
        at org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:424)
        at org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:424)
        at org.apache.spark.sql.SchemaRDDLike$class.$init$(SchemaRDDLike.scala:58)
        at org.apache.spark.sql.SchemaRDD.<init>(SchemaRDD.scala:108)
        at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:93)
        at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.run(Shim13.scala:160)
        at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:231)
        at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:212)
        at sun.reflect.GeneratedMethodAccessor30.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:79)
        at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:37)
        at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:64)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
        at org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:493)
        at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:60)
        at com.sun.proxy.$Proxy14.executeStatement(Unknown Source)
        at org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:220)
        at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:344)
        at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)
        at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
        at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:55)
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
Caused by: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute, tree:
!Exchange (RangePartitioning [annee#3901 ASC,mois#3902 ASC,jour#3903 ASC,heure#3904 ASC,nom_societe#3905 ASC,id_magasin#3906 ASC,CarteFidelitePresentee#3908 ASC,CompteCarteFidelite#3911 ASC,NbCompteCarteFidelite#3912 ASC,DetentionCF#3909 ASC,NbCarteFidelite#3910 ASC,Id_CF_Dim_DUCB#3913 ASC], 200)
 !OutputFaker [Annee#3883,Mois#3884,Jour#3885,Heure#3886,Societe#3887,Magasin#3888,CF Presentee#3889,CompteCarteFidelite#3890,NbCompteCarteFidelite#3891,DetentionCF#3892,NbCarteFidelite#3893,PlageDUCB#3894,NbCheque#3895L,CACheque#3896,NbImpaye#3897,NbEnsemble#3898L,NbCompte#3899,ResteDuImpaye#3900,id_magasin#3906,Mois#3884,Id_CF_Dim_DUCB#3913,NbCompteCarteFidelite#3891,Annee#3883,CompteCarteFidelite#3890,Heure#3886,NbCarteFidelite#3893,nom_societe#3905,Jour#3885,CarteFidelitePresentee#3908,DetentionCF#3892]
  Project [annee#3921 AS Annee#3883,mois#3922 AS Mois#3884,jour#3923 AS Jour#3885,heure#3924 AS Heure#3886,nom_societe#3925 AS Societe#3887,id_magasin#3926 AS Magasin#3888,CarteFidelitePresentee#3928 AS CF Presentee#3889,CompteCarteFidelite#3931 AS CompteCarteFidelite#3890,NbCompteCarteFidelite#3932 AS NbCompteCarteFidelite#3891,DetentionCF#3929 AS DetentionCF#3892,NbCarteFidelite#3930 AS NbCarteFidelite#3893,Id_CF_Dim_DUCB#3933 AS PlageDUCB#3894,NbCheque#3935L AS NbCheque#3895L,CACheque#3936 AS CACheque#3896,NbImpaye#3937 AS NbImpaye#3897,Id_Ensemble#3938L AS NbEnsemble#3898L,ZIBZIN#3940 AS NbCompte#3899,ResteDuImpaye#3939 AS ResteDuImpaye#3900,id_magasin#3926,mois#3922,Id_CF_Dim_DUCB#3933,NbCompteCarteFidelite#3932,annee#3921,CompteCarteFidelite#3931,heure#3924,NbCarteFidelite#3930,nom_societe#3925,jour#3923,CarteFidelitePresentee#3928,DetentionCF#3929]
   Filter ((((annee#3921 = 2014) && (mois#3922 = 1)) && (jour#3923 = 25)) && (id_magasin#3926 = 649))
    ParquetTableScan [DetentionCF#3929,heure#3924,ZIBZIN#3940,NbCheque#3935L,id_magasin#3926,ResteDuImpaye#3939,nom_societe#3925,NbCompteCarteFidelite#3932,CarteFidelitePresentee#3928,NbCarteFidelite#3930,CACheque#3936,CompteCarteFidelite#3931,mois#3922,jour#3923,Id_CF_Dim_DUCB#3933,NbImpaye#3937,annee#3921,Id_Ensemble#3938L], (ParquetRelation hdfs://nc-h07/user/hive/warehouse/testsimon3.db/cf_encaissement_fact_pq, Some(Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml), org.apache.spark.sql.hive.HiveContext@7db3bcc, []), []

        at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:47)
        at org.apache.spark.sql.execution.Exchange.execute(Exchange.scala:47)
        at org.apache.spark.sql.execution.Sort$$anonfun$execute$4.apply(basicOperators.scala:207)
        at org.apache.spark.sql.execution.Sort$$anonfun$execute$4.apply(basicOperators.scala:207)
        at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:46)
        ... 46 more
Caused by: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Binding attribute, tree: annee#3901
        at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:47)
        at org.apache.spark.sql.catalyst.expressions.BindReferences$$anonfun$bindReference$1.applyOrElse(BoundAttribute.scala:47)
        at org.apache.spark.sql.catalyst.expressions.BindReferences$$anonfun$bindReference$1.applyOrElse(BoundAttribute.scala:46)
        at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:144)
        at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:162)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
        at scala.collection.Iterator$class.foreach(Iterator.scala:727)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
        at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
        at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
        at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
        at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
        at scala.collection.AbstractIterator.to(Iterator.scala:1157)
        at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
        at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
        at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
        at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
        at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:191)
        at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:147)
        at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:135)
        at org.apache.spark.sql.catalyst.expressions.BindReferences$.bindReference(BoundAttribute.scala:46)
        at org.apache.spark.sql.catalyst.expressions.RowOrdering$$anonfun$$init$$1.apply(Row.scala:244)
        at org.apache.spark.sql.catalyst.expressions.RowOrdering$$anonfun$$init$$1.apply(Row.scala:244)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
        at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
        at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
        at scala.collection.AbstractTraversable.map(Traversable.scala:105)
        at org.apache.spark.sql.catalyst.expressions.RowOrdering.<init>(Row.scala:244)
        at org.apache.spark.sql.execution.Exchange$$anonfun$execute$1.apply(Exchange.scala:86)
        at org.apache.spark.sql.execution.Exchange$$anonfun$execute$1.apply(Exchange.scala:48)
        at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:46)
        ... 50 more
Caused by: java.lang.RuntimeException: Couldn't find annee#3901 in [Annee#3883,Mois#3884,Jour#3885,Heure#3886,Societe#3887,Magasin#3888,CF Presentee#3889,CompteCarteFidelite#3890,NbCompteCarteFidelite#3891,DetentionCF#3892,NbCarteFidelite#3893,PlageDUCB#3894,NbCheque#3895L,CACheque#3896,NbImpaye#3897,NbEnsemble#3898L,NbCompte#3899,ResteDuImpaye#3900,id_magasin#3906,Mois#3884,Id_CF_Dim_DUCB#3913,NbCompteCarteFidelite#3891,Annee#3883,CompteCarteFidelite#3890,Heure#3886,NbCarteFidelite#3893,nom_societe#3905,Jour#3885,CarteFidelitePresentee#3908,DetentionCF#3892]
        at scala.sys.package$.error(package.scala:27)
        at org.apache.spark.sql.catalyst.expressions.BindReferences$$anonfun$bindReference$1$$anonfun$applyOrElse$1.apply(BoundAttribute.scala:53)
        at org.apache.spark.sql.catalyst.expressions.BindReferences$$anonfun$bindReference$1$$anonfun$applyOrElse$1.apply(BoundAttribute.scala:47)
        at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:46)
        ... 82 more
15/01/13 17:10:33 WARN ThriftCLIService: Error executing statement:
org.apache.hive.service.cli.HiveSQLException: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: sort, tree:
!Sort [annee#3901 ASC,mois#3902 ASC,jour#3903 ASC,heure#3904 ASC,nom_societe#3905 ASC,id_magasin#3906 ASC,CarteFidelitePresentee#3908 ASC,CompteCarteFidelite#3911 ASC,NbCompteCarteFidelite#3912 ASC,DetentionCF#3909 ASC,NbCarteFidelite#3910 ASC,Id_CF_Dim_DUCB#3913 ASC], true
 !Exchange (RangePartitioning [annee#3901 ASC,mois#3902 ASC,jour#3903 ASC,heure#3904 ASC,nom_societe#3905 ASC,id_magasin#3906 ASC,CarteFidelitePresentee#3908 ASC,CompteCarteFidelite#3911 ASC,NbCompteCarteFidelite#3912 ASC,DetentionCF#3909 ASC,NbCarteFidelite#3910 ASC,Id_CF_Dim_DUCB#3913 ASC], 200)
  !OutputFaker [Annee#3883,Mois#3884,Jour#3885,Heure#3886,Societe#3887,Magasin#3888,CF Presentee#3889,CompteCarteFidelite#3890,NbCompteCarteFidelite#3891,DetentionCF#3892,NbCarteFidelite#3893,PlageDUCB#3894,NbCheque#3895L,CACheque#3896,NbImpaye#3897,NbEnsemble#3898L,NbCompte#3899,ResteDuImpaye#3900,id_magasin#3906,Mois#3884,Id_CF_Dim_DUCB#3913,NbCompteCarteFidelite#3891,Annee#3883,CompteCarteFidelite#3890,Heure#3886,NbCarteFidelite#3893,nom_societe#3905,Jour#3885,CarteFidelitePresentee#3908,DetentionCF#3892]
   Project [annee#3921 AS Annee#3883,mois#3922 AS Mois#3884,jour#3923 AS Jour#3885,heure#3924 AS Heure#3886,nom_societe#3925 AS Societe#3887,id_magasin#3926 AS Magasin#3888,CarteFidelitePresentee#3928 AS CF Presentee#3889,CompteCarteFidelite#3931 AS CompteCarteFidelite#3890,NbCompteCarteFidelite#3932 AS NbCompteCarteFidelite#3891,DetentionCF#3929 AS DetentionCF#3892,NbCarteFidelite#3930 AS NbCarteFidelite#3893,Id_CF_Dim_DUCB#3933 AS PlageDUCB#3894,NbCheque#3935L AS NbCheque#3895L,CACheque#3936 AS CACheque#3896,NbImpaye#3937 AS NbImpaye#3897,Id_Ensemble#3938L AS NbEnsemble#3898L,ZIBZIN#3940 AS NbCompte#3899,ResteDuImpaye#3939 AS ResteDuImpaye#3900,id_magasin#3926,mois#3922,Id_CF_Dim_DUCB#3933,NbCompteCarteFidelite#3932,annee#3921,CompteCarteFidelite#3931,heure#3924,NbCarteFidelite#3930,nom_societe#3925,jour#3923,CarteFidelitePresentee#3928,DetentionCF#3929]
    Filter ((((annee#3921 = 2014) && (mois#3922 = 1)) && (jour#3923 = 25)) && (id_magasin#3926 = 649))
     ParquetTableScan [DetentionCF#3929,heure#3924,ZIBZIN#3940,NbCheque#3935L,id_magasin#3926,ResteDuImpaye#3939,nom_societe#3925,NbCompteCarteFidelite#3932,CarteFidelitePresentee#3928,NbCarteFidelite#3930,CACheque#3936,CompteCarteFidelite#3931,mois#3922,jour#3923,Id_CF_Dim_DUCB#3933,NbImpaye#3937,annee#3921,Id_Ensemble#3938L], (ParquetRelation hdfs://nc-h07/user/hive/warehouse/testsimon3.db/cf_encaissement_fact_pq, Some(Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml), org.apache.spark.sql.hive.HiveContext@7db3bcc, []), []

        at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.run(Shim13.scala:189)
        at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:231)
        at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:212)
        at sun.reflect.GeneratedMethodAccessor30.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:79)
        at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:37)
        at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:64)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
        at org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:493)
        at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:60)
        at com.sun.proxy.$Proxy14.executeStatement(Unknown Source)
        at org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:220)
        at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:344)
        at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)
        at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
        at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:55)
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)

{noformat}

> Wrong parse of GROUP BY query
> -----------------------------
>
>                 Key: SPARK-4794
>                 URL: https://issues.apache.org/jira/browse/SPARK-4794
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.2.0
>            Reporter: Damien Carol
>
> Spark is not able to parse this query :
> {code:sql}
> select
>     `cf_encaissement_fact_pq`.`annee` as `Annee`,
>     `cf_encaissement_fact_pq`.`mois` as `Mois`,
>     `cf_encaissement_fact_pq`.`jour` as `Jour`,
>     `cf_encaissement_fact_pq`.`heure` as `Heure`,
>     `cf_encaissement_fact_pq`.`nom_societe` as `Societe`,
>     `cf_encaissement_fact_pq`.`id_magasin` as `Magasin`,
>     `cf_encaissement_fact_pq`.`CarteFidelitePresentee` as `CF_Presentee`,
>     `cf_encaissement_fact_pq`.`CompteCarteFidelite` as `CompteCarteFidelite`,
>     `cf_encaissement_fact_pq`.`NbCompteCarteFidelite` as `NbCompteCarteFidelite`,
>     `cf_encaissement_fact_pq`.`DetentionCF` as `DetentionCF`,
>     `cf_encaissement_fact_pq`.`NbCarteFidelite` as `NbCarteFidelite`,
>     `cf_encaissement_fact_pq`.`Id_CF_Dim_DUCB` as `Plage_DUCB`,
>     `cf_encaissement_fact_pq`.`NbCheque` as `NbCheque`,
>     `cf_encaissement_fact_pq`.`CACheque` as `CACheque`,
>     `cf_encaissement_fact_pq`.`NbImpaye` as `NbImpaye`,
>     `cf_encaissement_fact_pq`.`Id_Ensemble` as `NbEnsemble`,
>     `cf_encaissement_fact_pq`.`ZIBZIN` as `NbCompte`,
>     `cf_encaissement_fact_pq`.`ResteDuImpaye` as `ResteDuImpaye`
> from
>     `testsimon3`.`cf_encaissement_fact_pq` as `cf_encaissement_fact_pq`
> where
>     `cf_encaissement_fact_pq`.`annee` = 2013
> and
>     `cf_encaissement_fact_pq`.`mois` = 7
> and
>     `cf_encaissement_fact_pq`.`jour` = 12
> order by
>     `cf_encaissement_fact_pq`.`annee` ASC,
>     `cf_encaissement_fact_pq`.`mois` ASC,
>     `cf_encaissement_fact_pq`.`jour` ASC,
>     `cf_encaissement_fact_pq`.`heure` ASC,
>     `cf_encaissement_fact_pq`.`nom_societe` ASC,
>     `cf_encaissement_fact_pq`.`id_magasin` ASC,
>     `cf_encaissement_fact_pq`.`CarteFidelitePresentee` ASC,
>     `cf_encaissement_fact_pq`.`CompteCarteFidelite` ASC,
>     `cf_encaissement_fact_pq`.`NbCompteCarteFidelite` ASC,
>     `cf_encaissement_fact_pq`.`DetentionCF` ASC,
>     `cf_encaissement_fact_pq`.`NbCarteFidelite` ASC,
>     `cf_encaissement_fact_pq`.`Id_CF_Dim_DUCB` ASC
> {code}
> If I remove table name in ORDER BY conditions, Spark can handle it.
> {code:sql}
> select
>     `cf_encaissement_fact_pq`.`annee` as `Annee`,
>     `cf_encaissement_fact_pq`.`mois` as `Mois`,
>     `cf_encaissement_fact_pq`.`jour` as `Jour`,
>     `cf_encaissement_fact_pq`.`heure` as `Heure`,
>     `cf_encaissement_fact_pq`.`nom_societe` as `Societe`,
>     `cf_encaissement_fact_pq`.`id_magasin` as `Magasin`,
>     `cf_encaissement_fact_pq`.`CarteFidelitePresentee` as `CFPresentee`,
>     `cf_encaissement_fact_pq`.`CompteCarteFidelite` as `CompteCarteFidelite`,
>     `cf_encaissement_fact_pq`.`NbCompteCarteFidelite` as `NbCompteCarteFidelite`,
>     `cf_encaissement_fact_pq`.`DetentionCF` as `DetentionCF`,
>     `cf_encaissement_fact_pq`.`NbCarteFidelite` as `NbCarteFidelite`,
>     `cf_encaissement_fact_pq`.`Id_CF_Dim_DUCB` as `PlageDUCB`,
>     `cf_encaissement_fact_pq`.`NbCheque` as `NbCheque`,
>     `cf_encaissement_fact_pq`.`CACheque` as `CACheque`,
>     `cf_encaissement_fact_pq`.`NbImpaye` as `NbImpaye`,
>     `cf_encaissement_fact_pq`.`Id_Ensemble` as `NbEnsemble`,
>     `cf_encaissement_fact_pq`.`ZIBZIN` as `NbCompte`,
>     `cf_encaissement_fact_pq`.`ResteDuImpaye` as `ResteDuImpaye`
> from
>     `testsimon3`.`cf_encaissement_fact_pq` as `cf_encaissement_fact_pq`
> where
>     `cf_encaissement_fact_pq`.`annee` = 2013
> and
>     `cf_encaissement_fact_pq`.`mois` = 7
> and
>     `cf_encaissement_fact_pq`.`jour` = 12
> order by
>     `annee` ASC,
>     `mois` ASC,
>     `jour` ASC,
>     `heure` ASC,
>     `nom_societe` ASC,
>     `id_magasin` ASC,
>     `CarteFidelitePresentee` ASC,
>     `CompteCarteFidelite` ASC,
>     `NbCompteCarteFidelite` ASC,
>     `DetentionCF` ASC,
>     `NbCarteFidelite` ASC,
>     `Id_CF_Dim_DUCB` ASC
> {code}
> I'm using Spark Master with Thrift server (HIVE 0.12)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org