You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Alexandre Fouché (JIRA)" <ji...@apache.org> on 2013/01/20 11:40:13 UTC
[jira] [Created] (HIVE-3919) Partition query refused when PPD is disabled (due to HIVE-3699)

Alexandre Fouché created HIVE-3919:
--------------------------------------

             Summary: Partition query refused when PPD is disabled (due to HIVE-3699)
                 Key: HIVE-3919
                 URL: https://issues.apache.org/jira/browse/HIVE-3919
             Project: Hive
          Issue Type: Bug
          Components: Query Processor
         Environment: Cloudera 4.1 on Amazon Linux (rebranded Centos 6): hive-0.9.0+150-1.cdh4.1.1.p0.4.el6.noarch

            Reporter: Alexandre Fouché


The query below was working perfectly when hive.optimize.ppd=true. But i had to disable this property because my results are impacted by unresolved HIVE-3699.
But now that hive.optimize.ppd=false, the query from table1 partitionned by field 'day' is refused, even if the query has the partition predicate.

>From HIVE-830 it says that the partition pruner depends on it, so that may explain why my queries are refused when Predictive Push Down is disabled.

The query is accepted and run if i set hive.mapred.mode=nonstrict. But from Hive-server2 logs and input path lists and also processing, all dataset regardless of partitions is deserialized !

    set hive.optimize.ppd = true;
    set hive.mapred.mode=strict;
    FROM table1
    INSERT OVERWRITE TABLE general1
    SELECT qs_cs_s_kid, day, qs_cs_s_aid, COUNT(*) WHERE qs_cs_s_cat='PRINT' AND (day>=20121015 and day<=20130103) GROUP BY qs_cs_s_kid, day, qs_cs_s_aid;
    -> Parsed fine, but wrong results when i add multiple 'insert overwrite'

    set hive.optimize.ppd = false;
    set hive.mapred.mode=strict;
    FROM table1
    INSERT OVERWRITE TABLE general1
    SELECT qs_cs_s_kid, day, qs_cs_s_aid, COUNT(*) WHERE qs_cs_s_cat='PRINT' AND (day>=20121015 and day<=20130103) GROUP BY qs_cs_s_kid, day, qs_cs_s_aid;
    Error: Error while processing statement: FAILED: SemanticException [Error 10041]: No partition predicate found for Alias "table1" Table "table1" (state=42000,code=10041)
    Error: Error while processing statement: FAILED: SemanticException [Error 10041]: No partition predicate found for Alias "table1" Table "table1" (state=42000,code=10041)

The table 'table1' was created like this, it is partitionned by 'day':

    CREATE TABLE table1
    PARTITIONED BY (day BIGINT)
    ROW FORMAT
    SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
    STORED AS
    INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
    OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
    TBLPROPERTIES ('avro.schema.literal'='{
      "type": "record",
      "name": "Log",
      "fields" : [
        {"name": "time_stamp", "type": ["null", "int"], "default": null},
        (...)
      ]
    }')
    ;


    set hive.optimize.ppd = false;
    set hive.mapred.mode=strict;
    FROM table1
    INSERT OVERWRITE TABLE general1
    SELECT qs_cs_s_kid, day, qs_cs_s_aid, COUNT(*) WHERE qs_cs_s_cat='PRINT' AND day=20121015 GROUP BY qs_cs_s_kid, day, qs_cs_s_aid;
    -> FAILS TOO

    set hive.optimize.ppd = false;
    INSERT OVERWRITE TABLE general1
    SELECT qs_cs_s_kid, day, qs_cs_s_aid, COUNT(*) FROM table1 WHERE qs_cs_s_cat='PRINT' AND day=20121015 GROUP BY qs_cs_s_kid, day, qs_cs_s_aid;
    -> FAILS TOO

    set hive.optimize.ppd = false;
    set hive.mapred.mode=nonstrict;
    FROM table1
    INSERT OVERWRITE TABLE general1
    SELECT qs_cs_s_kid, day, qs_cs_s_aid, COUNT(*) WHERE qs_cs_s_cat='PRINT' AND day=20121015 GROUP BY qs_cs_s_kid, day, qs_cs_s_aid;
    -> WORKS, but from Hive-server2 logs and input path lists and also processing, all dataset regardless of partitions is deserialized !


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira