You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hive.apache.org by "Esteban Gutierrez (JIRA)" <ji...@apache.org> on 2011/04/22 05:51:05 UTC

[jira] [Created] (HIVE-2124) Specified functions in the partitioning predicates should not generate a M/R job.

Specified functions in the partitioning predicates should not generate a M/R job.
---------------------------------------------------------------------------------

                 Key: HIVE-2124
                 URL: https://issues.apache.org/jira/browse/HIVE-2124
             Project: Hive
          Issue Type: Improvement
          Components: Query Processor
    Affects Versions: 0.7.0, 0.6.0, 0.5.0
            Reporter: Esteban Gutierrez
            Priority: Minor


For certain situations specifying which functions should be evaluated once would help to make syntax simpler to avoid launching M/R jobs.

Example:

# myhql.time=`date "+%s"` -> constant
# counting rows from the last 30 days generates a M/R job using all the partitions
$ hive -hiveconf myhql.time=`date "+%s"` -e "SELECT COUNT(*) FROM mybigtable WHERE mypartition >= from_unixtime(\${hiveconf:myhql.time}-2592000,'yyyy-MM-dd');

Suggested feature:

# will scan only the right partitions
$ hive -hiveconf hive.partition.evaluateonce=unix_timestamp -e "SELECT COUNT(*) FROM mybigtable WHERE mypartition >= from_unixtime(unix_timestamp()-2592000,'yyyy-MM-dd');



--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2124) Specified functions in the partitioning predicates should not generate a M/R job.

Posted by "Esteban Gutierrez (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HIVE-2124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Esteban Gutierrez updated HIVE-2124:
------------------------------------

    Description: 
For certain situations specifying which functions should be evaluated once would help to make syntax simpler to avoid launching M/R jobs.


Example:

\# myhql.time=`date "+%s"` -> constant
\# counting rows from the last 30 days generates a M/R job using all the partitions
$ hive -hiveconf myhql.time=`date "+%s"` -e "SELECT COUNT\(\*\) FROM mybigtable WHERE mypartition >= from_unixtime(\${hiveconf:myhql.time}-2592000,'yyyy-MM-dd');


Suggested feature:


\# will scan only the right partitions
$ hive -hiveconf hive.partition.evaluateonce=unix_timestamp -e "SELECT COUNT\(\*\) FROM mybigtable WHERE mypartition >= from_unixtime(unix_timestamp()-2592000,'yyyy-MM-dd');



  was:
For certain situations specifying which functions should be evaluated once would help to make syntax simpler to avoid launching M/R jobs.

Example:

# myhql.time=`date "+%s"` -> constant
# counting rows from the last 30 days generates a M/R job using all the partitions
$ hive -hiveconf myhql.time=`date "+%s"` -e "SELECT COUNT(*) FROM mybigtable WHERE mypartition >= from_unixtime(\${hiveconf:myhql.time}-2592000,'yyyy-MM-dd');

Suggested feature:

# will scan only the right partitions
$ hive -hiveconf hive.partition.evaluateonce=unix_timestamp -e "SELECT COUNT(*) FROM mybigtable WHERE mypartition >= from_unixtime(unix_timestamp()-2592000,'yyyy-MM-dd');




> Specified functions in the partitioning predicates should not generate a M/R job.
> ---------------------------------------------------------------------------------
>
>                 Key: HIVE-2124
>                 URL: https://issues.apache.org/jira/browse/HIVE-2124
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>    Affects Versions: 0.5.0, 0.6.0, 0.7.0
>            Reporter: Esteban Gutierrez
>            Priority: Minor
>              Labels: features, new
>
> For certain situations specifying which functions should be evaluated once would help to make syntax simpler to avoid launching M/R jobs.
> Example:
> \# myhql.time=`date "+%s"` -> constant
> \# counting rows from the last 30 days generates a M/R job using all the partitions
> $ hive -hiveconf myhql.time=`date "+%s"` -e "SELECT COUNT\(\*\) FROM mybigtable WHERE mypartition >= from_unixtime(\${hiveconf:myhql.time}-2592000,'yyyy-MM-dd');
> Suggested feature:
> \# will scan only the right partitions
> $ hive -hiveconf hive.partition.evaluateonce=unix_timestamp -e "SELECT COUNT\(\*\) FROM mybigtable WHERE mypartition >= from_unixtime(unix_timestamp()-2592000,'yyyy-MM-dd');

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira