You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "He Yongqiang (JIRA)" <ji...@apache.org> on 2010/08/02 20:47:18 UTC

[jira] Commented: (HIVE-1506) Optimize number of mr jobs produced by group by sort by

    [ https://issues.apache.org/jira/browse/HIVE-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12894645#action_12894645 ] 

He Yongqiang commented on HIVE-1506:
------------------------------------

this is not a common case because the sort by columns are prefix of the group by  columns. The sort by clause can just be removed from the query, and the output will be the same.

> Optimize number of mr jobs produced by group by sort by
> -------------------------------------------------------
>
>                 Key: HIVE-1506
>                 URL: https://issues.apache.org/jira/browse/HIVE-1506
>             Project: Hadoop Hive
>          Issue Type: Improvement
>            Reporter: He Yongqiang
>
> Right now,
> select key, INPUT__FILE__NAME, count(value from src group by key, INPUT__FILE__NAME sort by key
> require 2 jobs
> and
> select key, INPUT__FILE__NAME, count(value from src group by key, INPUT__FILE__NAME sort by key limit 3
> require 3 jobs.
> Both can be done with just one job.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.