You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Jeff Zhang <zj...@gmail.com> on 2015/03/02 11:17:35 UTC
Where does hive do sampling in order by ?
Order by usually invoke 2 steps (sampling job and repartition job) but hive
only run one mr job for order by, so wondering when and where does hive do
sampling ? client side ?
--
Best Regards
Jeff Zhang
Re: Where does hive do sampling in order by ?
Posted by Xuefu Zhang <xz...@cloudera.com>.
there is no sampling for order by in Hive. Hive uses a single reducer for
order by (if you're talking about MR execution engine).
Hive on Spark is different for this, thought.
Thanks,
Xuefu
On Mon, Mar 2, 2015 at 2:17 AM, Jeff Zhang <zj...@gmail.com> wrote:
> Order by usually invoke 2 steps (sampling job and repartition job) but
> hive only run one mr job for order by, so wondering when and where does
> hive do sampling ? client side ?
>
>
> --
> Best Regards
>
> Jeff Zhang
>