You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by LLBian <li...@126.com> on 2016/01/19 05:45:44 UTC

Re:Re: what is the difference between ³hive.compute.splits.in.am=true²and "hive.compute.splits.in.am=false"


Thank-you so much for your quick response. Yea, the option is use only for hive-on-tez. I want to know its source, its principle.
Mybe this resource “http://www.slideshare.net/Hadoop_Summit/w-235phall1pandey/29” is very useful, but I can not visit it in our country (mybe for political reasons). Can you please tell me other explainations?

Thankyou & Rest Regards

---LLBian

At 2016-01-19 11:44:02, "Gopal Vijayaraghavan" <go...@apache.org> wrote:
>
>>what is the difference between³hive.compute.splits.in.am=true²and
>>"hive.compute.splits.in.am=false"?
>>which value is better?
>
>First up, those options are specific to Tez.
>
>The old MapReduce model was to always compute splits before asking for
>resources to run. And this uses the gateway host (where the CLI runs) to
>do that.
>
>That model runs sequentially and overload single gateway machines during
>heavy concurrency, particularly when used via ODBC (HiveServer2 mode).
>
>Here's an old slide explaining how that speeds up queries.
>
>http://www.slideshare.net/Hadoop_Summit/w-235phall1pandey/29
>
>
>This dynamic & pipelined model lays down the foundation for optimizations
>like Tez's dynamic partition pruning.
>
>Cheers,
>Gopal
>
>