You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by George Liaw <ge...@gmail.com> on 2016/12/02 06:32:28 UTC

Hive shell not using manually set tez container size

Not really sure if this is an issue on the Hive or Tez side, but when we
open a Hive shell and set tez.task.resource.memory.mb to a different value
than listed in tez-site.xml, the query that's run doesn't seem to pick up
the setting and instead uses the one in the config file. However, if we let
the Tez session timeout within the shell, then set the tez task memory to
some value, the query run in the new session will use the manual setting as
expected. Also, if you set the option as a command line setting when
launching the hive shell, it will also use the manual setting as expected.

Has anyone else run into this as well or know what the root cause of this
behavior is?
Using Hive 2.0.1 and Tez 0.8.4.

-- 
George A. Liaw

(408) 318-7920
george.a.liaw@gmail.com
LinkedIn <http://www.linkedin.com/in/georgeliaw/>

Re: Hive shell not using manually set tez container size

Posted by Gopal Vijayaraghavan <go...@apache.org>.
> even that setting is not being applied after the hive shell is started and a query is executed. 

Are you increasing it or decreasing it? 

Tez will reuse existing larger containers, instead of releasing them - reducing the parameter has almost no effect without a session restart.

Also that's rounded out by YARN internally by yarn.scheduler.minimum-allocation-mb, so the actual allocation will always be in multiples of yarn.scheduler.minimum-allocation-mb.

This often causes confusion - when the YARN config says 1536, while the Hive config says 2048 it will default to 3072 for all values between 3072-1537.

Point to note, if you configure the hive.tez.java.opts without the -Xmx parameter, tez-0.8+ will fill that in with 80% of the allocated container returned by YARN instead of using a fixed param.

> Btw, what's the difference between hive.tez.container.size and tez.task.resource.memory.mb? Should they always be the same?

The hive parameter is the one encoded into the DAG and the tez parameter is the fall-back if it is -1.

Cheers,
Gopal



Re: Hive shell not using manually set tez container size

Posted by Premal Shah <pr...@gmail.com>.
Gopal,
even that setting is not being applied after the hive shell is started and
a query is executed.

Btw, what's the difference between hive.tez.container.size and
tez.task.resource.memory.mb? Should they always be the same?

On Thu, Dec 1, 2016 at 11:56 PM, Gopal Vijayaraghavan <go...@apache.org>
wrote:

>
> > set tez.task.resource.memory.mb to a different value than listed in
> tez-site.xml, the query that's run doesn't seem to pick up the setting and
> instead uses the one in the config file.
>
> Why not use the setting Hive uses in the submitted vertex?
>
> set hive.tez.container.size=?
>
> Cheers,
> Gopal
>
>
>


-- 
Regards,
Premal Shah.

Re: Hive shell not using manually set tez container size

Posted by Gopal Vijayaraghavan <go...@apache.org>.
> set tez.task.resource.memory.mb to a different value than listed in tez-site.xml, the query that's run doesn't seem to pick up the setting and instead uses the one in the config file. 

Why not use the setting Hive uses in the submitted vertex?

set hive.tez.container.size=?

Cheers,
Gopal