You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tez.apache.org by Hitesh Shah <hi...@apache.org> on 2016/11/02 00:58:57 UTC

Re: Hive+Tez staging dir and scratch dir

Hello Dharmesh,

The tez staging dir is where scratch data is kept for the lifetime of the Tez session. i.e. data which can be deleted once the application completes.
  
Staging data includes the following:
  - recovery logs used by the Tez AM for checkpointing state
  - Configs and/or dag plan payloads that are sent across to the AM via the staging dir. 

This staging directory location is configurable and overridable by the upper layer application. In the case of Hive, Hive uses the scratch dir as the Tez staging dir for the lifetime of the Hive session. 

For the actual usage of the hive staging dir and scratch dir, I suggest trying out the user@hive mailing list. 

thanks
— Hitesh 


> On Oct 31, 2016, at 2:41 PM, Dharmesh Kakadia <dh...@gmail.com> wrote:
> 
> Hi,
> 
> I am trying to understand meaning and relation between following configurations when running Hive on Tez.
> 
> hive.exec.stagingdir
> tez.staging-dir
> hive.exec.scratchdir	
> 
> Thanks,
> Dharmesh


Re: Hive+Tez staging dir and scratch dir

Posted by Dharmesh Kakadia <dh...@gmail.com>.
Thanks Hitesh for clarification. I am running out of disk space while
converting a large table to ORC format from text. Does ORC conversion use
local disk space while the query is running?

Thanks,
Dharmesh

On Tue, Nov 1, 2016 at 5:58 PM, Hitesh Shah <hi...@apache.org> wrote:

> Hello Dharmesh,
>
> The tez staging dir is where scratch data is kept for the lifetime of the
> Tez session. i.e. data which can be deleted once the application completes.
>
> Staging data includes the following:
>   - recovery logs used by the Tez AM for checkpointing state
>   - Configs and/or dag plan payloads that are sent across to the AM via
> the staging dir.
>
> This staging directory location is configurable and overridable by the
> upper layer application. In the case of Hive, Hive uses the scratch dir as
> the Tez staging dir for the lifetime of the Hive session.
>
> For the actual usage of the hive staging dir and scratch dir, I suggest
> trying out the user@hive mailing list.
>
> thanks
> — Hitesh
>
>
> > On Oct 31, 2016, at 2:41 PM, Dharmesh Kakadia <dh...@gmail.com>
> wrote:
> >
> > Hi,
> >
> > I am trying to understand meaning and relation between following
> configurations when running Hive on Tez.
> >
> > hive.exec.stagingdir
> > tez.staging-dir
> > hive.exec.scratchdir
> >
> > Thanks,
> > Dharmesh
>
>