You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by anil gupta <an...@gmail.com> on 2012/12/18 00:19:41 UTC

Roll of hbase.tmp.dir in HBase

Hi All,

I am trying to figure out the exact roll of "hbase.tmp.dir" in HBase but i
could not find any detailed reference on HBase wiki and mailing list
archives. Can anybody tell me for which purpose hbase.tmp.dir is used? Is
it a comma separated value that can take multiple directories? Any
reference document on it would be highly appreciated?

-- 
Thanks & Regards,
Anil Gupta

Re: Roll of hbase.tmp.dir in HBase

Posted by Nick Dimiduk <nd...@gmail.com>.
This directory is used by the RegionServers durring compactions to store
intermediate data. See:

$ git grep 'hbase.tmp.dir'
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionTool.java:
 private final static String CONF_TMP_DIR = "hbase.tmp.dir";
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:
   final Path logdir = new Path(c.get("hbase.tmp.dir"));
...

Both of these uses indicate that it is a single path. I imagine you'd get
exceptions if you attempted to feed it a comma-delimited list of paths.

-n

On Mon, Dec 17, 2012 at 3:19 PM, anil gupta <an...@gmail.com> wrote:

> Hi All,
>
> I am trying to figure out the exact roll of "hbase.tmp.dir" in HBase but i
> could not find any detailed reference on HBase wiki and mailing list
> archives. Can anybody tell me for which purpose hbase.tmp.dir is used? Is
> it a comma separated value that can take multiple directories? Any
> reference document on it would be highly appreciated?
>
> --
> Thanks & Regards,
> Anil Gupta
>

Re: Roll of hbase.tmp.dir in HBase

Posted by Harsh J <ha...@cloudera.com>.
You're correct - I spoke with only user-data in mind.

On Tue, Dec 18, 2012 at 8:52 AM, Jean-Daniel Cryans <jd...@apache.org> wrote:
> IIRC ZK's data will still go there if HBase manages it, even in
> distributed instances.
>
> J-D
>
> On Mon, Dec 17, 2012 at 7:12 PM, Harsh J <ha...@cloudera.com> wrote:
>> A distributed mode of HBase does not make use of the hbase.tmp.dir in
>> any way. It simply leverages the DataNode's ability to scale over
>> multiple disks and leaves the dirty work to it.
>>
>> Makes sense to be parallelized for "beefier" standalone instances, but
>> I wonder who uses those and how it may even be done as HBase
>> expects/uses a flat directory structure presently.
>>
>> On Tue, Dec 18, 2012 at 4:49 AM, anil gupta <an...@gmail.com> wrote:
>>> Hi All,
>>>
>>> I am trying to figure out the exact roll of "hbase.tmp.dir" in HBase but i
>>> could not find any detailed reference on HBase wiki and mailing list
>>> archives. Can anybody tell me for which purpose hbase.tmp.dir is used? Is
>>> it a comma separated value that can take multiple directories? Any
>>> reference document on it would be highly appreciated?
>>>
>>> --
>>> Thanks & Regards,
>>> Anil Gupta
>>
>>
>>
>> --
>> Harsh J



-- 
Harsh J

Re: Roll of hbase.tmp.dir in HBase

Posted by Jean-Daniel Cryans <jd...@apache.org>.
IIRC ZK's data will still go there if HBase manages it, even in
distributed instances.

J-D

On Mon, Dec 17, 2012 at 7:12 PM, Harsh J <ha...@cloudera.com> wrote:
> A distributed mode of HBase does not make use of the hbase.tmp.dir in
> any way. It simply leverages the DataNode's ability to scale over
> multiple disks and leaves the dirty work to it.
>
> Makes sense to be parallelized for "beefier" standalone instances, but
> I wonder who uses those and how it may even be done as HBase
> expects/uses a flat directory structure presently.
>
> On Tue, Dec 18, 2012 at 4:49 AM, anil gupta <an...@gmail.com> wrote:
>> Hi All,
>>
>> I am trying to figure out the exact roll of "hbase.tmp.dir" in HBase but i
>> could not find any detailed reference on HBase wiki and mailing list
>> archives. Can anybody tell me for which purpose hbase.tmp.dir is used? Is
>> it a comma separated value that can take multiple directories? Any
>> reference document on it would be highly appreciated?
>>
>> --
>> Thanks & Regards,
>> Anil Gupta
>
>
>
> --
> Harsh J

Re: Roll of hbase.tmp.dir in HBase

Posted by Harsh J <ha...@cloudera.com>.
A distributed mode of HBase does not make use of the hbase.tmp.dir in
any way. It simply leverages the DataNode's ability to scale over
multiple disks and leaves the dirty work to it.

Makes sense to be parallelized for "beefier" standalone instances, but
I wonder who uses those and how it may even be done as HBase
expects/uses a flat directory structure presently.

On Tue, Dec 18, 2012 at 4:49 AM, anil gupta <an...@gmail.com> wrote:
> Hi All,
>
> I am trying to figure out the exact roll of "hbase.tmp.dir" in HBase but i
> could not find any detailed reference on HBase wiki and mailing list
> archives. Can anybody tell me for which purpose hbase.tmp.dir is used? Is
> it a comma separated value that can take multiple directories? Any
> reference document on it would be highly appreciated?
>
> --
> Thanks & Regards,
> Anil Gupta



-- 
Harsh J

Re: Roll of hbase.tmp.dir in HBase

Posted by Nick Dimiduk <nd...@gmail.com>.
On Mon, Dec 17, 2012 at 5:20 PM, anil gupta <an...@gmail.com> wrote:

> @Nick: I am using HBase 0.92.1, CompactionTool.java is part of HBase 0.96
> as per https://issues.apache.org/jira/browse/HBASE-7253.
>

Fair enough; I grepped against trunk.

I have 10 disks on my slave node that will primarily be used for serving
> HBase queries(very less MR ). So, i was trying to distribute my disk I/O
> load evenly among the disk. Will it be fine if i just dedicate 1 disk for
> hadoop.tmp.dir or 1 disk is also a overkill for hbase.tmp.dir.
>

The dedicated IO could help to alleviate compaction pain -- the question
is, will you experience compaction pain? Does your workload include
frequent mutations (Puts, Deletes)? If the answer is 'no' (as your
description above implies), you'll likely not benefit very much from the
dedicated platter; better use of the spindle will likely be for the
DataNode. You can probably co-locate tmp.dir with a low-intensity resource.
Then again, if you have 10 drives, why not?

Re: Roll of hbase.tmp.dir in HBase

Posted by anil gupta <an...@gmail.com>.
Hi Stack and Nick,
Thanks for the reply.

@Stack:
I went through the following link
http://hbase.apache.org/book.html#hbase.tmp.dir before posting my query.
hbase.tmp.dir

Temporary directory on the local filesystem. Change this setting to point
to a location more permanent than '/tmp' (The '/tmp' directory is often
cleared on machine restart).

Default: /tmp/hbase-${user.name}

IMHO, the description didn't say about the significance/roll of this
property. Yes, it would be nice if we can add some more details about this
property.

@Nick: I am using HBase 0.92.1, CompactionTool.java is part of HBase 0.96
as per https://issues.apache.org/jira/browse/HBASE-7253.

I have 10 disks on my slave node that will primarily be used for serving
HBase queries(very less MR ). So, i was trying to distribute my disk I/O
load evenly among the disk. Will it be fine if i just dedicate 1 disk for
hadoop.tmp.dir or 1 disk is also a overkill for hbase.tmp.dir.

Thanks,

Anil Gupta


On Mon, Dec 17, 2012 at 3:46 PM, Stack <st...@duboce.net> wrote:

> In refguide we repeat content of hbase-default.xml:
> http://hbase.apache.org/book.html#hbase.tmp.dir
>
> What Nick said plus its used to keep all data when doing standlone hbase.
>
> We should amend doc. to say that you cannot do comma-delimited list?
> St.Ack
>
>
> On Mon, Dec 17, 2012 at 3:19 PM, anil gupta <an...@gmail.com> wrote:
>
> > Hi All,
> >
> > I am trying to figure out the exact roll of "hbase.tmp.dir" in HBase but
> i
> > could not find any detailed reference on HBase wiki and mailing list
> > archives. Can anybody tell me for which purpose hbase.tmp.dir is used? Is
> > it a comma separated value that can take multiple directories? Any
> > reference document on it would be highly appreciated?
> >
> > --
> > Thanks & Regards,
> > Anil Gupta
> >
>



-- 
Thanks & Regards,
Anil Gupta

Re: Roll of hbase.tmp.dir in HBase

Posted by Stack <st...@duboce.net>.
In refguide we repeat content of hbase-default.xml:
http://hbase.apache.org/book.html#hbase.tmp.dir

What Nick said plus its used to keep all data when doing standlone hbase.

We should amend doc. to say that you cannot do comma-delimited list?
St.Ack


On Mon, Dec 17, 2012 at 3:19 PM, anil gupta <an...@gmail.com> wrote:

> Hi All,
>
> I am trying to figure out the exact roll of "hbase.tmp.dir" in HBase but i
> could not find any detailed reference on HBase wiki and mailing list
> archives. Can anybody tell me for which purpose hbase.tmp.dir is used? Is
> it a comma separated value that can take multiple directories? Any
> reference document on it would be highly appreciated?
>
> --
> Thanks & Regards,
> Anil Gupta
>