You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Mark Kerzner <ke...@shmsoft.com> on 2011/03/07 15:22:09 UTC

How to investigate the performance of replication?

Hi,

I have set up replication, and it is working. Now i am interested in the
performance implications of it. What is the best way to approach this?
Should I use the "verifyrep" mentioned at the 
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/replication/package-summary.html#requirements
end of the setup document here , and just try to measure or are there
better, theoretical ways, to approach this?

Thank you,
Mark
-- 
View this message in context: http://old.nabble.com/How-to-investigate-the-performance-of-replication--tp31087962p31087962.html
Sent from the HBase User mailing list archive at Nabble.com.

RE: LZO Compression

Posted by Peter Haidinyak <ph...@local.com>.

Thanks, I'll give it a try again.

-Pete

-----Original Message-----
From: Ryan Rawson [mailto:ryanobjc@gmail.com] 
Sent: Monday, March 07, 2011 3:43 PM
To: user@hbase.apache.org
Cc: Peter Haidinyak
Subject: Re: LZO Compression

That is correct.  The latest is here iirc:

https://github.com/toddlipcon/hadoop-lzo

It's pretty trivial to build and deploy.  Make sure you have the base
lzo libs on all your machines (that bit me once).

-ryan

On Mon, Mar 7, 2011 at 3:38 PM, Peter Haidinyak <ph...@local.com> wrote:
> I've been told that I should build the jars and native libraries and then deploy these to the cluster.
>
> -Pete
>
> -----Original Message-----
> From: Ryan Rawson [mailto:ryanobjc@gmail.com]
> Sent: Monday, March 07, 2011 3:23 PM
> To: user@hbase.apache.org
> Cc: Peter Haidinyak
> Subject: Re: LZO Compression
>
> Just copy in the hadoop-gpl-compression*.jar and the
> native/Linux*/libgplcompression* to every node and you can use 'LZO'
> compression type in your tables without doing anything else.
>
> -ryan
>
> On Mon, Mar 7, 2011 at 3:18 PM, Peter Haidinyak <ph...@local.com> wrote:
>> Hi,
>>        When you are setting up LZO Compression for HBase to you need to modify the Hadoop configuration to use LZO or can HBase compress independent of Hadoop?
>>
>> Thanks
>>
>> -Pete
>>
>

Re: LZO Compression

Posted by Ryan Rawson <ry...@gmail.com>.

That is correct.  The latest is here iirc:

https://github.com/toddlipcon/hadoop-lzo

It's pretty trivial to build and deploy.  Make sure you have the base
lzo libs on all your machines (that bit me once).

-ryan

On Mon, Mar 7, 2011 at 3:38 PM, Peter Haidinyak <ph...@local.com> wrote:
> I've been told that I should build the jars and native libraries and then deploy these to the cluster.
>
> -Pete
>
> -----Original Message-----
> From: Ryan Rawson [mailto:ryanobjc@gmail.com]
> Sent: Monday, March 07, 2011 3:23 PM
> To: user@hbase.apache.org
> Cc: Peter Haidinyak
> Subject: Re: LZO Compression
>
> Just copy in the hadoop-gpl-compression*.jar and the
> native/Linux*/libgplcompression* to every node and you can use 'LZO'
> compression type in your tables without doing anything else.
>
> -ryan
>
> On Mon, Mar 7, 2011 at 3:18 PM, Peter Haidinyak <ph...@local.com> wrote:
>> Hi,
>>        When you are setting up LZO Compression for HBase to you need to modify the Hadoop configuration to use LZO or can HBase compress independent of Hadoop?
>>
>> Thanks
>>
>> -Pete
>>
>

RE: LZO Compression

Posted by Peter Haidinyak <ph...@local.com>.

I've been told that I should build the jars and native libraries and then deploy these to the cluster. 

-Pete

-----Original Message-----
From: Ryan Rawson [mailto:ryanobjc@gmail.com] 
Sent: Monday, March 07, 2011 3:23 PM
To: user@hbase.apache.org
Cc: Peter Haidinyak
Subject: Re: LZO Compression

Just copy in the hadoop-gpl-compression*.jar and the
native/Linux*/libgplcompression* to every node and you can use 'LZO'
compression type in your tables without doing anything else.

-ryan

On Mon, Mar 7, 2011 at 3:18 PM, Peter Haidinyak <ph...@local.com> wrote:
> Hi,
>        When you are setting up LZO Compression for HBase to you need to modify the Hadoop configuration to use LZO or can HBase compress independent of Hadoop?
>
> Thanks
>
> -Pete
>

Re: LZO Compression

Posted by Ryan Rawson <ry...@gmail.com>.

Just copy in the hadoop-gpl-compression*.jar and the
native/Linux*/libgplcompression* to every node and you can use 'LZO'
compression type in your tables without doing anything else.

-ryan

On Mon, Mar 7, 2011 at 3:18 PM, Peter Haidinyak <ph...@local.com> wrote:
> Hi,
>        When you are setting up LZO Compression for HBase to you need to modify the Hadoop configuration to use LZO or can HBase compress independent of Hadoop?
>
> Thanks
>
> -Pete
>

LZO Compression

Posted by Peter Haidinyak <ph...@local.com>.

Hi,
	When you are setting up LZO Compression for HBase to you need to modify the Hadoop configuration to use LZO or can HBase compress independent of Hadoop?

Thanks

-Pete

Re: How to investigate the performance of replication?

Posted by Jean-Daniel Cryans <jd...@apache.org>.

> However, as far as performance, your answer seems to imply that "build it
> and make sure that it works, then measure the performance in
> your environment, and try to improve it if it is a problem," am I
> understanding you correctly?

I haven't answered anything yet since I was trying to understand your
question, and the title of your email first mislead me into thinking
you wanted to measure the performance of replication, not the side
effects of it :)

But my answer would have been what you just wrote (are you reading my
mind??). Regarding the improvements, you can tell replication to
buffer less data with replication.source.size.capacity which defaults
to 64MB.

Which reminds me... currently there's no way to tell replication to
slow down, it will just try to read/push as fast as it can. Shouldn't
be too hard to add some sort of throttling.

J-D

> Thank you,
> Mark
>
> On Mon, Mar 7, 2011 at 12:06 PM, Jean-Daniel Cryans <jd...@apache.org>
> wrote:
>>
>> Mark,
>>
>> By "performance implications" you mean the side effects it has on
>> things like throughput and latency on the master cluster? You're
>> wondering how much of a hit that cluster will take once replication is
>> enabled?
>>
>> verifyrep only does what its name says, it verifies the replication
>> was done correctly between two tables for a time range and it's not
>> really fancier than that :)   Here we have a cron job that checks a
>> table at random every 30 minutes for a 1 hour window to make sure
>> things work correctly.
>>
>> J-D

Re: How to investigate the performance of replication?

Posted by Jean-Daniel Cryans <jd...@apache.org>.

Mark,

By "performance implications" you mean the side effects it has on
things like throughput and latency on the master cluster? You're
wondering how much of a hit that cluster will take once replication is
enabled?

verifyrep only does what its name says, it verifies the replication
was done correctly between two tables for a time range and it's not
really fancier than that :)   Here we have a cron job that checks a
table at random every 30 minutes for a 1 hour window to make sure
things work correctly.

J-D

On Mon, Mar 7, 2011 at 6:22 AM, Mark Kerzner <ke...@shmsoft.com> wrote:
>
> Hi,
>
> I have set up replication, and it is working. Now i am interested in the
> performance implications of it. What is the best way to approach this?
> Should I use the "verifyrep" mentioned at the
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/replication/package-summary.html#requirements
> end of the setup document here , and just try to measure or are there
> better, theoretical ways, to approach this?
>
> Thank you,
> Mark
> --
> View this message in context: http://old.nabble.com/How-to-investigate-the-performance-of-replication--tp31087962p31087962.html
> Sent from the HBase User mailing list archive at Nabble.com.
>
>