You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Farrokh Shahriari <mo...@gmail.com> on 2013/01/08 06:05:03 UTC

Tune MapReduce over HBase to insert data

Hi there
I have a cluster with 12 nodes that each of them has 2 core of CPU. Now,I
want insert large data about 2Gb in 80 sec ( or 6Gb in 240sec ). I've used
Map-Reduce over hbase,but I can't achieve proper result .
I'd be glad if you tell me what I can do to get better result or which
parameters should I config or tune to improve Map-Reduce/Hbase performance ?

Tnx

Re: Tune MapReduce over HBase to insert data

Posted by Ted Yu <yu...@gmail.com>.

Please take a look at http://hbase.apache.org/book.html#jvm

Section 12.2.3, “JVM Garbage Collection
Logs”<http://hbase.apache.org/book.html#trouble.log.gc>should be read
as well.

There is more recent effort to reduce GC activity. Namely HBASE-7404 Bucket
Cache:A solution about CMS,Heap Fragment and Big Cache on HBASE
It is close to integration to trunk.

You can expect 0.94 backport down the road.

Cheers

On Mon, Jan 7, 2013 at 10:04 PM, Farrokh Shahriari <
mohandes.zebeleh.67@gmail.com> wrote:

> Tnx Ted,
> How can I tune it ? can you tell me !!
> I have not yet decided for upgrading,does it give a better performance on
> MapReduce job for inserting ?
>
> On Tue, Jan 8, 2013 at 9:18 AM, Ted Yu <yu...@gmail.com> wrote:
>
> > JVM
>

Re: Tune MapReduce over HBase to insert data

Posted by Gerrit Jansen van Vuuren <ge...@gmail.com>.

Note: if you have a huge amount of data using bulk inserts is much faster
than using Puts.

Regards,
 Gerrit

On Tue, Jan 8, 2013 at 7:04 AM, Farrokh Shahriari <
mohandes.zebeleh.67@gmail.com> wrote:

> Tnx Ted,
> How can I tune it ? can you tell me !!
> I have not yet decided for upgrading,does it give a better performance on
> MapReduce job for inserting ?
>
> On Tue, Jan 8, 2013 at 9:18 AM, Ted Yu <yu...@gmail.com> wrote:
>
>> JVM
>
>
>

Re: Tune MapReduce over HBase to insert data

Posted by Farrokh Shahriari <mo...@gmail.com>.

Tnx Ted,
How can I tune it ? can you tell me !!
I have not yet decided for upgrading,does it give a better performance on
MapReduce job for inserting ?

On Tue, Jan 8, 2013 at 9:18 AM, Ted Yu <yu...@gmail.com> wrote:

> JVM

Re: Tune MapReduce over HBase to insert data

Posted by Farrokh Shahriari <mo...@gmail.com>.

Tnx Ted,
How can I tune it ? can you tell me !!
I have not yet decided for upgrading,does it give a better performance on
MapReduce job for inserting ?

On Tue, Jan 8, 2013 at 9:18 AM, Ted Yu <yu...@gmail.com> wrote:

> JVM

Re: Tune MapReduce over HBase to insert data

Posted by Ted Yu <yu...@gmail.com>.

Have you tuned the JVM parameter of hbase ?
If you have Ganglia, did you observe high variation in network latency on
the 6 nodes ?

HBase 0.92.2 has been released. Do you plan to upgrade to 0.92.2 or 0.94.3 ?

Cheers

On Mon, Jan 7, 2013 at 9:38 PM, Farrokh Shahriari <
mohandes.zebeleh.67@gmail.com> wrote:

> Hi again,
> I'm using HBase 0.92.1-cdh4.0.0.
> I have two server machine with 48Gb RAM,12 physical core & 24 logical core
> that contain 12 nodes(6 nodes on each server). Each node has 8Gb RAM & 2
> VCPU.
> I've set some parameter that get better result like set WAL=off on put,but
> some parameters like Heap-size,Deferred log flush don't help me.
> Beside that I have another question,why each time I've run mapreduce,I've
> got different result time while all the config & hardware are same & not
> change ?
>
> Tnx you guys
>
> On Tue, Jan 8, 2013 at 8:42 AM, Ted Yu <yu...@gmail.com> wrote:
>
> > Have you read through http://hbase.apache.org/book.html#performance ?
> >
> > What version of HBase are you using ?
> >
> > Cheers
> >
> > On Mon, Jan 7, 2013 at 9:05 PM, Farrokh Shahriari <
> > mohandes.zebeleh.67@gmail.com> wrote:
> >
> > > Hi there
> > > I have a cluster with 12 nodes that each of them has 2 core of CPU.
> Now,I
> > > want insert large data about 2Gb in 80 sec ( or 6Gb in 240sec ). I've
> > used
> > > Map-Reduce over hbase,but I can't achieve proper result .
> > > I'd be glad if you tell me what I can do to get better result or which
> > > parameters should I config or tune to improve Map-Reduce/Hbase
> > performance
> > > ?
> > >
> > > Tnx
> > >
> >
>

Tune MapReduce over HBase to insert data

Posted by Farrokh Shahriari <mo...@gmail.com>.

Hi again,
I'm using HBase 0.92.1-cdh4.0.0.
I have two server machine with 48Gb RAM,12 physical core & 24 logical core
that contain 12 nodes(6 nodes on each server). Each node has 8Gb RAM & 2
VCPU.
I've set some parameter that get better result like set WAL=off on put,but
some parameters like Heap-size,Deferred log flush don't help me.
Beside that I have another question,why each time I've run mapreduce,I've
got different result time while all the config & hardware are same & not
change ?

Tnx you guys

On Tue, Jan 8, 2013 at 8:42 AM, Ted Yu <yu...@gmail.com> wrote:

> Have you read through http://hbase.apache.org/book.html#performance ?
>
> What version of HBase are you using ?
>
> Cheers
>
> On Mon, Jan 7, 2013 at 9:05 PM, Farrokh Shahriari <
> mohandes.zebeleh.67@gmail.com> wrote:
>
> > Hi there
> > I have a cluster with 12 nodes that each of them has 2 core of CPU. Now,I
> > want insert large data about 2Gb in 80 sec ( or 6Gb in 240sec ). I've
> used
> > Map-Reduce over hbase,but I can't achieve proper result .
> > I'd be glad if you tell me what I can do to get better result or which
> > parameters should I config or tune to improve Map-Reduce/Hbase
> performance
> > ?
> >
> > Tnx
> >
>

Re: Tune MapReduce over HBase to insert data

Posted by Anoop John <an...@gmail.com>.

Hi
             Can you think of using HFileOutputFormat ?  Here you use
TableOutputFormat now. There will be put calls to HTable. Instead in
HFileOutput format the MR will write the HFiles directly.[No flushes ,
compactions] Later using LoadIncrementalHFiles need to load the HFiles to
the regions.  May help you..

-Anoop-

On Sun, Jan 13, 2013 at 10:59 AM, Farrokh Shahriari <
mohandes.zebeleh.67@gmail.com> wrote:

> Thank you guys,let me change these configuration & test mapreduce again.
>
> On Tue, Jan 8, 2013 at 10:31 PM, Asaf Mesika <as...@gmail.com>
> wrote:
>
> > Start by testing HDFS throughput by doing s simple copyFromLocal using
> > Hadoop command line shell (bin/hadoop fs -copyFromLocal pathTo8GBFile
> > /tmp/dummyFile1). If you have 1000Mbit/sec network between the computers,
> > you should get around 75 MB/sec.
> >
> > On Tuesday, January 8, 2013, Bing Jiang wrote:
> >
> > > In our experience, it can enhance mapreduce insert by
> > > 1.add regionserver flush thread number
> > > 2.add memstore/jvm_heap
> > > 3.pre split table region before mapreduce
> > > 4.add large and small compaction thread number.
> > >
> > > please correct me if wrong, or any other better ideas.
> > > On Jan 8, 2013 4:02 PM, "lars hofhansl" <larsh@apache.org
> <javascript:;>>
> > > wrote:
> > >
> > > > What type of disks and how many?
> > > > With the default replication factor your 2 (or 6) GB are actually
> > > > replicated 3 times.
> > > > 6GB/80s = 75MB/s, twice that if you do not disable the WAL, which a
> > > > reasonable machine should be able to absorb.
> > > > The fact that deferred log flush does not help you seems to indicate
> > that
> > > > you're over IO bound.
> > > >
> > > >
> > > > What's your memstore flush size? Potentially the data is written many
> > > > times during compactions.
> > > >
> > > >
> > > > In your case you dial down the HDFS replication, since you only have
> > two
> > > > physical machines anyway.
> > > > (Set it to 2. If you do not specify any failure zones, you might as
> > well
> > > > set it to 1... You will lose data if one of your server machines dies
> > > > anyway).
> > > >
> > > > It does not really make that much sense to deploy HBase and HDFS on
> > > > virtual nodes like this.
> > > > -- Lars
> > > >
> > > >
> > > >
> > > > ________________________________
> > > >  From: Farrokh Shahriari <mohandes.zebeleh.67@gmail.com
> <javascript:;>>
> > > > To: user@hbase.apache.org <javascript:;>
> > > > Sent: Monday, January 7, 2013 9:38 PM
> > > > Subject: Re: Tune MapReduce over HBase to insert data
> > > >
> > > > Hi again,
> > > > I'm using HBase 0.92.1-cdh4.0.0.
> > > > I have two server machine with 48Gb RAM,12 physical core & 24 logical
> > > core
> > > > that contain 12 nodes(6 nodes on each server). Each node has 8Gb RAM
> &
> > 2
> > > > VCPU.
> > > > I've set some parameter that get better result like set WAL=off on
> > > put,but
> > > > some parameters like Heap-size,Deferred log flush don't help me.
> > > > Beside that I have another question,why each time I've run
> > mapreduce,I've
> > > > got different result time while all the config & hardware are same &
> > not
> > > > change ?
> > > >
> > > > Tnx you guys
> > > >
> > > > On Tue, Jan 8, 2013 at 8:42 AM, Ted Yu <yuzhihong@gmail.com
> > <javascript:;>>
> > > wrote:
> > > >
> > > > > Have you read through
> http://hbase.apache.org/book.html#performance?
> > > > >
> > > > > What version of HBase are you using ?
> > > > >
> > > > > Cheers
> > > > >
> > > > > On Mon, Jan 7, 2013 at 9:05 PM, Farrokh Shahriari <
> > > > > mohandes.zebeleh.67@gmail.com <javascript:;>> wrote:
> > > > >
> > > > > > Hi there
> > > > > > I have a cluster with 12 nodes that each of them has 2 core of
> CPU.
> > > > Now,I
> > > > > > want insert large data about 2Gb in 80 sec ( or 6Gb in 240sec ).
> > I've
> > > > > used
> > > > > > Map-Reduce over hbase,but I can't achieve proper result .
> > > > > > I'd be glad if you tell me what I can do to get better result or
> > > which
> > > > > > parameters should I config or tune to improve Map-Reduce/Hbase
> > > > > performance
> > > > > > ?
> > > > > >
> > > > > > Tnx
> > > > > >
> > > > >
> > >
> >
>

Re: Tune MapReduce over HBase to insert data

Posted by Farrokh Shahriari <mo...@gmail.com>.

Thank you guys,let me change these configuration & test mapreduce again.

On Tue, Jan 8, 2013 at 10:31 PM, Asaf Mesika <as...@gmail.com> wrote:

> Start by testing HDFS throughput by doing s simple copyFromLocal using
> Hadoop command line shell (bin/hadoop fs -copyFromLocal pathTo8GBFile
> /tmp/dummyFile1). If you have 1000Mbit/sec network between the computers,
> you should get around 75 MB/sec.
>
> On Tuesday, January 8, 2013, Bing Jiang wrote:
>
> > In our experience, it can enhance mapreduce insert by
> > 1.add regionserver flush thread number
> > 2.add memstore/jvm_heap
> > 3.pre split table region before mapreduce
> > 4.add large and small compaction thread number.
> >
> > please correct me if wrong, or any other better ideas.
> > On Jan 8, 2013 4:02 PM, "lars hofhansl" <larsh@apache.org<javascript:;>>
> > wrote:
> >
> > > What type of disks and how many?
> > > With the default replication factor your 2 (or 6) GB are actually
> > > replicated 3 times.
> > > 6GB/80s = 75MB/s, twice that if you do not disable the WAL, which a
> > > reasonable machine should be able to absorb.
> > > The fact that deferred log flush does not help you seems to indicate
> that
> > > you're over IO bound.
> > >
> > >
> > > What's your memstore flush size? Potentially the data is written many
> > > times during compactions.
> > >
> > >
> > > In your case you dial down the HDFS replication, since you only have
> two
> > > physical machines anyway.
> > > (Set it to 2. If you do not specify any failure zones, you might as
> well
> > > set it to 1... You will lose data if one of your server machines dies
> > > anyway).
> > >
> > > It does not really make that much sense to deploy HBase and HDFS on
> > > virtual nodes like this.
> > > -- Lars
> > >
> > >
> > >
> > > ________________________________
> > >  From: Farrokh Shahriari <mohandes.zebeleh.67@gmail.com<javascript:;>>
> > > To: user@hbase.apache.org <javascript:;>
> > > Sent: Monday, January 7, 2013 9:38 PM
> > > Subject: Re: Tune MapReduce over HBase to insert data
> > >
> > > Hi again,
> > > I'm using HBase 0.92.1-cdh4.0.0.
> > > I have two server machine with 48Gb RAM,12 physical core & 24 logical
> > core
> > > that contain 12 nodes(6 nodes on each server). Each node has 8Gb RAM &
> 2
> > > VCPU.
> > > I've set some parameter that get better result like set WAL=off on
> > put,but
> > > some parameters like Heap-size,Deferred log flush don't help me.
> > > Beside that I have another question,why each time I've run
> mapreduce,I've
> > > got different result time while all the config & hardware are same &
> not
> > > change ?
> > >
> > > Tnx you guys
> > >
> > > On Tue, Jan 8, 2013 at 8:42 AM, Ted Yu <yuzhihong@gmail.com
> <javascript:;>>
> > wrote:
> > >
> > > > Have you read through http://hbase.apache.org/book.html#performance?
> > > >
> > > > What version of HBase are you using ?
> > > >
> > > > Cheers
> > > >
> > > > On Mon, Jan 7, 2013 at 9:05 PM, Farrokh Shahriari <
> > > > mohandes.zebeleh.67@gmail.com <javascript:;>> wrote:
> > > >
> > > > > Hi there
> > > > > I have a cluster with 12 nodes that each of them has 2 core of CPU.
> > > Now,I
> > > > > want insert large data about 2Gb in 80 sec ( or 6Gb in 240sec ).
> I've
> > > > used
> > > > > Map-Reduce over hbase,but I can't achieve proper result .
> > > > > I'd be glad if you tell me what I can do to get better result or
> > which
> > > > > parameters should I config or tune to improve Map-Reduce/Hbase
> > > > performance
> > > > > ?
> > > > >
> > > > > Tnx
> > > > >
> > > >
> >
>

Re: Tune MapReduce over HBase to insert data

Posted by Asaf Mesika <as...@gmail.com>.

Start by testing HDFS throughput by doing s simple copyFromLocal using
Hadoop command line shell (bin/hadoop fs -copyFromLocal pathTo8GBFile
/tmp/dummyFile1). If you have 1000Mbit/sec network between the computers,
you should get around 75 MB/sec.

On Tuesday, January 8, 2013, Bing Jiang wrote:

> In our experience, it can enhance mapreduce insert by
> 1.add regionserver flush thread number
> 2.add memstore/jvm_heap
> 3.pre split table region before mapreduce
> 4.add large and small compaction thread number.
>
> please correct me if wrong, or any other better ideas.
> On Jan 8, 2013 4:02 PM, "lars hofhansl" <larsh@apache.org <javascript:;>>
> wrote:
>
> > What type of disks and how many?
> > With the default replication factor your 2 (or 6) GB are actually
> > replicated 3 times.
> > 6GB/80s = 75MB/s, twice that if you do not disable the WAL, which a
> > reasonable machine should be able to absorb.
> > The fact that deferred log flush does not help you seems to indicate that
> > you're over IO bound.
> >
> >
> > What's your memstore flush size? Potentially the data is written many
> > times during compactions.
> >
> >
> > In your case you dial down the HDFS replication, since you only have two
> > physical machines anyway.
> > (Set it to 2. If you do not specify any failure zones, you might as well
> > set it to 1... You will lose data if one of your server machines dies
> > anyway).
> >
> > It does not really make that much sense to deploy HBase and HDFS on
> > virtual nodes like this.
> > -- Lars
> >
> >
> >
> > ________________________________
> >  From: Farrokh Shahriari <mohandes.zebeleh.67@gmail.com <javascript:;>>
> > To: user@hbase.apache.org <javascript:;>
> > Sent: Monday, January 7, 2013 9:38 PM
> > Subject: Re: Tune MapReduce over HBase to insert data
> >
> > Hi again,
> > I'm using HBase 0.92.1-cdh4.0.0.
> > I have two server machine with 48Gb RAM,12 physical core & 24 logical
> core
> > that contain 12 nodes(6 nodes on each server). Each node has 8Gb RAM & 2
> > VCPU.
> > I've set some parameter that get better result like set WAL=off on
> put,but
> > some parameters like Heap-size,Deferred log flush don't help me.
> > Beside that I have another question,why each time I've run mapreduce,I've
> > got different result time while all the config & hardware are same & not
> > change ?
> >
> > Tnx you guys
> >
> > On Tue, Jan 8, 2013 at 8:42 AM, Ted Yu <yuzhihong@gmail.com<javascript:;>>
> wrote:
> >
> > > Have you read through http://hbase.apache.org/book.html#performance ?
> > >
> > > What version of HBase are you using ?
> > >
> > > Cheers
> > >
> > > On Mon, Jan 7, 2013 at 9:05 PM, Farrokh Shahriari <
> > > mohandes.zebeleh.67@gmail.com <javascript:;>> wrote:
> > >
> > > > Hi there
> > > > I have a cluster with 12 nodes that each of them has 2 core of CPU.
> > Now,I
> > > > want insert large data about 2Gb in 80 sec ( or 6Gb in 240sec ). I've
> > > used
> > > > Map-Reduce over hbase,but I can't achieve proper result .
> > > > I'd be glad if you tell me what I can do to get better result or
> which
> > > > parameters should I config or tune to improve Map-Reduce/Hbase
> > > performance
> > > > ?
> > > >
> > > > Tnx
> > > >
> > >
>

Re: Tune MapReduce over HBase to insert data

Posted by Bing Jiang <ji...@gmail.com>.

In our experience, it can enhance mapreduce insert by
1.add regionserver flush thread number
2.add memstore/jvm_heap
3.pre split table region before mapreduce
4.add large and small compaction thread number.

please correct me if wrong, or any other better ideas.
On Jan 8, 2013 4:02 PM, "lars hofhansl" <la...@apache.org> wrote:

> What type of disks and how many?
> With the default replication factor your 2 (or 6) GB are actually
> replicated 3 times.
> 6GB/80s = 75MB/s, twice that if you do not disable the WAL, which a
> reasonable machine should be able to absorb.
> The fact that deferred log flush does not help you seems to indicate that
> you're over IO bound.
>
>
> What's your memstore flush size? Potentially the data is written many
> times during compactions.
>
>
> In your case you dial down the HDFS replication, since you only have two
> physical machines anyway.
> (Set it to 2. If you do not specify any failure zones, you might as well
> set it to 1... You will lose data if one of your server machines dies
> anyway).
>
> It does not really make that much sense to deploy HBase and HDFS on
> virtual nodes like this.
> -- Lars
>
>
>
> ________________________________
>  From: Farrokh Shahriari <mo...@gmail.com>
> To: user@hbase.apache.org
> Sent: Monday, January 7, 2013 9:38 PM
> Subject: Re: Tune MapReduce over HBase to insert data
>
> Hi again,
> I'm using HBase 0.92.1-cdh4.0.0.
> I have two server machine with 48Gb RAM,12 physical core & 24 logical core
> that contain 12 nodes(6 nodes on each server). Each node has 8Gb RAM & 2
> VCPU.
> I've set some parameter that get better result like set WAL=off on put,but
> some parameters like Heap-size,Deferred log flush don't help me.
> Beside that I have another question,why each time I've run mapreduce,I've
> got different result time while all the config & hardware are same & not
> change ?
>
> Tnx you guys
>
> On Tue, Jan 8, 2013 at 8:42 AM, Ted Yu <yu...@gmail.com> wrote:
>
> > Have you read through http://hbase.apache.org/book.html#performance ?
> >
> > What version of HBase are you using ?
> >
> > Cheers
> >
> > On Mon, Jan 7, 2013 at 9:05 PM, Farrokh Shahriari <
> > mohandes.zebeleh.67@gmail.com> wrote:
> >
> > > Hi there
> > > I have a cluster with 12 nodes that each of them has 2 core of CPU.
> Now,I
> > > want insert large data about 2Gb in 80 sec ( or 6Gb in 240sec ). I've
> > used
> > > Map-Reduce over hbase,but I can't achieve proper result .
> > > I'd be glad if you tell me what I can do to get better result or which
> > > parameters should I config or tune to improve Map-Reduce/Hbase
> > performance
> > > ?
> > >
> > > Tnx
> > >
> >

Re: Tune MapReduce over HBase to insert data

Posted by lars hofhansl <la...@apache.org>.

What type of disks and how many?
With the default replication factor your 2 (or 6) GB are actually replicated 3 times.
6GB/80s = 75MB/s, twice that if you do not disable the WAL, which a reasonable machine should be able to absorb.
The fact that deferred log flush does not help you seems to indicate that you're over IO bound.

What's your memstore flush size? Potentially the data is written many times during compactions.

In your case you dial down the HDFS replication, since you only have two physical machines anyway.
(Set it to 2. If you do not specify any failure zones, you might as well set it to 1... You will lose data if one of your server machines dies anyway).

It does not really make that much sense to deploy HBase and HDFS on virtual nodes like this.
-- Lars

________________________________
 From: Farrokh Shahriari <mo...@gmail.com>
To: user@hbase.apache.org 
Sent: Monday, January 7, 2013 9:38 PM
Subject: Re: Tune MapReduce over HBase to insert data

Hi again,
I'm using HBase 0.92.1-cdh4.0.0.
I have two server machine with 48Gb RAM,12 physical core & 24 logical core
that contain 12 nodes(6 nodes on each server). Each node has 8Gb RAM & 2
VCPU.
I've set some parameter that get better result like set WAL=off on put,but
some parameters like Heap-size,Deferred log flush don't help me.
Beside that I have another question,why each time I've run mapreduce,I've
got different result time while all the config & hardware are same & not
change ?

Tnx you guys

On Tue, Jan 8, 2013 at 8:42 AM, Ted Yu <yu...@gmail.com> wrote:

> Have you read through http://hbase.apache.org/book.html#performance ?
>
> What version of HBase are you using ?
>
> Cheers
>
> On Mon, Jan 7, 2013 at 9:05 PM, Farrokh Shahriari <
> mohandes.zebeleh.67@gmail.com> wrote:
>
> > Hi there
> > I have a cluster with 12 nodes that each of them has 2 core of CPU. Now,I
> > want insert large data about 2Gb in 80 sec ( or 6Gb in 240sec ). I've
> used
> > Map-Reduce over hbase,but I can't achieve proper result .
> > I'd be glad if you tell me what I can do to get better result or which
> > parameters should I config or tune to improve Map-Reduce/Hbase
> performance
> > ?
> >
> > Tnx
> >
>

Re: Tune MapReduce over HBase to insert data

Posted by Farrokh Shahriari <mo...@gmail.com>.

Hi again,
I'm using HBase 0.92.1-cdh4.0.0.
I have two server machine with 48Gb RAM,12 physical core & 24 logical core
that contain 12 nodes(6 nodes on each server). Each node has 8Gb RAM & 2
VCPU.
I've set some parameter that get better result like set WAL=off on put,but
some parameters like Heap-size,Deferred log flush don't help me.
Beside that I have another question,why each time I've run mapreduce,I've
got different result time while all the config & hardware are same & not
change ?

Tnx you guys

On Tue, Jan 8, 2013 at 8:42 AM, Ted Yu <yu...@gmail.com> wrote:

> Have you read through http://hbase.apache.org/book.html#performance ?
>
> What version of HBase are you using ?
>
> Cheers
>
> On Mon, Jan 7, 2013 at 9:05 PM, Farrokh Shahriari <
> mohandes.zebeleh.67@gmail.com> wrote:
>
> > Hi there
> > I have a cluster with 12 nodes that each of them has 2 core of CPU. Now,I
> > want insert large data about 2Gb in 80 sec ( or 6Gb in 240sec ). I've
> used
> > Map-Reduce over hbase,but I can't achieve proper result .
> > I'd be glad if you tell me what I can do to get better result or which
> > parameters should I config or tune to improve Map-Reduce/Hbase
> performance
> > ?
> >
> > Tnx
> >
>

Re: Tune MapReduce over HBase to insert data

Posted by Ted Yu <yu...@gmail.com>.

Have you read through http://hbase.apache.org/book.html#performance ?

What version of HBase are you using ?

Cheers

On Mon, Jan 7, 2013 at 9:05 PM, Farrokh Shahriari <
mohandes.zebeleh.67@gmail.com> wrote:

> Hi there
> I have a cluster with 12 nodes that each of them has 2 core of CPU. Now,I
> want insert large data about 2Gb in 80 sec ( or 6Gb in 240sec ). I've used
> Map-Reduce over hbase,but I can't achieve proper result .
> I'd be glad if you tell me what I can do to get better result or which
> parameters should I config or tune to improve Map-Reduce/Hbase performance
> ?
>
> Tnx
>