You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Andrew Purtell <ap...@apache.org> on 2009/06/28 01:02:26 UTC
interesting informal test results
Test:
- Latest trunk.
- Config modified only with a store file split threshold of 1GB
- 4 node testbed:
1) namenode, datanode, hmaster, heritrix, jobtracker
2) datanode, regionserver, heritrix, tasktracker, mapper (2)
3) datanode, regionserver, heritrix, tasktracker, mapper (2)
4) datanode, regionserver, heritrix, tasktracker, mapper (2)
- 100 heritrix threads - 4 hosts, 25 threads each - feeding in ~5MB/sec average new edits
- 2 mappers x 3 hosts processing new edits and writing back serialized/compressed Documents
- 3K average transactions/sec reported by master
- 'hadoop balancer -threshold 0.1'
- 1 hour test run
Result:
Passed with no incidents!
- Andy
Re: interesting informal test results
Posted by stack <st...@duboce.net>.
Thats good news Andrew (Thats a great test too -- heavy upload with
concurrent read/writes to same table with blocks being moved around
underneath it all).
St.Ack
On Sat, Jun 27, 2009 at 4:02 PM, Andrew Purtell <ap...@apache.org> wrote:
> Test:
>
> - Latest trunk.
>
> - Config modified only with a store file split threshold of 1GB
>
> - 4 node testbed:
> 1) namenode, datanode, hmaster, heritrix, jobtracker
> 2) datanode, regionserver, heritrix, tasktracker, mapper (2)
> 3) datanode, regionserver, heritrix, tasktracker, mapper (2)
> 4) datanode, regionserver, heritrix, tasktracker, mapper (2)
>
> - 100 heritrix threads - 4 hosts, 25 threads each - feeding in ~5MB/sec
> average new edits
>
> - 2 mappers x 3 hosts processing new edits and writing back
> serialized/compressed Documents
>
> - 3K average transactions/sec reported by master
>
> - 'hadoop balancer -threshold 0.1'
>
> - 1 hour test run
>
> Result:
>
> Passed with no incidents!
>
> - Andy
>
>
>
Re: interesting informal test results
Posted by Jean-Daniel Cryans <jd...@apache.org>.
Sorry for being late in the conversation, I was away from the tubes
for some time...
So currently the batching is done by region, not region server. When I
implemented the write buffer, I saw that we can save a lot in the
first few RPCs we cut but then it stays pretty much stable (see the
graph in the jira about that). So grouping by region server would be a
bit more complicated IMO and it wouldn't save much... if not nothing
at all.
J-D
On Sun, Jun 28, 2009 at 4:30 AM, Ryan Rawson<ry...@gmail.com> wrote:
> Regionserver...could it be any other way?
>
> Group commit, aka commit buffer is to achieve maximal write performance.
>
> On Jun 28, 2009 12:54 AM, "Joey Echeverria"
> <jo...@gmail.com>>
> wrote:
>
> When the client does group commits does it group by row key or region
> server?
>
> On Sun, Jun 28, 2009 at 12:08 AM, Ryan Rawson<ry...@gmail.com> wrote: > I
> imported 9b rows in 5 ...
>
Re: interesting informal test results
Posted by Ryan Rawson <ry...@gmail.com>.
Regionserver...could it be any other way?
Group commit, aka commit buffer is to achieve maximal write performance.
On Jun 28, 2009 12:54 AM, "Joey Echeverria"
<jo...@gmail.com>>
wrote:
When the client does group commits does it group by row key or region
server?
On Sun, Jun 28, 2009 at 12:08 AM, Ryan Rawson<ry...@gmail.com> wrote: > I
imported 9b rows in 5 ...
Re: interesting informal test results
Posted by Joey Echeverria <jo...@gmail.com>.
When the client does group commits does it group by row key or region server?
On Sun, Jun 28, 2009 at 12:08 AM, Ryan Rawson<ry...@gmail.com> wrote:
> I imported 9b rows in 5 days or so, a few minor crashes, average speed
> between 50-200 k ops/sec. The client needs some love to make it more
> efficient on grouping commits during bulk upload.
>
> On Jun 27, 2009 4:02 PM, "Andrew Purtell" <ap...@apache.org> wrote:
>
> Test:
>
> - Latest trunk.
>
> - Config modified only with a store file split threshold of 1GB
>
> - 4 node testbed:
> 1) namenode, datanode, hmaster, heritrix, jobtracker
> 2) datanode, regionserver, heritrix, tasktracker, mapper (2)
> 3) datanode, regionserver, heritrix, tasktracker, mapper (2)
> 4) datanode, regionserver, heritrix, tasktracker, mapper (2)
>
> - 100 heritrix threads - 4 hosts, 25 threads each - feeding in ~5MB/sec
> average new edits
>
> - 2 mappers x 3 hosts processing new edits and writing back
> serialized/compressed Documents
>
> - 3K average transactions/sec reported by master
>
> - 'hadoop balancer -threshold 0.1'
>
> - 1 hour test run
>
> Result:
>
> Passed with no incidents!
>
> - Andy
>
Re: interesting informal test results
Posted by Ryan Rawson <ry...@gmail.com>.
I imported 9b rows in 5 days or so, a few minor crashes, average speed
between 50-200 k ops/sec. The client needs some love to make it more
efficient on grouping commits during bulk upload.
On Jun 27, 2009 4:02 PM, "Andrew Purtell" <ap...@apache.org> wrote:
Test:
- Latest trunk.
- Config modified only with a store file split threshold of 1GB
- 4 node testbed:
1) namenode, datanode, hmaster, heritrix, jobtracker
2) datanode, regionserver, heritrix, tasktracker, mapper (2)
3) datanode, regionserver, heritrix, tasktracker, mapper (2)
4) datanode, regionserver, heritrix, tasktracker, mapper (2)
- 100 heritrix threads - 4 hosts, 25 threads each - feeding in ~5MB/sec
average new edits
- 2 mappers x 3 hosts processing new edits and writing back
serialized/compressed Documents
- 3K average transactions/sec reported by master
- 'hadoop balancer -threshold 0.1'
- 1 hour test run
Result:
Passed with no incidents!
- Andy