You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Konstantin Shvachko (JIRA)" <ji...@apache.org> on 2007/12/24 23:23:43 UTC

[jira] Resolved: (HADOOP-2149) Pure name-node benchmarks.

     [ https://issues.apache.org/jira/browse/HADOOP-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Konstantin Shvachko resolved HADOOP-2149.
-----------------------------------------

    Resolution: Fixed

I just committed this.

> Pure name-node benchmarks.
> --------------------------
>
>                 Key: HADOOP-2149
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2149
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Konstantin Shvachko
>            Assignee: Konstantin Shvachko
>             Fix For: 0.16.0
>
>         Attachments: NNThroughput.patch, NNThroughput.patch
>
>
> h3. Pure name-node benchmark.
> This patch starts a series of name-node benchmarks.
> The intention is to have a separate benchmark for every important name-node operation.
> The purpose of benchmarks is
> # to measure the throughput for each name-node operation, and
> # to evaluate changes in the name-node performance (gain or degradation) when optimization
> or new functionality patches are introduced.
> The benchmarks measure name-node throughput (ops per second) and the average execution time.
> The benchmark does not involve any other hadoop components except for the name-node.
> The name-node server is real, other components are simulated.
> There is no RPC overhead. Each operation is executed by calling directly the respective name-node method.
> The benchmark is multi-threaded, that is one can start multiple threads competing for the
> name-node resources by executing concurrently the same operation but with different data.
> See javadoc for more details.
> The patch contains implementation for two name-node operations: file creates and block reports.
> Implementation of other operations will follow.
> h3. File creation benchmark.
> I've ran two series of the file create benchmarks on the name-node with different number of threads.
> The first series is run on the regular name-node performing an edits log transaction on every create.
> The transaction includes a synch to the disk.
> In the second series the name-node is modified so that the synchs are turned off.
> Each run of the benchmark performs the same number 10,000 of creates equally distributed between
> running threads. I used a 4 core 2.8Ghz machine.
> The following two tables summarized the results. Time is in milliseconds.
> || threads || time (msec)\\with synch || ops/sec\\with synch ||
> | 1 | 13074 | 764 |
> | 2 | 8883 | 1125 |
> | 4 | 7319 | 1366 |
> | 10 | 7094 | 1409 |
> | 20 | 6785 | 1473 |
> | 40 | 6776 | 1475 |
> | 100 | 6899 | 1449 |
> | 200 | 7131 | 1402 |
> | 400 | 7084 | 1411 |
> | 1000 | 7181 | 1392 |
> || threads || time (msec)\\no synch || ops/sec\\no synch ||
> | 1 | 4559 | 2193 |
> | 2 | 4979 | 2008 |
> | 4 | 5617 | 1780 |
> | 10 | 5679 | 1760 |
> | 20 | 5550 | 1801 |
> | 40 | 5804 | 1722 |
> | 100 | 5871 | 1703 |
> | 200 | 6037 | 1656 |
> | 400 | 5855 | 1707 |
> | 1000 | 6069 | 1647 |
> The results show:
> # (Table 1) The new synchronization mechanism that batches synch calls from different threads works well.
> For one thread all synchs cause a real IO making it slow. The more threads is used the more synchs are
> batched resulting in better performance. The performance grows up to a certain point and then stabilizes
> at about 1450 ops/sec.
> # (Table 2) Operations that do not require disk IOs are constrained by memory locks.
> Without synchs the one-threaded execution is the fastest, because there are no waits.
> More threads start to intervene with each other and have to wait.
> Again the performance stabilizes at about 1700 ops/sec, and does not degrade further.
> # Our default 10 handlers per name-node is not the best choice neither for the io bound nor for the pure
> memory operations. We should increase the default to 20 handlers and on big classes 100 handlers
> or more can be used without loss of performance. In fact with more handlers more operations can be handled
> simultaneously, which prevents the name-node from dropping calls that are close to timeout.
> h3. Block report benchmark.
> In this benchmarks each thread pretends it is a data-node and calls blockReport() with the same blocks.
> All blocks are real, that is they were previously allocated by the name-node and assigned to the data-nodes.
> Some reports can contain fake blocks, and some can have missing blocks.
> Each block report consists of 10,000 blocks. The total number of reports sent is 1000.
> The reports are equally divided between the data-nodes so that each of them sends equal number of reports.
> Here is the table with the results.
> || data-nodes || time (msec) || ops/sec ||
> | 1 | 42234 | 24 |
> | 2 | 9412 | 106 |
> | 4 | 11465 | 87 |
> | 10 | 15632 | 64 |
> | 20 | 17623 | 57 |
> | 40 | 19563 | 51 |
> | 100 | 24315 | 41 |
> | 200 | 29789 | 34 |
> | 400 | 23636 | 42 |
> | 600 | 39682 | 26 |
> I did not have time to analyze this yet. So comments are welcome.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.