You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hama.apache.org by "Suraj Menon (JIRA)" <ji...@apache.org> on 2013/01/03 14:06:13 UTC

[jira] [Commented] (HAMA-559) Add a spilling message queue

    [ https://issues.apache.org/jira/browse/HAMA-559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542905#comment-13542905 ] 

Suraj Menon commented on HAMA-559:
----------------------------------

I am trying to speed up the reads with PrefetchCache, however it is not final yet. 
In writes where most of the writes are single byte, a small improvements is made as shown below compared to previous benchmarks:
{noformat}
 0% Scenario{vm=java, trial=0, benchmark=Spill, size=10, type=DISK_LIST} 3466.12 ns; σ=11.99 ns @ 3 trials
 5% Scenario{vm=java, trial=0, benchmark=Spill, size=100, type=DISK_LIST} 7038.17 ns; σ=163.45 ns @ 10 trials
10% Scenario{vm=java, trial=0, benchmark=Spill, size=1000, type=DISK_LIST} 47895.80 ns; σ=4017.85 ns @ 10 trials
14% Scenario{vm=java, trial=0, benchmark=Spill, size=10000, type=DISK_LIST} 357760.36 ns; σ=353653.20 ns @ 10 trials
19% Scenario{vm=java, trial=0, benchmark=Spill, size=100000, type=DISK_LIST} 3546827.11 ns; σ=164414.76 ns @ 10 trials
24% Scenario{vm=java, trial=0, benchmark=Spill, size=1000000, type=DISK_LIST} 40919933.97 ns; σ=3765935.50 ns @ 10 trials
29% Scenario{vm=java, trial=0, benchmark=Spill, size=10000000, type=DISK_LIST} 350974828.00 ns; σ=15513001.84 ns @ 10 trials
33% Scenario{vm=java, trial=0, benchmark=Spill, size=10, type=SPILLING_BUFFER} 3241.19 ns; σ=9.83 ns @ 3 trials
38% Scenario{vm=java, trial=0, benchmark=Spill, size=100, type=SPILLING_BUFFER} 4929.45 ns; σ=6.30 ns @ 3 trials
43% Scenario{vm=java, trial=0, benchmark=Spill, size=1000, type=SPILLING_BUFFER} 28712.59 ns; σ=473.22 ns @ 10 trials
48% Scenario{vm=java, trial=0, benchmark=Spill, size=10000, type=SPILLING_BUFFER} 187271.76 ns; σ=6520.41 ns @ 10 trials
52% Scenario{vm=java, trial=0, benchmark=Spill, size=100000, type=SPILLING_BUFFER} 1685095.74 ns; σ=10410.03 ns @ 3 trials
57% Scenario{vm=java, trial=0, benchmark=Spill, size=1000000, type=SPILLING_BUFFER} 16922940.78 ns; σ=237383.64 ns @ 10 trials
62% Scenario{vm=java, trial=0, benchmark=Spill, size=10000000, type=SPILLING_BUFFER} 169591804.10 ns; σ=6028667.57 ns @ 10 trials
67% Scenario{vm=java, trial=0, benchmark=Spill, size=10, type=DISK_BUFFER} 4195.59 ns; σ=40.50 ns @ 10 trials
71% Scenario{vm=java, trial=0, benchmark=Spill, size=100, type=DISK_BUFFER} 6697.32 ns; σ=43.13 ns @ 3 trials
76% Scenario{vm=java, trial=0, benchmark=Spill, size=1000, type=DISK_BUFFER} 37091.61 ns; σ=155.36 ns @ 3 trials
81% Scenario{vm=java, trial=0, benchmark=Spill, size=10000, type=DISK_BUFFER} 333739.82 ns; σ=3451.49 ns @ 10 trials
86% Scenario{vm=java, trial=0, benchmark=Spill, size=100000, type=DISK_BUFFER} 3209961.86 ns; σ=15172.65 ns @ 3 trials
90% Scenario{vm=java, trial=0, benchmark=Spill, size=1000000, type=DISK_BUFFER} 30691871.71 ns; σ=25893.53 ns @ 3 trials
95% Scenario{vm=java, trial=0, benchmark=Spill, size=10000000, type=DISK_BUFFER} 317232771.17 ns; σ=2960684.01 ns @ 4 trials

    size            type        us linear runtime
      10       DISK_LIST      3.47 =
      10 SPILLING_BUFFER      3.24 =
      10     DISK_BUFFER      4.20 =
     100       DISK_LIST      7.04 =
     100 SPILLING_BUFFER      4.93 =
     100     DISK_BUFFER      6.70 =
    1000       DISK_LIST     47.90 =
    1000 SPILLING_BUFFER     28.71 =
    1000     DISK_BUFFER     37.09 =
   10000       DISK_LIST    357.76 =
   10000 SPILLING_BUFFER    187.27 =
   10000     DISK_BUFFER    333.74 =
  100000       DISK_LIST   3546.83 =
  100000 SPILLING_BUFFER   1685.10 =
  100000     DISK_BUFFER   3209.96 =
 1000000       DISK_LIST  40919.93 ===
 1000000 SPILLING_BUFFER  16922.94 =
 1000000     DISK_BUFFER  30691.87 ==
10000000       DISK_LIST 350974.83 ==============================
10000000 SPILLING_BUFFER 169591.80 ==============
10000000     DISK_BUFFER 317232.77 ===========================

vm: java
trial: 0
benchmark: Spill

Note: benchmarks printed 48434475 characters to System.out and 0 characters to System.err. Use --debug to see this output.

{noformat}
                
> Add a spilling message queue
> ----------------------------
>
>                 Key: HAMA-559
>                 URL: https://issues.apache.org/jira/browse/HAMA-559
>             Project: Hama
>          Issue Type: Sub-task
>          Components: bsp core
>    Affects Versions: 0.5.0
>            Reporter: Thomas Jungblut
>            Assignee: Suraj Menon
>            Priority: Minor
>             Fix For: 0.7.0
>
>         Attachments: HAMA-559.patch-v1, HAMA-559.patch-v2, spillbench_code.tar.gz, spilling_buffer_cpu_usage_text_write.png, SpillingBufferProfile-2012-10-27.snapshot, spilling_buffer_profile_cpu_graph_test_write.png, spilling_buffer_profile_cpugraph_writeUTF.png, spillingbuffer_profile_cpu_writeUTF.png, spilling_buffer_profile_LOCK.JPG, spilling_buffer_profile_timesplit_text_write.png, spilling_buffer_profile_writeUTF.png
>
>
> After HAMA-521 is done, we can add a spilling queue which just holds the messages in RAM that fit into the heap space. The rest can be flushed to disk.
> We may call this a HybridQueue or something like that.
> The benefits should be that we don't have to flush to disk so often and get faster. However we may have more GC so it is always overall faster.
> The requirements for this queue also include:
> - The message object once written to the queue (after returning from the write call) could be modified, but the changes should not be reflected in the messages stored in the queue.
> - For now let's implement a queue that does not support concurrent reading and writing. This feature is needed when we implement asynchronous communication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira