You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Binglin Chang (JIRA)" <ji...@apache.org> on 2016/04/20 07:23:25 UTC

[jira] [Commented] (KUDU-1235) Add Get API

    [ https://issues.apache.org/jira/browse/KUDU-1235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249270#comment-15249270 ] 

Binglin Chang commented on KUDU-1235:
-------------------------------------

New test result for the latest patch:

{noformat}
get_perf-itest result on i7-4770 CPU @ 3.40GHz (4 core 8 thread):

Input <scan|get> <seconds to run, 0(exit)> <num thread>: get 16 20
I0420 06:05:29.992522 28700 get_perf-itest.cc:154] Get total: 145032 QPS: 72514
I0420 06:05:31.992616 28700 get_perf-itest.cc:154] Get total: 285910 QPS: 70435.7
I0420 06:05:33.992712 28700 get_perf-itest.cc:154] Get total: 431481 QPS: 72782
I0420 06:05:35.992811 28700 get_perf-itest.cc:154] Get total: 571640 QPS: 70076.1
I0420 06:05:37.992911 28700 get_perf-itest.cc:154] Get total: 715463 QPS: 71907.9
I0420 06:05:39.993012 28700 get_perf-itest.cc:154] Get total: 859175 QPS: 71852.4
I0420 06:05:41.993110 28700 get_perf-itest.cc:154] Get total: 1002069 QPS: 71443.5
I0420 06:05:43.993211 28700 get_perf-itest.cc:154] Get total: 1137329 QPS: 67626.6

Input <scan|get> <seconds to run, 0(exit)> <num thread>: scan 16 20
I0420 06:05:51.713536 28700 get_perf-itest.cc:154] Scan total: 70844 QPS: 35421
I0420 06:05:53.713636 28700 get_perf-itest.cc:154] Scan total: 140419 QPS: 34785.8
I0420 06:05:55.713733 28700 get_perf-itest.cc:154] Scan total: 212848 QPS: 36212.8
I0420 06:05:57.713832 28700 get_perf-itest.cc:154] Scan total: 275949 QPS: 31548.9
I0420 06:05:59.713927 28700 get_perf-itest.cc:154] Scan total: 348015 QPS: 36031.3
I0420 06:06:01.714028 28700 get_perf-itest.cc:154] Scan total: 418685 QPS: 35333.2
I0420 06:06:03.714128 28700 get_perf-itest.cc:154] Scan total: 488378 QPS: 34844.8
I0420 06:06:05.714231 28700 get_perf-itest.cc:154] Scan total: 559647 QPS: 35632.7


result on Xeon(R) CPU E5-2620 0 @ 2.00GHz (12 core 24 thread)

Input <scan|get> <seconds to run, 0(exit)> <num thread>: get 16 20
I0420 11:52:07.616729 16915 get_perf-itest.cc:148] Get total: 73436 QPS: 36716.8
I0420 11:52:09.616859 16915 get_perf-itest.cc:148] Get total: 145494 QPS: 36026.7
I0420 11:52:11.616997 16915 get_perf-itest.cc:148] Get total: 222510 QPS: 38505.3
I0420 11:52:13.617130 16915 get_perf-itest.cc:148] Get total: 292826 QPS: 35155.7
I0420 11:52:15.617260 16915 get_perf-itest.cc:148] Get total: 370233 QPS: 38701
I0420 11:52:17.617399 16915 get_perf-itest.cc:148] Get total: 450170 QPS: 39965.9
I0420 11:52:19.617547 16915 get_perf-itest.cc:148] Get total: 524279 QPS: 37051.8
I0420 11:52:21.617691 16915 get_perf-itest.cc:148] Get total: 600118 QPS: 37916.8
CPU: TS: ~510% client: ~200%

Input <scan|get> <seconds to run, 0(exit)> <num thread>: scan 16 20
I0420 11:52:35.767921 16915 get_perf-itest.cc:148] Scan total: 62086 QPS: 31042
I0420 11:52:37.768052 16915 get_perf-itest.cc:148] Scan total: 117487 QPS: 27698.7
I0420 11:52:39.768182 16915 get_perf-itest.cc:148] Scan total: 180447 QPS: 31477.9
I0420 11:52:41.768314 16915 get_perf-itest.cc:148] Scan total: 241990 QPS: 30769.5
I0420 11:52:43.768456 16915 get_perf-itest.cc:148] Scan total: 304551 QPS: 31278.4
I0420 11:52:45.768621 16915 get_perf-itest.cc:148] Scan total: 355526 QPS: 25485.4
I0420 11:52:47.768771 16915 get_perf-itest.cc:148] Scan total: 410974 QPS: 27722
I0420 11:52:49.768911 16915 get_perf-itest.cc:148] Scan total: 474510 QPS: 31765.6
CPU: TS: ~720% client: ~360%
{noformat}

It's interesting that my local machine(4 core 3.4GHz) actually out perform remote server(12 core 2.0 GHz), 
Originally I suspect there is lock contention, so I change tablets from 1 to 10, but this doesn't help at all.
Another thought is maybe we should pin tablet to cpu cores, e.g. each handler thread only server fixed collection of tablets(hash by tabletid).





> Add Get API
> -----------
>
>                 Key: KUDU-1235
>                 URL: https://issues.apache.org/jira/browse/KUDU-1235
>             Project: Kudu
>          Issue Type: New Feature
>            Reporter: Binglin Chang
>            Assignee: Binglin Chang
>         Attachments: perf-get.svg, perf-scan-opt.svg, perf-scan.svg
>
>
> Get API is more user friendly and efficient if use just want primary key lookup.
> I setup a cluster and test get/scan single row using ycsb, initial test shows better performance for get.
> {noformat}
> kudu_workload:
> recordcount=1000000
> operationcount=1000000
> workload=com.yahoo.ycsb.workloads.CoreWorkload
> readallfields=false
> readproportion=1
> updateproportion=0
> scanproportion=0
> insertproportion=0
> requestdistribution=uniform
> use_get_api=false
> load:
> ./bin/ycsb load kudu -P workloads/kudu_workload -p sync_ops=false -p pre_split_num_tablets=1 -p table_name=ycsb_wiki_example -p masterQuorum='c3-kudu-tst-st01.bj:32600' -threads 100
> read test:
> ./bin/ycsb run kudu -P workloads/kudu_workload -p masterQuorum='c3-kudu-tst-st01.bj:32600' -threads 100
> {noformat}
> Get API:
> [OVERALL], RunTime(ms), 21304.0
> [OVERALL], Throughput(ops/sec), 46939.54187007135
> [CLEANUP], Operations, 100.0
> [CLEANUP], AverageLatency(us), 423.57
> [CLEANUP], MinLatency(us), 24.0
> [CLEANUP], MaxLatency(us), 19327.0
> [CLEANUP], 95thPercentileLatency(us), 52.0
> [CLEANUP], 99thPercentileLatency(us), 18815.0
> [READ], Operations, 1000000.0
> [READ], AverageLatency(us), 2065.185152
> [READ], MinLatency(us), 134.0
> [READ], MaxLatency(us), 92159.0
> [READ], 95thPercentileLatency(us), 2391.0
> [READ], 99thPercentileLatency(us), 6359.0
> [READ], Return=0, 1000000
> Scan API:
> [OVERALL], RunTime(ms), 38259.0
> [OVERALL], Throughput(ops/sec), 26137.6408165399
> [CLEANUP], Operations, 100.0
> [CLEANUP], AverageLatency(us), 47.32
> [CLEANUP], MinLatency(us), 16.0
> [CLEANUP], MaxLatency(us), 1837.0
> [CLEANUP], 95thPercentileLatency(us), 41.0
> [CLEANUP], 99thPercentileLatency(us), 158.0
> [READ], Operations, 1000000.0
> [READ], AverageLatency(us), 3595.825249
> [READ], MinLatency(us), 139.0
> [READ], MaxLatency(us), 3139583.0
> [READ], 95thPercentileLatency(us), 3775.0
> [READ], 99thPercentileLatency(us), 7659.0
> [READ], Return=0, 1000000



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)