You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@trafodion.apache.org by Eric Owhadi <er...@esgyn.com> on 2015/10/22 18:07:20 UTC
investigation on parallel scanner for trafodion...

Some interesting results from the parallel scanner investigation:

I implemented a parallel scanner based on HBase JIRA 9272. Given Hbase 1.0
scanner started to require multi-threading for replica support, I had to
backport a Hbase 98 scanner to make use in the parallel scanner. I made it
configurable via an hbase-site.xml option so I could test it on and off.

The key questions are: how does scanner level parallelism stack against ESP
level parallelism. Here are test numbers I ran on a workstation, single
region server, doing counts on a 10 000 000 row table 10 way split (so 10
regions). Columns are key int, int, varchar(1000) (and add one int for the _
*SALT*_).



Because I ran it on a shared workstation, I had to duplicate 2 test to
accommodate variations due to workstation busyness at different time of day.

Each number represent an average on 10 run.



Parallel scan 2 thread/ no ESP: 39.9s

Parallel scan 2 thread/ 2 ESP (so 4 threads total): 30.6s

Java test Parallel scan 2 thread: 30.5

Java test Parallel scan 1 thread: 49.7s -> this is stupid test but shows
the buffering on parallel scanner is not good when you don t do parallelism
(1 thread)

Java test 1 thread, using Hbase 1.0 client scanner: 37.5s (ran morning)

No ESP, Hbase 1.0 client scanner: 43.6s (ran morning)

No ESP, Hbase 1.0 client scanner: 47.3s -> see difference with above
morning run. In morning, workstation is not busy…)

2 ESP, HBase 1.0 client scanner: 30s



First, let me share that I had trouble when running these test with a debug
build. The trafodion code on the scan will be a huge bottleneck, throwing
off any tentative to do scan parallelism. Running in release mode was key
to get back to sanity…



The key learning here is that the work done by trafodion once data is
received from hbase scan:

-          Get the row

-          Create it doing formatting to tupple format

-          Apply predicate

-          Return the row

has a high cost. Has comparison, the java test is doing direct hbase
access, and just count the result back, doing nothing else…



if you compare parallel scan 2 thread/no ESP vs 2 ESP, Hbase 1.0 client
scanner: 39.9s vs 30s, it is clear that giving parallel thread to the
trafodion 4 steps above does show on the performance numbers.



So until we push down more predicate to have more cases where the returned
columns volume is cut down, parallel scanner will not show any benefit over
ESP parallelism, because we are not parallelizing the 4 steps above with a
parallel scanner, while we are with ESPs.



So I propose to put this PAPA work on hold, and retest it when we will have
more predicate push down implemented. At that point may be not doing
parallelism on the 4 steps will be OK…



Another learning, looking at Java test Parallel Scan 2 thread vs 2ESP Hbase
1.0 client scanner: 30.5 vs 30 -> is indicating that parallel scanner
should be able to get some more optimization, but that may be because of
the low parallelism level I picked.



Make sense?

Eric Owhadi