You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Vincent Poon (JIRA)" <ji...@apache.org> on 2017/01/05 19:59:58 UTC
[jira] [Updated] (HBASE-15995) Separate replication WAL reading
from shipping
[ https://issues.apache.org/jira/browse/HBASE-15995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vincent Poon updated HBASE-15995:
---------------------------------
Attachment: HBASE-15995.master.v1.patch
This patch does two things
1) Puts replication reading into a separate thread. This is done by ReplicationWALEntryBatcher, which reads entries and puts them onto a queue. ReplicationSourceWorkerThread is reduced to simply reading batches off the queue and shipping them.
2) Puts the actual WAL entry reading logic in a WALEntryStream class. This implements iterator. Eventually when we have a way to stream over the network, we can get rid of the batcher above and simplify to something like
{code}
while(entryStream.hasNext()) {
shipEntry(entryStream.next());
}
{code}
I tried to keep the rest of the logic the same as what currently exists. We could put ReplicationSource into another class ReplicationSourceV2 if so desired.
I believe all replication tests pass except TestGlobalThrottler. This is because one thread is currently reading a batch, and the other thread is shipping the last batch, so even if your queue holds only 1 batch, you're using double the memory. (If I modify the test to double the threshold, it passes)
I've done performance testing by setting up a single standalone region server shipping to a remote cluster, and then running PerformanceEvaluation to generate 3gb of data. The amount of time for replication to catch up:
ReplicationSourceV1 - 190s (source.size.capacity of 64mb)
ReplicationSourceV2 - 160s (source.size.capacity of 32mb, with queue size of 1 so that the max memory used should be 64mb)
There's better performance in situations where reading or filtering entries is more expensive (e.g. contention for disk/cpu). For example, I tried introducing a 100ms delay in a custom entry filter.
ReplicationSourceV1 - 366s
ReplicationSourceV2 - 236s
> Separate replication WAL reading from shipping
> ----------------------------------------------
>
> Key: HBASE-15995
> URL: https://issues.apache.org/jira/browse/HBASE-15995
> Project: HBase
> Issue Type: Sub-task
> Components: Replication
> Affects Versions: 2.0.0
> Reporter: Vincent Poon
> Assignee: Vincent Poon
> Fix For: 2.0.0
>
> Attachments: HBASE-15995.master.v1.patch
>
>
> Currently ReplicationSource reads edits from the WAL and ships them in the same thread.
> By breaking out the reading from the shipping, we can introduce greater parallelism and lay the foundation for further refactoring to a pipelined, streaming model.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)