You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2014/05/17 00:30:52 UTC

[jira] [Commented] (HBASE-11183) Timeline Consistent region replicas - Phase 2 design

    [ https://issues.apache.org/jira/browse/HBASE-11183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14000418#comment-14000418 ] 

stack commented on HBASE-11183:
-------------------------------

When is phase 1 coming in?  How is it going?

bq. This means that the new data will only become available after the primary does a flush, then after some time the secondary does the periodic file list and sees the  new file

What happens again if a file is compacted away and the replica tries to read it?

bq. I In case the edit cannot be put to memstore, because of mem pressure, we will trigger a  flush of a primary region, and throw RegionTooBusyException which should throttle the client. 

Throw the exception to the replicating client only?

bq. For this, we will reintroduce WALEdits for persisting flush actions in WAL.

Sounds good.

bq. Allow multiple memstore snapshots: This will allow us to keep the memstore snapshot  from aborted flush as it is.

Anoop needs something like this for his in-memory compaction of memstores.

bq. Whenever a secondary sees a compaction entry, it will go ahead and apply the compaction.

Nice.

-If replica crashes, it loses all inmemory state.  When it opens again, it will pick up hfiles.  What to do about the gap between the hfiles and what was in memory?  Wait until we get a new flush message from the primary before we start serving?- <-- You answer my question later in the doc.

Are "Memstore snapshots on the secondary (no flushes for the secondary regions) " and "Allow Flushes for the secondary regions " mutually exclusive -- one or the other?

bq. . In either of the approaches, new secondaries can be brought up to speed based on that data.

Only if the replica comes up post crash on same node?

bq. The secondary serves reads from a merged view of primary's store files, secondary's memstore and secondary's spill file (which  would be yet another storefile). 

On first read, strikes me as complicated.

X-DC replication, each regionserver hosts a client that writes the WAL edits to a RS on remote cluster for it to distribute.  Here, we'd host the distributing HTable instance inside the src RS?

bq. Whenever a secondary starts serving, it will trigger a flush from the primary region ..

How will this work?

bq. In this design, there will be a replication queue per table per region replication count.

So, replication will update zk per region?  What sort of update rates in zk you thinking?

bq. For replicating wal entries to secondaries, the client has to send the WALEdit/WALEntry objects. 

Is this what the client does currently when replicating?  Send WAL*s?  These are FAAAAAATTTTT objects.

On first read, the first option -- not writing spill files seems more straightforward -- but will reread.

-The split using 'sync' may not be tenable given close has data offline and the root of this work is minimizing offline time?  Might be worth a bit of measuring though.- <- You come to same conclusion yourselves later in the doc ("Fools seldom....").

Nice write up lads.












> Timeline Consistent region replicas - Phase 2 design
> ----------------------------------------------------
>
>                 Key: HBASE-11183
>                 URL: https://issues.apache.org/jira/browse/HBASE-11183
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>         Attachments: PhaseIIDesignforHBASE-10070.pdf
>
>
> Now that Phase 1 of parent issue HBASE-10070 is mostly done, it is time to think about remaining items in Phase 2 as per the design doc https://issues.apache.org/jira/secure/attachment/12616659/HighAvailabilityDesignforreadsApachedoc.pdf
> Phase 2 will conclude the work and include at least the following major features: 
>  - Async WAL Replication 
>  - Replication Interface changes (HBASE-10504)
>  - Replication should guarantee seqId order (for region moves and RS failure)
>  - Flush / Compaction events should be written to WAL
>  - Flush / Memstore handling from secondary regions
>  - Region split / merge handling for replicated regions
> In this issue, we can discuss the proposed design, and keep it as like a parent jira for Phase 2 work. We'll open subtasks agains the HBASE-10070 jira for individual patches. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)