You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Jesse Yates (JIRA)" <ji...@apache.org> on 2014/07/25 01:19:38 UTC

[jira] [Updated] (PHOENIX-1107) Support mutable indexes over replication

     [ https://issues.apache.org/jira/browse/PHOENIX-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jesse Yates updated PHOENIX-1107:
---------------------------------

    Attachment: phoenix-1107-3.0.v0

Attaching patch for 3.X line. I'll update to 4.X/5.X line once we agree on this approach.

Went the route that replication doesn't write any of the index edits that are stored with the primary table (those are only kept around in the case of failures for WAL replay). So the index gets written on the replication target either by the table on the target cluster or by replicating the index table itself. 

There isn't any way of propagating the index writes in the replication (and it really doesn't make sense at its a lot more overhead than replication either just the primary writes (and using the target cluster to update the index) or both tables as the index updates are larger than a regular kv).

At least in 0.94, there is no way around this except by adding a column family to which you never write any data (defaulting to _0--INDEX_DO_NOT_REPLICATE, configurable) and is not enabled for replication.

Problem is this bit from [Replication|https://github.com/apache/hbase/blob/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/Replication.java]:
{code}
@Override
  public void visitLogEntryBeforeWrite(HTableDescriptor htd, HLogKey logKey,
                                       WALEdit logEdit) {
    byte[] family;
    for (KeyValue kv : logEdit.getKeyValues()) {
      family = kv.getFamily();
      int scope = htd.getFamily(family).getScope();
      if (scope != REPLICATION_SCOPE_LOCAL &&
          !logEdit.hasKeyInScope(family)) {
        logEdit.putIntoScope(family, scope);
      }
    }
  }
{code}

which expects the column family to exist, but the index table CF probably doesn't match the primary table CF. And we don't want to mess around with the HTD since that could have farther reaching consequences that I can't predict at the moment.

Missing parts that we need to support this are:
* adding the CF when adding indexes/creating table (maybe better to just create it always?). Haven't yet figured out how that call heirarchy works yet and wanted to post this sooner (in case [~jamestaylor] has time to help out).
* upgrade path to add the necessary CF
* test with replication between two mini-hbase clusters

> Support mutable indexes over replication
> ----------------------------------------
>
>                 Key: PHOENIX-1107
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1107
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 5.0.0, 3.1, 4.1
>            Reporter: Jesse Yates
>            Assignee: Jesse Yates
>         Attachments: phoenix-1107-3.0.v0
>
>
> Mutable indexes don't support usage with replication. For starters, the replication WAL Listener checks the family of the edits, which can throw a NPE for the IndexedKeyValue 



--
This message was sent by Atlassian JIRA
(v6.2#6252)