You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@sentry.apache.org by "Sergio Peña (JIRA)" <ji...@apache.org> on 2017/07/27 21:00:07 UTC

[jira] [Commented] (SENTRY-1682) Investigate use of EXPORT for replication for initial HMS snapshot

    [ https://issues.apache.org/jira/browse/SENTRY-1682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16103903#comment-16103903 ] 

Sergio Peña commented on SENTRY-1682:
-------------------------------------

I finished with this investigation.

There are two replication versions currently used:

v1 uses commands like:
{noformat}
- export table <table> for replication <repl-name>;
- export table <table> for metadata replication <repl-name>;
{noformat}

v2 uses commands like:
{noformat}
- REPL DUMP <dbname>[.<tablename>] [FROM <init-evid> [TO <end-evid>] [LIMIT <num-evids>] ];
- REPL LOAD [<dbname>[.<tablename>]] FROM <dirname>;
{noformat}

v1 replication has the issue with the new notifications added during the dump. This is one of the improvements that v2 replication is supposed to do. However, v2 replication is still in progress, and this specific part not implemented yet.

Here's the link to the v2 code that will consolidate new notifications:
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java#L211

This is the code that has a missing implementation (current master branch is Hive 3.0):
{noformat}
private Long bootStrapDump(Path dumpRoot, DumpMetaData dmd, Path cmRoot) throws Exception {
...
    // Now we consolidate all the events that happenned during the objdump into the objdump
    while (evIter.hasNext()) {
      NotificationEvent ev = evIter.next();
      Path eventRoot = new Path(dumpRoot, String.valueOf(ev.getEventId()));
      // FIXME : implement consolidateEvent(..) similar to dumpEvent(ev,evRoot)
    }
...
{noformat}

> Investigate use of EXPORT for replication for initial HMS snapshot
> ------------------------------------------------------------------
>
>                 Key: SENTRY-1682
>                 URL: https://issues.apache.org/jira/browse/SENTRY-1682
>             Project: Sentry
>          Issue Type: Sub-task
>          Components: Sentry
>            Reporter: Alexander Kolbasov
>            Assignee: Sergio Peña
>            Priority: Minor
>             Fix For: 2.0.0
>
>
> [~mohitsabharwal] mentioned that Hive supports 
> {code}
>  EXPORT ...  for replication('somerandomstring')
> {code}
> command,
> It will dump the last notification id that EXPORT sees (in the export metadata json).
> i.e. if this works as documented, then EXPORT is telling you what is the last notification that EXPORT saw before executing.
> This can be used for ontaining consistent initial snapshot instead of the current conservative stop-the0world scheme.
> We should explore this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)