You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Sergey Shelukhin <se...@hortonworks.com> on 2016/08/23 01:41:13 UTC

Review Request 51312: HIVE-14589 add consistent node replacement to LLAP for splits

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51312/
-----------------------------------------------------------

Review request for hive and Prasanth_J.


Repository: hive-git


Description
-------

see jira


Diffs
-----

  llap-client/src/java/org/apache/hadoop/hive/llap/registry/ServiceInstanceSet.java 99ead9b 
  llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapFixedRegistryImpl.java e9456f2 
  llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java 64d2617 
  llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java PRE-CREATION 
  llap-client/src/java/org/apache/hadoop/hive/llap/security/LlapTokenClient.java 921e050 
  llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java 17ce69b 
  llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java efd774d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HostAffinitySplitLocationProvider.java c06499e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/Utils.java 8a4fc08 
  ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestHostAffinitySplitLocationProvider.java d98a5ff 

Diff: https://reviews.apache.org/r/51312/diff/


Testing
-------


Thanks,

Sergey Shelukhin


Re: Review Request 51312: HIVE-14589 add consistent node replacement to LLAP for splits

Posted by Siddharth Seth <ss...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51312/#review147548
-----------------------------------------------------------



Will look at the updated patch tomorrow.

- Siddharth Seth


On Aug. 31, 2016, 10:25 p.m., Sergey Shelukhin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51312/
> -----------------------------------------------------------
> 
> (Updated Aug. 31, 2016, 10:25 p.m.)
> 
> 
> Review request for hive and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> see jira
> 
> 
> Diffs
> -----
> 
>   llap-client/pom.xml 0243340 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/ServiceInstanceSet.java 99ead9b 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapFixedRegistryImpl.java e9456f2 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java 64d2617 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java PRE-CREATION 
>   llap-client/src/java/org/apache/hadoop/hive/llap/security/LlapTokenClient.java 921e050 
>   llap-client/src/test/org/apache/hadoop/hive/llap/registry/impl/TestSlotZnode.java PRE-CREATION 
>   llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java 17ce69b 
>   llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java efd774d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/Utils.java 8a4fc08 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestHostAffinitySplitLocationProvider.java 54f7363 
> 
> Diff: https://reviews.apache.org/r/51312/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>


Re: Review Request 51312: HIVE-14589 add consistent node replacement to LLAP for splits

Posted by Siddharth Seth <ss...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51312/#review147610
-----------------------------------------------------------


Ship it!




Think there's several issues which still need resolving. Follow up jiras.
+1, assuming it's been tested locally for regresions and the new functionality.

- Siddharth Seth


On Sept. 1, 2016, 6:35 p.m., Sergey Shelukhin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51312/
> -----------------------------------------------------------
> 
> (Updated Sept. 1, 2016, 6:35 p.m.)
> 
> 
> Review request for hive and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> see jira
> 
> 
> Diffs
> -----
> 
>   llap-client/pom.xml 0243340 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/ServiceInstanceSet.java 99ead9b 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapFixedRegistryImpl.java e9456f2 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java 64d2617 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java PRE-CREATION 
>   llap-client/src/java/org/apache/hadoop/hive/llap/security/LlapTokenClient.java 921e050 
>   llap-client/src/test/org/apache/hadoop/hive/llap/registry/impl/TestSlotZnode.java PRE-CREATION 
>   llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java 17ce69b 
>   llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java efd774d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/Utils.java 8a4fc08 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestHostAffinitySplitLocationProvider.java 54f7363 
> 
> Diff: https://reviews.apache.org/r/51312/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>


Re: Review Request 51312: HIVE-14589 add consistent node replacement to LLAP for splits

Posted by Sergey Shelukhin <se...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51312/
-----------------------------------------------------------

(Updated Sept. 1, 2016, 6:35 p.m.)


Review request for hive and Prasanth_J.


Repository: hive-git


Description
-------

see jira


Diffs (updated)
-----

  llap-client/pom.xml 0243340 
  llap-client/src/java/org/apache/hadoop/hive/llap/registry/ServiceInstanceSet.java 99ead9b 
  llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapFixedRegistryImpl.java e9456f2 
  llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java 64d2617 
  llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java PRE-CREATION 
  llap-client/src/java/org/apache/hadoop/hive/llap/security/LlapTokenClient.java 921e050 
  llap-client/src/test/org/apache/hadoop/hive/llap/registry/impl/TestSlotZnode.java PRE-CREATION 
  llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java 17ce69b 
  llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java efd774d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/Utils.java 8a4fc08 
  ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestHostAffinitySplitLocationProvider.java 54f7363 

Diff: https://reviews.apache.org/r/51312/diff/


Testing
-------


Thanks,

Sergey Shelukhin


Re: Review Request 51312: HIVE-14589 add consistent node replacement to LLAP for splits

Posted by Sergey Shelukhin <se...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51312/#review147499
-----------------------------------------------------------




ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HostAffinitySplitLocationProvider.java (line 82)
<https://reviews.apache.org/r/51312/#comment214676>

    hmm, this is bogus change


- Sergey Shelukhin


On Aug. 31, 2016, 10:25 p.m., Sergey Shelukhin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51312/
> -----------------------------------------------------------
> 
> (Updated Aug. 31, 2016, 10:25 p.m.)
> 
> 
> Review request for hive and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> see jira
> 
> 
> Diffs
> -----
> 
>   llap-client/pom.xml 0243340 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/ServiceInstanceSet.java 99ead9b 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapFixedRegistryImpl.java e9456f2 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java 64d2617 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java PRE-CREATION 
>   llap-client/src/java/org/apache/hadoop/hive/llap/security/LlapTokenClient.java 921e050 
>   llap-client/src/test/org/apache/hadoop/hive/llap/registry/impl/TestSlotZnode.java PRE-CREATION 
>   llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java 17ce69b 
>   llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java efd774d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/Utils.java 8a4fc08 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestHostAffinitySplitLocationProvider.java 54f7363 
> 
> Diff: https://reviews.apache.org/r/51312/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>


Re: Review Request 51312: HIVE-14589 add consistent node replacement to LLAP for splits

Posted by Sergey Shelukhin <se...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51312/
-----------------------------------------------------------

(Updated Aug. 31, 2016, 10:25 p.m.)


Review request for hive and Prasanth_J.


Repository: hive-git


Description
-------

see jira


Diffs (updated)
-----

  llap-client/pom.xml 0243340 
  llap-client/src/java/org/apache/hadoop/hive/llap/registry/ServiceInstanceSet.java 99ead9b 
  llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapFixedRegistryImpl.java e9456f2 
  llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java 64d2617 
  llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java PRE-CREATION 
  llap-client/src/java/org/apache/hadoop/hive/llap/security/LlapTokenClient.java 921e050 
  llap-client/src/test/org/apache/hadoop/hive/llap/registry/impl/TestSlotZnode.java PRE-CREATION 
  llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java 17ce69b 
  llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java efd774d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/Utils.java 8a4fc08 
  ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestHostAffinitySplitLocationProvider.java 54f7363 

Diff: https://reviews.apache.org/r/51312/diff/


Testing
-------


Thanks,

Sergey Shelukhin


Re: Review Request 51312: HIVE-14589 add consistent node replacement to LLAP for splits

Posted by Sergey Shelukhin <se...@hortonworks.com>.

> On Aug. 30, 2016, 10:36 p.m., Siddharth Seth wrote:
> > Unrelated to this patch - Any idea why 'uniq' is static/ (private static final UUID uniq = UUID.randomUUID();)

it's daemon identity, so it's global per daemon


- Sergey


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51312/#review147377
-----------------------------------------------------------


On Aug. 23, 2016, 1:41 a.m., Sergey Shelukhin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51312/
> -----------------------------------------------------------
> 
> (Updated Aug. 23, 2016, 1:41 a.m.)
> 
> 
> Review request for hive and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> see jira
> 
> 
> Diffs
> -----
> 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/ServiceInstanceSet.java 99ead9b 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapFixedRegistryImpl.java e9456f2 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java 64d2617 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java PRE-CREATION 
>   llap-client/src/java/org/apache/hadoop/hive/llap/security/LlapTokenClient.java 921e050 
>   llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java 17ce69b 
>   llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java efd774d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HostAffinitySplitLocationProvider.java c06499e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/Utils.java 8a4fc08 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestHostAffinitySplitLocationProvider.java d98a5ff 
> 
> Diff: https://reviews.apache.org/r/51312/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>


Re: Review Request 51312: HIVE-14589 add consistent node replacement to LLAP for splits

Posted by Siddharth Seth <ss...@apache.org>.

> On Aug. 30, 2016, 10:36 p.m., Siddharth Seth wrote:
> > Unrelated to this patch - Any idea why 'uniq' is static/ (private static final UUID uniq = UUID.randomUUID();)
> 
> Sergey Shelukhin wrote:
>     it's daemon identity, so it's global per daemon

Why is the daemon identity a static? Do we expect a new instance of the class to be created within the same daemon.

The reasona I ask is that this gets in the way of running a MiniLlapCluster with multiple simulated daemons.

Anyway, created a jira for this.


- Siddharth


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51312/#review147377
-----------------------------------------------------------


On Sept. 1, 2016, 6:35 p.m., Sergey Shelukhin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51312/
> -----------------------------------------------------------
> 
> (Updated Sept. 1, 2016, 6:35 p.m.)
> 
> 
> Review request for hive and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> see jira
> 
> 
> Diffs
> -----
> 
>   llap-client/pom.xml 0243340 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/ServiceInstanceSet.java 99ead9b 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapFixedRegistryImpl.java e9456f2 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java 64d2617 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java PRE-CREATION 
>   llap-client/src/java/org/apache/hadoop/hive/llap/security/LlapTokenClient.java 921e050 
>   llap-client/src/test/org/apache/hadoop/hive/llap/registry/impl/TestSlotZnode.java PRE-CREATION 
>   llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java 17ce69b 
>   llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java efd774d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/Utils.java 8a4fc08 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestHostAffinitySplitLocationProvider.java 54f7363 
> 
> Diff: https://reviews.apache.org/r/51312/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>


Re: Review Request 51312: HIVE-14589 add consistent node replacement to LLAP for splits

Posted by Siddharth Seth <ss...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51312/#review147377
-----------------------------------------------------------



Unrelated to this patch - Any idea why 'uniq' is static/ (private static final UUID uniq = UUID.randomUUID();)

- Siddharth Seth


On Aug. 23, 2016, 1:41 a.m., Sergey Shelukhin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51312/
> -----------------------------------------------------------
> 
> (Updated Aug. 23, 2016, 1:41 a.m.)
> 
> 
> Review request for hive and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> see jira
> 
> 
> Diffs
> -----
> 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/ServiceInstanceSet.java 99ead9b 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapFixedRegistryImpl.java e9456f2 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java 64d2617 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java PRE-CREATION 
>   llap-client/src/java/org/apache/hadoop/hive/llap/security/LlapTokenClient.java 921e050 
>   llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java 17ce69b 
>   llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java efd774d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HostAffinitySplitLocationProvider.java c06499e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/Utils.java 8a4fc08 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestHostAffinitySplitLocationProvider.java d98a5ff 
> 
> Diff: https://reviews.apache.org/r/51312/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>


Re: Review Request 51312: HIVE-14589 add consistent node replacement to LLAP for splits

Posted by Sergey Shelukhin <se...@hortonworks.com>.

> On Aug. 30, 2016, 10:35 p.m., Siddharth Seth wrote:
> > ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestHostAffinitySplitLocationProvider.java, line 98
> > <https://reviews.apache.org/r/51312/diff/1/?file=1481614#file1481614line98>
> >
> >     This isn't part of the patch uploaded to jira?

this was in the previous jira patch


- Sergey


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51312/#review147362
-----------------------------------------------------------


On Aug. 23, 2016, 1:41 a.m., Sergey Shelukhin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51312/
> -----------------------------------------------------------
> 
> (Updated Aug. 23, 2016, 1:41 a.m.)
> 
> 
> Review request for hive and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> see jira
> 
> 
> Diffs
> -----
> 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/ServiceInstanceSet.java 99ead9b 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapFixedRegistryImpl.java e9456f2 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java 64d2617 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java PRE-CREATION 
>   llap-client/src/java/org/apache/hadoop/hive/llap/security/LlapTokenClient.java 921e050 
>   llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java 17ce69b 
>   llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java efd774d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HostAffinitySplitLocationProvider.java c06499e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/Utils.java 8a4fc08 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestHostAffinitySplitLocationProvider.java d98a5ff 
> 
> Diff: https://reviews.apache.org/r/51312/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>


Re: Review Request 51312: HIVE-14589 add consistent node replacement to LLAP for splits

Posted by Sergey Shelukhin <se...@hortonworks.com>.

> On Aug. 30, 2016, 10:35 p.m., Siddharth Seth wrote:
> > llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java, line 561
> > <https://reviews.apache.org/r/51312/diff/1/?file=1481607#file1481607line561>
> >
> >     Here, as well as other places, a node is only available when it's slot has been registered. Otherwise it should not be visible to clients.
> 
> Sergey Shelukhin wrote:
>     I think it only makes sense for ordered, to not mess up the order. Changed that. It's also problematic because we'd have to keep track of coordinated notifications.
> 
> Siddharth Seth wrote:
>     Assuming notifications for registered workers behave in the following manner.
>     
>     NewNode - only when the slot has been registered.
>     Node down - when either the slot or the actual node goes away.
>     State changed - I'm not sure anyone is expected to act on this (what are the possible notifications here).

Why? It doesn't seem to provide a lot of benefit compared to existing worker node handling, in presence of crashes/etc.


> On Aug. 30, 2016, 10:35 p.m., Siddharth Seth wrote:
> > llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java, line 554
> > <https://reviews.apache.org/r/51312/diff/1/?file=1481607#file1481607line554>
> >
> >     The size of this collection is used to determine the #knownWorkers in HostAffinitySplitLocationProvider. The size at the moment represents active nodes.
> >     
> >     I don't think the intent is to return a fewer number of nodes than what existed if a node goes down? 
> >     1) Return n entries, and one entry would indicate it's not active (and this will need to be acted upon by all clients
> >     2) Add a getNumInstances itnerface, which can differ from actual nodes available, along with a getNodeAtLocation interface?
> 
> Sergey Shelukhin wrote:
>     HIVE-14680
> 
> Siddharth Seth wrote:
>     Think this is very incomplete without this change. I don't think it's any worse than it is today though.

Why? It handles consistency in most scenarios as described. There's just a small window during the crash where splits will not get good locations.


- Sergey


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51312/#review147362
-----------------------------------------------------------


On Aug. 31, 2016, 10:25 p.m., Sergey Shelukhin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51312/
> -----------------------------------------------------------
> 
> (Updated Aug. 31, 2016, 10:25 p.m.)
> 
> 
> Review request for hive and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> see jira
> 
> 
> Diffs
> -----
> 
>   llap-client/pom.xml 0243340 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/ServiceInstanceSet.java 99ead9b 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapFixedRegistryImpl.java e9456f2 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java 64d2617 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java PRE-CREATION 
>   llap-client/src/java/org/apache/hadoop/hive/llap/security/LlapTokenClient.java 921e050 
>   llap-client/src/test/org/apache/hadoop/hive/llap/registry/impl/TestSlotZnode.java PRE-CREATION 
>   llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java 17ce69b 
>   llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java efd774d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/Utils.java 8a4fc08 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestHostAffinitySplitLocationProvider.java 54f7363 
> 
> Diff: https://reviews.apache.org/r/51312/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>


Re: Review Request 51312: HIVE-14589 add consistent node replacement to LLAP for splits

Posted by Siddharth Seth <ss...@apache.org>.

> On Aug. 30, 2016, 10:35 p.m., Siddharth Seth wrote:
> > llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java, line 554
> > <https://reviews.apache.org/r/51312/diff/1/?file=1481607#file1481607line554>
> >
> >     The size of this collection is used to determine the #knownWorkers in HostAffinitySplitLocationProvider. The size at the moment represents active nodes.
> >     
> >     I don't think the intent is to return a fewer number of nodes than what existed if a node goes down? 
> >     1) Return n entries, and one entry would indicate it's not active (and this will need to be acted upon by all clients
> >     2) Add a getNumInstances itnerface, which can differ from actual nodes available, along with a getNodeAtLocation interface?
> 
> Sergey Shelukhin wrote:
>     HIVE-14680
> 
> Siddharth Seth wrote:
>     Think this is very incomplete without this change. I don't think it's any worse than it is today though.
> 
> Sergey Shelukhin wrote:
>     Why? It handles consistency in most scenarios as described. There's just a small window during the crash where splits will not get good locations.

The 'small window' is assuming the node comes back up in the fisrt place. Minutes is enough to cause damage to the cache on multiple nodes. What happens on a cluster where a physical node is taken out for maintenance, and there's no capacity to launch another node. What happens in case of an hour long maintenence window for a node/set of nodes.

Will the consisten caching part take care of this ? (The target bucket size will be decreasing).

The bucket size could be based on the number of instances requested by the slider app.


> On Aug. 30, 2016, 10:35 p.m., Siddharth Seth wrote:
> > llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java, line 561
> > <https://reviews.apache.org/r/51312/diff/1/?file=1481607#file1481607line561>
> >
> >     Here, as well as other places, a node is only available when it's slot has been registered. Otherwise it should not be visible to clients.
> 
> Sergey Shelukhin wrote:
>     I think it only makes sense for ordered, to not mess up the order. Changed that. It's also problematic because we'd have to keep track of coordinated notifications.
> 
> Siddharth Seth wrote:
>     Assuming notifications for registered workers behave in the following manner.
>     
>     NewNode - only when the slot has been registered.
>     Node down - when either the slot or the actual node goes away.
>     State changed - I'm not sure anyone is expected to act on this (what are the possible notifications here).
> 
> Sergey Shelukhin wrote:
>     Why? It doesn't seem to provide a lot of benefit compared to existing worker node handling, in presence of crashes/etc.

It does provide consistency, and makes it easier for someone who does not know this specific detail about the two ZK nodes per instance to reason about potential problems. Eseentially easier to undestand and maintain.


- Siddharth


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51312/#review147362
-----------------------------------------------------------


On Sept. 1, 2016, 6:35 p.m., Sergey Shelukhin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51312/
> -----------------------------------------------------------
> 
> (Updated Sept. 1, 2016, 6:35 p.m.)
> 
> 
> Review request for hive and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> see jira
> 
> 
> Diffs
> -----
> 
>   llap-client/pom.xml 0243340 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/ServiceInstanceSet.java 99ead9b 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapFixedRegistryImpl.java e9456f2 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java 64d2617 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java PRE-CREATION 
>   llap-client/src/java/org/apache/hadoop/hive/llap/security/LlapTokenClient.java 921e050 
>   llap-client/src/test/org/apache/hadoop/hive/llap/registry/impl/TestSlotZnode.java PRE-CREATION 
>   llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java 17ce69b 
>   llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java efd774d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/Utils.java 8a4fc08 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestHostAffinitySplitLocationProvider.java 54f7363 
> 
> Diff: https://reviews.apache.org/r/51312/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>


Re: Review Request 51312: HIVE-14589 add consistent node replacement to LLAP for splits

Posted by Sergey Shelukhin <se...@hortonworks.com>.

> On Aug. 30, 2016, 10:35 p.m., Siddharth Seth wrote:
> > llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java, line 104
> > <https://reviews.apache.org/r/51312/diff/1/?file=1481607#file1481607line104>
> >
> >     Move this into a separate path altogether, instead of listing and filtering at the same path each time?

That would require multiple listeners, acl checks, etc.


> On Aug. 30, 2016, 10:35 p.m., Siddharth Seth wrote:
> > llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java, line 538
> > <https://reviews.apache.org/r/51312/diff/1/?file=1481607#file1481607line538>
> >
> >     There's a lot of similar code to read records within a loop, and a method to read records. Move into a single method? Can be done in a separate jiras as well.

That code sets several variables, which makes it impossible (or too verbose) to refactor in Java


> On Aug. 30, 2016, 10:35 p.m., Siddharth Seth wrote:
> > llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java, line 554
> > <https://reviews.apache.org/r/51312/diff/1/?file=1481607#file1481607line554>
> >
> >     The size of this collection is used to determine the #knownWorkers in HostAffinitySplitLocationProvider. The size at the moment represents active nodes.
> >     
> >     I don't think the intent is to return a fewer number of nodes than what existed if a node goes down? 
> >     1) Return n entries, and one entry would indicate it's not active (and this will need to be acted upon by all clients
> >     2) Add a getNumInstances itnerface, which can differ from actual nodes available, along with a getNodeAtLocation interface?

HIVE-14680


> On Aug. 30, 2016, 10:35 p.m., Siddharth Seth wrote:
> > llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java, line 561
> > <https://reviews.apache.org/r/51312/diff/1/?file=1481607#file1481607line561>
> >
> >     Here, as well as other places, a node is only available when it's slot has been registered. Otherwise it should not be visible to clients.

I think it only makes sense for ordered, to not mess up the order. Changed that. It's also problematic because we'd have to keep track of coordinated notifications.


- Sergey


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51312/#review147362
-----------------------------------------------------------


On Aug. 23, 2016, 1:41 a.m., Sergey Shelukhin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51312/
> -----------------------------------------------------------
> 
> (Updated Aug. 23, 2016, 1:41 a.m.)
> 
> 
> Review request for hive and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> see jira
> 
> 
> Diffs
> -----
> 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/ServiceInstanceSet.java 99ead9b 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapFixedRegistryImpl.java e9456f2 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java 64d2617 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java PRE-CREATION 
>   llap-client/src/java/org/apache/hadoop/hive/llap/security/LlapTokenClient.java 921e050 
>   llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java 17ce69b 
>   llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java efd774d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HostAffinitySplitLocationProvider.java c06499e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/Utils.java 8a4fc08 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestHostAffinitySplitLocationProvider.java d98a5ff 
> 
> Diff: https://reviews.apache.org/r/51312/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>


Re: Review Request 51312: HIVE-14589 add consistent node replacement to LLAP for splits

Posted by Siddharth Seth <ss...@apache.org>.

> On Aug. 30, 2016, 10:35 p.m., Siddharth Seth wrote:
> > llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java, line 538
> > <https://reviews.apache.org/r/51312/diff/1/?file=1481607#file1481607line538>
> >
> >     There's a lot of similar code to read records within a loop, and a method to read records. Move into a single method? Can be done in a separate jiras as well.
> 
> Sergey Shelukhin wrote:
>     That code sets several variables, which makes it impossible (or too verbose) to refactor in Java

Think it can be attempted in a separate jira. Looking at this in IDEA with it's duplicate code detection is a lot of squigly lines.


> On Aug. 30, 2016, 10:35 p.m., Siddharth Seth wrote:
> > llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java, line 554
> > <https://reviews.apache.org/r/51312/diff/1/?file=1481607#file1481607line554>
> >
> >     The size of this collection is used to determine the #knownWorkers in HostAffinitySplitLocationProvider. The size at the moment represents active nodes.
> >     
> >     I don't think the intent is to return a fewer number of nodes than what existed if a node goes down? 
> >     1) Return n entries, and one entry would indicate it's not active (and this will need to be acted upon by all clients
> >     2) Add a getNumInstances itnerface, which can differ from actual nodes available, along with a getNodeAtLocation interface?
> 
> Sergey Shelukhin wrote:
>     HIVE-14680

Think this is very incomplete without this change. I don't think it's any worse than it is today though.


> On Aug. 30, 2016, 10:35 p.m., Siddharth Seth wrote:
> > llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java, line 561
> > <https://reviews.apache.org/r/51312/diff/1/?file=1481607#file1481607line561>
> >
> >     Here, as well as other places, a node is only available when it's slot has been registered. Otherwise it should not be visible to clients.
> 
> Sergey Shelukhin wrote:
>     I think it only makes sense for ordered, to not mess up the order. Changed that. It's also problematic because we'd have to keep track of coordinated notifications.

Assuming notifications for registered workers behave in the following manner.

NewNode - only when the slot has been registered.
Node down - when either the slot or the actual node goes away.
State changed - I'm not sure anyone is expected to act on this (what are the possible notifications here).


- Siddharth


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51312/#review147362
-----------------------------------------------------------


On Aug. 31, 2016, 10:25 p.m., Sergey Shelukhin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51312/
> -----------------------------------------------------------
> 
> (Updated Aug. 31, 2016, 10:25 p.m.)
> 
> 
> Review request for hive and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> see jira
> 
> 
> Diffs
> -----
> 
>   llap-client/pom.xml 0243340 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/ServiceInstanceSet.java 99ead9b 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapFixedRegistryImpl.java e9456f2 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java 64d2617 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java PRE-CREATION 
>   llap-client/src/java/org/apache/hadoop/hive/llap/security/LlapTokenClient.java 921e050 
>   llap-client/src/test/org/apache/hadoop/hive/llap/registry/impl/TestSlotZnode.java PRE-CREATION 
>   llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java 17ce69b 
>   llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java efd774d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/Utils.java 8a4fc08 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestHostAffinitySplitLocationProvider.java 54f7363 
> 
> Diff: https://reviews.apache.org/r/51312/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>


Re: Review Request 51312: HIVE-14589 add consistent node replacement to LLAP for splits

Posted by Siddharth Seth <ss...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51312/#review147362
-----------------------------------------------------------



Possible to add a test (or a few)? (I don' see one in the patch attached to the jira).

Question: How long does it take for a persistent node to go away - in case of a JVM crash / node crash. Is it possible that a new process starts up within the duration it takes for the old node-slot entry to be removed? (It gets added to the end as a result)

In terms of a permanent cluster size reduction - I think we should at least take some steps to handle this scenario, or an equivalent scenario where it takes a long time (minutes) for a node to come back. We have a force locality scheduling mode, this will effectively cause new queries to start in a state where they cannot complete. Check for this and fail the query?, or fallback to a next node for the split. The fallback to the next node is something that is going to come in soon anyway, to control locality if nodes are not available (current is always random as against 'random' being a configurable option :) )


llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java (line 104)
<https://reviews.apache.org/r/51312/#comment214497>

    Move this into a separate path altogether, instead of listing and filtering at the same path each time?



llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java (line 110)
<https://reviews.apache.org/r/51312/#comment214498>

    woekersPath is no long defined, so the comment can be deleted.



llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java (line 533)
<https://reviews.apache.org/r/51312/#comment214487>

    There's a lot of similar code to read records within a loop, and a method to read records. Move into a single method? Can be done in a separate jiras as well.



llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java (line 547)
<https://reviews.apache.org/r/51312/#comment214501>

    The size of this collection is used to determine the #knownWorkers in HostAffinitySplitLocationProvider. The size at the moment represents active nodes.
    
    I don't think the intent is to return a fewer number of nodes than what existed if a node goes down? 
    1) Return n entries, and one entry would indicate it's not active (and this will need to be acted upon by all clients
    2) Add a getNumInstances itnerface, which can differ from actual nodes available, along with a getNodeAtLocation interface?



llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java (line 554)
<https://reviews.apache.org/r/51312/#comment214496>

    Here, as well as other places, a node is only available when it's slot has been registered. Otherwise it should not be visible to clients.



llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java (line 577)
<https://reviews.apache.org/r/51312/#comment214495>

    Don't send back a node until it's slot registration has completed. We don't know what position it will take. This logic puts such nodes at the end.



llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java (line 51)
<https://reviews.apache.org/r/51312/#comment214489>

    Are there tests in curator for the said node, which we could borrow parts from?



llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java (line 128)
<https://reviews.apache.org/r/51312/#comment214472>

    Nit: else {
      slots.add( ..)
    }
    
    Makes it a little more readable, and maintanable for future changes.



llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java (line 138)
<https://reviews.apache.org/r/51312/#comment214477>

    Potential divide by 0? (new cluster)
    
    What's the purpose of the 2.0f/approxWorkerCount ?



ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestHostAffinitySplitLocationProvider.java (line 94)
<https://reviews.apache.org/r/51312/#comment214490>

    This isn't part of the patch uploaded to jira?


- Siddharth Seth


On Aug. 23, 2016, 1:41 a.m., Sergey Shelukhin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51312/
> -----------------------------------------------------------
> 
> (Updated Aug. 23, 2016, 1:41 a.m.)
> 
> 
> Review request for hive and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> see jira
> 
> 
> Diffs
> -----
> 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/ServiceInstanceSet.java 99ead9b 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapFixedRegistryImpl.java e9456f2 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java 64d2617 
>   llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java PRE-CREATION 
>   llap-client/src/java/org/apache/hadoop/hive/llap/security/LlapTokenClient.java 921e050 
>   llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java 17ce69b 
>   llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java efd774d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HostAffinitySplitLocationProvider.java c06499e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/Utils.java 8a4fc08 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestHostAffinitySplitLocationProvider.java d98a5ff 
> 
> Diff: https://reviews.apache.org/r/51312/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>