You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Sergey Shelukhin (JIRA)" <ji...@apache.org> on 2016/08/25 01:03:21 UTC

[jira] [Comment Edited] (HIVE-14589) add consistent node replacement to LLAP for splits

    [ https://issues.apache.org/jira/browse/HIVE-14589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15436060#comment-15436060 ] 

Sergey Shelukhin edited comment on HIVE-14589 at 8/25/16 1:02 AM:
------------------------------------------------------------------

This basically creates the nodes in ZK for "slots" in the cluster. The nodes try to take the lowest available slot, starting from 0. Unlike worker-... nodes, the slots are reused, which is the intent. The nodes are always sorted by the slot number for splits.
The idea is that as long as the node is running, it will retain the same position in the ordering, regardless of other nodes restarting, without knowing about each other, the predecessors location (if restarted in a different place), or the total count of nodes in the cluster. 
The restarting nodes may not take the same positions as their predecessors (i.e. if two nodes restart they can swap slots) but it shouldn't matter because they have lost their cache anyway.
I.e. if you have nodes 1-2-3-4 and I nuke and restart 1, 2, and 4, they will take whatever spots, but 3 will stay the 3rd and retain cache locality.

One case it doesn't handle is permanent cluster size reduction. There will be a permanent gap if nodes are removed that have the slots in the middle; until some nodes restart, it will result in misses. 


was (Author: sershe):
This basically creates the nodes in ZK for "slots" in the cluster. The nodes try to take the lowest available slot, starting from 0. Unlike worker-... nodes, the slots are reused, which is the intent. The nodes are always sort by the slot number for splits.
The idea is that as long as the node is running, it will retain the same position in the ordering, regardless of other nodes restarting, without knowing about each other, their predecessors location, or the total count of nodes in the cluster. 
The restarting nodes may not take the same positions as their predecessors (i.e. if two nodes restart they can swap slots) but it doesn't matter as much because they have lost their cache anyway.
I.e. if you have nodes 1-2-3-4 and I nuke and restart 1, 2, and 4, they will take whatever spots, but 3 will stay 3rd and retain cache locality.

One case it doesn't handle is permanent cluster size reduction. There will be a permanent gap if nodes are removed that have the slots in the middle; until some nodes restart, it will result in misses. 

> add consistent node replacement to LLAP for splits
> --------------------------------------------------
>
>                 Key: HIVE-14589
>                 URL: https://issues.apache.org/jira/browse/HIVE-14589
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HIVE-14589.01.patch, HIVE-14589.patch
>
>
> See HIVE-14574



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)