You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ambari.apache.org by "Jonathan Hurley (JIRA)" <ji...@apache.org> on 2016/10/13 15:18:20 UTC

[jira] [Updated] (AMBARI-18590) RegionServer Registration Checks Fail During Upgrade If rDNS is Not Enabled

     [ https://issues.apache.org/jira/browse/AMBARI-18590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Hurley updated AMBARI-18590:
-------------------------------------
    Description: 
During a rolling upgrade, the upgrade orchestration must wait for each RegionServer to register with the HBase master before moving onto the next RS restart. This is a very asynchronous process which may occur several minutes after the daemon has actually started. 

We have a check now which uses {{hbase shell}} along with {{status 'simple'}} to determine if the host has registered by looking for the hostname. 

However, if reverse DNS is not enabled, then this could potentially be IP addresses. As a result, the check would always fail during upgrades:

The HBase status command we use is {{status simple}}, which returns like so:

{noformat}
active master:  10.0.0.8:16000 1475801031124
2 backup masters
    10.0.0.10:16000 1475801061290
    10.0.0.13:16000 1475801046018
2 live servers
    10.0.0.5:16020 1475798271407
        requestsPerSecond=0.0, numberOfOnlineRegions=2, usedHeapMB=159, maxHeapMB=7840, numberOfStores=3, numberOfStorefiles=1, storefileUncompressedSizeMB=0, storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=14, writeRequestsCount=1, rootIndexSizeKB=0, totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, totalCompactingKVs=14, currentCompactedKVs=14, compactionProgressPct=1.0, coprocessors=[MultiRowMutationEndpoint, SecureBulkLoadEndpoint]
    10.0.0.7:16020 1475872741297
        requestsPerSecond=0.0, numberOfOnlineRegions=1, usedHeapMB=1002, maxHeapMB=7840, numberOfStores=1, numberOfStorefiles=1, storefileUncompressedSizeMB=0, storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=0, writeRequestsCount=0, rootIndexSizeKB=0, totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, totalCompactingKVs=0, currentCompactedKVs=0, compactionProgressPct=NaN, coprocessors=[SecureBulkLoadEndpoint]
0 dead servers
Aggregate load: 0, regions: 3
{noformat}

If this lookup fails for the hostname, we should also try by IP address.

  was:
During a rolling upgrade, the upgrade orchestration must wait for each RegionServer to register with the HBase master before moving onto the next RS restart. This is a very asynchronous process which may occur several minutes after the daemon has actually started. 

We have a check now which uses {{hbase shell}} along with {{status 'simple'}} to determine if the host has registered by looking for the hostname. 

However, if reverse DNS is not enabled, then this could potentially be IP addresses. As a result, the check would always fail during upgrades:

{code}
Thanks.

The HBase status command we use is {{status simple}}, which returns like so:

{noformat}
active master:  10.0.0.8:16000 1475801031124
2 backup masters
    10.0.0.10:16000 1475801061290
    10.0.0.13:16000 1475801046018
2 live servers
    10.0.0.5:16020 1475798271407
        requestsPerSecond=0.0, numberOfOnlineRegions=2, usedHeapMB=159, maxHeapMB=7840, numberOfStores=3, numberOfStorefiles=1, storefileUncompressedSizeMB=0, storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=14, writeRequestsCount=1, rootIndexSizeKB=0, totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, totalCompactingKVs=14, currentCompactedKVs=14, compactionProgressPct=1.0, coprocessors=[MultiRowMutationEndpoint, SecureBulkLoadEndpoint]
    10.0.0.7:16020 1475872741297
        requestsPerSecond=0.0, numberOfOnlineRegions=1, usedHeapMB=1002, maxHeapMB=7840, numberOfStores=1, numberOfStorefiles=1, storefileUncompressedSizeMB=0, storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=0, writeRequestsCount=0, rootIndexSizeKB=0, totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, totalCompactingKVs=0, currentCompactedKVs=0, compactionProgressPct=NaN, coprocessors=[SecureBulkLoadEndpoint]
0 dead servers
Aggregate load: 0, regions: 3
{noformat}

If this lookup fails for the hostname, we should also try by IP address.


> RegionServer Registration Checks Fail During Upgrade If rDNS is Not Enabled
> ---------------------------------------------------------------------------
>
>                 Key: AMBARI-18590
>                 URL: https://issues.apache.org/jira/browse/AMBARI-18590
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-agent
>    Affects Versions: 2.2.0
>            Reporter: Jonathan Hurley
>            Assignee: Jonathan Hurley
>            Priority: Blocker
>             Fix For: 2.5.0
>
>
> During a rolling upgrade, the upgrade orchestration must wait for each RegionServer to register with the HBase master before moving onto the next RS restart. This is a very asynchronous process which may occur several minutes after the daemon has actually started. 
> We have a check now which uses {{hbase shell}} along with {{status 'simple'}} to determine if the host has registered by looking for the hostname. 
> However, if reverse DNS is not enabled, then this could potentially be IP addresses. As a result, the check would always fail during upgrades:
> The HBase status command we use is {{status simple}}, which returns like so:
> {noformat}
> active master:  10.0.0.8:16000 1475801031124
> 2 backup masters
>     10.0.0.10:16000 1475801061290
>     10.0.0.13:16000 1475801046018
> 2 live servers
>     10.0.0.5:16020 1475798271407
>         requestsPerSecond=0.0, numberOfOnlineRegions=2, usedHeapMB=159, maxHeapMB=7840, numberOfStores=3, numberOfStorefiles=1, storefileUncompressedSizeMB=0, storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=14, writeRequestsCount=1, rootIndexSizeKB=0, totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, totalCompactingKVs=14, currentCompactedKVs=14, compactionProgressPct=1.0, coprocessors=[MultiRowMutationEndpoint, SecureBulkLoadEndpoint]
>     10.0.0.7:16020 1475872741297
>         requestsPerSecond=0.0, numberOfOnlineRegions=1, usedHeapMB=1002, maxHeapMB=7840, numberOfStores=1, numberOfStorefiles=1, storefileUncompressedSizeMB=0, storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=0, writeRequestsCount=0, rootIndexSizeKB=0, totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, totalCompactingKVs=0, currentCompactedKVs=0, compactionProgressPct=NaN, coprocessors=[SecureBulkLoadEndpoint]
> 0 dead servers
> Aggregate load: 0, regions: 3
> {noformat}
> If this lookup fails for the hostname, we should also try by IP address.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)