You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Jeongdae Kim (Jira)" <ji...@apache.org> on 2020/05/20 05:36:00 UTC

[jira] [Comment Edited] (HBASE-24402) Moving the meta region causes MetricsException when using above 2.6.0 hadoop version

    [ https://issues.apache.org/jira/browse/HBASE-24402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17111712#comment-17111712 ] 

Jeongdae Kim edited comment on HBASE-24402 at 5/20/20, 5:35 AM:
----------------------------------------------------------------

I think the reason why this warning occurs is that HBASE-9774 was ported to branch-1 HBASE-18060, and HBASE-18060 only fit for hadoop version < 2.6.0 by using reflections to remove some internal data. unfortunately, since hadoop version 2.6.0 HADOOP-11105 , some more internal data is added, which HBase 1.4.13 can't remove, so it makes this warning every 5 minutes and the metric source for coprocessor will be never re-registered.

 

More specifically, It happens in the following scenario.
 * At first, The warning started to appear after the meta region was moved out from a host and then moved back to the host again some minutes after.
{noformat}
2020-04-21 15:08:07,749 DEBUG [RS_CLOSE_META-XXX:11471-0] handler.CloseRegionHandler: Closed hbase:meta,,1.1588230740
...
2020-04-21 15:08:19,169 INFO  [RpcServer.priority.FPBQ.Fifo.handler=18,queue=0,port=11471] regionserver.RSRpcServices: Open hbase:meta,,1.1588230740
{noformat}

 * since HBASE-12016 ported to branch-1, the coprocessor 'MultiRowMutationEndpoint' has added to the meta table.
{noformat}
2020-04-21 15:08:19,172 DEBUG [RS_OPEN_META-XXX:11471-0] coprocessor.CoprocessorHost: Loading coprocessor class org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint with path null and priority 536870911
2020-04-21 15:08:19,173 DEBUG [RS_OPEN_META-XXX:11471-0] regionserver.HRegion: Registered coprocessor service: region=hbase:meta,,1 service=MultiRowMut
ationService
2020-04-21 15:08:19,174 INFO  [RS_OPEN_META-XXX:11471-0] regionserver.RegionCoprocessorHost: Loaded coprocessor org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint from HTD of hbase:meta successfully.
{noformat}

 * Because Coprocessor metrics introduced HBASE-18060, whenever meta region is closed or opened, a new metric source is added to hadoop metrics2 system or removed from it.
{noformat}
// (1) registered a metric source after a region server restarted 
2020-04-20 14:23:04,579 DEBUG [HBase-Metrics2-1] impl.GlobalMetricRegistriesAdapter: Registering adapter for the MetricRegistry: RegionServer,sub=Coprocessor.R
egion.CP_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint
2020-04-20 14:23:04,580 INFO  [HBase-Metrics2-1] impl.GlobalMetricRegistriesAdapter: Registering RegionServer,sub=Coprocessor.Region.CP_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint Metrics about HBase RegionObservers

// (2) removed it because of moving meta out from the region server
2020-04-21 15:08:14,579 DEBUG [HBase-Metrics2-1] impl.GlobalMetricRegistriesAdapter: Removing adapter for the MetricRegistry: RegionServer,sub=Coprocessor.Regi
on.CP_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint

// (3) then trying to re-register it after meta region moved into the same region server
2020-04-21 15:08:24,579 DEBUG [HBase-Metrics2-1] impl.GlobalMetricRegistriesAdapter: Registering adapter for the MetricRegistry: RegionServer,sub=Coprocessor.R
egion.CP_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint
2020-04-21 15:08:24,579 INFO  [HBase-Metrics2-1] impl.GlobalMetricRegistriesAdapter: Registering RegionServer,sub=Coprocessor.Region.CP_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint Metrics about HBase RegionObservers

// and, the log appears every 5 minutes
{noformat}
[https://github.com/apache/hbase/blob/rel/1.4.13/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/metrics/impl/GlobalMetricRegistriesAdapter.java#L151]
 [https://github.com/apache/hbase/blob/rel/1.4.13/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/metrics/impl/GlobalMetricRegistriesAdapter.java#L170-L176]

 * Since under 2.6.0 hadoop doesn't provide any interface to unregister existing metric sources (introduced since https://issues.apache.org/jira/browse/HADOOP-10839), branch-1 ported version uses reflections to remove a metric source from metric system. [https://github.com/apache/hbase/blob/rel/1.4.13/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/metrics/impl/GlobalMetricRegistriesAdapter.java#L197-L226]

 * after HADOOP-11105 introduced (2.6.0), a new field 'namedCallbacks' was added to re-register the existing sources when metric system restarted. Because current HBase implementation couldn't touch that field by reflection, a callback to re-register a metric source for the coprocessor still remains in the map, even after meta region moved out.

 * a scheduled task restart hadoop metric system every 5 minutes. So, a new metric source for MultiRowMutationEndpoint coprocessor is registered every restart, even after meta region moved out.
 [https://github.com/apache/hbase/blob/rel/1.4.13/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java#L59-L63]
 [https://github.com/apache/hbase/blob/rel/1.4.13/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/metrics2/impl/JmxCacheBuster.java#L103-L110]

 * When meta region was moved back at the time (3) in above logs, HBase tried to register a new metric source, and that try failed because the source with the same name already exists. However, even though the registration failed, that try replaces existing metric source (re-registered by restart) with new one which have 'mbeanName' == null due to registration failure.
 [https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSystemImpl.java#L267]
 [https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/util/MBeans.java#L124]
 [https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSourceAdapter.java#L222]
{noformat}
2020-04-21 15:08:24,579 INFO  [HBase-Metrics2-1] impl.GlobalMetricRegistriesAdapter: Registering RegionServer,sub=Coprocessor.Region.CP_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint Metrics about HBase RegionObservers
2020-04-21 15:08:24,579 WARN  [HBase-Metrics2-1] util.MBeans: Error creating MBean object name: Hadoop:service=HBase,name=RegionServer,sub=Coprocessor.Region.CP_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint
org.apache.hadoop.metrics2.MetricsException: org.apache.hadoop.metrics2.MetricsException: Hadoop:service=HBase,name=RegionServer,sub=Coprocessor.Region.CP_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint already exists!
{noformat}

 * when metric system is stopped by a scheduled task, de-registration process by stop for a metric source should unregister MBean and remove its name from DefaultMetricsSystem("mBeanNames" field), but it couldn't, because existing metric source have null for 'mbeanName' by replacement.
 [https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/util/MBeans.java#L105-L114]

 * So, when metric system starts again, registering a metric source for the coprocessor is failed, because its name still exists in DefaultMetricsSystem. and finally we can show the exception in log whenever metric system restart every 5 minutes.


was (Author: jeongdae kim):
I think the reason why this warning occurs is that HBASE-9774 was ported to branch-1 HBASE-18060, and HBASE-18060 only fit for hadoop version < 2.6.0 by using reflections to remove some internal data. unfortunately, since hadoop version 2.6.0 HADOOP-11105 , some more internal data is added, which HBase 1.4.13 can't remove, so it makes this warning every 5 minutes and the metric source for coprocessor will be never re-registered.

 

More specifically, It happens in the following scenario.
 * At first, The warning started to appear after the meta region was moved out from a host and then moved back to the host again some minutes after.
{noformat}
2020-04-21 15:08:07,749 DEBUG [RS_CLOSE_META-XXX:11471-0] handler.CloseRegionHandler: Closed hbase:meta,,1.1588230740
...
2020-04-21 15:08:19,169 INFO  [RpcServer.priority.FPBQ.Fifo.handler=18,queue=0,port=11471] regionserver.RSRpcServices: Open hbase:meta,,1.1588230740
{noformat}

 * since HBASE-12016 ported to branch-1, the coprocessor 'MultiRowMutationEndpoint' has added to the meta table.
{noformat}
2020-04-21 15:08:19,172 DEBUG [RS_OPEN_META-XXX:11471-0] coprocessor.CoprocessorHost: Loading coprocessor class org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint with path null and priority 536870911
2020-04-21 15:08:19,173 DEBUG [RS_OPEN_META-XXX:11471-0] regionserver.HRegion: Registered coprocessor service: region=hbase:meta,,1 service=MultiRowMut
ationService
2020-04-21 15:08:19,174 INFO  [RS_OPEN_META-XXX:11471-0] regionserver.RegionCoprocessorHost: Loaded coprocessor org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint from HTD of hbase:meta successfully.
{noformat}

 * Because Coprocessor metrics introduced HBASE-18060, whenever meta region is closed or opened, a new metric source is added to hadoop metrics2 system or removed from it.
{noformat}
// (1) registered a metric source after a region server restarted 
2020-04-20 14:23:04,579 DEBUG [HBase-Metrics2-1] impl.GlobalMetricRegistriesAdapter: Registering adapter for the MetricRegistry: RegionServer,sub=Coprocessor.R
egion.CP_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint
2020-04-20 14:23:04,580 INFO  [HBase-Metrics2-1] impl.GlobalMetricRegistriesAdapter: Registering RegionServer,sub=Coprocessor.Region.CP_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint Metrics about HBase RegionObservers

// (2) removed it because of moving meta out from the region server
2020-04-21 15:08:14,579 DEBUG [HBase-Metrics2-1] impl.GlobalMetricRegistriesAdapter: Removing adapter for the MetricRegistry: RegionServer,sub=Coprocessor.Regi
on.CP_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint

// (3) then trying to re-register it after meta region moved into the same region server
2020-04-21 15:08:24,579 DEBUG [HBase-Metrics2-1] impl.GlobalMetricRegistriesAdapter: Registering adapter for the MetricRegistry: RegionServer,sub=Coprocessor.R
egion.CP_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint
2020-04-21 15:08:24,579 INFO  [HBase-Metrics2-1] impl.GlobalMetricRegistriesAdapter: Registering RegionServer,sub=Coprocessor.Region.CP_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint Metrics about HBase RegionObservers

// and, the log appears every 5 minutes
{noformat}
[https://git.linecorp.com/LINE-Server/line-hbase/blob/1.4.13-line/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/metrics/impl/GlobalMetricRegistriesAdapter.java#L151]
 [https://git.linecorp.com/LINE-Server/line-hbase/blob/1.4.13-line/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/metrics/impl/GlobalMetricRegistriesAdapter.java#L170-L176]

 * Since under 2.6.0 hadoop doesn't provide any interface to unregister existing metric sources (introduced since https://issues.apache.org/jira/browse/HADOOP-10839), branch-1 ported version uses reflections to remove a metric source from metric system. [https://git.linecorp.com/LINE-Server/line-hbase/blob/1.4.13-line/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/metrics/impl/GlobalMetricRegistriesAdapter.java#L197-L226]

 * after HADOOP-11105 introduced (2.6.0), a new field 'namedCallbacks' was added to re-register the existing sources when metric system restarted. Because current HBase implementation couldn't touch that field by reflection, a callback to re-register a metric source for the coprocessor still remains in the map, even after meta region moved out.

 * a scheduled task restart hadoop metric system every 5 minutes. So, a new metric source for MultiRowMutationEndpoint coprocessor is registered every restart, even after meta region moved out.
 [https://git.linecorp.com/LINE-Server/line-hbase/blob/1.4.13-line/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java#L59-L63]
 [https://git.linecorp.com/LINE-Server/line-hbase/blob/1.4.13-line/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/metrics2/impl/JmxCacheBuster.java#L103-L110]

 * When meta region was moved back at the time (3) in above logs, HBase tried to register a new metric source, and that try failed because the source with the same name already exists. However, even though the registration failed, that try replaces existing metric source (re-registered by restart) with new one which have 'mbeanName' == null due to registration failure.
 [https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSystemImpl.java#L267]
 [https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/util/MBeans.java#L124]
 [https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSourceAdapter.java#L222]
{noformat}
2020-04-21 15:08:24,579 INFO  [HBase-Metrics2-1] impl.GlobalMetricRegistriesAdapter: Registering RegionServer,sub=Coprocessor.Region.CP_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint Metrics about HBase RegionObservers
2020-04-21 15:08:24,579 WARN  [HBase-Metrics2-1] util.MBeans: Error creating MBean object name: Hadoop:service=HBase,name=RegionServer,sub=Coprocessor.Region.CP_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint
org.apache.hadoop.metrics2.MetricsException: org.apache.hadoop.metrics2.MetricsException: Hadoop:service=HBase,name=RegionServer,sub=Coprocessor.Region.CP_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint already exists!
{noformat}

 * when metric system is stopped by a scheduled task, de-registration process by stop for a metric source should unregister MBean and remove its name from DefaultMetricsSystem("mBeanNames" field), but it couldn't, because existing metric source have null for 'mbeanName' by replacement.
 [https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/util/MBeans.java#L105-L114]

 * So, when metric system starts again, registering a metric source for the coprocessor is failed, because its name still exists in DefaultMetricsSystem. and finally we can show the exception in log whenever metric system restart every 5 minutes.

> Moving the meta region causes MetricsException when using above 2.6.0 hadoop version
> ------------------------------------------------------------------------------------
>
>                 Key: HBASE-24402
>                 URL: https://issues.apache.org/jira/browse/HBASE-24402
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.4.13
>            Reporter: Jeongdae Kim
>            Assignee: Jeongdae Kim
>            Priority: Minor
>
> When the meta region moved out from a region server then moved in to the one again, we got MetricsException below and the metric source for coprocessor couldn't be re-registered. it happened with hadoop 2.8.5
> {noformat}
> 2020-04-21 15:13:02,956 WARN  [HBase-Metrics2-1] util.MBeans: Error creating MBean object name: Hadoop:service=HBase,name=RegionServer,sub=Coprocessor.Region.C
> P_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint
> org.apache.hadoop.metrics2.MetricsException: org.apache.hadoop.metrics2.MetricsException: Hadoop:service=HBase,name=RegionServer,sub=Coprocessor.Region.CP_org.
> apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint already exists!
>     at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newObjectName(DefaultMetricsSystem.java:127)
>     at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newMBeanName(DefaultMetricsSystem.java:102)
>     at org.apache.hadoop.metrics2.util.MBeans.getMBeanName(MBeans.java:121)
>     at org.apache.hadoop.metrics2.util.MBeans.register(MBeans.java:64)
>     at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.startMBeans(MetricsSourceAdapter.java:222)
>     at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.start(MetricsSourceAdapter.java:100)
>     at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.registerSource(MetricsSystemImpl.java:268)
>     at org.apache.hadoop.metrics2.impl.MetricsSystemImpl$1.postStart(MetricsSystemImpl.java:239)
>     at jdk.internal.reflect.GeneratedMethodAccessor82.invoke(Unknown Source)
>     at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>     at org.apache.hadoop.metrics2.impl.MetricsSystemImpl$3.invoke(MetricsSystemImpl.java:320)
>     at com.sun.proxy.$Proxy10.postStart(Unknown Source)
>     at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.start(MetricsSystemImpl.java:193)
>     at org.apache.hadoop.metrics2.impl.JmxCacheBuster$JmxCacheBusterRunnable.run(JmxCacheBuster.java:109)
>     at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>     at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>     at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>     at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>     at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>     at java.base/java.lang.Thread.run(Thread.java:834) 
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)