You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "guoxiaojiao (Jira)" <ji...@apache.org> on 2023/04/13 10:28:00 UTC

[jira] [Commented] (HBASE-27387) MetricsSource lastShippedTimeStamps ConcurrentModificationException cause RegionServer crash

    [ https://issues.apache.org/jira/browse/HBASE-27387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17711818#comment-17711818 ] 

guoxiaojiao commented on HBASE-27387:
-------------------------------------

This issue may occur when RegionServer start or enable_peer, multiple ReplicationSourceShipper threads (hbase.wal.provider=muliwal, hbase.wal.regiongrouping.numgroups =3) start replicate data to the peer cluster, the monitor for every table's replication (Map<String, MetricsReplicationTableSource> singleSourceSourceByTable) needs to update after a entry batch replicated, singleSourceSourceByTable use HashMap is thread-unsafe , so cause ConcurrentModificationException. 

In extreme cases, a wal has replicated, then encounter ConcurrentModificationException, so it will retry, but wal information in zookeeper cannot be update again, we may be encounter NoNode Exception.

>  MetricsSource lastShippedTimeStamps ConcurrentModificationException cause RegionServer crash
> ---------------------------------------------------------------------------------------------
>
>                 Key: HBASE-27387
>                 URL: https://issues.apache.org/jira/browse/HBASE-27387
>             Project: HBase
>          Issue Type: Bug
>            Reporter: zhengsicheng
>            Priority: Minor
>
> 022-09-20 14:14:40,332 ERROR [regionserver/hostname1:16020] regionserver.HRegionServer: ***** ABORTING region server hostname1,16020,1663147531495: Unhandled: null *****
>  8587 java.util.ConcurrentModificationException
>  8588     at java.util.HashMap$HashIterator.nextNode(HashMap.java:1442)
>  8589     at java.util.HashMap$ValueIterator.next(HashMap.java:1471)
>  8590     at org.apache.hadoop.hbase.replication.regionserver.MetricsSource.getTimestampOfLastShippedOp(MetricsSource.java:321)
>  8591     at org.apache.hadoop.hbase.replication.regionserver.ReplicationLoad.buildReplicationLoad(ReplicationLoad.java:80)
>  8592     at org.apache.hadoop.hbase.replication.regionserver.Replication.buildReplicationLoad(Replication.java:264)
>  8593     at org.apache.hadoop.hbase.replication.regionserver.Replication.refreshAndGetReplicationLoad(Replication.java:253)
>  8594     at org.apache.hadoop.hbase.regionserver.HRegionServer.buildServerLoad(HRegionServer.java:1436)
>  8595     at org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:1243)
>  8596     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1065)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)