You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ambari.apache.org by "Tamas Payer (Jira)" <ji...@apache.org> on 2021/03/01 10:59:00 UTC
[jira] [Commented] (AMBARI-25611) After purging AMS database "TimelineMetricMetadataKey is null" error is thrown

    [ https://issues.apache.org/jira/browse/AMBARI-25611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292815#comment-17292815 ] 

Tamas Payer commented on AMBARI-25611:
--------------------------------------

This issue is appears only if there are multiple Metrics Collectors on the cluster where the data purge war done.
It seems that the metadata cache are not in sync on the collectors. The situation can be worked around by the following purge procedure:
h2. Cleaning up Ambari Metrics System Data
 * Get some configuration values:
 * Get _hbase.rootdir_. *hbase.rootdir = /user/ams/hbase*
 * Get _hbase.tmp.dir_. *hbase.tmp.dir = /var/lib/ambari-metrics-collector/hbase-tmp*
 * Get _hbase.zookeeper.property.datadir_. *hbase.zookeeper.property.datadir = ${hbase.tmp.dir}/zookeeper*
 * Get _phoenix.spool.directory_. *phoenix.spool.directory = ${hbase.tmp.dir}/phoenix-spool*
 * Get ‘ZooKeeper Znode Parent’. *ZooKeeper Znode Parent = /ams-hbase-unsecure*

 * Stop AMS and set it to Maintenance Mode

 * Remove the HBase database
 ** _su hdfs_
 ** _hdfs dfs -ls /user/ams/hbase_
 ** _hdfs dfs -rm -r /user/ams/hbase_
 ** _exit_ (back to root from hdfs user)

 * Remove the AMS Zookeeper data *on both Collector Nodes*
 ** _su ams_
 ** _ls /var/lib/ambari-metrics-collector/hbase-tmp/zookeeper/_
 ** _rm -rf /var/lib/ambari-metrics-collector/hbase-tmp/zookeeper_

 * Remove any Phoenix spool files *on both Collector Nodes*
 ** _ls /var/lib/ambari-metrics-collector/hbase-tmp/phoenix-spool/_
 ** _rm -rf /var/lib/ambari-metrics-collector/hbase-tmp/phoenix-spool_

 * _exit_ (back to root from ams user)

 * Connect to the cluster zookeeper instance and delete the ‘_ZooKeeper Znode Parent’/meta-region-server_ node
 ** _zkCli.sh -server localhost:2181_
 ** _[zk: localhost:2181(CONNECTED) 0]__
_ _rmr /ams-hbase-unsecure/meta-region-server_
 ** _quit_

 * *If there are two Metrics Collectors on the cluster*
 * *Start only one of the Collectors - the primary one. On the Hosts page of Ambari start the individual Metrics Collector component.*
 * *Watch the ambari-metrics-collector.log (tail -f ambari-metrics-collector.log) and wait for a few aggregation cycles.*
 * *Finally start the Ambari Metrics normally from Ambari. That will start up the second Collector along with the other components.*

 * Disable the Maintenance Mode

 

> After purging AMS database "TimelineMetricMetadataKey is null" error is thrown
> ------------------------------------------------------------------------------
>
>                 Key: AMBARI-25611
>                 URL: https://issues.apache.org/jira/browse/AMBARI-25611
>             Project: Ambari
>          Issue Type: Task
>          Components: ambari-metrics
>    Affects Versions: 2.7.5
>            Reporter: Tamas Payer
>            Priority: Minor
>
> After purging the Metrics Collector's database the following error message appears in the logs:
> {code:java}
> 2021-01-15 11:30:48,921 ERROR org.apache.ambari.metrics.core.timeline.discovery.TimelineMetricMetadataManager: TimelineMetricMetadataKey is null for : [-1, -1, -64, -112, -85, -17, 39, -84, -121, 19, -118, -36, -104, -21, -7, 110, 61, -97, -56, 10]{code}
>  
> *The following steps were used to purge the database:*
>  * +Get some configuration values:+
>  * Get _hbase.rootdir_. *hbase.rootdir = /user/ams/hbase*
>  * Get _hbase.tmp.dir_. *hbase.tmp.dir = /var/lib/ambari-metrics-collector/hbase-tmp*
>  * Get _hbase.zookeeper.property.datadir_. *hbase.zookeeper.property.datadir = ${hbase.tmp.dir}/zookeeper*
>  * Get _phoenix.spool.directory_. *phoenix.spool.directory = ${hbase.tmp.dir}/phoenix-spool*
>  * Get ‘ZooKeeper Znode Parent’. *ZooKeeper Znode Parent = /ams-hbase-unsecure*
>  * *Stop AMS and set it to Maintenance Mode*
>  * Remove the HBase database
>  * _su hdfs_
>  * _hdfs dfs -ls /user/ams/hbase_
>  * _hdfs dfs -rm -r /user/ams/hbase_
>  * _exit (back to root from hdfs user)_
>  * Remove the AMS Zookeeper data *on both Collector Nodes*
>  * _su ams_
>  * _ls /var/lib/ambari-metrics-collector/hbase-tmp/zookeeper/_
>  * _rm -rf /var/lib/ambari-metrics-collector/hbase-tmp/zookeeper/*_
>  * Remove any Phoenix spool files *on both Collector Nodes*
>  * _ls /var/lib/ambari-metrics-collector/hbase-tmp/phoenix-spool/_
>  * _rm -rf /var/lib/ambari-metrics-collector/hbase-tmp/phoenix-spool/*_
>  * _exit_ (back to root from ams user)
>  * Connect to the cluster zookeeper instance and delete the ‘_ZooKeeper Znode Parent’/meta-region-server_ node
>  * _/usr/hdp/current/zookeeper-client/bin/zkCli.sh -server localhost:2181_
>  * _[zk: localhost:2181(CONNECTED) 0]_
>  * _rmr /ams-hbase-unsecure/meta-region-server_
>  * _quit_
>  * Restart AMS
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)