You are viewing a plain text version of this content. The canonical link for it is here.

Posted to hdfs-dev@hadoop.apache.org by "Elek, Marton (JIRA)" <ji...@apache.org> on 2017/10/10 13:37:00 UTC

[jira] [Created] (HDFS-12624) Ozone: number of keys/values/buckets to KSMMetrics

Elek, Marton created HDFS-12624:
-----------------------------------

Summary: Ozone: number of keys/values/buckets to KSMMetrics
Key: HDFS-12624
URL: https://issues.apache.org/jira/browse/HDFS-12624
Project: Hadoop HDFS
Issue Type: Sub-task
Components: ozone
Affects Versions: HDFS-7240
Reporter: Elek, Marton
Assignee: Elek, Marton

During my last ozone test with 100 node ozone cluster I see a problem to track how many keys/volumes/buckets do I have.

I opened this jira to start a discussion about extending KSM metrics (but let me know if this is already planned somewhere else) and add number of keys/volumes/buckets to the metrics interface.

These counters could be added to anywhere else (for example as a client call) but I think it is an important number and would be worth to monitor it.

I see multiple ways to achieve it:

1. Extend the `org.apache.hadoop.utils.MetadaStore` class with an additional count() method. As I know there is no easy way to implement it with leveldb but with rocksdb there is a posibility to get the _estimated_ number of keys.

On the other hand KSM stores volumes/buckets/keys in the same db, so we can't use it without splitting the ksm.db to separated dbs.

2. Create a background task to iterate over all the keys and count ozone key/volume/bucket numbers:

pro: it would be independent from the existing program flow
con: doesn't provided up-to-date information.
con: it uses more resources to scan the whole db frequently

3. During the startup we can iterate over the whole ksm.db and count the current metrics, and later we can update the numbers in case of new create/delete calls. It uses additional resources during the startup (should be checked how much time is to parse a db with millions of keys) but after that it would be fast. Also we can introduce new confguration variables to skip the initial scan. In that case the numbers will be valid only from the last restart but the startup would be fast.

I suggest to use the 3rd approach, could you please comment about your opinion?

--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org