You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Anatoly <an...@slobozhanov.ru> on 2024/03/08 16:54:24 UTC

[hdfs] [metrics] RpcAuthenticationSuccesses

Hi

There is a question about two hdfs metrics that arose as a result of my
attempts to calculate the load on the KDC for an industrial cluster

There are two parameters in hdfs metrics

RpcAuthenticationSuccesses - Total number of successful authentication
attempts

RpcAuthenticationFailures - Total number of authentication failures

I expect that any data request in the hadoop cluster will commit

the request to KDC -> get ticket,

the request to the NameNode

after which the request counter should activate either +1 to the metric if
successful, or +1 to the metric if unsuccessful

However, in a test cluster where I have

4 DataNodes and 2 NameNodes (HA), I see completely incomprehensible indicators
for these metrics.

By the way, at the same time, I noticed that the RpcAuthenticationSuccesses
readings gradually increase by +1 every 30 seconds

TEST 1

I made sure that

1\. Only HDFS-{NN,DN,JN, ZKFC} and YARN-{RM,NM} services work

2\. All other components that were – hive, spark HistoryServer, are disabled

3\. There are no YARN jobs running and no user requests to hdfs

At the time of testing, the value of RpcAuthenticationFailures indicators = 0

RpcAuthenticationSuccesses = 208322

To check the download, I run the spark-submit test - spark-
examples_2.12-3.5.0.jar with the number of performers = 1

The request was completed in 1 minute and 20 seconds

RpcAuthenticationSuccesses = 208338

In total, +16 was added to the original value during execution

Let's say +2 can be attributed to the moment I wrote about above +1 every 30
seconds. But what does +14 authentications mean?

TEST 2

RpcAuthenticationFailures = 0

RpcAuthenticationSuccesses = 208388

hdfs dfs -ls /

RpcAuthenticationFailures = 0

RpcAuthenticationSuccesses = 208389

Added +1. Why?

I started kinit long before the ls/request, i.e. the metric should not have
changed, I think so, but maybe I'm wrong

TEST 3

disabled

\- All DN are

\- Satndby NN

\- All YARN services (RM, NM)

still running

Three JN, ZKFC

One NN is active

The +1 counter continues to add +1 to the RpcAuthenticationSuccesses metric
every 30 seconds

Either I misunderstand the meaning of these indicators, or something is
considered wrong

Can you tell me how these indicators are calculated, I do not understand this
or is it an error in the calculations and if I do not understand the work of
these metrics, then how is it correct?

Thank you very much

\--

With best wishes,

Anatoliy