You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-issues@hadoop.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/07/01 08:47:00 UTC

[jira] [Work logged] (HDFS-16646) RBF: Support an elastic RouterRpcFairnessPolicyController

     [ https://issues.apache.org/jira/browse/HDFS-16646?focusedWorklogId=786995&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-786995 ]

ASF GitHub Bot logged work on HDFS-16646:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 01/Jul/22 08:46
            Start Date: 01/Jul/22 08:46
    Worklog Time Spent: 10m 
      Work Description: hadoop-yetus commented on PR #4519:
URL: https://github.com/apache/hadoop/pull/4519#issuecomment-1172100167

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 45s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  |
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 3 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  40m 14s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m  0s |  |  trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   0m 52s |  |  trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   0m 51s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m  3s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  3s |  |  trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 10s |  |  trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 49s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m 39s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 43s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 47s |  |  the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   0m 47s |  |  hadoop-hdfs-project_hadoop-hdfs-rbf-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 generated 0 new + 54 unchanged - 1 fixed = 54 total (was 55)  |
   | +1 :green_heart: |  compile  |   0m 42s |  |  the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   0m 42s |  |  hadoop-hdfs-project_hadoop-hdfs-rbf-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 generated 0 new + 54 unchanged - 1 fixed = 54 total (was 55)  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 26s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 45s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 44s |  |  the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m  3s |  |  the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 39s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  23m 29s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | -1 :x: |  unit  |  22m 41s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4519/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt) |  hadoop-hdfs-rbf in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 50s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 128m  3s |  |  |
   
   
   | Reason | Tests |
   |-------:|:------|
   | Failed junit tests | hadoop.hdfs.server.federation.router.TestRBFConfigFields |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4519/2/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4519 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint |
   | uname | Linux 85e057413f1e 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / cf2fb3312f5695e158c02d89df973d649c257d3c |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4519/2/testReport/ |
   | Max. process+thread count | 2807 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: hadoop-hdfs-project/hadoop-hdfs-rbf |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4519/2/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




Issue Time Tracking
-------------------

    Worklog Id:     (was: 786995)
    Time Spent: 0.5h  (was: 20m)

> RBF: Support an elastic RouterRpcFairnessPolicyController
> ---------------------------------------------------------
>
>                 Key: HDFS-16646
>                 URL: https://issues.apache.org/jira/browse/HDFS-16646
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: ZanderXu
>            Assignee: ZanderXu
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> As we all known, `StaticRouterRpcFairnessPolicyController` is very helpfully for RBF to minimize impact of clients connecting to healthy vs unhealthy nameNodes. 
> But in prod environment, the traffic of clients accessing each NS and the pressure of downstream namenodes are dynamically changed. So if we only have one static permit conf, RBF cannot able to adapt to the changes in traffic to achieve optimal results. 
> So here I propose an elastic RouterRpcFairnessPolicyController to help RBF adapt to traffic changes to achieve an optimal result.
> The overall idea is:
> * Each name service can configured the exclusive permits like `StaticRouterRpcFairnessPolicyController`
> * TotalPermits is more than sum(NsExclusivePermit) and mark TotalPermits - sum(NsExclusivePermit) as SharedPermits
> * Each name service can properly preempt the SharedPermits after it's own exclusive permits is used up.
> * But the maximum value of SharedPermits preempted by each nameservice should be limited. Such as 20% of SharedPermits.
> Suppose we have 200 handlers and 5 name services, and each name services configured different exclusive Permits, like:
> | NS1 | NS2 | NS3 | NS4 | NS5 | Concurrent NS |
> |-- | -- | -- | -- | -- | -- |
> | 9 | 11 | 8 | 12 | 10 | 50 |
> The `sum(NsExclusivePermit)` is 100, and the `SharedPermits = TotalPermits(200) - Sum(NsExclusivePermit)(100) = 100`
> Suppose we configure that each nameservice can preempt up to 20% of TotalPermits, marked as `elasticPercent`.
> Then from the point view of a single NS, the permits it may be can use are as follow:
> - Exclusive Permits, which is cannot be used by other name services.
> - Limited SharedPermits, whether is can use so many shared permits depends on the remaining number of SharedPermits, because the SharedPermits is be preempted by all nameservices.
> If we configure the `elasticPercent=100`, it means one nameservices can use up all SharedPermits.
> If we configure the `elasticPercent=0`, it means nameservice can only use it's exclusive Permits.
> If we configure the `elasticPercent=20`, it means that the RBF can tolerate 5 unhealthy name services at the same time.
> In our prod environment, we configured as follow, and it works well:
> - RBF has 3000 handlers
> - Each nameservice has 10 exclusive permits
> - `elasticPercent` is 30%
> Of course, we need to configure reasonable parameters according to the prod traffic.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org