You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by GitBox <gi...@apache.org> on 2021/11/23 11:16:31 UTC

[GitHub] [hadoop] virajjasani opened a new pull request #3711: HDFS-16350. Datanode start time should be set after RPC server starts successfully

virajjasani opened a new pull request #3711:
URL: https://github.com/apache/hadoop/pull/3711


   ### Description of PR
   We set start time of Datanode when the class is instantiated but it should be ideally set only after RPC server starts and RPC handlers are initialized to serve client requests.
   
   ### How was this patch tested?
   Tested locally. UI screenshot:
   
   <img width="643" alt="Screenshot 2021-11-23 at 4 32 04 PM" src="https://user-images.githubusercontent.com/34790606/143014810-4b80aee5-aa10-4bd0-8899-5a40485e5e5a.png">
   
   
   ### For code changes:
   
   - [X] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] tomscut edited a comment on pull request #3711: HDFS-16350. Datanode start time should be set after RPC server starts successfully

Posted by GitBox <gi...@apache.org>.
tomscut edited a comment on pull request #3711:
URL: https://github.com/apache/hadoop/pull/3711#issuecomment-976471066


   > @tomscut @goiri could you please take a look? Thanks
   
   Thanks @virajjasani for reminding me here. At present, both `Router` and `Namenode` obtain `startTime` in this way. Do you mean there will be a small error?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] virajjasani edited a comment on pull request #3711: HDFS-16350. Datanode start time should be set after RPC server starts successfully

Posted by GitBox <gi...@apache.org>.
virajjasani edited a comment on pull request #3711:
URL: https://github.com/apache/hadoop/pull/3711#issuecomment-976548773


   Thanks @tomscut. Yes this is a minor improvement in the behaviour. Basically, we should set the time according to the actual server startup time. The point is by the time RPC servers are started, Datanode is not considered started so we should keep the start time accordingly, but as I mentioned, it's minor improvement but the timing should be more accurate I believe.
   
   > At present, both `Router` and `Namenode` obtain `startTime` in this way.
   
   In that case, they might also require change. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] tomscut commented on pull request #3711: HDFS-16350. Datanode start time should be set after RPC server starts successfully

Posted by GitBox <gi...@apache.org>.
tomscut commented on pull request #3711:
URL: https://github.com/apache/hadoop/pull/3711#issuecomment-976471066


   > @tomscut @goiri could you please take a look? Thanks
   
   
   
   > @tomscut @goiri could you please take a look? Thanks
   
   Thanks @virajjasani for reminding me here. At present, both router and Namenode obtain startTime in this way. Do you mean there will be a small error?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] virajjasani commented on pull request #3711: HDFS-16350. Datanode start time should be set after RPC server starts successfully

Posted by GitBox <gi...@apache.org>.
virajjasani commented on pull request #3711:
URL: https://github.com/apache/hadoop/pull/3711#issuecomment-977572315


   @ferhui could you please also take a look? Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] virajjasani edited a comment on pull request #3711: HDFS-16350. Datanode start time should be set after RPC server starts successfully

Posted by GitBox <gi...@apache.org>.
virajjasani edited a comment on pull request #3711:
URL: https://github.com/apache/hadoop/pull/3711#issuecomment-976586442


   Starttime calculation for Namenode and Router seem better than Datanode case, because it's not the main classes that have starttime defined (i.e. NameNode and DFSRouter) but rather other classes that are internally initialized (e.g. FSNamesystem and Router) as part of the main process initialization, have the starttime defined. In the case of Datanode, the main class itself (i.e. DataNode) has starttime defined but it is initialized soon as the Datanode is instantiated. Hence, I think we can improve only Datanode's starttime to match with the actual process start time.
   I understand that for a healthy process initialization, there should not be much difference in starttime if we keep it at instance init level (without this patch) vs if we keep it soon after RPC server is started (with this patch), but my concern is that if for some reason, RPC handler start time is significantly delayed, then we would have incorrect starttime reported.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #3711: HDFS-16350. Datanode start time should be set after RPC server starts successfully

Posted by GitBox <gi...@apache.org>.
hadoop-yetus commented on pull request #3711:
URL: https://github.com/apache/hadoop/pull/3711#issuecomment-977195641


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 55s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 1 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  35m  1s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 27s |  |  trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   1m 17s |  |  trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   1m  0s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 26s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  1s |  |  trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 30s |  |  trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 32s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  25m 59s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 17s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 25s |  |  the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   1m 25s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 14s |  |  the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 53s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 20s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 54s |  |  the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 25s |  |  the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 21s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  25m  6s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  | 313m 46s |  |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 39s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 421m 35s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3711/5/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3711 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux aec6689d0c15 4.15.0-147-generic #151-Ubuntu SMP Fri Jun 18 19:21:19 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 776b6548240a21d8fc5c08dea40adef542e98852 |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3711/5/testReport/ |
   | Max. process+thread count | 2267 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3711/5/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] tomscut edited a comment on pull request #3711: HDFS-16350. Datanode start time should be set after RPC server starts successfully

Posted by GitBox <gi...@apache.org>.
tomscut edited a comment on pull request #3711:
URL: https://github.com/apache/hadoop/pull/3711#issuecomment-976471066


   > @tomscut @goiri could you please take a look? Thanks
   
   Thanks @virajjasani for reminding me here. At present, both `Router` and `Namenode` obtain `startTime` in this way. Do you mean there will be a small gap?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] virajjasani commented on pull request #3711: HDFS-16350. Datanode start time should be set after RPC server starts successfully

Posted by GitBox <gi...@apache.org>.
virajjasani commented on pull request #3711:
URL: https://github.com/apache/hadoop/pull/3711#issuecomment-976548773


   Thanks @tomscut. Yes this is a minor improvement in the behaviour. Basically, we should set the time according to the actual server startup time. The point is by the time RPC servers are started, Datanode is not considered started so we should keep the start time accordingly, but as I mentioned, it's minor improvement.
   
   > At present, both `Router` and `Namenode` obtain `startTime` in this way.
   
   In that case, they might also require change. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] ferhui commented on pull request #3711: HDFS-16350. Datanode start time should be set after RPC server starts successfully

Posted by GitBox <gi...@apache.org>.
ferhui commented on pull request #3711:
URL: https://github.com/apache/hadoop/pull/3711#issuecomment-978750666


   @virajjasani Thanks for contribution. @tomscut Thanks for review!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] ferhui merged pull request #3711: HDFS-16350. Datanode start time should be set after RPC server starts successfully

Posted by GitBox <gi...@apache.org>.
ferhui merged pull request #3711:
URL: https://github.com/apache/hadoop/pull/3711


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] virajjasani commented on pull request #3711: HDFS-16350. Datanode start time should be set after RPC server starts successfully

Posted by GitBox <gi...@apache.org>.
virajjasani commented on pull request #3711:
URL: https://github.com/apache/hadoop/pull/3711#issuecomment-976414780


   @tomscut @goiri could you please take a look? Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] tomscut edited a comment on pull request #3711: HDFS-16350. Datanode start time should be set after RPC server starts successfully

Posted by GitBox <gi...@apache.org>.
tomscut edited a comment on pull request #3711:
URL: https://github.com/apache/hadoop/pull/3711#issuecomment-976471066


   > @tomscut @goiri could you please take a look? Thanks
   
   Thanks @virajjasani for reminding me here. At present, both router and Namenode obtain startTime in this way. Do you mean there will be a small error?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] virajjasani commented on pull request #3711: HDFS-16350. Datanode start time should be set after RPC server starts successfully

Posted by GitBox <gi...@apache.org>.
virajjasani commented on pull request #3711:
URL: https://github.com/apache/hadoop/pull/3711#issuecomment-976586442


   Starttime calculation for Namenode and Router seem better than Datanode case, because it's not the main classes that have starttime defined (i.e. NameNode and DFSRouter) but rather other classes that are internally initialized (e.g. FSNamesystem and Router) as part of the main process initialization, have the starttime defined. In the case of Datanode, the main class itself (i.e. DataNode) has starttime defined but it is initialized before soon as the Datanode is instantiated. Hence, I think we can improve only Datanode's starttime to match with the actual process start time.
   I understand that for a healthy process initialization, there should not be much difference in starttime if we keep it at instance init level (without this patch) vs if we keep it soon after RPC server is started (with this patch), but my concern is that if for some reason, RPC handler start time is significantly delayed, then we would have incorrect starttime reported.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] virajjasani edited a comment on pull request #3711: HDFS-16350. Datanode start time should be set after RPC server starts successfully

Posted by GitBox <gi...@apache.org>.
virajjasani edited a comment on pull request #3711:
URL: https://github.com/apache/hadoop/pull/3711#issuecomment-976586442


   Starttime calculation for Namenode and Router seem better than Datanode case, because it's not the main classes that have starttime defined (i.e. NameNode and DFSRouter) but rather other classes that are internally initialized (e.g. FSNamesystem and Router) as part of the main process initialization, have the starttime defined. In the case of Datanode, the main class itself (i.e. DataNode) has starttime defined but it is initialized soon as the Datanode is instantiated. Hence, I think we should only improve Datanode's starttime to match with the actual process start time.
   
   I understand that for a healthy process initialization, there should not be much difference in starttime if we keep it at instance init level (without this patch) vs if we keep it after RPC server is started (with this patch), but my concern is that if for some reason, RPC handler start time is significantly delayed, then we would have incorrect starttime reported.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org