You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2022/10/07 12:11:05 UTC

[GitHub] [ozone] navinko opened a new pull request, #3809: HDDS-7088. Removed IOException with log level warn

navinko opened a new pull request, #3809:
URL: https://github.com/apache/ozone/pull/3809

   ## What changes were proposed in this pull request?
   
   (Please fill in changes proposed in this fix)
   
   ## What is the link to the Apache JIRA
   
   (Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HDDS-XXXX. Fix a typo in YYY.)
   
   Please replace this section with the link to the Apache JIRA)
   
   ## How was this patch tested?
   
   (Please explain how this patch was tested. Ex: unit tests, manual tests)
   (If this patch involves UI changes, please attach a screen-shot; otherwise, remove this)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] szetszwo commented on pull request #3809: HDDS-7088. Removed IOException and added log level warn

Posted by GitBox <gi...@apache.org>.
szetszwo commented on PR #3809:
URL: https://github.com/apache/ozone/pull/3809#issuecomment-1274249596

   > ... In my understanding it's very common in production that SCM and OM wud be collocated ...
   
   Indeed, it is uncommon in production (SCM and OM usually use different directories, or even different machine) but it may happen in testing setup.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] navinko commented on pull request #3809: HDDS-7088. Removed IOException and added log level warn

Posted by GitBox <gi...@apache.org>.
navinko commented on PR #3809:
URL: https://github.com/apache/ozone/pull/3809#issuecomment-1271817340

   Hi @szetszwo added the suggested changes.  Kindly review the PR 
   Thanks 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] navinko commented on pull request #3809: HDDS-7088. Removed IOException and added log level warn

Posted by GitBox <gi...@apache.org>.
navinko commented on PR #3809:
URL: https://github.com/apache/ozone/pull/3809#issuecomment-1272993185

   Thanks @szetszwo @myskov for suggestions and reviewing .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] szetszwo commented on pull request #3809: HDDS-7088. Removed IOException and added log level warn

Posted by GitBox <gi...@apache.org>.
szetszwo commented on PR #3809:
URL: https://github.com/apache/ozone/pull/3809#issuecomment-1273266025

   @myskov , you are right that the message is misleading.
   
   @navinko , could you update the error message?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] szetszwo merged pull request #3809: HDDS-7088. Removed IOException and added log level warn

Posted by GitBox <gi...@apache.org>.
szetszwo merged PR #3809:
URL: https://github.com/apache/ozone/pull/3809


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] navinko commented on pull request #3809: HDDS-7088. Removed IOException and added log level warn

Posted by GitBox <gi...@apache.org>.
navinko commented on PR #3809:
URL: https://github.com/apache/ozone/pull/3809#issuecomment-1275738143

   Thanks @szetszwo @myskov @kerneltime for reviewing the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] myskov commented on pull request #3809: HDDS-7088. Removed IOException and added log level warn

Posted by GitBox <gi...@apache.org>.
myskov commented on PR #3809:
URL: https://github.com/apache/ozone/pull/3809#issuecomment-1273267374

   Let me explain the reasoning behind the requested change. We use Ozone in production and our administrators would be very concerned by the warning: "Ratis group Dir on disk scm does not match with RaftGroupID7f6848f2-63c2-3ce4-a1d2-9e88065dfcc3 generated from service id ozone1. Looks like there is a change to ozone.om.service.ids value after the cluster is setup. Currently change to this value is not supported."
   It sounds like a fatal error, not a warning.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] navinko commented on pull request #3809: HDDS-7088. Removed IOException and added log level warn

Posted by GitBox <gi...@apache.org>.
navinko commented on PR #3809:
URL: https://github.com/apache/ozone/pull/3809#issuecomment-1273670157

   Thanks @szetszwo @myskov for reviewing . I have updated the PR 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] szetszwo commented on a diff in pull request #3809: HDDS-7088. Removed IOException and added log level warn

Posted by GitBox <gi...@apache.org>.
szetszwo commented on code in PR #3809:
URL: https://github.com/apache/ozone/pull/3809#discussion_r991940929


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java:
##########
@@ -1351,13 +1351,10 @@ private void initializeRatisDirs(OzoneConfiguration conf) throws IOException {
         for (File ratisGroupDir : ratisDirFiles) {
           if (ratisGroupDir.isDirectory()) {
             if (!ratisGroupDir.getName().equals(groupIDfromServiceID)) {
-              throw new IOException("Ratis group Dir on disk "
-                  + ratisGroupDir.getName() + " does not match with RaftGroupID"
-                  + groupIDfromServiceID + " generated from service id "
-                  + getOMServiceId() + ". Looks like there is a change to " +
-                  OMConfigKeys.OZONE_OM_SERVICE_IDS_KEY + " value after the " +
-                  "cluster is setup. Currently change to this value is not " +
-                  "supported.");
+              LOG.warn("Another ratis dir is detected "

Review Comment:
   @navinko , Please include the name `ratisGroupDir.getName()`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] kerneltime commented on pull request #3809: HDDS-7088. Removed IOException and added log level warn

Posted by GitBox <gi...@apache.org>.
kerneltime commented on PR #3809:
URL: https://github.com/apache/ozone/pull/3809#issuecomment-1273832372

   @navinko why should the exception be dropped? An incorrectly configured OM and SCM folder should be a catastrophic failure.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] szetszwo commented on pull request #3809: HDDS-7088. Removed IOException and added log level warn

Posted by GitBox <gi...@apache.org>.
szetszwo commented on PR #3809:
URL: https://github.com/apache/ozone/pull/3809#issuecomment-1272831664

   > ... raft directory gets overridden (by OM or SCM, dependent on whom started later) ...
   
   @myskov , an existing raft directory should never be overridden automatically since it may cause data loss.  Currently, it throws an exception and stop starting.  After this change, it will print a warning and use a different directory because the group ids are different.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] myskov commented on a diff in pull request #3809: HDDS-7088. Removed IOException and added log level warn

Posted by GitBox <gi...@apache.org>.
myskov commented on code in PR #3809:
URL: https://github.com/apache/ozone/pull/3809#discussion_r991243562


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java:
##########
@@ -1351,7 +1351,7 @@ private void initializeRatisDirs(OzoneConfiguration conf) throws IOException {
         for (File ratisGroupDir : ratisDirFiles) {
           if (ratisGroupDir.isDirectory()) {
             if (!ratisGroupDir.getName().equals(groupIDfromServiceID)) {
-              throw new IOException("Ratis group Dir on disk "
+              LOG.warn("Ratis group Dir on disk "
                   + ratisGroupDir.getName() + " does not match with RaftGroupID"

Review Comment:
   The message is misleading. It should say  something like this: 
   "Another ratis dir is detected (possible SCM collocation). It's better not to share ratis log dir in the production environment."



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] myskov commented on pull request #3809: HDDS-7088. Removed IOException and added log level warn

Posted by GitBox <gi...@apache.org>.
myskov commented on PR #3809:
URL: https://github.com/apache/ozone/pull/3809#issuecomment-1272011571

   Hi @navinko. As far as I understand the jira, the issue is not that OM doesn't start but the raft directory gets overridden (by OM or SCM, dependent on whom started later). In this case, OM should not be started.
   It seems that the fix to the issue should allow sharing ratis dir by OM and SCM.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] navinko commented on pull request #3809: HDDS-7088. Removed IOException and added log level warn

Posted by GitBox <gi...@apache.org>.
navinko commented on PR #3809:
URL: https://github.com/apache/ozone/pull/3809#issuecomment-1274084048

   Hi @kerneltime , In my understanding it's very common in production that SCM and OM wud be collocated and in current design if both the scm and om ratis directory shared and if which ever is started later , it will not just override the existing Raft directory as there will be data loss so it throws an exception and stop starting. After this change, it will print a warning and  create a new directory because the group ids are different.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] szetszwo commented on a diff in pull request #3809: HDDS-7088. Removed IOException and added log level warn

Posted by GitBox <gi...@apache.org>.
szetszwo commented on code in PR #3809:
URL: https://github.com/apache/ozone/pull/3809#discussion_r992096099


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java:
##########
@@ -1351,13 +1351,10 @@ private void initializeRatisDirs(OzoneConfiguration conf) throws IOException {
         for (File ratisGroupDir : ratisDirFiles) {
           if (ratisGroupDir.isDirectory()) {
             if (!ratisGroupDir.getName().equals(groupIDfromServiceID)) {
-              throw new IOException("Ratis group Dir on disk "
-                  + ratisGroupDir.getName() + " does not match with RaftGroupID"
-                  + groupIDfromServiceID + " generated from service id "
-                  + getOMServiceId() + ". Looks like there is a change to " +
-                  OMConfigKeys.OZONE_OM_SERVICE_IDS_KEY + " value after the " +
-                  "cluster is setup. Currently change to this value is not " +
-                  "supported.");
+              LOG.warn("Another ratis dir is detected "

Review Comment:
   @navinko , thanks for the update.  Let's use a similar message for both cases
   ```java
   @@ -1353,16 +1353,13 @@ private void initializeRatisDirs(OzoneConfiguration conf) throws IOException {
            for (File ratisGroupDir : ratisDirFiles) {
              if (ratisGroupDir.isDirectory()) {
                if (!ratisGroupDir.getName().equals(groupIDfromServiceID)) {
   -              throw new IOException("Ratis group Dir on disk "
   -                  + ratisGroupDir.getName() + " does not match with RaftGroupID"
   -                  + groupIDfromServiceID + " generated from service id "
   -                  + getOMServiceId() + ". Looks like there is a change to " +
   -                  OMConfigKeys.OZONE_OM_SERVICE_IDS_KEY + " value after the " +
   -                  "cluster is setup. Currently change to this value is not " +
   -                  "supported.");
   +              LOG.warn("Unknown directory {} exists in ratis storage dir {}."
   +                  + " It is recommended not to share the ratis storage dir.",
   +                  ratisGroupDir, omRatisDir);
                }
              } else {
   -            LOG.warn("Unknown file {} exists in ratis storage dir {}",
   +            LOG.warn("Unknown file {} exists in ratis storage dir {}. "
   +                + " It is recommended not to share the ratis storage dir.",
                    ratisGroupDir, omRatisDir);
              }
            }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] myskov commented on pull request #3809: HDDS-7088. Removed IOException and added log level warn

Posted by GitBox <gi...@apache.org>.
myskov commented on PR #3809:
URL: https://github.com/apache/ozone/pull/3809#issuecomment-1273249691

   > > ... raft directory gets overridden (by OM or SCM, dependent on whom started later) ...
   > 
   > @myskov , an existing raft directory should never be overridden automatically since it may cause data loss. Currently, it throws an exception and stop starting. After this change, it will print a warning and use a different directory because the group ids are different.
   
   Ok, but in this case, the warning message is misleading, and it should be changed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org