You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2021/11/30 06:40:28 UTC

[GitHub] [ozone] guihecheng opened a new pull request #2875: HDDS-6053. Fix too short container scrubber data scan interval.

guihecheng opened a new pull request #2875:
URL: https://github.com/apache/ozone/pull/2875


   ## What changes were proposed in this pull request?
   
   Fix too short container scrubber data scan interval.
   1 minute -> 1 day
   With some improvement of the default value definitions and config value validations.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-6053
   
   ## How was this patch tested?
   
   no test.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] guihecheng commented on a change in pull request #2875: HDDS-6053. Fix too short container scrubber data scan interval.

Posted by GitBox <gi...@apache.org>.
guihecheng commented on a change in pull request #2875:
URL: https://github.com/apache/ozone/pull/2875#discussion_r759071510



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/ozoneimpl/ContainerScrubberConfiguration.java
##########
@@ -17,55 +17,100 @@
  */
 package org.apache.hadoop.ozone.container.ozoneimpl;
 
+import org.apache.hadoop.conf.StorageUnit;
 import org.apache.hadoop.hdds.conf.Config;
 import org.apache.hadoop.hdds.conf.ConfigGroup;
 import org.apache.hadoop.hdds.conf.ConfigTag;
 import org.apache.hadoop.hdds.conf.ConfigType;
+import org.apache.hadoop.hdds.conf.PostConstruct;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.time.Duration;
 
 /**
  * This class defines configuration parameters for container scrubber.
  **/
 @ConfigGroup(prefix = "hdds.container.scrub")
 public class ContainerScrubberConfiguration {
 
+  private static final Logger LOG =
+      LoggerFactory.getLogger(ContainerScrubberConfiguration.class);
+
   // only for log
   public static final String HDDS_CONTAINER_SCRUB_ENABLED =
       "hdds.container.scrub.enabled";
+  static final String METADATA_SCAN_INTERVAL_KEY =
+      "hdds.container.scrub.metadata.scan.interval";
+  static final String DATA_SCAN_INTERVAL_KEY =
+      "hdds.container.scrub.data.scan.interval";
+  static final String VOLUME_BYTES_PER_SECOND_KEY =
+      "hdds.container.scrub.volume.bytes.per.second";
+
+  static final long METADATA_SCAN_INTERVAL_DEFAULT =
+      Duration.ofHours(3).toMillis();
+  static final long DATA_SCAN_INTERVAL_DEFAULT =
+      Duration.ofDays(1).toMillis();
+  static final long BANDWIDTH_PER_VOLUME_DEFAULT =
+      (long) StorageUnit.MB.toBytes(1);
 
   @Config(key = "enabled",
       type = ConfigType.BOOLEAN,
       defaultValue = "false",
       tags = {ConfigTag.STORAGE},
       description = "Config parameter to enable container scrubber.")
-  private boolean enabled;
+  private boolean enabled = false;
 
   @Config(key = "metadata.scan.interval",
       type = ConfigType.TIME,
       defaultValue = "3h",
       tags = {ConfigTag.STORAGE},
-      description = "Config parameter define time interval in milliseconds" +
-          " between two metadata scans by container scrubber.")
-  private long metadataScanInterval;
+      description = "Config parameter define time interval" +
+          " between two metadata scans by container scrubber." +
+          " Unit could be defined with postfix (ns,ms,s,m,h,d).")
+  private long metadataScanInterval = METADATA_SCAN_INTERVAL_DEFAULT;
 
   @Config(key = "data.scan.interval",
       type = ConfigType.TIME,
-      defaultValue = "1m",
+      defaultValue = "1d",
       tags = {ConfigTag.STORAGE},
       description = "Minimum time interval between two iterations of container"
           + " data scanning.  If an iteration takes less time than this, the"
-          + " scanner will wait before starting the next iteration."
-  )
-  private long dataScanInterval;
+          + " scanner will wait before starting the next iteration." +
+          " Unit could be defined with postfix (ns,ms,s,m,h,d).")
+  private long dataScanInterval = DATA_SCAN_INTERVAL_DEFAULT;
 
   @Config(key = "volume.bytes.per.second",
-      type = ConfigType.LONG,
-      defaultValue = "1048576",
+      type = ConfigType.SIZE,
+      defaultValue = "1MB",

Review comment:
       Ah, this is an incompatible change as you said, so I should revert this type back to LONG, thanks~




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] guihecheng commented on pull request #2875: HDDS-6053. Fix too short container scrubber data scan interval.

Posted by GitBox <gi...@apache.org>.
guihecheng commented on pull request #2875:
URL: https://github.com/apache/ozone/pull/2875#issuecomment-993081537


   Ah, thanks @adoroszlai @arp7 , I agree that data scanner is for bit rot catching, and 1 week is a good default interval for it according to those similar interval configs of HDFS.
   I'll update the patch soon.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] adoroszlai commented on a change in pull request #2875: HDDS-6053. Fix too short container scrubber data scan interval.

Posted by GitBox <gi...@apache.org>.
adoroszlai commented on a change in pull request #2875:
URL: https://github.com/apache/ozone/pull/2875#discussion_r759037518



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/ozoneimpl/ContainerScrubberConfiguration.java
##########
@@ -17,55 +17,100 @@
  */
 package org.apache.hadoop.ozone.container.ozoneimpl;
 
+import org.apache.hadoop.conf.StorageUnit;
 import org.apache.hadoop.hdds.conf.Config;
 import org.apache.hadoop.hdds.conf.ConfigGroup;
 import org.apache.hadoop.hdds.conf.ConfigTag;
 import org.apache.hadoop.hdds.conf.ConfigType;
+import org.apache.hadoop.hdds.conf.PostConstruct;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.time.Duration;
 
 /**
  * This class defines configuration parameters for container scrubber.
  **/
 @ConfigGroup(prefix = "hdds.container.scrub")
 public class ContainerScrubberConfiguration {
 
+  private static final Logger LOG =
+      LoggerFactory.getLogger(ContainerScrubberConfiguration.class);
+
   // only for log
   public static final String HDDS_CONTAINER_SCRUB_ENABLED =
       "hdds.container.scrub.enabled";
+  static final String METADATA_SCAN_INTERVAL_KEY =
+      "hdds.container.scrub.metadata.scan.interval";
+  static final String DATA_SCAN_INTERVAL_KEY =
+      "hdds.container.scrub.data.scan.interval";
+  static final String VOLUME_BYTES_PER_SECOND_KEY =
+      "hdds.container.scrub.volume.bytes.per.second";
+
+  static final long METADATA_SCAN_INTERVAL_DEFAULT =
+      Duration.ofHours(3).toMillis();
+  static final long DATA_SCAN_INTERVAL_DEFAULT =
+      Duration.ofDays(1).toMillis();
+  static final long BANDWIDTH_PER_VOLUME_DEFAULT =
+      (long) StorageUnit.MB.toBytes(1);
 
   @Config(key = "enabled",
       type = ConfigType.BOOLEAN,
       defaultValue = "false",
       tags = {ConfigTag.STORAGE},
       description = "Config parameter to enable container scrubber.")
-  private boolean enabled;
+  private boolean enabled = false;
 
   @Config(key = "metadata.scan.interval",
       type = ConfigType.TIME,
       defaultValue = "3h",
       tags = {ConfigTag.STORAGE},
-      description = "Config parameter define time interval in milliseconds" +
-          " between two metadata scans by container scrubber.")
-  private long metadataScanInterval;
+      description = "Config parameter define time interval" +
+          " between two metadata scans by container scrubber." +
+          " Unit could be defined with postfix (ns,ms,s,m,h,d).")
+  private long metadataScanInterval = METADATA_SCAN_INTERVAL_DEFAULT;
 
   @Config(key = "data.scan.interval",
       type = ConfigType.TIME,
-      defaultValue = "1m",
+      defaultValue = "1d",
       tags = {ConfigTag.STORAGE},
       description = "Minimum time interval between two iterations of container"
           + " data scanning.  If an iteration takes less time than this, the"
-          + " scanner will wait before starting the next iteration."
-  )
-  private long dataScanInterval;
+          + " scanner will wait before starting the next iteration." +
+          " Unit could be defined with postfix (ns,ms,s,m,h,d).")
+  private long dataScanInterval = DATA_SCAN_INTERVAL_DEFAULT;
 
   @Config(key = "volume.bytes.per.second",
-      type = ConfigType.LONG,
-      defaultValue = "1048576",
+      type = ConfigType.SIZE,
+      defaultValue = "1MB",
       tags = {ConfigTag.STORAGE},
       description = "Config parameter to throttle I/O bandwidth used"
           + " by scrubber per volume.")
-
-  private long bandwidthPerVolume;
-
+  private long bandwidthPerVolume = BANDWIDTH_PER_VOLUME_DEFAULT;
+
+  @PostConstruct
+  public void validate() {
+    if (metadataScanInterval < 0) {
+      LOG.warn(METADATA_SCAN_INTERVAL_KEY +
+              " must be >= 0 and was set to {}. Defaulting to {}",
+          metadataScanInterval, METADATA_SCAN_INTERVAL_DEFAULT);
+      metadataScanInterval = METADATA_SCAN_INTERVAL_DEFAULT;
+    }
+
+    if (dataScanInterval < 0) {
+      LOG.warn(DATA_SCAN_INTERVAL_KEY +
+              " must be >= 0 and was set to {}. Defaulting to {}",
+          dataScanInterval, DATA_SCAN_INTERVAL_DEFAULT);
+      dataScanInterval = DATA_SCAN_INTERVAL_DEFAULT;
+    }
+
+    if (bandwidthPerVolume < 0) {

Review comment:
       What is the effect of zero bandwidth?  Does it mean no limit?

##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/ozoneimpl/ContainerScrubberConfiguration.java
##########
@@ -17,55 +17,100 @@
  */
 package org.apache.hadoop.ozone.container.ozoneimpl;
 
+import org.apache.hadoop.conf.StorageUnit;
 import org.apache.hadoop.hdds.conf.Config;
 import org.apache.hadoop.hdds.conf.ConfigGroup;
 import org.apache.hadoop.hdds.conf.ConfigTag;
 import org.apache.hadoop.hdds.conf.ConfigType;
+import org.apache.hadoop.hdds.conf.PostConstruct;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.time.Duration;
 
 /**
  * This class defines configuration parameters for container scrubber.
  **/
 @ConfigGroup(prefix = "hdds.container.scrub")
 public class ContainerScrubberConfiguration {
 
+  private static final Logger LOG =
+      LoggerFactory.getLogger(ContainerScrubberConfiguration.class);
+
   // only for log
   public static final String HDDS_CONTAINER_SCRUB_ENABLED =
       "hdds.container.scrub.enabled";
+  static final String METADATA_SCAN_INTERVAL_KEY =
+      "hdds.container.scrub.metadata.scan.interval";
+  static final String DATA_SCAN_INTERVAL_KEY =
+      "hdds.container.scrub.data.scan.interval";
+  static final String VOLUME_BYTES_PER_SECOND_KEY =
+      "hdds.container.scrub.volume.bytes.per.second";
+
+  static final long METADATA_SCAN_INTERVAL_DEFAULT =
+      Duration.ofHours(3).toMillis();
+  static final long DATA_SCAN_INTERVAL_DEFAULT =
+      Duration.ofDays(1).toMillis();
+  static final long BANDWIDTH_PER_VOLUME_DEFAULT =
+      (long) StorageUnit.MB.toBytes(1);
 
   @Config(key = "enabled",
       type = ConfigType.BOOLEAN,
       defaultValue = "false",
       tags = {ConfigTag.STORAGE},
       description = "Config parameter to enable container scrubber.")
-  private boolean enabled;
+  private boolean enabled = false;
 
   @Config(key = "metadata.scan.interval",
       type = ConfigType.TIME,
       defaultValue = "3h",
       tags = {ConfigTag.STORAGE},
-      description = "Config parameter define time interval in milliseconds" +
-          " between two metadata scans by container scrubber.")
-  private long metadataScanInterval;
+      description = "Config parameter define time interval" +
+          " between two metadata scans by container scrubber." +
+          " Unit could be defined with postfix (ns,ms,s,m,h,d).")
+  private long metadataScanInterval = METADATA_SCAN_INTERVAL_DEFAULT;
 
   @Config(key = "data.scan.interval",
       type = ConfigType.TIME,
-      defaultValue = "1m",
+      defaultValue = "1d",
       tags = {ConfigTag.STORAGE},
       description = "Minimum time interval between two iterations of container"
           + " data scanning.  If an iteration takes less time than this, the"
-          + " scanner will wait before starting the next iteration."
-  )
-  private long dataScanInterval;
+          + " scanner will wait before starting the next iteration." +
+          " Unit could be defined with postfix (ns,ms,s,m,h,d).")
+  private long dataScanInterval = DATA_SCAN_INTERVAL_DEFAULT;
 
   @Config(key = "volume.bytes.per.second",
-      type = ConfigType.LONG,
-      defaultValue = "1048576",
+      type = ConfigType.SIZE,
+      defaultValue = "1MB",

Review comment:
       How does type change affect existing config users may have?  Are plain numbers still accepted and handled as if `B` suffix was specified?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] guihecheng commented on pull request #2875: HDDS-6053. Fix too short container scrubber data scan interval.

Posted by GitBox <gi...@apache.org>.
guihecheng commented on pull request #2875:
URL: https://github.com/apache/ozone/pull/2875#issuecomment-982452686


   @adoroszlai Hmmm, originally I just want to change the default value of the data.scan.interval(I even suspect that this is a typo mistake "1m" -> "1d"), but later I found that this configuration is not so good, so I did some extending work.
   But since I've touched this part, I should add new unit tests for these configurations, thanks~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] adoroszlai commented on pull request #2875: HDDS-6053. Fix too short container scrubber data scan interval.

Posted by GitBox <gi...@apache.org>.
adoroszlai commented on pull request #2875:
URL: https://github.com/apache/ozone/pull/2875#issuecomment-992602185


   > I recall there were multiple levels of scrubbing. One that opens the container for a quick sanity check
   
   Metadata scan interval is 3 hours by default.
   
   > and one that does a full scan of every data block verifying checksums.
   
   1 minute interval is for this one.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] adoroszlai commented on pull request #2875: HDDS-6053. Fix too short container scrubber data scan interval.

Posted by GitBox <gi...@apache.org>.
adoroszlai commented on pull request #2875:
URL: https://github.com/apache/ozone/pull/2875#issuecomment-992674256


   @guihecheng Based on offline discussion with @arp7 we can increase the default interval, even to 1 week.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] guihecheng commented on pull request #2875: HDDS-6053. Fix too short container scrubber data scan interval.

Posted by GitBox <gi...@apache.org>.
guihecheng commented on pull request #2875:
URL: https://github.com/apache/ozone/pull/2875#issuecomment-992369490


   ping @arp7 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] guihecheng commented on pull request #2875: HDDS-6053. Fix too short container scrubber data scan interval.

Posted by GitBox <gi...@apache.org>.
guihecheng commented on pull request #2875:
URL: https://github.com/apache/ozone/pull/2875#issuecomment-984208406


   @adoroszlai thanks for the comments~
   Sure, we should discuss about the default values.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] guihecheng commented on a change in pull request #2875: HDDS-6053. Fix too short container scrubber data scan interval.

Posted by GitBox <gi...@apache.org>.
guihecheng commented on a change in pull request #2875:
URL: https://github.com/apache/ozone/pull/2875#discussion_r759255304



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/ozoneimpl/ContainerScrubberConfiguration.java
##########
@@ -17,55 +17,100 @@
  */
 package org.apache.hadoop.ozone.container.ozoneimpl;
 
+import org.apache.hadoop.conf.StorageUnit;
 import org.apache.hadoop.hdds.conf.Config;
 import org.apache.hadoop.hdds.conf.ConfigGroup;
 import org.apache.hadoop.hdds.conf.ConfigTag;
 import org.apache.hadoop.hdds.conf.ConfigType;
+import org.apache.hadoop.hdds.conf.PostConstruct;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.time.Duration;
 
 /**
  * This class defines configuration parameters for container scrubber.
  **/
 @ConfigGroup(prefix = "hdds.container.scrub")
 public class ContainerScrubberConfiguration {
 
+  private static final Logger LOG =
+      LoggerFactory.getLogger(ContainerScrubberConfiguration.class);
+
   // only for log
   public static final String HDDS_CONTAINER_SCRUB_ENABLED =
       "hdds.container.scrub.enabled";
+  static final String METADATA_SCAN_INTERVAL_KEY =
+      "hdds.container.scrub.metadata.scan.interval";
+  static final String DATA_SCAN_INTERVAL_KEY =
+      "hdds.container.scrub.data.scan.interval";
+  static final String VOLUME_BYTES_PER_SECOND_KEY =
+      "hdds.container.scrub.volume.bytes.per.second";
+
+  static final long METADATA_SCAN_INTERVAL_DEFAULT =
+      Duration.ofHours(3).toMillis();
+  static final long DATA_SCAN_INTERVAL_DEFAULT =
+      Duration.ofDays(1).toMillis();
+  static final long BANDWIDTH_PER_VOLUME_DEFAULT =
+      (long) StorageUnit.MB.toBytes(1);
 
   @Config(key = "enabled",
       type = ConfigType.BOOLEAN,
       defaultValue = "false",
       tags = {ConfigTag.STORAGE},
       description = "Config parameter to enable container scrubber.")
-  private boolean enabled;
+  private boolean enabled = false;
 
   @Config(key = "metadata.scan.interval",
       type = ConfigType.TIME,
       defaultValue = "3h",
       tags = {ConfigTag.STORAGE},
-      description = "Config parameter define time interval in milliseconds" +
-          " between two metadata scans by container scrubber.")
-  private long metadataScanInterval;
+      description = "Config parameter define time interval" +
+          " between two metadata scans by container scrubber." +
+          " Unit could be defined with postfix (ns,ms,s,m,h,d).")
+  private long metadataScanInterval = METADATA_SCAN_INTERVAL_DEFAULT;
 
   @Config(key = "data.scan.interval",
       type = ConfigType.TIME,
-      defaultValue = "1m",
+      defaultValue = "1d",
       tags = {ConfigTag.STORAGE},
       description = "Minimum time interval between two iterations of container"
           + " data scanning.  If an iteration takes less time than this, the"
-          + " scanner will wait before starting the next iteration."
-  )
-  private long dataScanInterval;
+          + " scanner will wait before starting the next iteration." +
+          " Unit could be defined with postfix (ns,ms,s,m,h,d).")
+  private long dataScanInterval = DATA_SCAN_INTERVAL_DEFAULT;
 
   @Config(key = "volume.bytes.per.second",
-      type = ConfigType.LONG,
-      defaultValue = "1048576",
+      type = ConfigType.SIZE,
+      defaultValue = "1MB",
       tags = {ConfigTag.STORAGE},
       description = "Config parameter to throttle I/O bandwidth used"
           + " by scrubber per volume.")
-
-  private long bandwidthPerVolume;
-
+  private long bandwidthPerVolume = BANDWIDTH_PER_VOLUME_DEFAULT;
+
+  @PostConstruct
+  public void validate() {
+    if (metadataScanInterval < 0) {
+      LOG.warn(METADATA_SCAN_INTERVAL_KEY +
+              " must be >= 0 and was set to {}. Defaulting to {}",
+          metadataScanInterval, METADATA_SCAN_INTERVAL_DEFAULT);
+      metadataScanInterval = METADATA_SCAN_INTERVAL_DEFAULT;
+    }
+
+    if (dataScanInterval < 0) {
+      LOG.warn(DATA_SCAN_INTERVAL_KEY +
+              " must be >= 0 and was set to {}. Defaulting to {}",
+          dataScanInterval, DATA_SCAN_INTERVAL_DEFAULT);
+      dataScanInterval = DATA_SCAN_INTERVAL_DEFAULT;
+    }
+
+    if (bandwidthPerVolume < 0) {

Review comment:
       This bandwidth only affects data scanner, so I'll disable the data scanners on a 0 bandwidth while keep metadata scanners running.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] adoroszlai commented on pull request #2875: HDDS-6053. Fix too short container scrubber data scan interval.

Posted by GitBox <gi...@apache.org>.
adoroszlai commented on pull request #2875:
URL: https://github.com/apache/ozone/pull/2875#issuecomment-993495097


   Thanks @guihecheng for updating the patch.  Thanks @arp7 for the review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] guihecheng commented on pull request #2875: HDDS-6053. Fix too short container scrubber data scan interval.

Posted by GitBox <gi...@apache.org>.
guihecheng commented on pull request #2875:
URL: https://github.com/apache/ozone/pull/2875#issuecomment-986702822


   Hi @arp7 , could you please help us review this patch ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] arp7 commented on pull request #2875: HDDS-6053. Fix too short container scrubber data scan interval.

Posted by GitBox <gi...@apache.org>.
arp7 commented on pull request #2875:
URL: https://github.com/apache/ozone/pull/2875#issuecomment-992637957


   Does it attempt to do a full scan every minute? If so that's pretty bad. Or just make some forward progress every minute?
   
   If former we should definitely update it right away.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] adoroszlai merged pull request #2875: HDDS-6053. Fix too short container scrubber data scan interval.

Posted by GitBox <gi...@apache.org>.
adoroszlai merged pull request #2875:
URL: https://github.com/apache/ozone/pull/2875


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] guihecheng commented on a change in pull request #2875: HDDS-6053. Fix too short container scrubber data scan interval.

Posted by GitBox <gi...@apache.org>.
guihecheng commented on a change in pull request #2875:
URL: https://github.com/apache/ozone/pull/2875#discussion_r759071510



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/ozoneimpl/ContainerScrubberConfiguration.java
##########
@@ -17,55 +17,100 @@
  */
 package org.apache.hadoop.ozone.container.ozoneimpl;
 
+import org.apache.hadoop.conf.StorageUnit;
 import org.apache.hadoop.hdds.conf.Config;
 import org.apache.hadoop.hdds.conf.ConfigGroup;
 import org.apache.hadoop.hdds.conf.ConfigTag;
 import org.apache.hadoop.hdds.conf.ConfigType;
+import org.apache.hadoop.hdds.conf.PostConstruct;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.time.Duration;
 
 /**
  * This class defines configuration parameters for container scrubber.
  **/
 @ConfigGroup(prefix = "hdds.container.scrub")
 public class ContainerScrubberConfiguration {
 
+  private static final Logger LOG =
+      LoggerFactory.getLogger(ContainerScrubberConfiguration.class);
+
   // only for log
   public static final String HDDS_CONTAINER_SCRUB_ENABLED =
       "hdds.container.scrub.enabled";
+  static final String METADATA_SCAN_INTERVAL_KEY =
+      "hdds.container.scrub.metadata.scan.interval";
+  static final String DATA_SCAN_INTERVAL_KEY =
+      "hdds.container.scrub.data.scan.interval";
+  static final String VOLUME_BYTES_PER_SECOND_KEY =
+      "hdds.container.scrub.volume.bytes.per.second";
+
+  static final long METADATA_SCAN_INTERVAL_DEFAULT =
+      Duration.ofHours(3).toMillis();
+  static final long DATA_SCAN_INTERVAL_DEFAULT =
+      Duration.ofDays(1).toMillis();
+  static final long BANDWIDTH_PER_VOLUME_DEFAULT =
+      (long) StorageUnit.MB.toBytes(1);
 
   @Config(key = "enabled",
       type = ConfigType.BOOLEAN,
       defaultValue = "false",
       tags = {ConfigTag.STORAGE},
       description = "Config parameter to enable container scrubber.")
-  private boolean enabled;
+  private boolean enabled = false;
 
   @Config(key = "metadata.scan.interval",
       type = ConfigType.TIME,
       defaultValue = "3h",
       tags = {ConfigTag.STORAGE},
-      description = "Config parameter define time interval in milliseconds" +
-          " between two metadata scans by container scrubber.")
-  private long metadataScanInterval;
+      description = "Config parameter define time interval" +
+          " between two metadata scans by container scrubber." +
+          " Unit could be defined with postfix (ns,ms,s,m,h,d).")
+  private long metadataScanInterval = METADATA_SCAN_INTERVAL_DEFAULT;
 
   @Config(key = "data.scan.interval",
       type = ConfigType.TIME,
-      defaultValue = "1m",
+      defaultValue = "1d",
       tags = {ConfigTag.STORAGE},
       description = "Minimum time interval between two iterations of container"
           + " data scanning.  If an iteration takes less time than this, the"
-          + " scanner will wait before starting the next iteration."
-  )
-  private long dataScanInterval;
+          + " scanner will wait before starting the next iteration." +
+          " Unit could be defined with postfix (ns,ms,s,m,h,d).")
+  private long dataScanInterval = DATA_SCAN_INTERVAL_DEFAULT;
 
   @Config(key = "volume.bytes.per.second",
-      type = ConfigType.LONG,
-      defaultValue = "1048576",
+      type = ConfigType.SIZE,
+      defaultValue = "1MB",

Review comment:
       Ah, this is an incompatible change as we said, so I should revert this type back to LONG, thanks~




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] guihecheng commented on a change in pull request #2875: HDDS-6053. Fix too short container scrubber data scan interval.

Posted by GitBox <gi...@apache.org>.
guihecheng commented on a change in pull request #2875:
URL: https://github.com/apache/ozone/pull/2875#discussion_r759112256



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/ozoneimpl/ContainerScrubberConfiguration.java
##########
@@ -17,55 +17,100 @@
  */
 package org.apache.hadoop.ozone.container.ozoneimpl;
 
+import org.apache.hadoop.conf.StorageUnit;
 import org.apache.hadoop.hdds.conf.Config;
 import org.apache.hadoop.hdds.conf.ConfigGroup;
 import org.apache.hadoop.hdds.conf.ConfigTag;
 import org.apache.hadoop.hdds.conf.ConfigType;
+import org.apache.hadoop.hdds.conf.PostConstruct;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.time.Duration;
 
 /**
  * This class defines configuration parameters for container scrubber.
  **/
 @ConfigGroup(prefix = "hdds.container.scrub")
 public class ContainerScrubberConfiguration {
 
+  private static final Logger LOG =
+      LoggerFactory.getLogger(ContainerScrubberConfiguration.class);
+
   // only for log
   public static final String HDDS_CONTAINER_SCRUB_ENABLED =
       "hdds.container.scrub.enabled";
+  static final String METADATA_SCAN_INTERVAL_KEY =
+      "hdds.container.scrub.metadata.scan.interval";
+  static final String DATA_SCAN_INTERVAL_KEY =
+      "hdds.container.scrub.data.scan.interval";
+  static final String VOLUME_BYTES_PER_SECOND_KEY =
+      "hdds.container.scrub.volume.bytes.per.second";
+
+  static final long METADATA_SCAN_INTERVAL_DEFAULT =
+      Duration.ofHours(3).toMillis();
+  static final long DATA_SCAN_INTERVAL_DEFAULT =
+      Duration.ofDays(1).toMillis();
+  static final long BANDWIDTH_PER_VOLUME_DEFAULT =
+      (long) StorageUnit.MB.toBytes(1);
 
   @Config(key = "enabled",
       type = ConfigType.BOOLEAN,
       defaultValue = "false",
       tags = {ConfigTag.STORAGE},
       description = "Config parameter to enable container scrubber.")
-  private boolean enabled;
+  private boolean enabled = false;
 
   @Config(key = "metadata.scan.interval",
       type = ConfigType.TIME,
       defaultValue = "3h",
       tags = {ConfigTag.STORAGE},
-      description = "Config parameter define time interval in milliseconds" +
-          " between two metadata scans by container scrubber.")
-  private long metadataScanInterval;
+      description = "Config parameter define time interval" +
+          " between two metadata scans by container scrubber." +
+          " Unit could be defined with postfix (ns,ms,s,m,h,d).")
+  private long metadataScanInterval = METADATA_SCAN_INTERVAL_DEFAULT;
 
   @Config(key = "data.scan.interval",
       type = ConfigType.TIME,
-      defaultValue = "1m",
+      defaultValue = "1d",
       tags = {ConfigTag.STORAGE},
       description = "Minimum time interval between two iterations of container"
           + " data scanning.  If an iteration takes less time than this, the"
-          + " scanner will wait before starting the next iteration."
-  )
-  private long dataScanInterval;
+          + " scanner will wait before starting the next iteration." +
+          " Unit could be defined with postfix (ns,ms,s,m,h,d).")
+  private long dataScanInterval = DATA_SCAN_INTERVAL_DEFAULT;
 
   @Config(key = "volume.bytes.per.second",
-      type = ConfigType.LONG,
-      defaultValue = "1048576",
+      type = ConfigType.SIZE,
+      defaultValue = "1MB",
       tags = {ConfigTag.STORAGE},
       description = "Config parameter to throttle I/O bandwidth used"
           + " by scrubber per volume.")
-
-  private long bandwidthPerVolume;
-
+  private long bandwidthPerVolume = BANDWIDTH_PER_VOLUME_DEFAULT;
+
+  @PostConstruct
+  public void validate() {
+    if (metadataScanInterval < 0) {
+      LOG.warn(METADATA_SCAN_INTERVAL_KEY +
+              " must be >= 0 and was set to {}. Defaulting to {}",
+          metadataScanInterval, METADATA_SCAN_INTERVAL_DEFAULT);
+      metadataScanInterval = METADATA_SCAN_INTERVAL_DEFAULT;
+    }
+
+    if (dataScanInterval < 0) {
+      LOG.warn(DATA_SCAN_INTERVAL_KEY +
+              " must be >= 0 and was set to {}. Defaulting to {}",
+          dataScanInterval, DATA_SCAN_INTERVAL_DEFAULT);
+      dataScanInterval = DATA_SCAN_INTERVAL_DEFAULT;
+    }
+
+    if (bandwidthPerVolume < 0) {

Review comment:
       BTW, if we want to put no limit on this item, maybe we could define an extremely large value for it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] guihecheng commented on a change in pull request #2875: HDDS-6053. Fix too short container scrubber data scan interval.

Posted by GitBox <gi...@apache.org>.
guihecheng commented on a change in pull request #2875:
URL: https://github.com/apache/ozone/pull/2875#discussion_r759085542



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/ozoneimpl/ContainerScrubberConfiguration.java
##########
@@ -17,55 +17,100 @@
  */
 package org.apache.hadoop.ozone.container.ozoneimpl;
 
+import org.apache.hadoop.conf.StorageUnit;
 import org.apache.hadoop.hdds.conf.Config;
 import org.apache.hadoop.hdds.conf.ConfigGroup;
 import org.apache.hadoop.hdds.conf.ConfigTag;
 import org.apache.hadoop.hdds.conf.ConfigType;
+import org.apache.hadoop.hdds.conf.PostConstruct;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.time.Duration;
 
 /**
  * This class defines configuration parameters for container scrubber.
  **/
 @ConfigGroup(prefix = "hdds.container.scrub")
 public class ContainerScrubberConfiguration {
 
+  private static final Logger LOG =
+      LoggerFactory.getLogger(ContainerScrubberConfiguration.class);
+
   // only for log
   public static final String HDDS_CONTAINER_SCRUB_ENABLED =
       "hdds.container.scrub.enabled";
+  static final String METADATA_SCAN_INTERVAL_KEY =
+      "hdds.container.scrub.metadata.scan.interval";
+  static final String DATA_SCAN_INTERVAL_KEY =
+      "hdds.container.scrub.data.scan.interval";
+  static final String VOLUME_BYTES_PER_SECOND_KEY =
+      "hdds.container.scrub.volume.bytes.per.second";
+
+  static final long METADATA_SCAN_INTERVAL_DEFAULT =
+      Duration.ofHours(3).toMillis();
+  static final long DATA_SCAN_INTERVAL_DEFAULT =
+      Duration.ofDays(1).toMillis();
+  static final long BANDWIDTH_PER_VOLUME_DEFAULT =
+      (long) StorageUnit.MB.toBytes(1);
 
   @Config(key = "enabled",
       type = ConfigType.BOOLEAN,
       defaultValue = "false",
       tags = {ConfigTag.STORAGE},
       description = "Config parameter to enable container scrubber.")
-  private boolean enabled;
+  private boolean enabled = false;
 
   @Config(key = "metadata.scan.interval",
       type = ConfigType.TIME,
       defaultValue = "3h",
       tags = {ConfigTag.STORAGE},
-      description = "Config parameter define time interval in milliseconds" +
-          " between two metadata scans by container scrubber.")
-  private long metadataScanInterval;
+      description = "Config parameter define time interval" +
+          " between two metadata scans by container scrubber." +
+          " Unit could be defined with postfix (ns,ms,s,m,h,d).")
+  private long metadataScanInterval = METADATA_SCAN_INTERVAL_DEFAULT;
 
   @Config(key = "data.scan.interval",
       type = ConfigType.TIME,
-      defaultValue = "1m",
+      defaultValue = "1d",
       tags = {ConfigTag.STORAGE},
       description = "Minimum time interval between two iterations of container"
           + " data scanning.  If an iteration takes less time than this, the"
-          + " scanner will wait before starting the next iteration."
-  )
-  private long dataScanInterval;
+          + " scanner will wait before starting the next iteration." +
+          " Unit could be defined with postfix (ns,ms,s,m,h,d).")
+  private long dataScanInterval = DATA_SCAN_INTERVAL_DEFAULT;
 
   @Config(key = "volume.bytes.per.second",
-      type = ConfigType.LONG,
-      defaultValue = "1048576",
+      type = ConfigType.SIZE,
+      defaultValue = "1MB",
       tags = {ConfigTag.STORAGE},
       description = "Config parameter to throttle I/O bandwidth used"
           + " by scrubber per volume.")
-
-  private long bandwidthPerVolume;
-
+  private long bandwidthPerVolume = BANDWIDTH_PER_VOLUME_DEFAULT;
+
+  @PostConstruct
+  public void validate() {
+    if (metadataScanInterval < 0) {
+      LOG.warn(METADATA_SCAN_INTERVAL_KEY +
+              " must be >= 0 and was set to {}. Defaulting to {}",
+          metadataScanInterval, METADATA_SCAN_INTERVAL_DEFAULT);
+      metadataScanInterval = METADATA_SCAN_INTERVAL_DEFAULT;
+    }
+
+    if (dataScanInterval < 0) {
+      LOG.warn(DATA_SCAN_INTERVAL_KEY +
+              " must be >= 0 and was set to {}. Defaulting to {}",
+          dataScanInterval, DATA_SCAN_INTERVAL_DEFAULT);
+      dataScanInterval = DATA_SCAN_INTERVAL_DEFAULT;
+    }
+
+    if (bandwidthPerVolume < 0) {

Review comment:
       Here, I think the desired behaviour is make the scrubber disabled.
   Currently, the data scanner will run, but it will just keep throttling on each scan since the limit is always hit. BTW, HDFS has a similar config and will disable the scrubber if 0 is given.
   I think I'll let the scrubber be disabled if this a 0 is given, but 0 should be valid although.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] guihecheng commented on pull request #2875: HDDS-6053. Fix too short container scrubber data scan interval.

Posted by GitBox <gi...@apache.org>.
guihecheng commented on pull request #2875:
URL: https://github.com/apache/ozone/pull/2875#issuecomment-983428079


   @adoroszlai updated, thanks~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] arp7 commented on pull request #2875: HDDS-6053. Fix too short container scrubber data scan interval.

Posted by GitBox <gi...@apache.org>.
arp7 commented on pull request #2875:
URL: https://github.com/apache/ozone/pull/2875#issuecomment-992592699


   I recall there were multiple levels of scrubbing. One that opens the container for a quick sanity check and one that does a full scan of every data block verifying checksums. Is the 1 minute interval for the former? It sounds way too often in any case.
   
   The goal is to scrub every container at least once in a reasonable time interval to catch bit rot or other corruptions. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] arp7 commented on pull request #2875: HDDS-6053. Fix too short container scrubber data scan interval.

Posted by GitBox <gi...@apache.org>.
arp7 commented on pull request #2875:
URL: https://github.com/apache/ozone/pull/2875#issuecomment-992564665


   Hi @guihecheng, if @adoroszlai is +1 for the change please go ahead. I haven't looked at the DN code in a few years and wouldn't be able to review this quickly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] adoroszlai commented on pull request #2875: HDDS-6053. Fix too short container scrubber data scan interval.

Posted by GitBox <gi...@apache.org>.
adoroszlai commented on pull request #2875:
URL: https://github.com/apache/ozone/pull/2875#issuecomment-992582872


   @arp7 Code-wise it is fine.  The big question is if it is OK to increase the default interval for container data scrubber.  I think the assumption with the current default (1 minute) is that the scrubber should run basically continuously and be limited only by the disk bandwith allocated to it.  This PR proposes to increase the interval to 1 day, so if the scrubber finishes in less time, it will sleep.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org