You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2020/08/08 08:23:58 UTC

[GitHub] [hadoop-ozone] vivekratnavel opened a new pull request #1305: HDDS-4009. Recon Overview page: The volume, bucket and key counts are not accurate

vivekratnavel opened a new pull request #1305:
URL: https://github.com/apache/hadoop-ozone/pull/1305


   ## What changes were proposed in this pull request?
   
    - Add a new task in Recon to keep track of the count of all the OM DB tables
    - Add / update unit tests
    - Add a new method "getSkipCache" in Table interface to get value from the table by skipping the cache. This is required for Recon, since recon never adds entries to cache of any table.
    - Update ClusterStateEndpoint to get the counts of volume, bucket, and key from global stats table and not rely on "rocksdb.estimate-num-keys"
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-4009
   
   ## How was this patch tested?
   
   Manually tested in a real cluster and also added unit tests.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org

[GitHub] [hadoop-ozone] avijayanhwx merged pull request #1305: HDDS-4009. Recon Overview page: The volume, bucket and key counts are not accurate

Posted by GitBox <gi...@apache.org>.

avijayanhwx merged pull request #1305:
URL: https://github.com/apache/hadoop-ozone/pull/1305


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org

[GitHub] [hadoop-ozone] vivekratnavel commented on pull request #1305: HDDS-4009. Recon Overview page: The volume, bucket and key counts are not accurate

Posted by GitBox <gi...@apache.org>.

vivekratnavel commented on pull request #1305:
URL: https://github.com/apache/hadoop-ozone/pull/1305#issuecomment-672329724


   @avijayanhwx Thanks for the review! I have updated the patch addressing all your comments. Please take another look. Thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org

[GitHub] [hadoop-ozone] avijayanhwx commented on a change in pull request #1305: HDDS-4009. Recon Overview page: The volume, bucket and key counts are not accurate

Posted by GitBox <gi...@apache.org>.

avijayanhwx commented on a change in pull request #1305:
URL: https://github.com/apache/hadoop-ozone/pull/1305#discussion_r468639279



##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/tasks/TableCountTask.java
##########
@@ -0,0 +1,185 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.ozone.recon.tasks;
+
+import com.google.inject.Inject;
+import org.apache.commons.lang3.tuple.ImmutablePair;
+import org.apache.commons.lang3.tuple.Pair;
+import org.apache.hadoop.hdds.utils.db.Table;
+import org.apache.hadoop.hdds.utils.db.TableIterator;
+import org.apache.hadoop.ozone.om.OMMetadataManager;
+import org.apache.hadoop.ozone.recon.ReconUtils;
+import org.apache.hadoop.ozone.recon.recovery.ReconOMMetadataManager;
+import org.hadoop.ozone.recon.schema.tables.daos.GlobalStatsDao;
+import org.hadoop.ozone.recon.schema.tables.pojos.GlobalStats;
+import org.jooq.Configuration;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.HashMap;
+import java.util.Iterator;
+import java.util.Map.Entry;
+
+/**
+ * Class to iterate over the OM DB and store the total counts of volumes,
+ * buckets, keys, open keys, deleted keys, etc.
+ */
+public class TableCountTask implements ReconOmTask {
+  private static final Logger LOG =
+      LoggerFactory.getLogger(TableCountTask.class);
+
+  private GlobalStatsDao globalStatsDao;
+  private Configuration sqlConfiguration;
+  private HashMap<String, Long> objectCountMap;

Review comment:
       This can be a method local variable in process().

##########
File path: hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/db/Table.java
##########
@@ -82,6 +82,19 @@ void putWithBatch(BatchOperation batch, KEY key, VALUE value)
   VALUE get(KEY key) throws IOException;
 
 
+  /**
+   * Skip checking cache and get the value mapped to the given key in byte
+   * array or returns null if the key is not found.
+   *
+   * @param key metadata key
+   * @return value in byte array or null if the key is not found.
+   * @throws IOException on Failure
+   */
+  default VALUE getSkipCache(KEY key) throws IOException {
+    throw new NotImplementedException("getSkipCache is not implemented");

Review comment:
       We should default to get() method or provide implementation in org.apache.hadoop.hdds.utils.db.RDBTable as well. 

##########
File path: hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/db/DBDefinition.java
##########
@@ -43,4 +47,15 @@
    */
   DBColumnFamilyDefinition[] getColumnFamilies();
 
+  default Optional<Class> getKeyType(String table) {

Review comment:
       Can we add Javadoc for the new methods?

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/tasks/OMDBUpdatesHandler.java
##########
@@ -262,12 +270,13 @@ public void markCommit(byte[] bytes) throws RocksDBException {
   }
 
   /**
-   * Return Key type class for a given table name.
-   * @param name table name.
-   * @return String.class by default.
+   * Return Key type class for the given table.
+   *
+   * @return keyType class.
    */
-  private Class<String> getKeyType() {
-    return String.class;
+  @VisibleForTesting
+  Optional<Class> getKeyType(String name) {

Review comment:
       Nit. We can get rid of these single line methods. 

##########
File path: hadoop-ozone/recon-codegen/src/main/java/org/hadoop/ozone/recon/schema/StatsSchemaDefinition.java
##########
@@ -46,22 +48,31 @@
   @Override
   public void initializeSchema() throws SQLException {
     Connection conn = dataSource.getConnection();
+    dslContext = DSL.using(conn);
     if (!TABLE_EXISTS_CHECK.test(conn, GLOBAL_STATS_TABLE_NAME)) {
-      createGlobalStatsTable(conn);
+      createGlobalStatsTable();
     }
   }
 
   /**
    * Create the Ozone Global Stats table.
-   * @param conn connection
    */
-  private void createGlobalStatsTable(Connection conn) {
-    DSL.using(conn).createTableIfNotExists(GLOBAL_STATS_TABLE_NAME)
+  private void createGlobalStatsTable() {
+    dslContext.createTableIfNotExists(GLOBAL_STATS_TABLE_NAME)
         .column("key", SQLDataType.VARCHAR(255))
         .column("value", SQLDataType.BIGINT)
         .column("last_updated_timestamp", SQLDataType.TIMESTAMP)
         .constraint(DSL.constraint("pk_key")
             .primaryKey("key"))
         .execute();
   }
+
+  /**
+   * Returns the DSL context.
+   *
+   * @return dslContext
+   */
+  public DSLContext getDSLContext() {

Review comment:
       The 'dslContext' is common across tables. Do we need to add specific getters inside the individual table definitions? Can we just get the 'datasource' instance from the test injector and get hold of the dslcontext? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org