You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2020/12/09 19:49:42 UTC

[GitHub] [ozone] avijayanhwx commented on a change in pull request #1670: HDDS-4403. Update the container replica history to the Recon DB lazily instead of for every report

avijayanhwx commented on a change in pull request #1670:
URL: https://github.com/apache/ozone/pull/1670#discussion_r539567149



##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/codec/ContainerReplicaHistoryListCodec.java
##########
@@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ *
+ */
+
+package org.apache.hadoop.ozone.recon.codec;
+
+import org.apache.hadoop.hdds.utils.db.Codec;
+import org.apache.hadoop.hdds.utils.db.LongCodec;
+import org.apache.hadoop.ozone.recon.scm.ContainerReplicaHistory;
+import org.apache.hadoop.ozone.recon.scm.ContainerReplicaHistoryList;
+
+import java.io.ByteArrayInputStream;
+import java.io.ByteArrayOutputStream;
+import java.io.DataInputStream;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.UUID;
+
+/**
+ * Codec for ContainerReplicaHistoryList.
+ */
+public class ContainerReplicaHistoryListCodec
+    implements Codec<ContainerReplicaHistoryList> {
+
+  // UUID takes 2 long to store. Each timestamp takes 1 long to store.
+  static final int SIZE_PER_ENTRY = 4 * Long.BYTES;
+  private final Codec<Long> lc = new LongCodec();
+
+  @Override
+  public byte[] toPersistedFormat(ContainerReplicaHistoryList obj)
+      throws IOException {
+
+    List<ContainerReplicaHistory> lst = obj.getList();

Review comment:
       Based on 16 bytes per container-DN combination, it comes to around 16KB for a degenerate case where a container has been in 1000 Datanodes through its lifecycle. Hence, I think we should be OK on the replica history growth over time.

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/persistence/ContainerSchemaManager.java
##########
@@ -113,40 +142,121 @@ public void insertUnhealthyContainerRecords(List<UnhealthyContainers> recs) {
     unhealthyContainersDao.insert(recs);
   }
 
-  public void upsertContainerHistory(long containerID, String datanode,
-                                     long time) {
-    DSLContext dslContext = containerSchemaDefinition.getDSLContext();
-    Record2<Long, String> recordToFind =
-        dslContext.newRecord(
-        CONTAINER_HISTORY.CONTAINER_ID,
-        CONTAINER_HISTORY.DATANODE_HOST).value1(containerID).value2(datanode);
-    ContainerHistory newRecord = new ContainerHistory();
-    newRecord.setContainerId(containerID);
-    newRecord.setDatanodeHost(datanode);
-    newRecord.setLastReportTimestamp(time);
-    ContainerHistory record = containerHistoryDao.findById(recordToFind);
-    if (record != null) {
-      newRecord.setFirstReportTimestamp(record.getFirstReportTimestamp());
-      containerHistoryDao.update(newRecord);
-    } else {
-      newRecord.setFirstReportTimestamp(time);
-      containerHistoryDao.insert(newRecord);
+  public void upsertContainerHistory(long containerID, UUID uuid, long time) {
+    Map<UUID, ContainerReplicaHistory> tsMap;
+    try {
+      tsMap = dbServiceProvider.getContainerReplicaHistoryMap(containerID);
+      ContainerReplicaHistory ts = tsMap.get(uuid);
+      if (ts == null) {
+        // New entry
+        tsMap.put(uuid, new ContainerReplicaHistory(uuid, time, time));
+      } else {
+        // Entry exists, update last seen time and put it back to DB.
+        ts.setLastSeenTime(time);
+      }
+      dbServiceProvider.storeContainerReplicaHistoryMap(containerID, tsMap);
+    } catch (IOException e) {
+      LOG.debug("Error on DB operations. {}", e.getMessage());
     }
   }
 
   public List<ContainerHistory> getAllContainerHistory(long containerID) {
-    return containerHistoryDao.fetchByContainerId(containerID);
+    // First, get the existing entries from DB
+    Map<UUID, ContainerReplicaHistory> resMap;
+    try {
+      resMap = dbServiceProvider.getContainerReplicaHistoryMap(containerID);
+    } catch (IOException ex) {
+      resMap = new HashMap<>();
+      LOG.debug("Unable to retrieve container replica history from RDB.");
+    }
+
+    // Then, update the entries with the latest in-memory info, if available
+    if (replicaHistoryMap != null) {
+      Map<UUID, ContainerReplicaHistory> replicaLastSeenMap =
+          replicaHistoryMap.get(containerID);
+      if (replicaLastSeenMap != null) {
+        Map<UUID, ContainerReplicaHistory> finalResMap = resMap;
+        replicaLastSeenMap.forEach((k, v) ->
+            finalResMap.merge(k, v, (old, latest) -> latest));
+        resMap = finalResMap;
+      }
+    }
+
+    // Finally, convert map to list for output
+    List<ContainerHistory> resList = new ArrayList<>();
+    for (Map.Entry<UUID, ContainerReplicaHistory> entry : resMap.entrySet()) {
+      final UUID uuid = entry.getKey();
+      String hostname = "N/A";
+      // Attempt to retrieve hostname from NODES table
+      if (nodeDB != null) {
+        try {
+          DatanodeDetails dnDetails = nodeDB.get(uuid);
+          if (dnDetails != null) {
+            hostname = dnDetails.getHostName();
+          }
+        } catch (IOException ex) {
+          LOG.debug("Unable to retrieve from NODES table of node {}. {}",
+              uuid, ex.getMessage());
+        }
+      }
+      final long firstSeenTime = entry.getValue().getFirstSeenTime();
+      final long lastSeenTime = entry.getValue().getLastSeenTime();
+      resList.add(new ContainerHistory(containerID, uuid.toString(), hostname,
+          firstSeenTime, lastSeenTime));
+    }
+    return resList;
   }
 
   public List<ContainerHistory> getLatestContainerHistory(long containerID,
                                                           int limit) {
-    DSLContext dslContext = containerSchemaDefinition.getDSLContext();
-    // Get container history sorted in descending order of last report timestamp
-    return dslContext.select()
-        .from(CONTAINER_HISTORY)
-        .where(CONTAINER_HISTORY.CONTAINER_ID.eq(containerID))
-        .orderBy(CONTAINER_HISTORY.LAST_REPORT_TIMESTAMP.desc())
-        .limit(limit)
-        .fetchInto(ContainerHistory.class);
+    List<ContainerHistory> res = getAllContainerHistory(containerID);
+    res.sort(comparingLong(ContainerHistory::getLastSeenTime).reversed());
+    return res.stream().limit(limit).collect(Collectors.toList());
+  }
+
+  /**
+   * Should only be called once during ReconContainerManager init.
+   */
+  public void setReplicaHistoryMap(
+      Map<Long, Map<UUID, ContainerReplicaHistory>> replicaHistoryMap) {
+    this.replicaHistoryMap = replicaHistoryMap;
+  }
+
+  /**
+   * Should only be called once during ReconContainerManager init.
+   */
+  public void setScmDBStore(DBStore scmDBStore) {
+    this.scmDBStore = scmDBStore;
+    try {
+      this.nodeDB = ReconSCMDBDefinition.NODES.getTable(scmDBStore);
+    } catch (IOException ex) {
+      LOG.debug("Failed to get NODES table. {}", ex.getMessage());
+    }
+  }
+
+  /**
+   * Flush the container replica history in-memory map to DB.
+   * @param clearMap true to clear the in-memory map after flushing completes.
+   */
+  public void flushReplicaHistoryMapToDB(boolean clearMap) {
+    if (replicaHistoryMap == null) {
+      return;
+    }
+    synchronized (replicaHistoryMap) {
+      try {
+        for (Map.Entry<Long, Map<UUID, ContainerReplicaHistory>> entry :
+            replicaHistoryMap.entrySet()) {
+          final long containerId = entry.getKey();
+          final Map<UUID, ContainerReplicaHistory> map = entry.getValue();
+          dbServiceProvider.storeContainerReplicaHistoryMap(containerId, map);

Review comment:
       Can we support a batch interface that uses the RocksDB BatchOperation? That will help with faster flushing.

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/scm/ReconContainerManager.java
##########
@@ -171,23 +182,93 @@ public void addNewContainer(long containerId,
 
   /**
    * Add a container Replica for given DataNode.
-   *
-   * @param containerID
-   * @param replica
    */
   @Override
   public void updateContainerReplica(ContainerID containerID,
       ContainerReplica replica)
       throws ContainerNotFoundException {
     super.updateContainerReplica(containerID, replica);
-    // Update container_history table
-    long currentTime = System.currentTimeMillis();
-    String datanodeHost = replica.getDatanodeDetails().getHostName();
-    containerSchemaManager.upsertContainerHistory(containerID.getId(),
-        datanodeHost, currentTime);
+
+    final long currTime = System.currentTimeMillis();
+    final long id = containerID.getId();
+    final DatanodeDetails dnInfo = replica.getDatanodeDetails();
+    final UUID uuid = dnInfo.getUuid();
+
+    // Map from DataNode UUID to replica last seen time
+    final Map<UUID, ContainerReplicaHistory> replicaLastSeenMap =
+        replicaHistoryMap.get(id);
+
+    boolean flushToDB = false;
+
+    // If replica doesn't exist in in-memory map, add to DB and add to map
+    if (replicaLastSeenMap == null) {
+      // putIfAbsent to avoid TOCTOU
+      replicaHistoryMap.putIfAbsent(id,
+          new ConcurrentHashMap<UUID, ContainerReplicaHistory>() {{
+            put(uuid, new ContainerReplicaHistory(uuid, currTime, currTime));
+          }});
+      flushToDB = true;
+    } else {
+      // ContainerID exists, update timestamp in memory
+      final ContainerReplicaHistory ts = replicaLastSeenMap.get(uuid);
+      if (ts == null) {
+        // New Datanode
+        replicaLastSeenMap.put(uuid,
+            new ContainerReplicaHistory(uuid, currTime, currTime));
+        flushToDB = true;
+      } else {
+        // if the object exists, only update the last seen time field
+        ts.setLastSeenTime(currTime);
+      }
+    }
+
+    if (flushToDB) {
+      containerSchemaManager.upsertContainerHistory(id, uuid, currTime);
+    }
   }
 
+  /**
+   * Remove a Container Replica of a given DataNode.
+   */
+  @Override
+  public void removeContainerReplica(ContainerID containerID,
+      ContainerReplica replica) throws ContainerNotFoundException,
+      ContainerReplicaNotFoundException {
+    super.removeContainerReplica(containerID, replica);
+
+    final long id = containerID.getId();
+    final DatanodeDetails dnInfo = replica.getDatanodeDetails();
+    final UUID uuid = dnInfo.getUuid();
+
+    final Map<UUID, ContainerReplicaHistory> replicaLastSeenMap =
+        replicaHistoryMap.get(id);
+    if (replicaLastSeenMap != null) {
+      final ContainerReplicaHistory ts = replicaLastSeenMap.get(uuid);
+      if (ts != null) {
+        // Flush to DB, then remove from in-memory map
+        containerSchemaManager.upsertContainerHistory(id, uuid,
+            ts.getLastSeenTime());
+        replicaLastSeenMap.remove(uuid);
+      }
+    }
+  }
+
+  @VisibleForTesting
   public ContainerSchemaManager getContainerSchemaManager() {
     return containerSchemaManager;
   }
+
+  @VisibleForTesting
+  public Map<Long, Map<UUID, ContainerReplicaHistory>> getReplicaHistoryMap() {

Review comment:
       I don't see any usages for this method.

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/spi/impl/ContainerDBServiceProviderImpl.java
##########
@@ -184,6 +214,34 @@ public long getKeyCountForContainer(Long containerID) throws IOException {
     return keyCount == null ? 0L : keyCount;
   }
 
+  /**
+   * Get the container replica history of the given containerID.
+   *
+   * @param containerID the given containerId.
+   * @return A map of ContainerReplicaWithTimestamp of the given containerID.
+   * @throws IOException
+   */
+  @Override
+  public Map<UUID, ContainerReplicaHistory> getContainerReplicaHistoryMap(

Review comment:
       Nit. Method name can just be _getContainerReplicaHistory_.

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/scm/ContainerReplicaHistory.java
##########
@@ -0,0 +1,56 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.ozone.recon.scm;
+
+import java.util.UUID;
+
+/**
+ * A ContainerReplica timestamp class that tracks first and last seen time.

Review comment:
       Can we add some documentation here on on the ContainerEndpoint API class that Recon can only track first and last seen time of the container replica on a DN. It cannot track intermediate states where a replica can move out a DN and then come back in the future again to the same DN. All Recon guarantees is first and last seen time, and not the whole interval in between the 2 times.

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/scm/ReconContainerManager.java
##########
@@ -65,13 +73,16 @@
   public ReconContainerManager(
       ConfigurationSource conf,
       Table<ContainerID, ContainerInfo> containerStore,
-      BatchOperationHandler batchHandler,
+      DBStore batchHandler,
       PipelineManager pipelineManager,
       StorageContainerServiceProvider scm,
       ContainerSchemaManager containerSchemaManager) throws IOException {
     super(conf, containerStore, batchHandler, pipelineManager);
     this.scmClient = scm;
     this.containerSchemaManager = containerSchemaManager;
+    this.replicaHistoryMap = new ConcurrentHashMap<>();
+    containerSchemaManager.setReplicaHistoryMap(replicaHistoryMap);

Review comment:
       Is there any reason why we cannot maintain the ScmDB and replica history map only in this class and move the related methods (getContainerHistory, upsertContainerHistory, flushDb) from the ContainerSchemaManager to this class? The ContainerEndpoint class already has an instance of the Recon container manager. After this change, maybe we can also rename ContainerSchemaManager to ContainerHealthSchemaManager?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org