You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2018/08/21 05:22:00 UTC

[jira] [Created] (HBASE-21081) Trim Master memory usage, part 2

stack created HBASE-21081:
-----------------------------

             Summary: Trim Master memory usage, part 2
                 Key: HBASE-21081
                 URL: https://issues.apache.org/jira/browse/HBASE-21081
             Project: HBase
          Issue Type: Bug
    Affects Versions: 2.0.1
            Reporter: stack
            Assignee: stack


Good one found by a jxray spelunking [~misha@cloudera.com] on a 700 node cluster with 500k+ regions. For some reason, there are >1M instances of each column family when there should be only 500k (By rights there should be only the number of column families in the table rather than repeating these bytes per region -- TODO).

The below seemed suspicious added by HBASE-19496. It is making hashmaps with byte []s for keys. Byte []'s don't do hashCode/Equals. Usually when we have byte []'s for keys, we do ConcurrentMap and pass a Comparator in constructor that knows how to do byte []s.

{code}
.setStoreSequenceIds(regionLoadPB.getStoreCompleteSequenceIdList().stream()
  .collect(Collectors.toMap(
    (ClusterStatusProtos.StoreSequenceId s) -> s.getFamilyName().toByteArray(),
      ClusterStatusProtos.StoreSequenceId::getSequenceId)))
{code}

But looking back through code, even if a hashmap, the hashmap should only have one item in the Map. Where's the other coming from.

Here's how to get a TreeMap w/ Comparator into the mix... but need to check if this fixes the issue (I don't think so).

{code}
@@ -66,12 +70,13 @@ public final class RegionMetricsBuilder {
         .setStoreCount(regionLoadPB.getStores())
         .setStoreFileCount(regionLoadPB.getStorefiles())
         .setStoreFileSize(new Size(regionLoadPB.getStorefileSizeMB(), Size.Unit.MEGABYTE))
-        .setStoreSequenceIds(regionLoadPB.getStoreCompleteSequenceIdList().stream()
-          .collect(Collectors.toMap(
-            (ClusterStatusProtos.StoreSequenceId s) -> s.getFamilyName().toByteArray(),
-              ClusterStatusProtos.StoreSequenceId::getSequenceId)))
+        .setStoreSequenceIds(regionLoadPB.getStoreCompleteSequenceIdList().stream().collect(
+            Collectors.toMap(s -> s.getFamilyName().toByteArray(),
+                ClusterStatusProtos.StoreSequenceId::getSequenceId,
+                (k1, k2) -> k1, // Should never happen; only one completed sequenceid per Store
+                () -> new TreeMap<byte [], Long>(Bytes.BYTES_COMPARATOR))))
         .setUncompressedStoreFileSize(
-          new Size(regionLoadPB.getStoreUncompressedSizeMB(),Size.Unit.MEGABYTE))
+            new Size(regionLoadPB.getStoreUncompressedSizeMB(), Size.Unit.MEGABYTE))
         .build();
   }
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)