You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sentry.apache.org by Sergio Pena <se...@cloudera.com> on 2017/09/01 17:01:30 UTC

Re: Review Request 62007: SENTRY-1909 Improvements for memory usage when full path snapshot is sent from Sentry to NN

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62007/#review184366
-----------------------------------------------------------


Ship it!




The patch looks good. But just wonder, how does this patch have a memory improvement? I see the big change is just to return an ArrayList instead of a HashSet in the getPathStrings(). How does this make us better?

- Sergio Pena


On Aug. 31, 2017, 4:14 p.m., Alexander Kolbasov wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62007/
> -----------------------------------------------------------
> 
> (Updated Aug. 31, 2017, 4:14 p.m.)
> 
> 
> Review request for sentry, Arjun Mishra, Brian Towles, Misha Dmitriev, Na Li, Sergio Pena, Vamsee Yarlagadda, and Vadim Spector.
> 
> 
> Bugs: SENTRY-1909
>     https://issues.apache.org/jira/browse/SENTRY-1909
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> SENTRY-1909 Improvements for memory usage when full path snapshot is sent from Sentry to NN
> 
> 
> Diffs
> -----
> 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PathImageRetriever.java 898c7bec2e35e6f1424478666282ba78222da709 
>   sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestImageRetriever.java 20b3e108cc976207a3809bc6a214a34e10788200 
>   sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestSentryHDFSServiceProcessor.java e403d7cca5dca5c39785490f92e562dc9a3b4daf 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/model/MAuthzPathsMapping.java fee5279a7cd3d60df4e60a5f25d2e8604c37d04b 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/PathsImage.java 409a557fe89d85df888c1f16b3f5aacdd9aa96b0 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java 593b92f96b47f959266280bce3d0809f6a80c362 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/FullUpdateInitializer.java 760a2b55e98ac1fb305666cbbedfba11d26f2fcc 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/FullUpdateModifier.java 2cd18ea9d2ca40b5f46aaafde532fb3bf9769c07 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java 4a8fb953ce486e1aeb1042884de56872b5539cd0 
>   sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java 3da6a4e998d36cada7e9ea77285b11f07cea5f25 
>   sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestFullUpdateInitializer.java 9d7bddd339513180e627286d8a749b7814c824f2 
>   sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestFullUpdateModifier.java 7deccb064fd75b39af779796d9e95caf9c718b32 
>   sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java f56384a99e128b3e9880cbc5692107d61f2f500f 
> 
> 
> Diff: https://reviews.apache.org/r/62007/diff/3/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Alexander Kolbasov
> 
>


Re: Review Request 62007: SENTRY-1909 Improvements for memory usage when full path snapshot is sent from Sentry to NN

Posted by Misha Dmitriev <mi...@cloudera.com>.

> On Sept. 1, 2017, 5:01 p.m., Sergio Pena wrote:
> > The patch looks good. But just wonder, how does this patch have a memory improvement? I see the big change is just to return an ArrayList instead of a HashSet in the getPathStrings(). How does this make us better?

Hi Sergio. HashSets use a lot more memory than ArrayLists because of the difference in their internal design. If you are interested, I can explain more, but for now take a look at the small test below. It creates the same number of HashSets and then ArrayLists, with the same workload, and measures how much memory they take each time. Here are the results:

$ java TestMem
Used memory by  sets: 1240 MB
Used memory by  lists: 156 MB


Here is the code:


import java.util.ArrayList;
import java.util.HashSet;

public class TestMem {
  public static final int NUM_OBJS = 1000 * 1000;
  public static final int NUM_STRS = 30;
  public static final String[] STRINGS = new String[NUM_STRS];

  public static void main(String args[]) {
    // Fill the strings array once
    for (int i = 0; i < NUM_STRS; i++) {
      STRINGS[i] = Integer.toString(i);
    }

    Object[] listsOrSets = new Object[NUM_OBJS];

    System.gc();

    for (int i = 0; i < NUM_OBJS; i++) {
      HashSet<String> set = new HashSet<>(NUM_STRS);
      for (int j = 0; j < NUM_STRS; j++) {
        set.add(STRINGS[j]);
      }
      listsOrSets[i] = set;
    }

    reportMemory("sets");

    System.gc();

    for (int i = 0; i < NUM_OBJS; i++) {
      ArrayList<String> list = new ArrayList<>(NUM_STRS);
      for (int j = 0; j < NUM_STRS; j++) {
        list.add(STRINGS[j]);
      }
      listsOrSets[i] = list;
    }

    reportMemory("lists");
  }

  private static void reportMemory(String s) {
    System.gc();
    Runtime r = Runtime.getRuntime();
    long usedMemInMB = (r.totalMemory() - r.freeMemory()) / 1024 / 1024;
    System.out.println("Used memory by  " + s + ": " + usedMemInMB + " MB");
  }
}


- Misha


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62007/#review184366
-----------------------------------------------------------


On Aug. 31, 2017, 4:14 p.m., Alexander Kolbasov wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62007/
> -----------------------------------------------------------
> 
> (Updated Aug. 31, 2017, 4:14 p.m.)
> 
> 
> Review request for sentry, Arjun Mishra, Brian Towles, Misha Dmitriev, Na Li, Sergio Pena, Vamsee Yarlagadda, and Vadim Spector.
> 
> 
> Bugs: SENTRY-1909
>     https://issues.apache.org/jira/browse/SENTRY-1909
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> SENTRY-1909 Improvements for memory usage when full path snapshot is sent from Sentry to NN
> 
> 
> Diffs
> -----
> 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PathImageRetriever.java 898c7bec2e35e6f1424478666282ba78222da709 
>   sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestImageRetriever.java 20b3e108cc976207a3809bc6a214a34e10788200 
>   sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestSentryHDFSServiceProcessor.java e403d7cca5dca5c39785490f92e562dc9a3b4daf 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/model/MAuthzPathsMapping.java fee5279a7cd3d60df4e60a5f25d2e8604c37d04b 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/PathsImage.java 409a557fe89d85df888c1f16b3f5aacdd9aa96b0 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java 593b92f96b47f959266280bce3d0809f6a80c362 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/FullUpdateInitializer.java 760a2b55e98ac1fb305666cbbedfba11d26f2fcc 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/FullUpdateModifier.java 2cd18ea9d2ca40b5f46aaafde532fb3bf9769c07 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java 4a8fb953ce486e1aeb1042884de56872b5539cd0 
>   sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java 3da6a4e998d36cada7e9ea77285b11f07cea5f25 
>   sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestFullUpdateInitializer.java 9d7bddd339513180e627286d8a749b7814c824f2 
>   sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestFullUpdateModifier.java 7deccb064fd75b39af779796d9e95caf9c718b32 
>   sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java f56384a99e128b3e9880cbc5692107d61f2f500f 
> 
> 
> Diff: https://reviews.apache.org/r/62007/diff/3/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Alexander Kolbasov
> 
>