You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by GitBox <gi...@apache.org> on 2018/03/28 21:20:27 UTC

[GitHub] keith-turner opened a new pull request #410: Fixed inefficient auths check

keith-turner opened a new pull request #410: Fixed inefficient auths check
URL: https://github.com/apache/accumulo/pull/410
 
 
   [On the mailing list](https://lists.apache.org/thread.html/31ff119654efe1d2c7c95c544a1634f6cb3c0721108f169986de02dc@%3Cuser.accumulo.apache.org%3E) a performance problem with authorizations was identified.  For the case when a user has a large number of authorizations scan scan be really slow. For example is a user has 100,000 auths and does a scan with 90,000 auths, its very slow.  This caused by a subset check on the server side that uses lists.  This PR changes the check to use an existing hashset.
   
   The following is a performance test that was written to explore this problem.  This code creates a user with 100,000 auths and then times scans with 0, 10, 100, 1000, 10000, and 100000 auths.
   
   ```java
     public void testManyAuths(Connector conn) throws Exception {
       conn.securityOperations().createLocalUser("bob", new PasswordToken("bob"));
   
       List<byte[]> al = new ArrayList<>();
   
       for (int i = 0; i < 100000; i++) {
         al.add(String.format("%06x", i).getBytes(StandardCharsets.UTF_8));
       }
   
       Authorizations auths = genAuths(100_000, 1);
   
       conn.securityOperations().changeUserAuthorizations("bob", auths);
   
       conn.tableOperations().create("bobsSpecialTable");
       conn.securityOperations().grantTablePermission("bob", "bobsSpecialTable", TablePermission.READ);
   
       conn = conn.getInstance().getConnector("bob", "bob");
   
       runTest(conn, Authorizations.EMPTY);
       runTest(conn, genAuths(100_000, 10_000));
       runTest(conn, genAuths(100_000, 1_000));
       runTest(conn, genAuths(100_000, 100));
       runTest(conn, genAuths(100_000, 10));
       runTest(conn, auths);
     }
   
     void runTest(Connector conn, Authorizations auths) throws Exception {
   
       try (Scanner scanner = conn.createScanner("bobsSpecialTable", auths)) {
         long start = System.currentTimeMillis();
         // do a few warm up scans
         for (int i = 0; i < 50 && System.currentTimeMillis() - start < 60000; i++) {
           int count = 0;
           for (Entry<Key,Value> entry : scanner) {
             count++;
           }
   
         }
   
         start = System.currentTimeMillis();
         SummaryStatistics stats = new SummaryStatistics();
         for (int i = 0; i < 100 && System.currentTimeMillis() - start < 120000; i++) {
           long t1 = System.currentTimeMillis();
           for (Entry<Key,Value> entry : scanner) {
   
           }
           long t2 = System.currentTimeMillis();
           stats.addValue(t2 - t1);
   
         }
         System.out.printf("auths.size:%,7d  mean:%.2f  stddev:%.2f  min:%.2f  max:%.2f samples:%d\n",
             auths.size(), stats.getGeometricMean(), stats.getStandardDeviation(), stats.getMin(), 
             stats.getMax(), stats.getN());
       }
     }
   
     Authorizations genAuths(int max, int step) {
       List<byte[]> al = new ArrayList<>();
   
       for (int i = 0; i < max; i += step) {
         al.add(String.format("%06x", i).getBytes(StandardCharsets.UTF_8));
       }
   
       return new Authorizations(al);
     }
   ```
   
   Before this PR, this test output : 
   
   ```
   auths.size:      0  mean:134.39  stddev:29.88  min:98.00  max:289.00 samples:100
   auths.size:     10  mean:147.82  stddev:33.17  min:104.00  max:242.00 samples:100
   auths.size:    100  mean:219.07  stddev:44.38  min:162.00  max:351.00 samples:100
   auths.size:  1,000  mean:475.67  stddev:44.17  min:419.00  max:620.00 samples:100
   auths.size: 10,000  mean:3516.51  stddev:105.13  min:3319.00  max:3785.00 samples:35
   auths.size:100,000  mean:34615.27  stddev:400.12  min:34167.00  max:35109.00 samples:4
   ```
   
   This shows that a scan with 0 auths took 134 milliseconds on average.  A scan with 100,000 auths took 34 seconds on average.  After this PR the test outputs :
   
   ```
   auths.size:      0  mean:1.52  stddev:0.73  min:1.00  max:5.00 samples:100
   auths.size:     10  mean:137.23  stddev:26.14  min:102.00  max:228.00 samples:100
   auths.size:    100  mean:136.70  stddev:28.12  min:104.00  max:247.00 samples:100
   auths.size:  1,000  mean:137.59  stddev:26.18  min:101.00  max:243.00 samples:100
   auths.size: 10,000  mean:154.95  stddev:24.81  min:115.00  max:232.00 samples:100
   auths.size:100,000  mean:347.62  stddev:56.12  min:250.00  max:549.00 samples:100
   ```
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services