You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@impala.apache.org by st...@apache.org on 2021/02/26 13:10:08 UTC

[impala] branch master updated (285cea9 -> 9f1b527)

This is an automated email from the ASF dual-hosted git repository.

stigahuang pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git.


    from 285cea9  IMPALA-10526: Fix BufferPoolTest.Multi8RandomSpillToRemoteMix failed in sanitizer builds
     new bead2ed  IMPALA-10533: Fix TestScratchDir.test_scratch_dirs_mix_local_and_remote_dir_spill_local_only seems flaky
     new d038e39  IMPALA-10505: Avoid creating misleading audit logs
     new 9f1b527  IMPALA-10531: Fix TmpFileMgrTest.TestCompressBufferManagementEncryptedRemoteUpload failed in exhaustive release build

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 be/src/runtime/tmp-file-mgr.cc                     | 12 ++---
 .../impala/authorization/AuthorizationContext.java |  5 ++
 .../authorization/BaseAuthorizationChecker.java    |  3 ++
 .../ranger/RangerAuthorizationChecker.java         | 21 ++++++--
 .../authorization/ranger/RangerAuditLogTest.java   | 39 ++++++++++++++
 tests/authorization/test_ranger.py                 | 63 ++++++++++++++++++++++
 tests/custom_cluster/test_scratch_disk.py          |  2 +-
 7 files changed, 134 insertions(+), 11 deletions(-)

[impala] 03/03: IMPALA-10531: Fix TmpFileMgrTest.TestCompressBufferManagementEncryptedRemoteUpload failed in exhaustive release build

Posted by st...@apache.org.

This is an automated email from the ASF dual-hosted git repository.

stigahuang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit 9f1b5272a1ed0f1970ace7ba820d1c0dba590f46
Author: Yida Wu <wy...@gmail.com>
AuthorDate: Sun Feb 21 07:59:54 2021 -0800

    IMPALA-10531: Fix TmpFileMgrTest.TestCompressBufferManagementEncryptedRemoteUpload failed in exhaustive release build
    
    Fixed TmpFileMgrTest testcases failed in exhaustive release build.
    The cause is that in TmpFileGroup::AllocateRemoteSpace(), it uses
    DCHECK to check the return of TmpFileRemote::AllocateSpace(), because
    it always returns true, in the release build, the logic is optimized,
    so TmpFileRemote::AllocateSpace() isn't called in the release build.
    The solution is to use if instead of DCHECK.
    
    Tests:
    Reran TmpFileMgrTest/DiskIoMgrTest/BufferPoolTest in the release build.
    
    Change-Id: I7320ae41957c44151b7ec17886c383e72c2546e4
    Reviewed-on: http://gerrit.cloudera.org:8080/17101
    Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
 be/src/runtime/tmp-file-mgr.cc | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/be/src/runtime/tmp-file-mgr.cc b/be/src/runtime/tmp-file-mgr.cc
index cd00f42..738cc72 100644
--- a/be/src/runtime/tmp-file-mgr.cc
+++ b/be/src/runtime/tmp-file-mgr.cc
@@ -1051,17 +1051,17 @@ Status TmpFileGroup::AllocateRemoteSpace(int64_t num_bytes, TmpFile** tmp_file,
         tmp_dir_remote->bytes_limit, tmp_dir_remote->path));
   }
   UpdateScratchSpaceMetrics(file_size, true);
-
-  // It should be a successful return to allocate the first range from the new file.
-  DCHECK(tmp_file_remote->AllocateSpace(num_bytes, file_offset));
-
-  tmp_files_remote_.emplace_back(std::move(tmp_file_remote));
+  tmp_files_remote_.emplace_back(move(tmp_file_remote));
   *tmp_file = tmp_files_remote_.back().get();
+  // It should be a successful return to allocate the first range from the new file.
+  if (!(*tmp_file)->AllocateSpace(num_bytes, file_offset)) {
+    DCHECK(false) << "Should be a successful allocation for the first write range.";
+  }
+  DCHECK_EQ(*file_offset, 0);
   {
     lock_guard<SpinLock> lock(tmp_files_remote_ptrs_lock_);
     tmp_files_remote_ptrs_.emplace(*tmp_file, tmp_files_remote_.back());
   }
-  *file_offset = 0;
 
   // Try to reserve the space for local buffer with a quick return to avoid
   // a long wait, if failed, caller should do the reservation for the buffer.

[impala] 02/03: IMPALA-10505: Avoid creating misleading audit logs

Posted by st...@apache.org.

This is an automated email from the ASF dual-hosted git repository.

stigahuang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit d038e392de8ba453155040f3ca1ae33f03e39b2e
Author: Fang-Yu Rao <fa...@cloudera.com>
AuthorDate: Fri Feb 12 17:31:47 2021 -0800

    IMPALA-10505: Avoid creating misleading audit logs
    
    Before this patch, if a requesting user granted the privilege on a view
    does not have the privilege on the table(s) on which the view is based,
    only an audit log entry indicating a failed authorization with respect
    to an underlying table will be produced, whereas the requesting user is
    actually able to fetch the data from the view since the user is granted
    the privilege to do so. Such an audit log entry, however, is misleading
    and thus should not be produced at all. Moreover, the audit log entry
    corresponding to the successful authorization with respect to the view
    should also be created.
    
    Recall that to authorize a query involving a view, Impala performs
    privilege checks for both the view as well as the underlying table(s).
    Thus, for a user granted the privilege on the view but not the
    underlying tables, the privilege check for the view would succeed but
    those for the underlying table(s) would fail. Each privilege check
    results in an audit log entry produced by Ranger. These audit log
    entries will be collected by Impala and will be sent back to Ranger
    after the query authorization. In the case where there is at least one
    AuthzAuditEvent indicating a failed privilege check, only the
    AuthzAuditEvent corresponding to the first failed check will be sent
    back to Ranger. Refer to RangerBufferAuditHandler#flush() for further
    details. Impala performs checks for both the view as well as the
    underlying table(s) so that it is able to disallow the requesting user
    from accessing the runtime profile or execution summary when the
    requesting user is not granted the privilege on the underlying table(s).
    Note that allowing the requesting user the access to the runtime profile
    would reveal the existence of the underlying tables.
    
    This patch resolves the issue by specifying whether or not we should
    retain the audit log entries when calling
    BaseAuthorizationChecker#authorizePrivilegeRequest() so that Impala will
    not collect the audit log entries resulting from the privilege checks
    for the underlying table(s) of a view.
    
    Testing:
     - Added new FE tests to verify that the correct audit log entry is
       produced after this patch.
     - Added a new E2E test to verify that a user not granted the privilege
       on the underlying table(s) of a view is still not able to access the
       runtime profile or execution summary even though the user is granted
       the privilege on the view.
     - Verified that the patch passes the core tests in the DEBUG build.
    
    Change-Id: I02f40eb96d6ed863cd2cd88d717c354dc351a64c
    Reviewed-on: http://gerrit.cloudera.org:8080/17078
    Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
 .../impala/authorization/AuthorizationContext.java |  5 ++
 .../authorization/BaseAuthorizationChecker.java    |  3 ++
 .../ranger/RangerAuthorizationChecker.java         | 21 ++++++--
 .../authorization/ranger/RangerAuditLogTest.java   | 39 ++++++++++++++
 tests/authorization/test_ranger.py                 | 63 ++++++++++++++++++++++
 5 files changed, 127 insertions(+), 4 deletions(-)

diff --git a/fe/src/main/java/org/apache/impala/authorization/AuthorizationContext.java b/fe/src/main/java/org/apache/impala/authorization/AuthorizationContext.java
index 1930dc2..5972a9a 100644
--- a/fe/src/main/java/org/apache/impala/authorization/AuthorizationContext.java
+++ b/fe/src/main/java/org/apache/impala/authorization/AuthorizationContext.java
@@ -27,6 +27,8 @@ import java.util.Optional;
 public class AuthorizationContext {
   private final Optional<EventSequence> timeline_;
 
+  private boolean retainAudits_ = true;
+
   public AuthorizationContext(Optional<EventSequence> timeline) {
     this.timeline_ = timeline;
   }
@@ -36,4 +38,7 @@ public class AuthorizationContext {
    */
   public Optional<EventSequence> getTimeline() { return timeline_; }
 
+  public void setRetainAudits(boolean retainAudits) { retainAudits_ = retainAudits; }
+
+  public boolean getRetainAudits() { return retainAudits_; }
 }
diff --git a/fe/src/main/java/org/apache/impala/authorization/BaseAuthorizationChecker.java b/fe/src/main/java/org/apache/impala/authorization/BaseAuthorizationChecker.java
index 42781d8..09bb780 100644
--- a/fe/src/main/java/org/apache/impala/authorization/BaseAuthorizationChecker.java
+++ b/fe/src/main/java/org/apache/impala/authorization/BaseAuthorizationChecker.java
@@ -179,6 +179,7 @@ public abstract class BaseAuthorizationChecker implements AuthorizationChecker {
     // 'user_has_profile_access' flag in queryCtx_.
     for (Pair<PrivilegeRequest, String> maskedReq : analyzer.getMaskedPrivilegeReqs()) {
       try {
+        authzCtx.setRetainAudits(false);
         authorizePrivilegeRequest(authzCtx, analysisResult, catalog, maskedReq.first);
       } catch (AuthorizationException e) {
         analysisResult.setUserHasProfileAccess(false);
@@ -186,6 +187,8 @@ public abstract class BaseAuthorizationChecker implements AuthorizationChecker {
           throw new AuthorizationException(maskedReq.second);
         }
         break;
+      } finally {
+        authzCtx.setRetainAudits(true);
       }
     }
   }
diff --git a/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java b/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java
index d3bf9b6..eb62599 100644
--- a/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java
+++ b/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java
@@ -455,7 +455,10 @@ public class RangerAuthorizationChecker extends BaseAuthorizationChecker {
     // Use a temporary audit handler instead of the original audit handler
     // so that we know all the audit events generated by the temporary audit
     // handler are contained to the given resource.
-    RangerBufferAuditHandler tmpAuditHandler = originalAuditHandler == null ?
+    // Recall that in some case, e.g., the DESCRIBE statement, 'originalAuditHandler'
+    // could be null, resulting in 'tmpAuditHandler' being null as well.
+    RangerBufferAuditHandler tmpAuditHandler =
+        (originalAuditHandler == null || !authzCtx.getRetainAudits()) ?
         null : new RangerBufferAuditHandler(originalAuditHandler);
     for (Privilege impliedPrivilege: privilege.getImpliedPrivileges()) {
       if (authorizeResource(authzCtx, resource, user, impliedPrivilege,
@@ -464,7 +467,9 @@ public class RangerAuthorizationChecker extends BaseAuthorizationChecker {
         break;
       }
     }
-    if (originalAuditHandler != null) {
+    // 'tmpAuditHandler' could be null if 'originalAuditHandler' is null or
+    // authzCtx.getRetainAudits() is false.
+    if (originalAuditHandler != null && tmpAuditHandler != null) {
       updateAuditEvents(tmpAuditHandler, originalAuditHandler, true /*any*/,
           privilege);
     }
@@ -479,7 +484,13 @@ public class RangerAuthorizationChecker extends BaseAuthorizationChecker {
     // Use a temporary audit handler instead of the original audit handler
     // so that we know all the audit events generated by the temporary audit
     // handler are contained to the given resource.
-    RangerBufferAuditHandler tmpAuditHandler = originalAuditHandler == null ?
+    // Recall that if 'min_privilege_set_for_show_stmts' is specified when the impalad's
+    // are started, this method will be called with the RangerBufferAuditHandler in
+    // 'authzCtx' being null for the SHOW DATABASES and SHOW TABLES statements. Refer to
+    // BaseAuthorizationChecker#hasAccess(User user, PrivilegeRequest request) for
+    // further details.
+    RangerBufferAuditHandler tmpAuditHandler =
+        (originalAuditHandler == null || !authzCtx.getRetainAudits()) ?
         null : new RangerBufferAuditHandler(originalAuditHandler);
     for (Privilege impliedPrivilege: privilege.getImpliedPrivileges()) {
       if (!authorizeResource(authzCtx, resource, user, impliedPrivilege,
@@ -488,7 +499,9 @@ public class RangerAuthorizationChecker extends BaseAuthorizationChecker {
         break;
       }
     }
-    if (originalAuditHandler != null) {
+    // 'tmpAuditHandler' could be null if 'originalAuditHandler' is null or
+    // authzCtx.getRetainAudits() is false.
+    if (originalAuditHandler != null && tmpAuditHandler != null) {
       updateAuditEvents(tmpAuditHandler, originalAuditHandler, false /*not any*/,
           privilege);
     }
diff --git a/fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java b/fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java
index a23a0c4..06bba79 100644
--- a/fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java
+++ b/fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java
@@ -160,6 +160,45 @@ public class RangerAuditLogTest extends AuthorizationTestBase {
     }, "compute stats FUNCTIONAL.ALLTYPES",
         onTable("functional", "alltypes", TPrivilegeLevel.ALTER,
             TPrivilegeLevel.SELECT));
+
+    // Recall that the view 'functional.complex_view' is created based on two underlying
+    // tables, i.e., functional.alltypesagg and functional.alltypestiny. The following 3
+    // test cases make sure the same audit log entry indicating a successful
+    // authorization event is produced whether or not the requesting user is granted the
+    // privileges on any of the underlying tables.
+    authzOk(events -> {
+      assertEquals(1, events.size());
+      assertEventEquals("@table", "select", "functional/complex_view", 1,
+          events.get(0));
+      assertEquals("select count(*) from functional.complex_view",
+          events.get(0).getRequestData());
+    }, "select count(*) from functional.complex_view",
+        onTable("functional", "complex_view", TPrivilegeLevel.SELECT));
+
+    // The same audit log entry is produced if the requesting user is granted the
+    // privilege on one of the underlying tables.
+    authzOk(events -> {
+      assertEquals(1, events.size());
+      assertEventEquals("@table", "select", "functional/complex_view", 1,
+          events.get(0));
+      assertEquals("select count(*) from functional.complex_view",
+          events.get(0).getRequestData());
+    }, "select count(*) from functional.complex_view",
+        onTable("functional", "complex_view", TPrivilegeLevel.SELECT),
+        onTable("functional", "alltypesagg", TPrivilegeLevel.SELECT));
+
+    // The same audit log entry is produced if the requesting user is granted the
+    // privileges on all of the underlying tables.
+    authzOk(events -> {
+      assertEquals(1, events.size());
+      assertEventEquals("@table", "select", "functional/complex_view", 1,
+          events.get(0));
+      assertEquals("select count(*) from functional.complex_view",
+          events.get(0).getRequestData());
+    }, "select count(*) from functional.complex_view",
+        onTable("functional", "complex_view", TPrivilegeLevel.SELECT),
+        onTable("functional", "alltypesagg", TPrivilegeLevel.SELECT),
+        onTable("functional", "alltypestiny", TPrivilegeLevel.SELECT));
   }
 
   @Test
diff --git a/tests/authorization/test_ranger.py b/tests/authorization/test_ranger.py
index 74940e6..791755d 100644
--- a/tests/authorization/test_ranger.py
+++ b/tests/authorization/test_ranger.py
@@ -1069,6 +1069,69 @@ class TestRanger(CustomClusterTestSuite):
 
   @pytest.mark.execute_serially
   @CustomClusterTestSuite.with_args(
+    impalad_args=IMPALAD_ARGS, catalogd_args=CATALOGD_ARGS)
+  def test_profile_protection(self, vector):
+    """Test that a requesting user is able to access the runtime profile or execution
+    summary of a query involving a view only if the user is granted the privileges on all
+    the underlying tables of the view. Recall that the view functional.complex_view we
+    use here is created based on the tables functional.alltypesagg and
+    functional.alltypestiny."""
+    admin_client = self.create_impala_client()
+    non_owner_client = self.create_impala_client()
+    test_db = "functional"
+    test_view = "complex_view"
+    grantee_user = "non_owner"
+    try:
+      admin_client.execute(
+          "grant select on table {0}.{1} to user {2}"
+          .format(test_db, test_view, grantee_user), user=ADMIN)
+
+      admin_client.execute("refresh authorization")
+      # Recall that in a successful execution, result.exec_summary and
+      # result.runtime_profile store the execution summary and runtime profile,
+      # respectively. But when the requesting user does not have the privileges
+      # on the underlying tables, an exception will be thrown from
+      # ImpalaBeeswaxClient.get_runtime_profile().
+      result = self.execute_query_expect_failure(
+          non_owner_client, "select count(*) from {0}.{1}".format(test_db, test_view),
+          user=grantee_user)
+      assert "User {0} is not authorized to access the runtime profile or " \
+          "execution summary".format(grantee_user) in str(result)
+
+      admin_client.execute(
+          "grant select on table {0}.alltypesagg to user {1}"
+          .format(test_db, grantee_user), user=ADMIN)
+
+      admin_client.execute("refresh authorization")
+      self.execute_query_expect_failure(
+          non_owner_client, "select count(*) from {0}.{1}".format(test_db, test_view),
+          user=grantee_user)
+      assert "User {0} is not authorized to access the runtime profile or " \
+          "execution summary".format(grantee_user) in str(result)
+
+      admin_client.execute(
+          "grant select on table {0}.alltypestiny to user {1}"
+          .format(test_db, grantee_user), user=ADMIN)
+
+      admin_client.execute("refresh authorization")
+      self.execute_query_expect_success(
+          non_owner_client, "select count(*) from {0}.{1}".format(test_db, test_view),
+          user=grantee_user)
+    finally:
+      cleanup_statements = [
+          "revoke select on table {0}.{1} from user {2}"
+          .format(test_db, test_view, grantee_user),
+          "revoke select on table {0}.alltypesagg from user {1}"
+          .format(test_db, grantee_user),
+          "revoke select on table {0}.alltypestiny from user {1}"
+          .format(test_db, grantee_user)
+      ]
+
+      for statement in cleanup_statements:
+        admin_client.execute(statement, user=ADMIN)
+
+  @pytest.mark.execute_serially
+  @CustomClusterTestSuite.with_args(
     # We additionally provide impalad and catalogd with the customized user-to-groups
     # mapper since some test cases in grant_revoke.test require Impala to retrieve the
     # groups a given user belongs to and such users might not exist in the underlying

[impala] 01/03: IMPALA-10533: Fix TestScratchDir.test_scratch_dirs_mix_local_and_remote_dir_spill_local_only seems flaky

Posted by st...@apache.org.

This is an automated email from the ASF dual-hosted git repository.

stigahuang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit bead2ede1c79fb477d2fb4d2f73703df1e34e97a
Author: Yida Wu <wy...@gmail.com>
AuthorDate: Sun Feb 21 12:00:08 2021 -0800

    IMPALA-10533: Fix TestScratchDir.test_scratch_dirs_mix_local_and_remote_dir_spill_local_only seems flaky
    
    The E2E testcase emulates the situation when there are two types of
    scratch directories, the data only spills to the local one when the
    space of local directory is sufficient. The testcase works fine for
    the debug build, however in the release build, the system runs faster
    and more data is spilled from memory which exceeds the setting of the
    local scratch space limit. To solve this, the size limit of local
    scratch space is changed from 100M to 2GB, so that allows all of the
    spilled data is in the local instead of the remote directory.
    
    Tests:
    Reran test_scratch_dirs_mix_local_and_remote_dir_spill_local_only in
    the release build.
    
    Change-Id: If2dc32196b2554aee9fc94a4ccbbf5803dbcce1d
    Reviewed-on: http://gerrit.cloudera.org:8080/17102
    Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
 tests/custom_cluster/test_scratch_disk.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/custom_cluster/test_scratch_disk.py b/tests/custom_cluster/test_scratch_disk.py
index 23df692..b1898db 100644
--- a/tests/custom_cluster/test_scratch_disk.py
+++ b/tests/custom_cluster/test_scratch_disk.py
@@ -294,7 +294,7 @@ class TestScratchDir(CustomClusterTestSuite):
        to local in the test'''
     normal_dirs = self.generate_dirs(2)
     normal_dirs[0] = '{0}::{1}'.format(normal_dirs[0], 1)
-    normal_dirs[1] = '{0}:100M:{1}'.format(normal_dirs[1], 0)
+    normal_dirs[1] = '{0}:2GB:{1}'.format(normal_dirs[1], 0)
     normal_dirs.append('hdfs://localhost')
     self._start_impala_cluster([
       '--impalad_args=-logbuflevel=-1 -scratch_dirs={0}'.format(','.join(normal_dirs)),