You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by ta...@apache.org on 2018/05/24 19:26:47 UTC

[1/2] impala git commit: [DOCS] Sentry is required for Impala to enable delegation

Repository: impala
Updated Branches:
  refs/heads/master 1ca077fd0 -> 2362b672c


[DOCS] Sentry is required for Impala to enable delegation

Change-Id: I002d3d33eee6a9b9336f21c81a4de75ed3bd5efb
Reviewed-on: http://gerrit.cloudera.org:8080/10451
Reviewed-by: Sailesh Mukil <sa...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


Project: http://git-wip-us.apache.org/repos/asf/impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/impala/commit/a22ee641
Tree: http://git-wip-us.apache.org/repos/asf/impala/tree/a22ee641
Diff: http://git-wip-us.apache.org/repos/asf/impala/diff/a22ee641

Branch: refs/heads/master
Commit: a22ee6419c4fa1ed7dfb04ca9930f1e791a85411
Parents: 1ca077f
Author: Alex Rodoni <ar...@cloudera.com>
Authored: Fri May 18 12:24:02 2018 -0700
Committer: Impala Public Jenkins <im...@cloudera.com>
Committed: Thu May 24 18:26:27 2018 +0000

----------------------------------------------------------------------
 docs/topics/impala_delegation.xml | 93 +++++++++++++++-------------------
 1 file changed, 42 insertions(+), 51 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/impala/blob/a22ee641/docs/topics/impala_delegation.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_delegation.xml b/docs/topics/impala_delegation.xml
index 73ae658..c524bf5 100644
--- a/docs/topics/impala_delegation.xml
+++ b/docs/topics/impala_delegation.xml
@@ -36,67 +36,58 @@ under the License.
   </prolog>
 
   <conbody>
-
     <p>
-<!--
+      <!--
       When users connect to Impala directly through the <cmdname>impala-shell</cmdname> interpreter, the Sentry
       authorization framework determines what actions they can take and what data they can see.
 -->
-      When users submit Impala queries through a separate application, such as Hue or a business intelligence tool,
-      typically all requests are treated as coming from the same user. In Impala 1.2 and higher, authentication is
-      extended by a new feature that allows applications to pass along credentials for the users that connect to
-      them (known as <q>delegation</q>), and issue Impala queries with the privileges for those users. Currently,
-      the delegation feature is available only for Impala queries submitted through application interfaces such as
-      Hue and BI tools; for example, Impala cannot issue queries using the privileges of the HDFS user.
-    </p>
-
-    <p>
-      The delegation feature is enabled by a startup option for <cmdname>impalad</cmdname>:
-      <codeph>--authorized_proxy_user_config</codeph>. When you specify this option, users whose names you specify
-      (such as <codeph>hue</codeph>) can delegate the execution of a query to another user. The query runs with the
-      privileges of the delegated user, not the original user such as <codeph>hue</codeph>. The name of the
-      delegated user is passed using the HiveServer2 configuration property <codeph>impala.doas.user</codeph>.
-    </p>
-
-    <p>
-      You can specify a list of users that the application user can delegate to, or <codeph>*</codeph> to allow a
-      superuser to delegate to any other user. For example:
-    </p>
-
-<codeblock>impalad --authorized_proxy_user_config 'hue=user1,user2;admin=*' ...</codeblock>
-
-    <note>
-      Make sure to use single quotes or escape characters to ensure that any <codeph>*</codeph> characters do not
-      undergo wildcard expansion when specified in command-line arguments.
-    </note>
-
-    <p>
-      See <xref href="impala_config_options.xml#config_options"/> for details about adding or changing
-      <cmdname>impalad</cmdname> startup options. See
-      <xref keyref="how-hiveserver2-brings-security-and-concurrency-to-apache-hive">this
-      blog post</xref> for background information about the delegation capability in HiveServer2.
-    </p>
-    <p>
-      To set up authentication for the delegated users:
-    </p>
-
+      When users submit Impala queries through a separate application, such as
+      Hue or a business intelligence tool, typically all requests are treated as
+      coming from the same user. In Impala 1.2 and higher,Impala supports
+      applications to pass along credentials for the users that connect to them,
+      known as <q>delegation</q>, and to issue Impala queries with the
+      privileges for those users. Currently, the delegation feature is available
+      only for Impala queries submitted through application interfaces such as
+      Hue and BI tools. For example, Impala cannot issue queries using the
+      privileges of the HDFS user. </p>
+    <note type="attention">Impala requires Apache Sentry on the cluster to
+      enable delegation. Without Apache Sentry installed, the delegation feature
+      will fail with the following error: User <i>user1</i> is not authorized to
+      delegate to <i>user2</i> User delegation is disabled.</note>
+    <p> The delegation feature is enabled by a startup option for
+        <cmdname>impalad</cmdname>:
+        <codeph>--authorized_proxy_user_config</codeph>. When you specify this
+      option, users whose names you specify (such as <codeph>hue</codeph>) can
+      delegate the execution of a query to another user. The query runs with the
+      privileges of the delegated user, not the original user such as
+        <codeph>hue</codeph>. The name of the delegated user is passed using the
+      HiveServer2 configuration property <codeph>impala.doas.user</codeph>. </p>
+    <p> You can specify a list of users that the application user can delegate
+      to, or <codeph>*</codeph> to allow a superuser to delegate to any other
+      user. For example: </p>
+    <codeblock>impalad --authorized_proxy_user_config 'hue=user1,user2;admin=*' ...</codeblock>
+    <note> Make sure to use single quotes or escape characters to ensure that
+      any <codeph>*</codeph> characters do not undergo wildcard expansion when
+      specified in command-line arguments. </note>
+    <p> See <xref href="impala_config_options.xml#config_options"/> for details
+      about adding or changing <cmdname>impalad</cmdname> startup options. See
+        <xref
+        keyref="how-hiveserver2-brings-security-and-concurrency-to-apache-hive"
+        >this blog post</xref> for background information about the delegation
+      capability in HiveServer2. </p>
+    <p> To set up authentication for the delegated users: </p>
     <ul>
       <li>
-        <p>
-          On the server side, configure either user/password authentication through LDAP, or Kerberos
-          authentication, for all the delegated users. See <xref href="impala_ldap.xml#ldap"/> or
-          <xref href="impala_kerberos.xml#kerberos"/> for details.
-        </p>
+        <p> On the server side, configure either user/password authentication
+          through LDAP, or Kerberos authentication, for all the delegated users.
+          See <xref href="impala_ldap.xml#ldap"/> or <xref
+            href="impala_kerberos.xml#kerberos"/> for details. </p>
       </li>
-
       <li>
-        <p>
-          On the client side, to learn how to enable delegation, consult the documentation
-          for the ODBC driver you are using.
-        </p>
+        <p> On the client side, to learn how to enable delegation, consult the
+          documentation for the ODBC driver you are using. </p>
       </li>
     </ul>
-
   </conbody>
 
 </concept>


[2/2] impala git commit: IMPALA-7055: fix race with DML errors

Posted by ta...@apache.org.
IMPALA-7055: fix race with DML errors

Error statuses could be lost because backend_exec_complete_barrier_
went to 0 before the query was transitioned to an error state.
Reordering the UpdateExecState() and backend_exec_complete_barrier_
calls prevents this race.

Change-Id: Idafd0b342e77a065be7cc28fa8c8a9df445622c2
Reviewed-on: http://gerrit.cloudera.org:8080/10491
Reviewed-by: Tim Armstrong <ta...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


Project: http://git-wip-us.apache.org/repos/asf/impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/impala/commit/2362b672
Tree: http://git-wip-us.apache.org/repos/asf/impala/tree/2362b672
Diff: http://git-wip-us.apache.org/repos/asf/impala/diff/2362b672

Branch: refs/heads/master
Commit: 2362b672ccd94ed97331fe9c84ac1603ecb3772f
Parents: a22ee64
Author: Tim Armstrong <ta...@cloudera.com>
Authored: Wed May 23 14:03:12 2018 -0700
Committer: Impala Public Jenkins <im...@cloudera.com>
Committed: Thu May 24 19:09:49 2018 +0000

----------------------------------------------------------------------
 be/src/runtime/coordinator.cc | 24 ++++++++++++++++--------
 be/src/runtime/coordinator.h  |  3 +++
 2 files changed, 19 insertions(+), 8 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/impala/blob/2362b672/be/src/runtime/coordinator.cc
----------------------------------------------------------------------
diff --git a/be/src/runtime/coordinator.cc b/be/src/runtime/coordinator.cc
index 998fee2..90e2390 100644
--- a/be/src/runtime/coordinator.cc
+++ b/be/src/runtime/coordinator.cc
@@ -669,25 +669,33 @@ Status Coordinator::UpdateBackendExecStatus(const TReportExecStatusParams& param
 
   if (backend_state->ApplyExecStatusReport(params, &exec_summary_, &progress_)) {
     // This backend execution has completed.
+    if (VLOG_QUERY_IS_ON) {
+      // Don't log backend completion if the query has already been cancelled.
+      int pending_backends = backend_exec_complete_barrier_->pending();
+      if (pending_backends >= 1) {
+        VLOG_QUERY << "Backend completed:"
+                   << " host=" << TNetworkAddressToString(backend_state->impalad_address())
+                   << " remaining=" << pending_backends
+                   << " query_id=" << PrintId(query_id());
+        BackendState::LogFirstInProgress(backend_states_);
+      }
+    }
     bool is_fragment_failure;
     TUniqueId failed_instance_id;
     Status status = backend_state->GetStatus(&is_fragment_failure, &failed_instance_id);
-    int pending_backends = backend_exec_complete_barrier_->Notify();
-    if (VLOG_QUERY_IS_ON && pending_backends >= 0) {
-      VLOG_QUERY << "Backend completed:"
-                 << " host=" << TNetworkAddressToString(backend_state->impalad_address())
-                 << " remaining=" << pending_backends
-                 << " query_id=" << PrintId(query_id());
-      BackendState::LogFirstInProgress(backend_states_);
-    }
     if (!status.ok()) {
       // We may start receiving status reports before all exec rpcs are complete.
       // Can't apply state transition until no more exec rpcs will be sent.
       exec_rpcs_complete_barrier_->Wait();
+      // Transition the status if we're not already in a terminal state. This won't block
+      // because either this transitions to an ERROR state or the query is already in
+      // a terminal state.
       discard_result(UpdateExecState(status,
               is_fragment_failure ? &failed_instance_id : nullptr,
               TNetworkAddressToString(backend_state->impalad_address())));
     }
+    // We've applied all changes from the final status report - notify waiting threads.
+    backend_exec_complete_barrier_->Notify();
   }
   // If all results have been returned, return a cancelled status to force the fragment
   // instance to stop executing.

http://git-wip-us.apache.org/repos/asf/impala/blob/2362b672/be/src/runtime/coordinator.h
----------------------------------------------------------------------
diff --git a/be/src/runtime/coordinator.h b/be/src/runtime/coordinator.h
index ae85bcd..5bb399f 100644
--- a/be/src/runtime/coordinator.h
+++ b/be/src/runtime/coordinator.h
@@ -350,6 +350,9 @@ class Coordinator { // NOLINT: The member variables could be re-ordered to save
   /// the Coordinator object), then finalizes execution (cancels remaining backends if
   /// transitioning to CANCELLED; in all cases releases resources and calls
   /// ComputeQuerySummary()). Must not be called if exec RPCs are pending.
+  /// Will block waiting for backends to completed if transitioning to the
+  /// RETURNED_RESULTS terminal state. Does not block if already in terminal state or
+  /// transitioning to ERROR or CANCELLED.
   void HandleExecStateTransition(const ExecState old_state, const ExecState new_state);
 
   /// Return true if 'exec_state_' is RETURNED_RESULTS.