You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by bo...@apache.org on 2022/08/26 13:46:04 UTC

[impala] branch master updated (bf1034140 -> da3d6fc7f)

This is an automated email from the ASF dual-hosted git repository.

boroknagyz pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git


    from bf1034140 IMPALA-11523: Fix test_http_socket_timeout in Docker-based tests
     new 40b9b9cd7 IMPALA-11521: Fix test_binary_type
     new 423b08776 IMPALA-11520: Remove functional.unsupported_types misc test
     new da3d6fc7f IMPALA-11429: Set table owner after creating an Iceberg table

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../apache/impala/service/CatalogOpExecutor.java   | 32 ++++++++++++++++-
 .../queries/QueryTest/binary-type.test             |  2 +-
 .../functional-query/queries/QueryTest/misc.test   | 12 -------
 tests/query_test/test_iceberg.py                   | 42 +++++++++++++++++++---
 tests/query_test/test_queries.py                   |  1 +
 5 files changed, 71 insertions(+), 18 deletions(-)


[impala] 03/03: IMPALA-11429: Set table owner after creating an Iceberg table

Posted by bo...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

boroknagyz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit da3d6fc7f7c656b118bb3570cedf7d7c3158bd0b
Author: Gabor Kaszab <ga...@cloudera.com>
AuthorDate: Tue Aug 9 16:02:17 2022 +0200

    IMPALA-11429: Set table owner after creating an Iceberg table
    
    Iceberg tables are created using Apache Iceberg's API. However,
    currently Iceberg gets the owner of the process running Iceberg for the
    owner of the newly created tables. In our case it's the user running
    catalogd and not the user running the CREATE TABLE statement.
    
    Until the Iceberg API is enhanced to accept an owner parameter for
    table creation, as a workaround this patch adds an extra alter table
    step right after Iceberg table creation.
    
    TestIcebergTable.test_drop_incomplete_table test had to be skipped
    with this implementation because this would run into a known bug where
    Impala runs into an infinite loop when the table location is being
    dropped somewhere during the execution of the first command after
    CREATE TABLE. In this case the ALTER TABLE SET OWNER is going to be
    the first command, so IMPALA-11509 would be triggered in case we
    dropped the table location as this test would.
    
    Testing:
      - Manually creating Iceberg tables and checking the owner.
      - Added one automated test to create Iceberg tables with different
        users and checking the output of DESCRIBE FORMATTED to verify the
        owner.
      - Turned off TestIcebergTable.test_drop_incomplete_table as described
        above.
      - Manually tested when ALTER TABLE fails (hacked Impala code) that
        the table is created properly.
    
    Change-Id: I5cac198a4a53be3599cb582864ee5f8c269202c0
    Reviewed-on: http://gerrit.cloudera.org:8080/18837
    Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
 .../apache/impala/service/CatalogOpExecutor.java   | 32 ++++++++++++++++-
 tests/query_test/test_iceberg.py                   | 42 +++++++++++++++++++---
 2 files changed, 69 insertions(+), 5 deletions(-)

diff --git a/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java b/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
index 1181f2b2e..7e8f866e0 100644
--- a/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
+++ b/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
@@ -209,6 +209,7 @@ import org.apache.impala.thrift.THdfsFileFormat;
 import org.apache.impala.thrift.TIcebergCatalog;
 import org.apache.impala.thrift.TImpalaTableType;
 import org.apache.impala.thrift.TIcebergPartitionSpec;
+import org.apache.impala.thrift.TOwnerType;
 import org.apache.impala.thrift.TPartitionDef;
 import org.apache.impala.thrift.TPartitionKeyValue;
 import org.apache.impala.thrift.TPartitionStats;
@@ -3617,9 +3618,26 @@ public class CatalogOpExecutor {
       Table newTbl = catalog_.addIncompleteTable(newTable.getDbName(),
           newTable.getTableName(), TImpalaTableType.TABLE, tblComment,
           createEventId);
-      LOG.debug("Created a iceberg table {} in catalog with create event Id {} ",
+      LOG.debug("Created an iceberg table {} in catalog with create event Id {} ",
           newTbl.getFullName(), createEventId);
       addTableToCatalogUpdate(newTbl, wantMinimalResult, response.result);
+
+      try {
+        // Currently we create Iceberg tables using the Iceberg API, however, table owner
+        // is hardcoded to be the user running the Iceberg process. In our case it's the
+        // user running Catalogd and not the user running the create table DDL. Hence, an
+        // extra "ALTER TABLE SET OWNER" step is required.
+        setIcebergTableOwnerAfterCreateTable(newTable.getDbName(),
+            newTable.getTableName(), newTable.getOwner());
+        LOG.debug("Table owner has been changed to " + newTable.getOwner());
+      } catch (Exception e) {
+        LOG.warn("Failed to set table owner after creating " +
+            "Iceberg table but the table {} has been created successfully. Reason: {}",
+            newTable.toString(), e);
+        addSummary(response, "Table has been created. However, unable to change table " +
+            "owner to " + newTable.getOwner());
+        return true;
+      }
     } catch (Exception e) {
       if (e instanceof AlreadyExistsException && ifNotExists) {
         addSummary(response, "Table already exists.");
@@ -3635,6 +3653,18 @@ public class CatalogOpExecutor {
     return true;
   }
 
+  private void setIcebergTableOwnerAfterCreateTable(String dbName, String tableName,
+      String newOwner) throws ImpalaException {
+    TAlterTableOrViewSetOwnerParams setOwnerParams = new TAlterTableOrViewSetOwnerParams(
+        TOwnerType.USER, newOwner);
+    TAlterTableParams alterTableParams = new TAlterTableParams(
+        TAlterTableType.SET_OWNER, new TTableName(dbName, tableName));
+    alterTableParams.setSet_owner_params(setOwnerParams);
+    TDdlExecResponse dummyResponse = new TDdlExecResponse();
+    dummyResponse.result = new TCatalogUpdateResult();
+
+    alterTable(alterTableParams, null, true, dummyResponse);
+  }
   /**
    * Creates a new table in the metastore based on the definition of an existing table.
    * No data is copied as part of this process, it is a metadata only operation. If the
diff --git a/tests/query_test/test_iceberg.py b/tests/query_test/test_iceberg.py
index 0fe7c8358..d1995f9a7 100644
--- a/tests/query_test/test_iceberg.py
+++ b/tests/query_test/test_iceberg.py
@@ -31,16 +31,14 @@ import json
 
 from tests.beeswax.impala_beeswax import ImpalaBeeswaxException
 from tests.common.iceberg_test_suite import IcebergTestSuite
-
 from tests.common.skip import SkipIf
-
+from tests.common.file_utils import create_iceberg_table_from_directory
+from tests.shell.util import run_impala_shell_cmd
 from tests.util.filesystem_utils import get_fs_path, IS_HDFS
 from tests.util.get_parquet_metadata import get_parquet_metadata
-from tests.common.file_utils import create_iceberg_table_from_directory
 
 LOG = logging.getLogger(__name__)
 
-
 class TestIcebergTable(IcebergTestSuite):
   """Tests related to Iceberg tables."""
 
@@ -121,10 +119,16 @@ class TestIcebergTable(IcebergTestSuite):
   def test_truncate_iceberg_tables(self, vector, unique_database):
     self.run_test_case('QueryTest/iceberg-truncate', vector, use_db=unique_database)
 
+  # With IMPALA-11429 there is an extra "ALTER TABLE SET OWNER" right after executing
+  # "CREATE TABLE". As a result dropping the table location right after CREATE TABLE will
+  # trigger a known bug: IMPALA-11509. Hence, turning this test off until there is a fix
+  # for this issue. Note, we could add a sleep righ after table creation that could
+  # workaround the above mentioned bug but then we would hit another issue: IMPALA-11502.
   @SkipIf.not_hdfs
   def test_drop_incomplete_table(self, vector, unique_database):
     """Test DROP TABLE when the underlying directory is deleted. In that case table
     loading fails, but we should be still able to drop the table from Impala."""
+    pytest.skip()
     tbl_name = unique_database + ".synchronized_iceberg_tbl"
     cat_location = "/test-warehouse/" + unique_database
     self.client.execute("""create table {0} (i int) stored as iceberg
@@ -745,3 +749,33 @@ class TestIcebergTable(IcebergTestSuite):
   def test_create_table_like_table(self, vector, unique_database):
     self.run_test_case('QueryTest/iceberg-create-table-like-table', vector,
                        use_db=unique_database)
+
+  def test_table_owner(self, vector, unique_database):
+    self.run_table_owner_test(vector, unique_database, "some_random_user")
+    self.run_table_owner_test(vector, unique_database, "another_random_user")
+
+  def run_table_owner_test(self, vector, db_name, user_name):
+    # Create Iceberg table with a given user running the query.
+    tbl_name = "iceberg_table_owner"
+    sql_stmt = 'CREATE TABLE {0}.{1} (i int) STORED AS ICEBERG'.format(
+      db_name, tbl_name)
+    args = ['-u', user_name, '-q', sql_stmt]
+    run_impala_shell_cmd(vector, args)
+
+    # Run DESCRIBE FORMATTED to get the owner of the table.
+    args = ['-q', 'DESCRIBE FORMATTED {0}.{1}'.format(db_name, tbl_name)]
+    results = run_impala_shell_cmd(vector, args)
+    result_rows = results.stdout.strip().split('\n')
+
+    # Find the output row with the owner.
+    owner_row = ""
+    for row in result_rows:
+      if "Owner:" in row:
+        owner_row = row
+    assert owner_row != "", "DESCRIBE output doesn't contain owner" + results.stdout
+    # Verify that the user running the query is the owner of the table.
+    assert user_name in owner_row, "Unexpected owner of Iceberg table. " + \
+      "Expected user name: {0}. Actual output row: {1}".format(user_name, owner_row)
+
+    args = ['-q', 'DROP TABLE {0}.{1}'.format(db_name, tbl_name)]
+    results = run_impala_shell_cmd(vector, args)


[impala] 02/03: IMPALA-11520: Remove functional.unsupported_types misc test

Posted by bo...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

boroknagyz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit 423b0877621b26eb5d198b7258ee1b67b9f51ddb
Author: Tamas Mate <tm...@apache.org>
AuthorDate: Tue Aug 23 11:26:04 2022 +0200

    IMPALA-11520: Remove functional.unsupported_types misc test
    
    IMPALA-9482 added support to the remaining Hive types and removed the
    functional.unsupported_types table. There was a reference remaining in a
    misc test. test_misc is not marked as exhaustive but it only runs in
    exhaustive builds.
    
    Change-Id: I65b6ea5ac742fbcc427ad41741d347558cb7d110
    Reviewed-on: http://gerrit.cloudera.org:8080/18896
    Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
 .../workloads/functional-query/queries/QueryTest/misc.test   | 12 ------------
 tests/query_test/test_queries.py                             |  1 +
 2 files changed, 1 insertion(+), 12 deletions(-)

diff --git a/testdata/workloads/functional-query/queries/QueryTest/misc.test b/testdata/workloads/functional-query/queries/QueryTest/misc.test
index 4bd6db4d4..23df8e706 100644
--- a/testdata/workloads/functional-query/queries/QueryTest/misc.test
+++ b/testdata/workloads/functional-query/queries/QueryTest/misc.test
@@ -118,18 +118,6 @@ SELECT 1 as "``"
 tinyint
 ====
 ---- QUERY
-#  Select from table that contains unsupported primitive types
-SELECT int_col, str_col, bigint_col from functional.unsupported_types
----- RESULTS
-0,'aaaa',0
-1,'bbbb',10
-2,'cccc',20
-NULL,'NULL',NULL
-4,'eeee',40
----- TYPES
-int, string, bigint
-====
----- QUERY
 # where clause is a SlotRef
 SELECT count(*) from alltypes where bool_col
 ---- RESULTS
diff --git a/tests/query_test/test_queries.py b/tests/query_test/test_queries.py
index afd7354e9..7b19dd23a 100644
--- a/tests/query_test/test_queries.py
+++ b/tests/query_test/test_queries.py
@@ -183,6 +183,7 @@ class TestQueries(ImpalaTestSuite):
       pytest.xfail("TODO: Enable with clause tests for hbase")
     self.run_test_case('QueryTest/with-clause', vector)
 
+  # TODO: Although it is not specified this test only runs in exhaustive.
   def test_misc(self, vector):
     table_format = vector.get_value('table_format')
     if table_format.file_format in ['hbase', 'rc', 'parquet', 'kudu']:


[impala] 01/03: IMPALA-11521: Fix test_binary_type

Posted by bo...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

boroknagyz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit 40b9b9cd7525298f80ebab5cc0bd198d507785e5
Author: Tamas Mate <tm...@apache.org>
AuthorDate: Tue Aug 23 11:03:48 2022 +0200

    IMPALA-11521: Fix test_binary_type
    
    Fix test_binary_type typo, it should not reference to an HBase table.
    
    Change-Id: Id41049094b632af6326f6ee9f3886577d1fc5ee6
    Reviewed-on: http://gerrit.cloudera.org:8080/18895
    Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
 testdata/workloads/functional-query/queries/QueryTest/binary-type.test | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/testdata/workloads/functional-query/queries/QueryTest/binary-type.test b/testdata/workloads/functional-query/queries/QueryTest/binary-type.test
index 3b145baa4..1462b4820 100644
--- a/testdata/workloads/functional-query/queries/QueryTest/binary-type.test
+++ b/testdata/workloads/functional-query/queries/QueryTest/binary-type.test
@@ -136,7 +136,7 @@ BIGINT, BIGINT
 ====
 ---- QUERY
 select bb1.id, bb2.id, bb1.binary_col_with_nulls, bb2.binary_col_with_nulls
-  from binary_tbl_big bb1 left join functional_hbase.binary_tbl_big bb2
+  from binary_tbl_big bb1 left join binary_tbl_big bb2
   on bb1.binary_col_with_nulls = bb2.binary_col_with_nulls
   where bb1.id < 3 and bb2.id < 3;
 ---- TYPES