You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org> on 2024/03/12 18:58:32 UTC

[Impala-ASF-CR] WIP IMPALA-12896: Avoid JDBC table to be set as transactional table

Wenzhe Zhou has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21141


Change subject: WIP IMPALA-12896: Avoid JDBC table to be set as transactional table
......................................................................

WIP IMPALA-12896: Avoid JDBC table to be set as transactional table

In some deployment environment, JDBC tables are set as transactional
tables by default. This causes catalogd failed to load the metadata for
JDBC tables. This patch explicitly add table properties with
"transactional=false" for JDBC table to avoid the JDBC to be set as
transactional table.

The operations on JDBC table are processed only on coordinator. The
processed rows should be estimated as 0 for DataSourceScanNode by
planner so that coordinator-only query plans are generated for simple
queries on JDBC tables and queries could be executed without invoking
executor nodes.

Updates FileSystemUtil.copyFileFromUriToLocal() function to write log
message for all types of exceptions.

Testing:
 - TODO: update planer tests for data source tables.
 - TODO: pass core -tests.

Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
---
M fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java
M tests/custom_cluster/test_ext_data_sources.py
M tests/query_test/test_ext_data_sources.py
5 files changed, 22 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/21141/1
-- 
To view, visit http://gerrit.cloudera.org:8080/21141
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
Gerrit-Change-Number: 21141
Gerrit-PatchSet: 1
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>

[Impala-ASF-CR] IMPALA-12896: Avoid JDBC table to be set as transactional table

Posted by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org>.
Wenzhe Zhou has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/21141 )

Change subject: IMPALA-12896: Avoid JDBC table to be set as transactional table
......................................................................

IMPALA-12896: Avoid JDBC table to be set as transactional table

In some deployment environment, JDBC tables are set as transactional
tables by default. This causes catalogd failed to load the metadata for
JDBC tables. This patch explicitly add table properties with
"transactional=false" for JDBC table to avoid the JDBC to be set as
transactional table.

The operations on JDBC table are processed only on coordinator. The
processed rows should be estimated as 0 for DataSourceScanNode by
planner so that coordinator-only query plans are generated for simple
queries on JDBC tables and queries could be executed without invoking
executor nodes. Also adds Preconditions.check to make sure numNodes
equals 1 for DataSourceScanNode.

Updates FileSystemUtil.copyFileFromUriToLocal() function to write log
message for all types of exceptions.

Testing:
 - Fixed planer tests for data source tables.
 - Passed core-tests.

Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
---
M fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java
M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M testdata/workloads/functional-planner/queries/PlannerTest/small-query-opt.test
M tests/custom_cluster/test_ext_data_sources.py
M tests/query_test/test_ext_data_sources.py
7 files changed, 33 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/21141/2
-- 
To view, visit http://gerrit.cloudera.org:8080/21141
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
Gerrit-Change-Number: 21141
Gerrit-PatchSet: 2
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>

[Impala-ASF-CR] IMPALA-12896: Avoid JDBC table to be set as transactional table

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21141 )

Change subject: IMPALA-12896: Avoid JDBC table to be set as transactional table
......................................................................


Patch Set 3: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/21141
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
Gerrit-Change-Number: 21141
Gerrit-PatchSet: 3
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Wed, 13 Mar 2024 08:01:50 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-12896: Avoid JDBC table to be set as transactional table

Posted by "Riza Suminto (Code Review)" <ge...@cloudera.org>.
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21141 )

Change subject: IMPALA-12896: Avoid JDBC table to be set as transactional table
......................................................................


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java:

http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@809
PS1, Line 809: Exception
> IOException is subclass of Exception. We found the exception was not catche
nit: I'd suggest to narrow it down to subclass of Exception that is actually thrown, but I guess this is fine too to catch all.


http://gerrit.cloudera.org:8080/#/c/21141/2/tests/custom_cluster/test_ext_data_sources.py
File tests/custom_cluster/test_ext_data_sources.py:

http://gerrit.cloudera.org:8080/#/c/21141/2/tests/custom_cluster/test_ext_data_sources.py@29
PS2, Line 29: 
            : class TestExtDataSources(CustomClusterTestSuite):
            :   """Impala query tests for external data sources."""
You can add_test_dimension to apply exec_single_node_rows_threshold = 100 for all tests here.

from tests.common.test_dimensions import create_exec_option_dimension

...

  @classmethod
  def add_test_dimensions(cls):
    super(TestExtDataSources, cls).add_test_dimensions()
    cls.ImpalaTestMatrix.add_dimension(create_exec_option_dimension(
      exec_single_node_option=[100]))

Similarly for TestImpalaExtJdbcTables.
L52 and L254 then can be removed.



-- 
To view, visit http://gerrit.cloudera.org:8080/21141
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
Gerrit-Change-Number: 21141
Gerrit-PatchSet: 2
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Tue, 12 Mar 2024 23:12:11 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-12896: Avoid JDBC table to be set as transactional table

Posted by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org>.
Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/21141 )

Change subject: IMPALA-12896: Avoid JDBC table to be set as transactional table
......................................................................


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java:

http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@809
PS1, Line 809: Exception
> nit: I'd suggest to narrow it down to subclass of Exception that is
 > actually thrown, but I guess this is fine too to catch all.

I will keep the code to catch all exception.


http://gerrit.cloudera.org:8080/#/c/21141/2/tests/custom_cluster/test_ext_data_sources.py
File tests/custom_cluster/test_ext_data_sources.py:

http://gerrit.cloudera.org:8080/#/c/21141/2/tests/custom_cluster/test_ext_data_sources.py@29
PS2, Line 29: 
            : 
            : class TestExtDataSources(CustomClusterTestSuite):
> You can add_test_dimension to apply exec_single_node_rows_threshold = 100 f
Fixed as suggested. Thanks



-- 
To view, visit http://gerrit.cloudera.org:8080/21141
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
Gerrit-Change-Number: 21141
Gerrit-PatchSet: 3
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Wed, 13 Mar 2024 02:42:36 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] WIP IMPALA-12896: Avoid JDBC table to be set as transactional table

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21141 )

Change subject: WIP IMPALA-12896: Avoid JDBC table to be set as transactional table
......................................................................


Patch Set 1:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/15488/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/21141
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
Gerrit-Change-Number: 21141
Gerrit-PatchSet: 1
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Tue, 12 Mar 2024 19:23:12 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-12896: Avoid JDBC table to be set as transactional table

Posted by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org>.
Wenzhe Zhou has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/21141 )

Change subject: IMPALA-12896: Avoid JDBC table to be set as transactional table
......................................................................

IMPALA-12896: Avoid JDBC table to be set as transactional table

In some deployment environment, JDBC tables are set as transactional
tables by default. This causes catalogd failed to load the metadata for
JDBC tables. This patch explicitly add table properties with
"transactional=false" for JDBC table to avoid the JDBC to be set as
transactional table.

The operations on JDBC table are processed only on coordinator. The
processed rows should be estimated as 0 for DataSourceScanNode by
planner so that coordinator-only query plans are generated for simple
queries on JDBC tables and queries could be executed without invoking
executor nodes. Also adds Preconditions.check to make sure numNodes
equals 1 for DataSourceScanNode.

Updates FileSystemUtil.copyFileFromUriToLocal() function to write log
message for all types of exceptions.

Testing:
 - Fixed planer tests for data source tables.
 - Ran end-to-end tests of JDBC tables with query option
   'exec_single_node_rows_threshold' as default value 100.
 - Passed core-tests.

Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
Reviewed-on: http://gerrit.cloudera.org:8080/21141
Reviewed-by: Riza Suminto <ri...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java
M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M testdata/workloads/functional-planner/queries/PlannerTest/small-query-opt.test
M tests/custom_cluster/test_ext_data_sources.py
M tests/query_test/test_ext_data_sources.py
7 files changed, 42 insertions(+), 13 deletions(-)

Approvals:
  Riza Suminto: Looks good to me, approved
  Impala Public Jenkins: Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/21141
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
Gerrit-Change-Number: 21141
Gerrit-PatchSet: 4
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>

[Impala-ASF-CR] WIP IMPALA-12896: Avoid JDBC table to be set as transactional table

Posted by "Riza Suminto (Code Review)" <ge...@cloudera.org>.
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21141 )

Change subject: WIP IMPALA-12896: Avoid JDBC table to be set as transactional table
......................................................................


Patch Set 1:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java:

http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@809
PS1, Line 809: Exception
Why this is widen from only catching IOException to all Exception?


http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java
File fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java:

http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java@36
PS1, Line 36: // Max number of rows processed across all instances of a plan node.
            :   private long maxRowsProcessed_ ;
            : 
            :   // Max number of rows processed per backend impala daemon for a plan node.
            :   private long maxRowsProcessedPerNode_;
This a good chance to explicitly init these to 0.


http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java@49
PS1, Line 49: // Operations on DataSourceScanNode are processed on coordinator.
            :       maxRowsProcessed_ = Math.max(maxRowsProcessed_, 0);
            :       maxRowsProcessedPerNode_ = Math.max(maxRowsProcessedPerNode_, 0);
Is this basically intended to be a no-op?
If yes, probably better to leave it empty and just add Preconditions that numNodes == 1.

Just like the very last else branch, this should check if caller is valid and return early when it is invalid.

      if ((in == -1) || (out == -1)) {
        valid_ = false;
        return;
      }

At the end of this method, you can add instead add Preconditions that verify both maxRowsProcessed_ and maxRowsProcessedPerNode_ are always >= 0 after each visit.


http://gerrit.cloudera.org:8080/#/c/21141/1/tests/custom_cluster/test_ext_data_sources.py
File tests/custom_cluster/test_ext_data_sources.py:

http://gerrit.cloudera.org:8080/#/c/21141/1/tests/custom_cluster/test_ext_data_sources.py@51
PS1, Line 51: vector.get_value('exec_option')
Use deepcopy


http://gerrit.cloudera.org:8080/#/c/21141/1/tests/custom_cluster/test_ext_data_sources.py@254
PS1, Line 254: vector.get_value('exec_option')
Use deepcopy



-- 
To view, visit http://gerrit.cloudera.org:8080/21141
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
Gerrit-Change-Number: 21141
Gerrit-PatchSet: 1
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Tue, 12 Mar 2024 19:25:00 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-12896: Avoid JDBC table to be set as transactional table

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21141 )

Change subject: IMPALA-12896: Avoid JDBC table to be set as transactional table
......................................................................


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10374/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/21141
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
Gerrit-Change-Number: 21141
Gerrit-PatchSet: 3
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Wed, 13 Mar 2024 03:09:44 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-12896: Avoid JDBC table to be set as transactional table

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21141 )

Change subject: IMPALA-12896: Avoid JDBC table to be set as transactional table
......................................................................


Patch Set 3:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/15495/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/21141
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
Gerrit-Change-Number: 21141
Gerrit-PatchSet: 3
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Wed, 13 Mar 2024 03:05:42 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-12896: Avoid JDBC table to be set as transactional table

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21141 )

Change subject: IMPALA-12896: Avoid JDBC table to be set as transactional table
......................................................................


Patch Set 2:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/15491/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/21141
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
Gerrit-Change-Number: 21141
Gerrit-PatchSet: 2
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Tue, 12 Mar 2024 23:23:15 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-12896: Avoid JDBC table to be set as transactional table

Posted by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org>.
Wenzhe Zhou has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/21141 )

Change subject: IMPALA-12896: Avoid JDBC table to be set as transactional table
......................................................................

IMPALA-12896: Avoid JDBC table to be set as transactional table

In some deployment environment, JDBC tables are set as transactional
tables by default. This causes catalogd failed to load the metadata for
JDBC tables. This patch explicitly add table properties with
"transactional=false" for JDBC table to avoid the JDBC to be set as
transactional table.

The operations on JDBC table are processed only on coordinator. The
processed rows should be estimated as 0 for DataSourceScanNode by
planner so that coordinator-only query plans are generated for simple
queries on JDBC tables and queries could be executed without invoking
executor nodes. Also adds Preconditions.check to make sure numNodes
equals 1 for DataSourceScanNode.

Updates FileSystemUtil.copyFileFromUriToLocal() function to write log
message for all types of exceptions.

Testing:
 - Fixed planer tests for data source tables.
 - Ran end-to-end tests of JDBC tables with query option
   'exec_single_node_rows_threshold' as default value 100.
 - Passed core-tests.

Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
---
M fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java
M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M testdata/workloads/functional-planner/queries/PlannerTest/small-query-opt.test
M tests/custom_cluster/test_ext_data_sources.py
M tests/query_test/test_ext_data_sources.py
7 files changed, 42 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/21141/3
-- 
To view, visit http://gerrit.cloudera.org:8080/21141
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
Gerrit-Change-Number: 21141
Gerrit-PatchSet: 3
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>

[Impala-ASF-CR] IMPALA-12896: Avoid JDBC table to be set as transactional table

Posted by "Riza Suminto (Code Review)" <ge...@cloudera.org>.
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21141 )

Change subject: IMPALA-12896: Avoid JDBC table to be set as transactional table
......................................................................


Patch Set 3: Code-Review+2

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java
File fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java:

http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java@36
PS1, Line 36: // Max number of rows processed across all instances of a plan node.
            :   private long maxRowsProcessed_ = 0;
            : 
            :   // Max number of rows processed per backend impala daemon for a plan node.
            :   private long maxRowsProcessedPerNode_ 
> fixed
Done


http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java@49
PS1, Line 49: // Operations on DataSourceScanNode are processed on coordinator.
            :       if (fragment == null) {
            :         numNodes = ((DataSourceScanNode)caller).getNumNodes();
> fixed as suggested
Done


http://gerrit.cloudera.org:8080/#/c/21141/2/tests/custom_cluster/test_ext_data_sources.py
File tests/custom_cluster/test_ext_data_sources.py:

http://gerrit.cloudera.org:8080/#/c/21141/2/tests/custom_cluster/test_ext_data_sources.py@29
PS2, Line 29: 
            : 
            : class TestExtDataSources(CustomClusterTestSuite):
> Fixed as suggested. Thanks
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/21141
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
Gerrit-Change-Number: 21141
Gerrit-PatchSet: 3
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Wed, 13 Mar 2024 02:55:55 +0000
Gerrit-HasComments: Yes