You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Qifan Chen (Code Review)" <ge...@cloudera.org> on 2021/10/19 17:35:06 UTC

[Impala-ASF-CR] [WIP] IMPALA-10967 Load data should handle AWS NLB-type timeout

Qifan Chen has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17955


Change subject: [WIP] IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................

[WIP] IMPALA-10967 Load data should handle AWS NLB-type timeout

This patch addresses Impala client hang due to AWS network load balancer
timeout which is fixed at 350s. When some long data loading operations
are executing and the timeout happens, AWS silently drops the connection
and the Impala client enters the hang state.

The fix maintains the current TCLIService protocol between the client
and Impala server and utilizes a separate thread to run the data loading
and metadata refresh operation. Since this thread is waited for in a
wait thread which runs asynchronously, the execution of the entire
operation will not cause a wait on the Impala client. The Impala client
can check the status of the operation via repeated GetOperationStatus()
call.

Testing: [TBD]

Change-Id: I8c2437e9894510204303ec07710cad60102c8821
---
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
2 files changed, 67 insertions(+), 29 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/17955/1
-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 1
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>

[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................


Patch Set 4:

Rebase.


-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 4
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Comment-Date: Sat, 23 Oct 2021 00:57:32 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................


Patch Set 9:

Fix qualified_table_name in test_load.py.


-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 9
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Comment-Date: Tue, 26 Oct 2021 18:24:22 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................

IMPALA-10967 Load data should handle AWS NLB-type timeout

This patch addresses Impala client hang due to AWS network load balancer
timeout which is fixed at 350s. When some long data loading operations
are executing and the timeout happens, AWS silently drops the connection
and the Impala client enters the hang state.

The fix maintains the current TCLIService protocol between the client
and Impala server and utilizes a separate thread to run the data loading
and metadata refresh operation. Since this thread is waited for in a
wait thread which runs asynchronously, the execution of the entire
operation will not cause a wait on the Impala client. The Impala client
can check the status of the operation via repeated GetOperationStatus()
call.

External behavior change:
  1. A new query option 'enable_async_load_data_execution', default to
     true, is added. It can be set to false to turn off the patch.

Testing:
  1. Added a new test in test_load.py to verify that the asynchronous
     execution in BE keeps the session live for hs2, hs2-http and
     beeswax three clients;
  2. Ran core tests successfully.

Change-Id: I8c2437e9894510204303ec07710cad60102c8821
---
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M tests/metadata/test_load.py
7 files changed, 192 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/17955/8
-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 8
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>

[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................

IMPALA-10967 Load data should handle AWS NLB-type timeout

This patch addresses Impala client hang due to AWS network load balancer
timeout which is fixed at 350s. When some long data loading operations
are executing and the timeout happens, AWS silently drops the connection
and the Impala client enters the hang state.

The fix maintains the current TCLIService protocol between the client
and Impala server and utilizes a separate thread to run the data loading
and metadata refresh operation. Since this thread is waited for in a
wait thread which runs asynchronously, the execution of the entire
operation will not cause a wait on the Impala client. The Impala client
can check the status of the operation via repeated GetOperationStatus()
call.

Testing:
  1. Added a new test in load.test to verify that the asynchronous
     execution in BE keeps the session live;
  2. Ran core tests successfully.

Change-Id: I8c2437e9894510204303ec07710cad60102c8821
---
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M testdata/workloads/functional-query/queries/QueryTest/load.test
3 files changed, 88 insertions(+), 29 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/17955/4
-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 4
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>

[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................

IMPALA-10967 Load data should handle AWS NLB-type timeout

This patch addresses Impala client hang due to AWS network load balancer
timeout which is fixed at 350s. When some long data loading operations
are executing and the timeout happens, AWS silently drops the connection
and the Impala client enters the hang state.

The fix maintains the current TCLIService protocol between the client
and Impala server and utilizes a separate thread to run the data loading
and metadata refresh operation. Since this thread is waited for in a
wait thread which runs asynchronously, the execution of the entire
operation will not cause a wait on the Impala client. The Impala client
can check the status of the operation via repeated GetOperationStatus()
call.

External behavior change:
  1. A new query option 'enable_async_load_data_execution', default to
     true, is added. It can be set to false to turn off the patch.

Testing:
  1. Added a new test in test_load.py to verify that the asynchronous
     execution in BE keeps the session live for hs2, hs2-http and
     beeswax three clients;
  2. Ran core tests successfully.

Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Reviewed-on: http://gerrit.cloudera.org:8080/17955
Reviewed-by: Joe McDonnell <jo...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M tests/metadata/test_load.py
7 files changed, 192 insertions(+), 31 deletions(-)

Approvals:
  Joe McDonnell: Looks good to me, approved
  Impala Public Jenkins: Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 10
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>

[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................

IMPALA-10967 Load data should handle AWS NLB-type timeout

This patch addresses Impala client hang due to AWS network load balancer
timeout which is fixed at 350s. When some long data loading operations
are executing and the timeout happens, AWS silently drops the connection
and the Impala client enters the hang state.

The fix maintains the current TCLIService protocol between the client
and Impala server and utilizes a separate thread to run the data loading
and metadata refresh operation. Since this thread is waited for in a
wait thread which runs asynchronously, the execution of the entire
operation will not cause a wait on the Impala client. The Impala client
can check the status of the operation via repeated GetOperationStatus()
call.

External behavior change:
  1. A new query option 'enable_async_load_data_execution', default to
     true, is added. It can be set to false to turn off the patch.

Testing:
  1. Added a new test in test_load.py to verify that the asynchronous
     execution in BE keeps the session live for hs2, hs2-http and
     beeswax three clients;
  2. Ran core tests successfully.

Change-Id: I8c2437e9894510204303ec07710cad60102c8821
---
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M tests/metadata/test_load.py
7 files changed, 193 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/17955/7
-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 7
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>

[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................


Patch Set 9: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7569/


-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 9
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Comment-Date: Wed, 27 Oct 2021 03:07:28 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................


Patch Set 8:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9667/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 8
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Comment-Date: Tue, 26 Oct 2021 18:39:39 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

Posted by "Joe McDonnell (Code Review)" <ge...@cloudera.org>.
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................


Patch Set 9: Code-Review+2

(1 comment)

Thanks for the changes, this is looking good to me.

http://gerrit.cloudera.org:8080/#/c/17955/9/tests/metadata/test_load.py
File tests/metadata/test_load.py:

http://gerrit.cloudera.org:8080/#/c/17955/9/tests/metadata/test_load.py@139
PS9, Line 139:     # Form a fully qualified table name with '-' in protocol 'hs2-http' dropped as
             :     # '-' is not allowed in Impala table name even delimited with ``.
             :     qualified_table_name = '{0}.{1}_{2}_{3}'.format(unique_database, TEST_TBL_NOPART,
             :         protocol if protocol != 'hs2-http' else 'hs2http', enable_async_load_data)
             : 
             :     # Form a staging path that is protocol and enable_async_load_data dependent to
             :     # allow parallel creating distinct HDFS directories for each test object.
             :     staging_path = "{0}_{1}_{2}".format(STAGING_PATH, protocol, enable_async_load_data)
Just so you know: the unique_database name has one part that is based on the test name and another part that is a unique identifier based on the complete pytest description (which includes parameters from test dimensions).

https://github.com/apache/impala/blob/master/tests/conftest.py#L265

The second part makes it generally safe to assume that multiple runs of the same test with different parameters will have different unique databases. Tests often rely on that and keep the table names simple. Tests can also use the actual unique database directory as a unique place. Here's an example of some code that is using the database directory name:
https://github.com/apache/impala/blob/master/tests/common/file_utils.py#L49-L50

This is purely informational, in case it is useful later.



-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 9
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Comment-Date: Tue, 26 Oct 2021 18:56:50 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................


Patch Set 8:

(3 comments)

Fix format errors.

http://gerrit.cloudera.org:8080/#/c/17955/7/tests/metadata/test_load.py
File tests/metadata/test_load.py:

http://gerrit.cloudera.org:8080/#/c/17955/7/tests/metadata/test_load.py@23
PS7, Line 23: from tests.common.impala_test_suite import ImpalaTestSuite
> flake8: F401 'pytest' imported but unused
Done


http://gerrit.cloudera.org:8080/#/c/17955/7/tests/metadata/test_load.py@122
PS7, Line 122: #
> flake8: E122 continuation line missing indentation or outdented
Done


http://gerrit.cloudera.org:8080/#/c/17955/7/tests/metadata/test_load.py@142
PS7, Line 142: )
> flake8: E502 the backslash is redundant between brackets
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 8
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Comment-Date: Tue, 26 Oct 2021 18:17:45 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................


Patch Set 6:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9658/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 6
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Oct 2021 20:10:12 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................


Patch Set 2:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9627/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 2
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Comment-Date: Tue, 19 Oct 2021 19:38:40 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17955/5/tests/metadata/test_load.py
File tests/metadata/test_load.py:

http://gerrit.cloudera.org:8080/#/c/17955/5/tests/metadata/test_load.py@20
PS5, Line 20: import sys
flake8: F401 'sys' imported but unused


http://gerrit.cloudera.org:8080/#/c/17955/5/tests/metadata/test_load.py@110
PS5, Line 110: @SkipIfLocal.hdfs_client
flake8: E302 expected 2 blank lines, found 1



-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 5
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Oct 2021 19:38:57 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................


Patch Set 9:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7572/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 9
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Comment-Date: Wed, 27 Oct 2021 12:49:19 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17955/9/tests/metadata/test_load.py
File tests/metadata/test_load.py:

http://gerrit.cloudera.org:8080/#/c/17955/9/tests/metadata/test_load.py@139
PS9, Line 139:     # Form a fully qualified table name with '-' in protocol 'hs2-http' dropped as
             :     # '-' is not allowed in Impala table name even delimited with ``.
             :     qualified_table_name = '{0}.{1}_{2}_{3}'.format(unique_database, TEST_TBL_NOPART,
             :         protocol if protocol != 'hs2-http' else 'hs2http', enable_async_load_data)
             : 
             :     # Form a staging path that is protocol and enable_async_load_data dependent to
             :     # allow parallel creating distinct HDFS directories for each test object.
             :     staging_path = "{0}_{1}_{2}".format(STAGING_PATH, protocol, enable_async_load_data)
> Just so you know: the unique_database name has one part that is based on th
Noted.



-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 9
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Comment-Date: Wed, 27 Oct 2021 12:46:53 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................


Patch Set 7:

(3 comments)

Thanks a lot Joe for the review comments!

http://gerrit.cloudera.org:8080/#/c/17955/6/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/17955/6/be/src/service/client-request-state.cc@769
PS6, Line 769:   if (exec_in_worker_thread) {
             :     DCHECK(exec_state() == ExecState::PENDING);
> Nit: My only thought here is that I do like it when these statements are ri
Done


http://gerrit.cloudera.org:8080/#/c/17955/6/tests/metadata/test_load.py
File tests/metadata/test_load.py:

http://gerrit.cloudera.org:8080/#/c/17955/6/tests/metadata/test_load.py@111
PS6, Line 111: @SkipIfLocal.hdfs_client
> One thing about subclassing TestLoadData is that TestAsyncLoadData will get
Made this class a direct subclass of ImpalaTestSuite as it helps reduce some unnecessary dependency too. 

Done.


http://gerrit.cloudera.org:8080/#/c/17955/6/tests/metadata/test_load.py@122
PS6, Line 122:     create_uncompressed_text_dimension(cls.get_workload()))
> Nice to have: When possible, we want to structure tests to allow them to ex
Yeah, the test time is longer with serial mode. Good to know the limitation with unique_database + setup/teardown. 

Formed protocol and async_ddl_exec dependent staging path name which worked pretty well. It has to be within each test_async_load() to eliminate the sharing of staging path completely.  

The test time is about 1min now.



-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 7
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Comment-Date: Tue, 26 Oct 2021 18:15:21 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................


Patch Set 5:

Added the logic to disable the feature, and a new state/timing test in test_load.py.


-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 5
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Oct 2021 19:38:41 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................


Patch Set 7:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9666/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 7
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Comment-Date: Tue, 26 Oct 2021 18:36:52 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................


Patch Set 9:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9668/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 9
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Comment-Date: Tue, 26 Oct 2021 18:45:31 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................

IMPALA-10967 Load data should handle AWS NLB-type timeout

This patch addresses Impala client hang due to AWS network load balancer
timeout which is fixed at 350s. When some long data loading operations
are executing and the timeout happens, AWS silently drops the connection
and the Impala client enters the hang state.

The fix maintains the current TCLIService protocol between the client
and Impala server and utilizes a separate thread to run the data loading
and metadata refresh operation. Since this thread is waited for in a
wait thread which runs asynchronously, the execution of the entire
operation will not cause a wait on the Impala client. The Impala client
can check the status of the operation via repeated GetOperationStatus()
call.

External behavior change:
  1. A new query option 'enable_async_load_data_execution', default to
     true, is added. It can be set to false to turn off the patch.

Testing:
  1. Added a new test in test_load.py to verify that the asynchronous
     execution in BE keeps the session live;
  2. Ran core tests successfully.

Change-Id: I8c2437e9894510204303ec07710cad60102c8821
---
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M tests/metadata/test_load.py
7 files changed, 170 insertions(+), 32 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/17955/5
-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 5
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>

[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................


Patch Set 9: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 9
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Comment-Date: Wed, 27 Oct 2021 19:01:55 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] [WIP] IMPALA-10967 Load data should handle AWS NLB-type timeout

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17955 )

Change subject: [WIP] IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................


Patch Set 1:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9625/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 1
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Comment-Date: Tue, 19 Oct 2021 17:57:57 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................


Patch Set 2:

Add a new test in load.test.


-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 2
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Comment-Date: Tue, 19 Oct 2021 19:16:51 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] [WIP] IMPALA-10967 Load data should handle AWS NLB-type timeout

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17955 )

Change subject: [WIP] IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................


Patch Set 1:

1st draft.


-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 1
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Comment-Date: Tue, 19 Oct 2021 17:35:18 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................


Patch Set 6:

Fix format error in test_load.py.


-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 6
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Oct 2021 19:50:30 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................


Patch Set 5:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9657/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 5
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Oct 2021 19:59:25 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has uploaded a new patch set (#9). ( http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................

IMPALA-10967 Load data should handle AWS NLB-type timeout

This patch addresses Impala client hang due to AWS network load balancer
timeout which is fixed at 350s. When some long data loading operations
are executing and the timeout happens, AWS silently drops the connection
and the Impala client enters the hang state.

The fix maintains the current TCLIService protocol between the client
and Impala server and utilizes a separate thread to run the data loading
and metadata refresh operation. Since this thread is waited for in a
wait thread which runs asynchronously, the execution of the entire
operation will not cause a wait on the Impala client. The Impala client
can check the status of the operation via repeated GetOperationStatus()
call.

External behavior change:
  1. A new query option 'enable_async_load_data_execution', default to
     true, is added. It can be set to false to turn off the patch.

Testing:
  1. Added a new test in test_load.py to verify that the asynchronous
     execution in BE keeps the session live for hs2, hs2-http and
     beeswax three clients;
  2. Ran core tests successfully.

Change-Id: I8c2437e9894510204303ec07710cad60102c8821
---
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M tests/metadata/test_load.py
7 files changed, 192 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/17955/9
-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 9
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>

[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................


Patch Set 7:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/17955/7/tests/metadata/test_load.py
File tests/metadata/test_load.py:

http://gerrit.cloudera.org:8080/#/c/17955/7/tests/metadata/test_load.py@23
PS7, Line 23: import pytest
flake8: F401 'pytest' imported but unused


http://gerrit.cloudera.org:8080/#/c/17955/7/tests/metadata/test_load.py@122
PS7, Line 122: c
flake8: E122 continuation line missing indentation or outdented


http://gerrit.cloudera.org:8080/#/c/17955/7/tests/metadata/test_load.py@142
PS7, Line 142: \
flake8: E502 the backslash is redundant between brackets



-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 7
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Comment-Date: Tue, 26 Oct 2021 18:14:44 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................


Patch Set 9:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7569/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 9
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Comment-Date: Tue, 26 Oct 2021 20:58:04 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................

IMPALA-10967 Load data should handle AWS NLB-type timeout

This patch addresses Impala client hang due to AWS network load balancer
timeout which is fixed at 350s. When some long data loading operations
are executing and the timeout happens, AWS silently drops the connection
and the Impala client enters the hang state.

The fix maintains the current TCLIService protocol between the client
and Impala server and utilizes a separate thread to run the data loading
and metadata refresh operation. Since this thread is waited for in a
wait thread which runs asynchronously, the execution of the entire
operation will not cause a wait on the Impala client. The Impala client
can check the status of the operation via repeated GetOperationStatus()
call.

Testing:
  1. Add a new test in load.test to verify the asynchronous execution in
     BE keeps the session live;
  2. Core tests [TBD]

Change-Id: I8c2437e9894510204303ec07710cad60102c8821
---
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M testdata/workloads/functional-query/queries/QueryTest/load.test
3 files changed, 88 insertions(+), 29 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/17955/2
-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 2
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>

[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................


Patch Set 4:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9645/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 4
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Comment-Date: Sat, 23 Oct 2021 01:19:01 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................

IMPALA-10967 Load data should handle AWS NLB-type timeout

This patch addresses Impala client hang due to AWS network load balancer
timeout which is fixed at 350s. When some long data loading operations
are executing and the timeout happens, AWS silently drops the connection
and the Impala client enters the hang state.

The fix maintains the current TCLIService protocol between the client
and Impala server and utilizes a separate thread to run the data loading
and metadata refresh operation. Since this thread is waited for in a
wait thread which runs asynchronously, the execution of the entire
operation will not cause a wait on the Impala client. The Impala client
can check the status of the operation via repeated GetOperationStatus()
call.

External behavior change:
  1. A new query option 'enable_async_load_data_execution', default to
     true, is added. It can be set to false to turn off the patch.

Testing:
  1. Added a new test in test_load.py to verify that the asynchronous
     execution in BE keeps the session live;
  2. Ran core tests successfully.

Change-Id: I8c2437e9894510204303ec07710cad60102c8821
---
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M tests/metadata/test_load.py
7 files changed, 170 insertions(+), 32 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/17955/6
-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 6
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>

[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

Posted by "Joe McDonnell (Code Review)" <ge...@cloudera.org>.
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
......................................................................


Patch Set 6:

(3 comments)

Thanks for adding the tests and query option.

This is looking good, I only have a couple small comments.

http://gerrit.cloudera.org:8080/#/c/17955/6/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/17955/6/be/src/service/client-request-state.cc@769
PS6, Line 769:   DebugActionNoFail(
             :       exec_request_->query_options, "CRS_DELAY_BEFORE_LOAD_DATA");
Nit: My only thought here is that I do like it when these statements are right next to the statement that we are simulating the delay about (in this case frontend_->LoadData()).


http://gerrit.cloudera.org:8080/#/c/17955/6/tests/metadata/test_load.py
File tests/metadata/test_load.py:

http://gerrit.cloudera.org:8080/#/c/17955/6/tests/metadata/test_load.py@111
PS6, Line 111: class TestAsyncLoadData(TestLoadData):
One thing about subclassing TestLoadData is that TestAsyncLoadData will get its own copy of test_load() from TestLoadData. When those copies execute in parallel, things can go wrong.

One way out is to create a TestLoadDataBase that contains the pieces you need to share, and then subclass for both TestLoadData and TestAsyncLoadData.


http://gerrit.cloudera.org:8080/#/c/17955/6/tests/metadata/test_load.py@122
PS6, Line 122:   @pytest.mark.execute_serially   # To avoid file copy failure: dst file does not exist
Nice to have: When possible, we want to structure tests to allow them to execute in parallel. I ran test_async_load locally in its 6 variations, and it took about 6 minutes. I think a decent chunk of that was setup/teardown and not the test itself.

In this case, it would involve replacing STAGING_PATH with something under the unique_database directory (and populating it with some files, etc). Unfortunately, unique_database doesn't really work with setup_method/teardown_method, so it would need some rework of populating the directory.



-- 
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 6
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Oct 2021 21:51:31 +0000
Gerrit-HasComments: Yes