You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Yida Wu (Code Review)" <ge...@cloudera.org> on 2021/10/05 23:25:06 UTC

[Impala-ASF-CR] IMPALA-10953 Fix impalad crashes due to TmpFileMgr initialization fails

Yida Wu has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17906


Change subject: IMPALA-10953 Fix impalad crashes due to TmpFileMgr initialization fails
......................................................................

IMPALA-10953 Fix impalad crashes due to TmpFileMgr initialization fails

The impalad can start successfully with an uninitialized TmpFileMgr
when abort_on_config_error is set to false. However in some cases,
this could lead to the use of the uninitialized TmpFileMgr and then
the crash.

The purpose of keeping a failed TmpFileMgr is to allow the users to
use impala even with minor configuration issues because only
spilling the data needs the TmpFileMgr and queries might have a small
chance to spill the data.

Previously we have a DCHECK for the initialization status of the
TmpFileMgr, however it doesn't work for the release version, and
the way of using DCHECK to handle this situation is not good
enough if we need to keep the impala running.

Instead of putting verifications to every function of the
TmpFileMgr, this patch fixes the bug by verifying the initialization
status when creating the TmpFileGroup. It returns an error if
TmpFileMgr fails to initialize. When the QueryState instance
fails to create the TmpFileGroup, it stores a nullptr to the local
temporary file group and fails the query only when it needs to
spill the data.

Moreover, TmpFileGroup can be only created by the TmpFileMgr after
this change.

Tests:
Ran core tests and exhaustive e2e tests.
Added a e2e test: test_scratch_dirs_spill_fails_due_to_init_failure.
Added a unit test: BufferPoolTest::FailInitTmpFileMgr.

Change-Id: I7852bf4206a603e6e2652ce9232077777a06ce67
---
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/query-state.cc
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
M tests/custom_cluster/test_scratch_disk.py
9 files changed, 337 insertions(+), 217 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/17906/1
-- 
To view, visit http://gerrit.cloudera.org:8080/17906
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7852bf4206a603e6e2652ce9232077777a06ce67
Gerrit-Change-Number: 17906
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu <wy...@gmail.com>

[Impala-ASF-CR] IMPALA-10953 Fix impalad crashes due to TmpFileMgr initialization fails

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17906 )

Change subject: IMPALA-10953 Fix impalad crashes due to TmpFileMgr initialization fails
......................................................................


Patch Set 1:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9562/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17906
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7852bf4206a603e6e2652ce9232077777a06ce67
Gerrit-Change-Number: 17906
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu <wy...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Tue, 05 Oct 2021 23:46:29 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10953 Fix impalad crashes due to TmpFileMgr initialization fails

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17906 )

Change subject: IMPALA-10953 Fix impalad crashes due to TmpFileMgr initialization fails
......................................................................


Patch Set 2:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9572/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17906
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7852bf4206a603e6e2652ce9232077777a06ce67
Gerrit-Change-Number: 17906
Gerrit-PatchSet: 2
Gerrit-Owner: Yida Wu <wy...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Thu, 07 Oct 2021 12:41:52 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10953 Fix impalad crashes due to TmpFileMgr initialization fails

Posted by "Yida Wu (Code Review)" <ge...@cloudera.org>.
Yida Wu has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/17906 )

Change subject: IMPALA-10953 Fix impalad crashes due to TmpFileMgr initialization fails
......................................................................

IMPALA-10953 Fix impalad crashes due to TmpFileMgr initialization fails

The impalad can start successfully with an uninitialized TmpFileMgr
when abort_on_config_error is set to false. However in some cases,
this could lead to the use of the uninitialized TmpFileMgr and then
the crash.

The purpose of keeping a failed TmpFileMgr is to allow the users to
use impala even with minor configuration issues because only
spilling the data needs the TmpFileMgr and queries might have a small
chance to spill the data.

Previously we have a DCHECK for the initialization status of the
TmpFileMgr, however it doesn't work for the release version, and
the way of using DCHECK to handle this situation is not good
enough if we need to keep the impala running.

Instead of putting verifications to every function of the
TmpFileMgr, this patch fixes the bug by verifying the initialization
status when creating the TmpFileGroup. It returns an error if
TmpFileMgr fails to initialize. When the QueryState instance
fails to create the TmpFileGroup, it stores a nullptr to the local
temporary file group and fails the query only when it needs to
spill the data.

Moreover, TmpFileGroup can be only created by the TmpFileMgr after
this change.

Tests:
Ran core tests and exhaustive e2e tests.
Added a e2e test: test_scratch_dirs_spill_fails_due_to_init_failure.
Added a unit test: BufferPoolTest::FailInitTmpFileMgr.

Change-Id: I7852bf4206a603e6e2652ce9232077777a06ce67
---
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/query-state.cc
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
M tests/custom_cluster/test_scratch_disk.py
9 files changed, 336 insertions(+), 217 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/17906/2
-- 
To view, visit http://gerrit.cloudera.org:8080/17906
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7852bf4206a603e6e2652ce9232077777a06ce67
Gerrit-Change-Number: 17906
Gerrit-PatchSet: 2
Gerrit-Owner: Yida Wu <wy...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-10953 Fix impalad crashes due to TmpFileMgr initialization fails

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17906 )

Change subject: IMPALA-10953 Fix impalad crashes due to TmpFileMgr initialization fails
......................................................................


Patch Set 1:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/17906/1/tests/custom_cluster/test_scratch_disk.py
File tests/custom_cluster/test_scratch_disk.py:

http://gerrit.cloudera.org:8080/#/c/17906/1/tests/custom_cluster/test_scratch_disk.py@456
PS1, Line 456: ,
flake8: E231 missing whitespace after ','


http://gerrit.cloudera.org:8080/#/c/17906/1/tests/custom_cluster/test_scratch_disk.py@456
PS1, Line 456:  
flake8: E251 unexpected spaces around keyword / parameter equals


http://gerrit.cloudera.org:8080/#/c/17906/1/tests/custom_cluster/test_scratch_disk.py@456
PS1, Line 456:  
flake8: E251 unexpected spaces around keyword / parameter equals


http://gerrit.cloudera.org:8080/#/c/17906/1/tests/custom_cluster/test_scratch_disk.py@466
PS1, Line 466:  
flake8: E251 unexpected spaces around keyword / parameter equals


http://gerrit.cloudera.org:8080/#/c/17906/1/tests/custom_cluster/test_scratch_disk.py@474
PS1, Line 474: r
flake8: F841 local variable 'result' is assigned to but never used


http://gerrit.cloudera.org:8080/#/c/17906/1/tests/custom_cluster/test_scratch_disk.py@476
PS1, Line 476: 
flake8: W391 blank line at end of file



-- 
To view, visit http://gerrit.cloudera.org:8080/17906
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7852bf4206a603e6e2652ce9232077777a06ce67
Gerrit-Change-Number: 17906
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu <wy...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Tue, 05 Oct 2021 23:25:52 +0000
Gerrit-HasComments: Yes