You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Tim Armstrong (Code Review)" <ge...@cloudera.org> on 2018/02/22 20:13:37 UTC

[Impala-ASF-CR] IMPALA-6564: avoid scanner thread race leading to query failure

Tim Armstrong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9401


Change subject: IMPALA-6564: avoid scanner thread race leading to query failure
......................................................................

IMPALA-6564: avoid scanner thread race leading to query failure

Scanner threads can start running while initial ranges are still being
issued. It is possible that all ranges will be done before all the
initial ranges are issued. In that case, a scanner thread may notice
completion and call SetDone(), which cancels the ReaderContext. This can
cause adding new scan ranges to fail with status CANCELLED.

This scenario is most common when all files are pruned so the scan
finishes very quickly.

This is a minimal fix to the problem that detects this scenario and
does not propagate the status. Ideally we would refactor the
interaction between the threads to be easier to understand.

Testing:
I could reproduce this fairly easily with test_basic_filters on my
scanner buffer pool feature branch. Ran that test in a loop for a while
to make sure it didn't reproduce.

Change-Id: If1a17b8a3586a2f41988f3da673addf2b4565408
---
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node.cc
2 files changed, 14 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/9401/1
-- 
To view, visit http://gerrit.cloudera.org:8080/9401
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: If1a17b8a3586a2f41988f3da673addf2b4565408
Gerrit-Change-Number: 9401
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-6564: avoid scanner thread race leading to query failure

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Hello Dan Hecht, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/9401

to look at the new patch set (#2).

Change subject: IMPALA-6564: avoid scanner thread race leading to query failure
......................................................................

IMPALA-6564: avoid scanner thread race leading to query failure

Scanner threads can start running while initial ranges are still being
issued. It is possible that all ranges will be done before all the
initial ranges are issued. In that case, a scanner thread may notice
completion and call SetDone(), which cancels the ReaderContext. This can
cause adding new scan ranges to fail with status CANCELLED.

This scenario is most common when all files are pruned so the scan
finishes very quickly.

This is a minimal fix to the problem that detects this scenario and
does not propagate the status. Ideally we would refactor the
interaction between the threads to be easier to understand.

Testing:
I could reproduce this fairly easily with test_basic_filters on my
scanner buffer pool feature branch. Ran that test in a loop for a while
to make sure it didn't reproduce.

Change-Id: If1a17b8a3586a2f41988f3da673addf2b4565408
---
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node.cc
2 files changed, 14 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/9401/2
-- 
To view, visit http://gerrit.cloudera.org:8080/9401
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If1a17b8a3586a2f41988f3da673addf2b4565408
Gerrit-Change-Number: 9401
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>

[Impala-ASF-CR] IMPALA-6564: avoid scanner thread race leading to query failure

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has abandoned this change. ( http://gerrit.cloudera.org:8080/9401 )

Change subject: IMPALA-6564: avoid scanner thread race leading to query failure
......................................................................


Abandoned

The bug was actually a subtle issue in my other patch
-- 
To view, visit http://gerrit.cloudera.org:8080/9401
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: If1a17b8a3586a2f41988f3da673addf2b4565408
Gerrit-Change-Number: 9401
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-6564: avoid scanner thread race leading to query failure

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/9401 )

Change subject: IMPALA-6564: avoid scanner thread race leading to query failure
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9401/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9401/2//COMMIT_MSG@13
PS2, Line 13: cause adding new scan ranges to fail with status CANCELLED.
I don't follow this given this code:

 // Case 4. We have not issued the initial ranges so don't start a scanner thread.
  // Issuing ranges will call this function and we'll start the scanner threads then.
  // TODO: It would be good to have a test case for that.
  if (!initial_ranges_issued_) return;



-- 
To view, visit http://gerrit.cloudera.org:8080/9401
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If1a17b8a3586a2f41988f3da673addf2b4565408
Gerrit-Change-Number: 9401
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Comment-Date: Thu, 22 Feb 2018 20:26:55 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-6564: avoid scanner thread race leading to query failure

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9401 )

Change subject: IMPALA-6564: avoid scanner thread race leading to query failure
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9401/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9401/2//COMMIT_MSG@13
PS2, Line 13: cause adding new scan ranges to fail with status CANCELLED.
> I don't follow this given this code:
initial_ranges_issued_ is set at the start of IssueInitialRanges() before the ranges are actually issued (and there's no locking around it).

Effectively now it is preventing scanner threads from starting before GetNext() is called, but doesn't prevent scanner threads from starting before all the ranges are issued.

I think there's a better fix in restructuring that bit of logic but I didn't want to start pulling on that bit of thread right now



-- 
To view, visit http://gerrit.cloudera.org:8080/9401
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If1a17b8a3586a2f41988f3da673addf2b4565408
Gerrit-Change-Number: 9401
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Thu, 22 Feb 2018 21:01:49 +0000
Gerrit-HasComments: Yes