You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Michael Ho (Code Review)" <ge...@cloudera.org> on 2019/02/23 18:08:41 UTC

[Impala-ASF-CR] IMPALA-8239: Fix handling of failed deserialization of row batch

Michael Ho has uploaded this change for review. ( http://gerrit.cloudera.org:8080/12567


Change subject: IMPALA-8239: Fix handling of failed deserialization of row batch
......................................................................

IMPALA-8239: Fix handling of failed deserialization of row batch

Previously, when a row batch failed to be deserialized in the
data stream receiver, we will return the error status to the
sender of the row batch without inserting the row batch. The
receiver will continue to operate without flagging any error.
The assumption is that the sender will eventually cancel the
query upon receiving the failed status.

Normally, when a caller of GetBatch() successfully dequeues a row
batch from the batch queue, it will kick off the draining of the
row batches from the deferred queue into the normal batch queue,
which will further continue the cycle of draining the deferred queue
upon the next call to GetBatch() until the deferred queue becomes empty.

When an error is hit when deserializing a deferred batch to be inserted
into the batch queue, the existing code will simply not insert the row
batch or flag any error. This breaks the cycle of the deferred queue
draining as the batch queue may become empty forever. The caller of
GetBatch() will block indefinitely until the query is cancelled.

The existing code works fine as the expectation is that the query will
be cancelled once the sender receives the error status from the RPC
response. However, this behavior is still not ideal as it lets a query
which has hit a fatal error to hold on to resources for extended period
of time.

This patch fixes the problem by explicitly recording any error during
row batch insertion in an error status object. Callers of GetBatch()
will now also poll for this status object while waiting for row batch
to show up and bail out early if there is any error.

A new test case has been added to simulate the problematic case above.

Change-Id: Iaa74937b046d95484887533be548249e96078755
---
M be/src/exec/exchange-node.cc
M be/src/runtime/data-stream-test.cc
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-mgr.h
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
A tests/custom_cluster/test_exchange_deferred_batches.py
7 files changed, 139 insertions(+), 41 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/12567/1
-- 
To view, visit http://gerrit.cloudera.org:8080/12567
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iaa74937b046d95484887533be548249e96078755
Gerrit-Change-Number: 12567
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho <kw...@cloudera.com>

[Impala-ASF-CR] IMPALA-8239: Fix handling of failed deserialization of row batch

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/12567 )

Change subject: IMPALA-8239: Fix handling of failed deserialization of row batch
......................................................................

IMPALA-8239: Fix handling of failed deserialization of row batch

Previously, when a row batch failed to be deserialized in the
data stream receiver, we will return the error status to the
sender of the row batch without inserting the row batch. The
receiver will continue to operate without flagging any error.
The assumption is that the sender will eventually cancel the
query upon receiving the failed status.

Normally, when a caller of GetBatch() successfully dequeues a row
batch from the batch queue, it will kick off the draining of the
row batches from the deferred queue into the normal batch queue,
which will further continue the cycle of draining the deferred queue
upon the next call to GetBatch() until the deferred queue becomes empty.

When an error is hit when deserializing a deferred batch to be inserted
into the batch queue, the existing code will simply not insert the row
batch or flag any error. This breaks the cycle of the deferred queue
draining as the batch queue may become empty forever. The caller of
GetBatch() will block indefinitely until the query is cancelled.

The existing code works fine as the expectation is that the query will
be cancelled once the sender receives the error status from the RPC
response. However, this behavior is still not ideal as it lets a query
which has hit a fatal error to hold on to resources for extended period
of time.

This patch fixes the problem by explicitly recording any error during
row batch insertion in an error status object. Callers of GetBatch()
will now also poll for this status object while waiting for row batch
to show up and bail out early if there is any error.

A new test case has been added to simulate the problematic case above.

Change-Id: Iaa74937b046d95484887533be548249e96078755
Reviewed-on: http://gerrit.cloudera.org:8080/12567
Reviewed-by: Tim Armstrong <ta...@cloudera.com>
Reviewed-by: Thomas Marshall <tm...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M be/src/exec/exchange-node.cc
M be/src/runtime/data-stream-test.cc
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-mgr.h
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
A tests/custom_cluster/test_exchange_deferred_batches.py
7 files changed, 157 insertions(+), 47 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Thomas Marshall: Looks good to me, approved
  Impala Public Jenkins: Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/12567
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Iaa74937b046d95484887533be548249e96078755
Gerrit-Change-Number: 12567
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-8239: Fix handling of failed deserialization of row batch

Posted by "Michael Ho (Code Review)" <ge...@cloudera.org>.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/12567 )

Change subject: IMPALA-8239: Fix handling of failed deserialization of row batch
......................................................................


Patch Set 4:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/12567/2/tests/custom_cluster/test_exchange_deferred_batches.py
File tests/custom_cluster/test_exchange_deferred_batches.py:

http://gerrit.cloudera.org:8080/#/c/12567/2/tests/custom_cluster/test_exchange_deferred_batches.py@21
PS2, Line 21: 
> flake8: E302 expected 2 blank lines, found 1
Done


http://gerrit.cloudera.org:8080/#/c/12567/3/tests/custom_cluster/test_exchange_deferred_batches.py
File tests/custom_cluster/test_exchange_deferred_batches.py:

http://gerrit.cloudera.org:8080/#/c/12567/3/tests/custom_cluster/test_exchange_deferred_batches.py@21
PS3, Line 21: 
> flake8: E302 expected 2 blank lines, found 1
Done


http://gerrit.cloudera.org:8080/#/c/12567/3/tests/custom_cluster/test_exchange_deferred_batches.py@24
PS3, Line 24: @
> flake8: E303 too many blank lines (2)
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/12567
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa74937b046d95484887533be548249e96078755
Gerrit-Change-Number: 12567
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Feb 2019 23:00:07 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8239: Fix handling of failed deserialization of row batch

Posted by "Michael Ho (Code Review)" <ge...@cloudera.org>.
Hello Thomas Marshall, Lars Volker, Tim Armstrong, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12567

to look at the new patch set (#4).

Change subject: IMPALA-8239: Fix handling of failed deserialization of row batch
......................................................................

IMPALA-8239: Fix handling of failed deserialization of row batch

Previously, when a row batch failed to be deserialized in the
data stream receiver, we will return the error status to the
sender of the row batch without inserting the row batch. The
receiver will continue to operate without flagging any error.
The assumption is that the sender will eventually cancel the
query upon receiving the failed status.

Normally, when a caller of GetBatch() successfully dequeues a row
batch from the batch queue, it will kick off the draining of the
row batches from the deferred queue into the normal batch queue,
which will further continue the cycle of draining the deferred queue
upon the next call to GetBatch() until the deferred queue becomes empty.

When an error is hit when deserializing a deferred batch to be inserted
into the batch queue, the existing code will simply not insert the row
batch or flag any error. This breaks the cycle of the deferred queue
draining as the batch queue may become empty forever. The caller of
GetBatch() will block indefinitely until the query is cancelled.

The existing code works fine as the expectation is that the query will
be cancelled once the sender receives the error status from the RPC
response. However, this behavior is still not ideal as it lets a query
which has hit a fatal error to hold on to resources for extended period
of time.

This patch fixes the problem by explicitly recording any error during
row batch insertion in an error status object. Callers of GetBatch()
will now also poll for this status object while waiting for row batch
to show up and bail out early if there is any error.

A new test case has been added to simulate the problematic case above.

Change-Id: Iaa74937b046d95484887533be548249e96078755
---
M be/src/exec/exchange-node.cc
M be/src/runtime/data-stream-test.cc
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-mgr.h
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
A tests/custom_cluster/test_exchange_deferred_batches.py
7 files changed, 157 insertions(+), 47 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/12567/4
-- 
To view, visit http://gerrit.cloudera.org:8080/12567
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iaa74937b046d95484887533be548249e96078755
Gerrit-Change-Number: 12567
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-8239: Fix handling of failed deserialization of row batch

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/12567 )

Change subject: IMPALA-8239: Fix handling of failed deserialization of row batch
......................................................................


Patch Set 1: Code-Review+1


-- 
To view, visit http://gerrit.cloudera.org:8080/12567
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa74937b046d95484887533be548249e96078755
Gerrit-Change-Number: 12567
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Feb 2019 20:01:39 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8239: Fix handling of failed deserialization of row batch

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12567 )

Change subject: IMPALA-8239: Fix handling of failed deserialization of row batch
......................................................................


Patch Set 4: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/12567
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa74937b046d95484887533be548249e96078755
Gerrit-Change-Number: 12567
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Tue, 26 Feb 2019 03:38:08 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8239: Fix handling of failed deserialization of row batch

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12567 )

Change subject: IMPALA-8239: Fix handling of failed deserialization of row batch
......................................................................


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12567/3/tests/custom_cluster/test_exchange_deferred_batches.py
File tests/custom_cluster/test_exchange_deferred_batches.py:

http://gerrit.cloudera.org:8080/#/c/12567/3/tests/custom_cluster/test_exchange_deferred_batches.py@21
PS3, Line 21: class TestExchangeDeferredBatches(CustomClusterTestSuite):
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/12567/3/tests/custom_cluster/test_exchange_deferred_batches.py@24
PS3, Line 24: @
flake8: E303 too many blank lines (2)



-- 
To view, visit http://gerrit.cloudera.org:8080/12567
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa74937b046d95484887533be548249e96078755
Gerrit-Change-Number: 12567
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Feb 2019 22:31:52 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8239: Fix handling of failed deserialization of row batch

Posted by "Michael Ho (Code Review)" <ge...@cloudera.org>.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/12567 )

Change subject: IMPALA-8239: Fix handling of failed deserialization of row batch
......................................................................


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/12567/1/be/src/runtime/krpc-data-stream-recvr.cc
File be/src/runtime/krpc-data-stream-recvr.cc:

http://gerrit.cloudera.org:8080/#/c/12567/1/be/src/runtime/krpc-data-stream-recvr.cc@118
PS1, Line 118:   // accounting for various counters.
> Maybe document that 'lock_' must be held via 'lock' here and below. Ok to i
Done


http://gerrit.cloudera.org:8080/#/c/12567/1/be/src/runtime/krpc-data-stream-recvr.cc@161
PS1, Line 161: inesrting
> inserting.
Done


http://gerrit.cloudera.org:8080/#/c/12567/1/tests/custom_cluster/test_exchange_deferred_batches.py
File tests/custom_cluster/test_exchange_deferred_batches.py:

http://gerrit.cloudera.org:8080/#/c/12567/1/tests/custom_cluster/test_exchange_deferred_batches.py@21
PS1, Line 21: class TestExchangeDeferredBatches(CustomClusterTestSuite):
> flake8: E302 expected 2 blank lines, found 1
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/12567
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa74937b046d95484887533be548249e96078755
Gerrit-Change-Number: 12567
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Feb 2019 22:22:39 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8239: Fix handling of failed deserialization of row batch

Posted by "Michael Ho (Code Review)" <ge...@cloudera.org>.
Hello Thomas Marshall, Lars Volker, Tim Armstrong, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12567

to look at the new patch set (#2).

Change subject: IMPALA-8239: Fix handling of failed deserialization of row batch
......................................................................

IMPALA-8239: Fix handling of failed deserialization of row batch

Previously, when a row batch failed to be deserialized in the
data stream receiver, we will return the error status to the
sender of the row batch without inserting the row batch. The
receiver will continue to operate without flagging any error.
The assumption is that the sender will eventually cancel the
query upon receiving the failed status.

Normally, when a caller of GetBatch() successfully dequeues a row
batch from the batch queue, it will kick off the draining of the
row batches from the deferred queue into the normal batch queue,
which will further continue the cycle of draining the deferred queue
upon the next call to GetBatch() until the deferred queue becomes empty.

When an error is hit when deserializing a deferred batch to be inserted
into the batch queue, the existing code will simply not insert the row
batch or flag any error. This breaks the cycle of the deferred queue
draining as the batch queue may become empty forever. The caller of
GetBatch() will block indefinitely until the query is cancelled.

The existing code works fine as the expectation is that the query will
be cancelled once the sender receives the error status from the RPC
response. However, this behavior is still not ideal as it lets a query
which has hit a fatal error to hold on to resources for extended period
of time.

This patch fixes the problem by explicitly recording any error during
row batch insertion in an error status object. Callers of GetBatch()
will now also poll for this status object while waiting for row batch
to show up and bail out early if there is any error.

A new test case has been added to simulate the problematic case above.

Change-Id: Iaa74937b046d95484887533be548249e96078755
---
M be/src/exec/exchange-node.cc
M be/src/runtime/data-stream-test.cc
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-mgr.h
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
A tests/custom_cluster/test_exchange_deferred_batches.py
7 files changed, 156 insertions(+), 47 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/12567/2
-- 
To view, visit http://gerrit.cloudera.org:8080/12567
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iaa74937b046d95484887533be548249e96078755
Gerrit-Change-Number: 12567
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-8239: Fix handling of failed deserialization of row batch

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12567 )

Change subject: IMPALA-8239: Fix handling of failed deserialization of row batch
......................................................................


Patch Set 2:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/2233/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/12567
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa74937b046d95484887533be548249e96078755
Gerrit-Change-Number: 12567
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Feb 2019 23:05:31 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8239: Fix handling of failed deserialization of row batch

Posted by "Michael Ho (Code Review)" <ge...@cloudera.org>.
Hello Thomas Marshall, Lars Volker, Tim Armstrong, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12567

to look at the new patch set (#3).

Change subject: IMPALA-8239: Fix handling of failed deserialization of row batch
......................................................................

IMPALA-8239: Fix handling of failed deserialization of row batch

Previously, when a row batch failed to be deserialized in the
data stream receiver, we will return the error status to the
sender of the row batch without inserting the row batch. The
receiver will continue to operate without flagging any error.
The assumption is that the sender will eventually cancel the
query upon receiving the failed status.

Normally, when a caller of GetBatch() successfully dequeues a row
batch from the batch queue, it will kick off the draining of the
row batches from the deferred queue into the normal batch queue,
which will further continue the cycle of draining the deferred queue
upon the next call to GetBatch() until the deferred queue becomes empty.

When an error is hit when deserializing a deferred batch to be inserted
into the batch queue, the existing code will simply not insert the row
batch or flag any error. This breaks the cycle of the deferred queue
draining as the batch queue may become empty forever. The caller of
GetBatch() will block indefinitely until the query is cancelled.

The existing code works fine as the expectation is that the query will
be cancelled once the sender receives the error status from the RPC
response. However, this behavior is still not ideal as it lets a query
which has hit a fatal error to hold on to resources for extended period
of time.

This patch fixes the problem by explicitly recording any error during
row batch insertion in an error status object. Callers of GetBatch()
will now also poll for this status object while waiting for row batch
to show up and bail out early if there is any error.

A new test case has been added to simulate the problematic case above.

Change-Id: Iaa74937b046d95484887533be548249e96078755
---
M be/src/exec/exchange-node.cc
M be/src/runtime/data-stream-test.cc
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-mgr.h
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
A tests/custom_cluster/test_exchange_deferred_batches.py
7 files changed, 157 insertions(+), 47 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/12567/3
-- 
To view, visit http://gerrit.cloudera.org:8080/12567
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iaa74937b046d95484887533be548249e96078755
Gerrit-Change-Number: 12567
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-8239: Fix handling of failed deserialization of row batch

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12567 )

Change subject: IMPALA-8239: Fix handling of failed deserialization of row batch
......................................................................


Patch Set 3:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/2234/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/12567
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa74937b046d95484887533be548249e96078755
Gerrit-Change-Number: 12567
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Feb 2019 23:07:33 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8239: Fix handling of failed deserialization of row batch

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12567 )

Change subject: IMPALA-8239: Fix handling of failed deserialization of row batch
......................................................................


Patch Set 4:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/2235/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/12567
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa74937b046d95484887533be548249e96078755
Gerrit-Change-Number: 12567
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Feb 2019 23:14:37 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8239: Fix handling of failed deserialization of row batch

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12567 )

Change subject: IMPALA-8239: Fix handling of failed deserialization of row batch
......................................................................


Patch Set 1:

(3 comments)

Thanks for fixing this. I had one serious question about data_arrival_cv_ and some minor comments about comments.

http://gerrit.cloudera.org:8080/#/c/12567/1/be/src/runtime/krpc-data-stream-recvr.cc
File be/src/runtime/krpc-data-stream-recvr.cc:

http://gerrit.cloudera.org:8080/#/c/12567/1/be/src/runtime/krpc-data-stream-recvr.cc@118
PS1, Line 118:   // accounting for various counters.
Maybe document that 'lock_' must be held via 'lock' here and below. Ok to ignore since it is fairly self-explanatory.


http://gerrit.cloudera.org:8080/#/c/12567/1/be/src/runtime/krpc-data-stream-recvr.cc@161
PS1, Line 161: inesrting
inserting.

Maybe document that this is for errors that occur within KrpcDataStreamRecvr, whereas 'is_cancelled_' tracks cancellations from outside KrpcDataStreamRecvr.


http://gerrit.cloudera.org:8080/#/c/12567/1/be/src/runtime/krpc-data-stream-recvr.cc@346
PS1, Line 346:   status_.MergeStatus(status);
Should we signal data_arrival_cv_ here? Since the loop above depends on this.



-- 
To view, visit http://gerrit.cloudera.org:8080/12567
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa74937b046d95484887533be548249e96078755
Gerrit-Change-Number: 12567
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Feb 2019 19:30:34 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8239: Fix handling of failed deserialization of row batch

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12567 )

Change subject: IMPALA-8239: Fix handling of failed deserialization of row batch
......................................................................


Patch Set 4: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/12567
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa74937b046d95484887533be548249e96078755
Gerrit-Change-Number: 12567
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Feb 2019 23:05:39 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8239: Fix handling of failed deserialization of row batch

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12567 )

Change subject: IMPALA-8239: Fix handling of failed deserialization of row batch
......................................................................


Patch Set 1:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/2223/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/12567
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa74937b046d95484887533be548249e96078755
Gerrit-Change-Number: 12567
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Sat, 23 Feb 2019 18:49:48 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8239: Fix handling of failed deserialization of row batch

Posted by "Michael Ho (Code Review)" <ge...@cloudera.org>.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/12567 )

Change subject: IMPALA-8239: Fix handling of failed deserialization of row batch
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12567/1/be/src/runtime/krpc-data-stream-recvr.cc
File be/src/runtime/krpc-data-stream-recvr.cc:

http://gerrit.cloudera.org:8080/#/c/12567/1/be/src/runtime/krpc-data-stream-recvr.cc@346
PS1, Line 346:   status_.MergeStatus(status);
> Should we signal data_arrival_cv_ here? Since the loop above depends on thi
Good point. I think that's a more general implementation than what's implemented in this patch.



-- 
To view, visit http://gerrit.cloudera.org:8080/12567
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa74937b046d95484887533be548249e96078755
Gerrit-Change-Number: 12567
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Feb 2019 19:45:41 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8239: Fix handling of failed deserialization of row batch

Posted by "Thomas Marshall (Code Review)" <ge...@cloudera.org>.
Thomas Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/12567 )

Change subject: IMPALA-8239: Fix handling of failed deserialization of row batch
......................................................................


Patch Set 4: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/12567
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa74937b046d95484887533be548249e96078755
Gerrit-Change-Number: 12567
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Feb 2019 23:16:54 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8239: Fix handling of failed deserialization of row batch

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12567 )

Change subject: IMPALA-8239: Fix handling of failed deserialization of row batch
......................................................................


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3826/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/12567
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa74937b046d95484887533be548249e96078755
Gerrit-Change-Number: 12567
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Feb 2019 23:25:43 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8239: Fix handling of failed deserialization of row batch

Posted by "Michael Ho (Code Review)" <ge...@cloudera.org>.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/12567 )

Change subject: IMPALA-8239: Fix handling of failed deserialization of row batch
......................................................................


Patch Set 2: Code-Review+1

Carry Lars' +1


-- 
To view, visit http://gerrit.cloudera.org:8080/12567
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa74937b046d95484887533be548249e96078755
Gerrit-Change-Number: 12567
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Feb 2019 22:22:55 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8239: Fix handling of failed deserialization of row batch

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12567 )

Change subject: IMPALA-8239: Fix handling of failed deserialization of row batch
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12567/1/tests/custom_cluster/test_exchange_deferred_batches.py
File tests/custom_cluster/test_exchange_deferred_batches.py:

http://gerrit.cloudera.org:8080/#/c/12567/1/tests/custom_cluster/test_exchange_deferred_batches.py@21
PS1, Line 21: class TestExchangeDeferredBatches(CustomClusterTestSuite):
flake8: E302 expected 2 blank lines, found 1



-- 
To view, visit http://gerrit.cloudera.org:8080/12567
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa74937b046d95484887533be548249e96078755
Gerrit-Change-Number: 12567
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Sat, 23 Feb 2019 18:09:42 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8239: Fix handling of failed deserialization of row batch

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12567 )

Change subject: IMPALA-8239: Fix handling of failed deserialization of row batch
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12567/2/tests/custom_cluster/test_exchange_deferred_batches.py
File tests/custom_cluster/test_exchange_deferred_batches.py:

http://gerrit.cloudera.org:8080/#/c/12567/2/tests/custom_cluster/test_exchange_deferred_batches.py@21
PS2, Line 21: class TestExchangeDeferredBatches(CustomClusterTestSuite):
flake8: E302 expected 2 blank lines, found 1



-- 
To view, visit http://gerrit.cloudera.org:8080/12567
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa74937b046d95484887533be548249e96078755
Gerrit-Change-Number: 12567
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Feb 2019 22:23:23 +0000
Gerrit-HasComments: Yes