You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Michael Ho (Code Review)" <ge...@cloudera.org> on 2019/08/19 18:11:22 UTC

[Impala-ASF-CR] IMPALA-8845: Cancel receiver's streams on exchange node's EOS

Michael Ho has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14101


Change subject: IMPALA-8845: Cancel receiver's streams on exchange node's EOS
......................................................................

IMPALA-8845: Cancel receiver's streams on exchange node's EOS

When an exchange node reaches its row count limit,
the current code will not notify the sender fragments
about it. Consequently, sender fragments may keep sending
row batches to the exchange node but they won't be dequeued
anymore. The sender fragments may end up blocking in the
RPC indefinitely until either the query is cancelled or
closed.

This change fixes the problem above by cancelling the
underlying receiver's streams of an exchange node once it
reaches the row count limit. This will unblock all senders
whose TransmitData() RPCs haven't been replied to yet. Any
future row batches sent to this receiver will also be immediately
replied to with a response indicating that this receiver is
already closed so the sender will stop sending any more row
batches to it.

Change-Id: I10c805e9d63ed8af9f458bf71e8ef5ea9376b939
---
M be/src/exec/exchange-node.cc
M be/src/exec/exchange-node.h
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
A tests/custom_cluster/test_exchange_eos.py
5 files changed, 91 insertions(+), 12 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/14101/1
-- 
To view, visit http://gerrit.cloudera.org:8080/14101
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I10c805e9d63ed8af9f458bf71e8ef5ea9376b939
Gerrit-Change-Number: 14101
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho <kw...@cloudera.com>

[Impala-ASF-CR] IMPALA-8845: Cancel receiver's streams on exchange node's EOS

Posted by "Thomas Tauber-Marshall (Code Review)" <ge...@cloudera.org>.
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/14101 )

Change subject: IMPALA-8845: Cancel receiver's streams on exchange node's EOS
......................................................................


Patch Set 4: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14101/4/tests/custom_cluster/test_exchange_eos.py
File tests/custom_cluster/test_exchange_eos.py:

http://gerrit.cloudera.org:8080/#/c/14101/4/tests/custom_cluster/test_exchange_eos.py@41
PS4, Line 41: test_exchange_eos
Might be worth it to add a few comments about how this test works, eg. why do we need a cluster size of 9?

And if I understand correctly, the point of the metrics checking below is that eventually after the query is run but before fetch is called all fragments should complete and exit except the coordinator fragment?



-- 
To view, visit http://gerrit.cloudera.org:8080/14101
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I10c805e9d63ed8af9f458bf71e8ef5ea9376b939
Gerrit-Change-Number: 14101
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Comment-Date: Mon, 26 Aug 2019 20:47:01 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8845: Cancel receiver's streams on exchange node's EOS

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14101 )

Change subject: IMPALA-8845: Cancel receiver's streams on exchange node's EOS
......................................................................


Patch Set 5:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/4372/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/14101
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I10c805e9d63ed8af9f458bf71e8ef5ea9376b939
Gerrit-Change-Number: 14101
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Comment-Date: Mon, 26 Aug 2019 22:29:40 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8845: Cancel receiver's streams on exchange node's EOS

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14101 )

Change subject: IMPALA-8845: Cancel receiver's streams on exchange node's EOS
......................................................................


Patch Set 4:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/4329/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/14101
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I10c805e9d63ed8af9f458bf71e8ef5ea9376b939
Gerrit-Change-Number: 14101
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Comment-Date: Thu, 22 Aug 2019 06:51:06 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8845: Cancel receiver's streams on exchange node's EOS

Posted by "Michael Ho (Code Review)" <ge...@cloudera.org>.
Hello Thomas Tauber-Marshall, Lars Volker, Sahil Takiar, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14101

to look at the new patch set (#5).

Change subject: IMPALA-8845: Cancel receiver's streams on exchange node's EOS
......................................................................

IMPALA-8845: Cancel receiver's streams on exchange node's EOS

When an exchange node reaches its row count limit,
the current code will not notify the sender fragments
about it. Consequently, sender fragments may keep sending
row batches to the exchange node but they won't be dequeued
anymore. The sender fragments may end up blocking in the
RPC indefinitely until either the query is cancelled or
closed.

This change fixes the problem above by cancelling the
underlying receiver's streams of an exchange node once it
reaches the row count limit. This will unblock all senders
whose TransmitData() RPCs haven't been replied to yet. Any
future row batches sent to this receiver will also be immediately
replied to with a response indicating that this receiver is
already closed so the sender will stop sending any more row
batches to it.

Change-Id: I10c805e9d63ed8af9f458bf71e8ef5ea9376b939
---
M be/src/exec/exchange-node.cc
M be/src/exec/exchange-node.h
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
A tests/custom_cluster/test_exchange_eos.py
6 files changed, 103 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/14101/5
-- 
To view, visit http://gerrit.cloudera.org:8080/14101
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I10c805e9d63ed8af9f458bf71e8ef5ea9376b939
Gerrit-Change-Number: 14101
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>

[Impala-ASF-CR] IMPALA-8845: Cancel receiver's streams on exchange node's EOS

Posted by "Michael Ho (Code Review)" <ge...@cloudera.org>.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/14101 )

Change subject: IMPALA-8845: Cancel receiver's streams on exchange node's EOS
......................................................................


Patch Set 4: Code-Review+1

Carrying Sahil's +1


-- 
To view, visit http://gerrit.cloudera.org:8080/14101
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I10c805e9d63ed8af9f458bf71e8ef5ea9376b939
Gerrit-Change-Number: 14101
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Comment-Date: Thu, 22 Aug 2019 06:09:55 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8845: Cancel receiver's streams on exchange node's EOS

Posted by "Michael Ho (Code Review)" <ge...@cloudera.org>.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/14101 )

Change subject: IMPALA-8845: Cancel receiver's streams on exchange node's EOS
......................................................................


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/14101/2/be/src/exec/exchange-node.cc
File be/src/exec/exchange-node.cc:

http://gerrit.cloudera.org:8080/#/c/14101/2/be/src/exec/exchange-node.cc@162
PS2, Line 162:   stream_recvr_->CancelStream();
> does this mess up some of the logging done in KrpcDataStreamMgr::Cancel? it
Reworded that log statement a bit to clarify the situation.

We want to close the receiver without unregistering it so incoming senders won't hang. In fact, the same can already happen today if a fragment instance is first cancelled before being closed.


http://gerrit.cloudera.org:8080/#/c/14101/2/tests/custom_cluster/test_exchange_eos.py
File tests/custom_cluster/test_exchange_eos.py:

http://gerrit.cloudera.org:8080/#/c/14101/2/tests/custom_cluster/test_exchange_eos.py@40
PS2, Line 40: @CustomClusterTestSuite.with_args(cluster_size=9, num_exclusive_coordinators=1)
> is there a simpler cluster setup that re-produces this issue? e.g. with a s
I tried with a smaller cluster size but couldn't quite reproduce it reliably.


http://gerrit.cloudera.org:8080/#/c/14101/2/tests/custom_cluster/test_exchange_eos.py@57
PS2, Line 57:       results = client.fetch(query, handle)
> nit: validate the fetch was successful and close the query handle?
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/14101
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I10c805e9d63ed8af9f458bf71e8ef5ea9376b939
Gerrit-Change-Number: 14101
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Comment-Date: Tue, 20 Aug 2019 21:51:06 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8845: Cancel receiver's streams on exchange node's EOS

Posted by "Michael Ho (Code Review)" <ge...@cloudera.org>.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/14101 )

Change subject: IMPALA-8845: Cancel receiver's streams on exchange node's EOS
......................................................................


Patch Set 5: Code-Review+2

Carry Thomas' +2


-- 
To view, visit http://gerrit.cloudera.org:8080/14101
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I10c805e9d63ed8af9f458bf71e8ef5ea9376b939
Gerrit-Change-Number: 14101
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Comment-Date: Mon, 26 Aug 2019 21:51:55 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8845: Cancel receiver's streams on exchange node's EOS

Posted by "Michael Ho (Code Review)" <ge...@cloudera.org>.
Michael Ho has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/14101 )

Change subject: IMPALA-8845: Cancel receiver's streams on exchange node's EOS
......................................................................

IMPALA-8845: Cancel receiver's streams on exchange node's EOS

When an exchange node reaches its row count limit,
the current code will not notify the sender fragments
about it. Consequently, sender fragments may keep sending
row batches to the exchange node but they won't be dequeued
anymore. The sender fragments may end up blocking in the
RPC indefinitely until either the query is cancelled or
closed.

This change fixes the problem above by cancelling the
underlying receiver's streams of an exchange node once it
reaches the row count limit. This will unblock all senders
whose TransmitData() RPCs haven't been replied to yet. Any
future row batches sent to this receiver will also be immediately
replied to with a response indicating that this receiver is
already closed so the sender will stop sending any more row
batches to it.

Change-Id: I10c805e9d63ed8af9f458bf71e8ef5ea9376b939
---
M be/src/exec/exchange-node.cc
M be/src/exec/exchange-node.h
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
A tests/custom_cluster/test_exchange_eos.py
6 files changed, 96 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/14101/4
-- 
To view, visit http://gerrit.cloudera.org:8080/14101
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I10c805e9d63ed8af9f458bf71e8ef5ea9376b939
Gerrit-Change-Number: 14101
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>

[Impala-ASF-CR] IMPALA-8845: Cancel receiver's streams on exchange node's EOS

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14101 )

Change subject: IMPALA-8845: Cancel receiver's streams on exchange node's EOS
......................................................................


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14101/1/tests/custom_cluster/test_exchange_eos.py
File tests/custom_cluster/test_exchange_eos.py:

http://gerrit.cloudera.org:8080/#/c/14101/1/tests/custom_cluster/test_exchange_eos.py@22
PS1, Line 22: from tests.common.test_vector import ImpalaTestDimension
flake8: F401 'tests.common.test_vector.ImpalaTestDimension' imported but unused


http://gerrit.cloudera.org:8080/#/c/14101/1/tests/custom_cluster/test_exchange_eos.py@25
PS1, Line 25: class TestExchangeEos(CustomClusterTestSuite):
flake8: E302 expected 2 blank lines, found 1



-- 
To view, visit http://gerrit.cloudera.org:8080/14101
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I10c805e9d63ed8af9f458bf71e8ef5ea9376b939
Gerrit-Change-Number: 14101
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Comment-Date: Mon, 19 Aug 2019 18:11:56 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8845: Cancel receiver's streams on exchange node's EOS

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14101 )

Change subject: IMPALA-8845: Cancel receiver's streams on exchange node's EOS
......................................................................

IMPALA-8845: Cancel receiver's streams on exchange node's EOS

When an exchange node reaches its row count limit,
the current code will not notify the sender fragments
about it. Consequently, sender fragments may keep sending
row batches to the exchange node but they won't be dequeued
anymore. The sender fragments may end up blocking in the
RPC indefinitely until either the query is cancelled or
closed.

This change fixes the problem above by cancelling the
underlying receiver's streams of an exchange node once it
reaches the row count limit. This will unblock all senders
whose TransmitData() RPCs haven't been replied to yet. Any
future row batches sent to this receiver will also be immediately
replied to with a response indicating that this receiver is
already closed so the sender will stop sending any more row
batches to it.

Change-Id: I10c805e9d63ed8af9f458bf71e8ef5ea9376b939
Reviewed-on: http://gerrit.cloudera.org:8080/14101
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M be/src/exec/exchange-node.cc
M be/src/exec/exchange-node.h
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
A tests/custom_cluster/test_exchange_eos.py
6 files changed, 103 insertions(+), 13 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/14101
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I10c805e9d63ed8af9f458bf71e8ef5ea9376b939
Gerrit-Change-Number: 14101
Gerrit-PatchSet: 7
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>

[Impala-ASF-CR] IMPALA-8845: Cancel receiver's streams on exchange node's EOS

Posted by "Sahil Takiar (Code Review)" <ge...@cloudera.org>.
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/14101 )

Change subject: IMPALA-8845: Cancel receiver's streams on exchange node's EOS
......................................................................


Patch Set 3: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14101/2/tests/custom_cluster/test_exchange_eos.py
File tests/custom_cluster/test_exchange_eos.py:

http://gerrit.cloudera.org:8080/#/c/14101/2/tests/custom_cluster/test_exchange_eos.py@57
PS2, Line 57:       results = client.fetch(query, handle)
> Done
need to close the query handle? otherwise it will leak, probably not a big deal in tests, but good practice



-- 
To view, visit http://gerrit.cloudera.org:8080/14101
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I10c805e9d63ed8af9f458bf71e8ef5ea9376b939
Gerrit-Change-Number: 14101
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Comment-Date: Wed, 21 Aug 2019 18:54:09 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8845: Cancel receiver's streams on exchange node's EOS

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14101 )

Change subject: IMPALA-8845: Cancel receiver's streams on exchange node's EOS
......................................................................


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4849/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/14101
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I10c805e9d63ed8af9f458bf71e8ef5ea9376b939
Gerrit-Change-Number: 14101
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Comment-Date: Mon, 26 Aug 2019 21:58:55 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8845: Cancel receiver's streams on exchange node's EOS

Posted by "Michael Ho (Code Review)" <ge...@cloudera.org>.
Michael Ho has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/14101 )

Change subject: IMPALA-8845: Cancel receiver's streams on exchange node's EOS
......................................................................

IMPALA-8845: Cancel receiver's streams on exchange node's EOS

When an exchange node reaches its row count limit,
the current code will not notify the sender fragments
about it. Consequently, sender fragments may keep sending
row batches to the exchange node but they won't be dequeued
anymore. The sender fragments may end up blocking in the
RPC indefinitely until either the query is cancelled or
closed.

This change fixes the problem above by cancelling the
underlying receiver's streams of an exchange node once it
reaches the row count limit. This will unblock all senders
whose TransmitData() RPCs haven't been replied to yet. Any
future row batches sent to this receiver will also be immediately
replied to with a response indicating that this receiver is
already closed so the sender will stop sending any more row
batches to it.

Change-Id: I10c805e9d63ed8af9f458bf71e8ef5ea9376b939
---
M be/src/exec/exchange-node.cc
M be/src/exec/exchange-node.h
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
A tests/custom_cluster/test_exchange_eos.py
5 files changed, 91 insertions(+), 12 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/14101/2
-- 
To view, visit http://gerrit.cloudera.org:8080/14101
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I10c805e9d63ed8af9f458bf71e8ef5ea9376b939
Gerrit-Change-Number: 14101
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>

[Impala-ASF-CR] IMPALA-8845: Cancel receiver's streams on exchange node's EOS

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14101 )

Change subject: IMPALA-8845: Cancel receiver's streams on exchange node's EOS
......................................................................


Patch Set 3:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/4313/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/14101
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I10c805e9d63ed8af9f458bf71e8ef5ea9376b939
Gerrit-Change-Number: 14101
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Comment-Date: Tue, 20 Aug 2019 22:31:20 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8845: Cancel receiver's streams on exchange node's EOS

Posted by "Sahil Takiar (Code Review)" <ge...@cloudera.org>.
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/14101 )

Change subject: IMPALA-8845: Cancel receiver's streams on exchange node's EOS
......................................................................


Patch Set 2:

(3 comments)

mostly minor comments, in general LGTM

http://gerrit.cloudera.org:8080/#/c/14101/2/be/src/exec/exchange-node.cc
File be/src/exec/exchange-node.cc:

http://gerrit.cloudera.org:8080/#/c/14101/2/be/src/exec/exchange-node.cc@162
PS2, Line 162:   stream_recvr_->CancelStream();
does this mess up some of the logging done in KrpcDataStreamMgr::Cancel? it prints out "cancelling all streams for..." but it doesn't really cancel anything because the streams have already been cancelled


http://gerrit.cloudera.org:8080/#/c/14101/2/tests/custom_cluster/test_exchange_eos.py
File tests/custom_cluster/test_exchange_eos.py:

http://gerrit.cloudera.org:8080/#/c/14101/2/tests/custom_cluster/test_exchange_eos.py@40
PS2, Line 40: @CustomClusterTestSuite.with_args(cluster_size=9, num_exclusive_coordinators=1)
is there a simpler cluster setup that re-produces this issue? e.g. with a smaller cluster_size


http://gerrit.cloudera.org:8080/#/c/14101/2/tests/custom_cluster/test_exchange_eos.py@57
PS2, Line 57:       client.fetch(query, handle)
nit: validate the fetch was successful and close the query handle?



-- 
To view, visit http://gerrit.cloudera.org:8080/14101
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I10c805e9d63ed8af9f458bf71e8ef5ea9376b939
Gerrit-Change-Number: 14101
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Comment-Date: Tue, 20 Aug 2019 19:04:17 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8845: Cancel receiver's streams on exchange node's EOS

Posted by "Michael Ho (Code Review)" <ge...@cloudera.org>.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/14101 )

Change subject: IMPALA-8845: Cancel receiver's streams on exchange node's EOS
......................................................................


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14101/1/tests/custom_cluster/test_exchange_eos.py
File tests/custom_cluster/test_exchange_eos.py:

http://gerrit.cloudera.org:8080/#/c/14101/1/tests/custom_cluster/test_exchange_eos.py@22
PS1, Line 22: from tests.verifiers.metric_verifier import MetricVerifier
> flake8: F401 'tests.common.test_vector.ImpalaTestDimension' imported but un
Done


http://gerrit.cloudera.org:8080/#/c/14101/1/tests/custom_cluster/test_exchange_eos.py@25
PS1, Line 25: class TestExchangeEos(CustomClusterTestSuite):
> flake8: E302 expected 2 blank lines, found 1
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/14101
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I10c805e9d63ed8af9f458bf71e8ef5ea9376b939
Gerrit-Change-Number: 14101
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Comment-Date: Mon, 19 Aug 2019 21:04:46 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8845: Cancel receiver's streams on exchange node's EOS

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14101 )

Change subject: IMPALA-8845: Cancel receiver's streams on exchange node's EOS
......................................................................


Patch Set 6: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/14101
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I10c805e9d63ed8af9f458bf71e8ef5ea9376b939
Gerrit-Change-Number: 14101
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Comment-Date: Tue, 27 Aug 2019 02:06:39 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8845: Cancel receiver's streams on exchange node's EOS

Posted by "Michael Ho (Code Review)" <ge...@cloudera.org>.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/14101 )

Change subject: IMPALA-8845: Cancel receiver's streams on exchange node's EOS
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14101/2/tests/custom_cluster/test_exchange_eos.py
File tests/custom_cluster/test_exchange_eos.py:

http://gerrit.cloudera.org:8080/#/c/14101/2/tests/custom_cluster/test_exchange_eos.py@57
PS2, Line 57:       results = client.fetch(query, handle)
> need to close the query handle? otherwise it will leak, probably not a big 
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/14101
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I10c805e9d63ed8af9f458bf71e8ef5ea9376b939
Gerrit-Change-Number: 14101
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Comment-Date: Thu, 22 Aug 2019 06:09:28 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8845: Cancel receiver's streams on exchange node's EOS

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14101 )

Change subject: IMPALA-8845: Cancel receiver's streams on exchange node's EOS
......................................................................


Patch Set 2:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/4306/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/14101
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I10c805e9d63ed8af9f458bf71e8ef5ea9376b939
Gerrit-Change-Number: 14101
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Comment-Date: Mon, 19 Aug 2019 21:47:19 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8845: Cancel receiver's streams on exchange node's EOS

Posted by "Michael Ho (Code Review)" <ge...@cloudera.org>.
Hello Sahil Takiar, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14101

to look at the new patch set (#3).

Change subject: IMPALA-8845: Cancel receiver's streams on exchange node's EOS
......................................................................

IMPALA-8845: Cancel receiver's streams on exchange node's EOS

When an exchange node reaches its row count limit,
the current code will not notify the sender fragments
about it. Consequently, sender fragments may keep sending
row batches to the exchange node but they won't be dequeued
anymore. The sender fragments may end up blocking in the
RPC indefinitely until either the query is cancelled or
closed.

This change fixes the problem above by cancelling the
underlying receiver's streams of an exchange node once it
reaches the row count limit. This will unblock all senders
whose TransmitData() RPCs haven't been replied to yet. Any
future row batches sent to this receiver will also be immediately
replied to with a response indicating that this receiver is
already closed so the sender will stop sending any more row
batches to it.

Change-Id: I10c805e9d63ed8af9f458bf71e8ef5ea9376b939
---
M be/src/exec/exchange-node.cc
M be/src/exec/exchange-node.h
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
A tests/custom_cluster/test_exchange_eos.py
6 files changed, 95 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/14101/3
-- 
To view, visit http://gerrit.cloudera.org:8080/14101
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I10c805e9d63ed8af9f458bf71e8ef5ea9376b939
Gerrit-Change-Number: 14101
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>

[Impala-ASF-CR] IMPALA-8845: Cancel receiver's streams on exchange node's EOS

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14101 )

Change subject: IMPALA-8845: Cancel receiver's streams on exchange node's EOS
......................................................................


Patch Set 6: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/14101
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I10c805e9d63ed8af9f458bf71e8ef5ea9376b939
Gerrit-Change-Number: 14101
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Comment-Date: Mon, 26 Aug 2019 21:58:54 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8845: Cancel receiver's streams on exchange node's EOS

Posted by "Michael Ho (Code Review)" <ge...@cloudera.org>.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/14101 )

Change subject: IMPALA-8845: Cancel receiver's streams on exchange node's EOS
......................................................................


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14101/4/tests/custom_cluster/test_exchange_eos.py
File tests/custom_cluster/test_exchange_eos.py:

http://gerrit.cloudera.org:8080/#/c/14101/4/tests/custom_cluster/test_exchange_eos.py@41
PS4, Line 41: test_exchange_eos
> Might be worth it to add a few comments about how this test works, eg. why 
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/14101
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I10c805e9d63ed8af9f458bf71e8ef5ea9376b939
Gerrit-Change-Number: 14101
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Comment-Date: Mon, 26 Aug 2019 21:51:39 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8845: Cancel receiver's streams on exchange node's EOS

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14101 )

Change subject: IMPALA-8845: Cancel receiver's streams on exchange node's EOS
......................................................................


Patch Set 1:

Build Failed 

https://jenkins.impala.io/job/gerrit-code-review-checks/4300/ : Initial code review checks failed. See linked job for details on the failure.


-- 
To view, visit http://gerrit.cloudera.org:8080/14101
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I10c805e9d63ed8af9f458bf71e8ef5ea9376b939
Gerrit-Change-Number: 14101
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Comment-Date: Mon, 19 Aug 2019 18:37:13 +0000
Gerrit-HasComments: No