You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org> on 2021/04/27 14:44:09 UTC

[Impala-ASF-CR] IMPALA-10682: Add buffering to hs2-http client in impala-shell

Csaba Ringhofer has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17346


Change subject: IMPALA-10682: Add buffering to hs2-http client in impala-shell
......................................................................

IMPALA-10682: Add buffering to hs2-http client in impala-shell

This change reduces to following command from 8.5s to 1.5s on my
machine:
shell/impala_shell.py -B -q "select * from tpch_parquet.lineitem limit 100000;" --protocol hs2-http > /dev/null

This nearly eliminates the speed difference between hs2 and hs-http.

The root cause of the original slowness is the large number of
calls to socket.recv(). The query above used to call it 2809090 times,
now it is only 9007.

Testing:
- ran shell tests

Change-Id: If11f287be65b10bee2b0afffea118e3dc70fdbbd
---
M shell/impala_client.py
1 file changed, 3 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/17346/1
-- 
To view, visit http://gerrit.cloudera.org:8080/17346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: If11f287be65b10bee2b0afffea118e3dc70fdbbd
Gerrit-Change-Number: 17346
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>

[Impala-ASF-CR] IMPALA-10682: Add buffering to hs2-http client in impala-shell

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/17346 )

Change subject: IMPALA-10682: Add buffering to hs2-http client in impala-shell
......................................................................


Patch Set 2: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/17346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If11f287be65b10bee2b0afffea118e3dc70fdbbd
Gerrit-Change-Number: 17346
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Wed, 28 Apr 2021 06:33:42 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10682: Add buffering to hs2-http client in impala-shell

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17346 )

Change subject: IMPALA-10682: Add buffering to hs2-http client in impala-shell
......................................................................


Patch Set 1: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/17346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If11f287be65b10bee2b0afffea118e3dc70fdbbd
Gerrit-Change-Number: 17346
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Tue, 27 Apr 2021 20:38:06 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10682: Add buffering to hs2-http client in impala-shell

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/17346 )

Change subject: IMPALA-10682: Add buffering to hs2-http client in impala-shell
......................................................................


Patch Set 2:

There is no new commit since the jenkins +1, so I am merging this by hand.


-- 
To view, visit http://gerrit.cloudera.org:8080/17346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If11f287be65b10bee2b0afffea118e3dc70fdbbd
Gerrit-Change-Number: 17346
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Wed, 28 Apr 2021 06:34:30 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10682: Add buffering to hs2-http client in impala-shell

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Hello Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17346

to look at the new patch set (#2).

Change subject: IMPALA-10682: Add buffering to hs2-http client in impala-shell
......................................................................

IMPALA-10682: Add buffering to hs2-http client in impala-shell

This change reduces to following command from 8.5s to 1.5s on my
machine:
shell/impala_shell.py -B -q "select * from tpch_parquet.lineitem limit 100000;" --protocol hs2-http > /dev/null

This nearly eliminates the speed difference between hs2 and hs2-http.

The root cause of the original slowness is the large number of
calls to socket.recv(). The query above used to call it 2809090 times,
now it is only 9007.

Testing:
- ran shell tests

Change-Id: If11f287be65b10bee2b0afffea118e3dc70fdbbd
---
M shell/impala_client.py
1 file changed, 3 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/17346/2
-- 
To view, visit http://gerrit.cloudera.org:8080/17346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If11f287be65b10bee2b0afffea118e3dc70fdbbd
Gerrit-Change-Number: 17346
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-10682: Add buffering to hs2-http client in impala-shell

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17346 )

Change subject: IMPALA-10682: Add buffering to hs2-http client in impala-shell
......................................................................


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7106/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/17346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If11f287be65b10bee2b0afffea118e3dc70fdbbd
Gerrit-Change-Number: 17346
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Tue, 27 Apr 2021 14:45:03 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10682: Add buffering to hs2-http client in impala-shell

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17346 )

Change subject: IMPALA-10682: Add buffering to hs2-http client in impala-shell
......................................................................

IMPALA-10682: Add buffering to hs2-http client in impala-shell

This change reduces to following command from 8.5s to 1.5s on my
machine:
shell/impala_shell.py -B -q "select * from tpch_parquet.lineitem limit 100000;" --protocol hs2-http > /dev/null

This nearly eliminates the speed difference between hs2 and hs2-http.

The root cause of the original slowness is the large number of
calls to socket.recv(). The query above used to call it 2809090 times,
now it is only 9007.

Testing:
- ran shell tests

Change-Id: If11f287be65b10bee2b0afffea118e3dc70fdbbd
Reviewed-on: http://gerrit.cloudera.org:8080/17346
Reviewed-by: Quanlong Huang <hu...@gmail.com>
Tested-by: Csaba Ringhofer <cs...@cloudera.com>
---
M shell/impala_client.py
1 file changed, 3 insertions(+), 0 deletions(-)

Approvals:
  Quanlong Huang: Looks good to me, approved
  Csaba Ringhofer: Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/17346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: If11f287be65b10bee2b0afffea118e3dc70fdbbd
Gerrit-Change-Number: 17346
Gerrit-PatchSet: 3
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>

[Impala-ASF-CR] IMPALA-10682: Add buffering to hs2-http client in impala-shell

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/17346 )

Change subject: IMPALA-10682: Add buffering to hs2-http client in impala-shell
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17346/2/shell/impala_client.py
File shell/impala_client.py:

http://gerrit.cloudera.org:8080/#/c/17346/2/shell/impala_client.py@479
PS2, Line 479:     elif self.use_kerberos:
             :       transport = TSaslClientTransport(sasl_factory, "GSSAPI", sock)
             :     else:
             :       transport = TSaslClientTransport(sasl_factory, "PLAIN", sock)
> Should we wrap this two cases with TBufferedTransport as well?
Hmm, thrift's TSaslClientTransport is interesting. It uses puresasl and Attila Jeges also has plans to switch to puresasl. There are many differences in the code (including having a buffer), so this needs some investigation. Dropping thrift_sasl would be a great simplification for Impyla.



-- 
To view, visit http://gerrit.cloudera.org:8080/17346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If11f287be65b10bee2b0afffea118e3dc70fdbbd
Gerrit-Change-Number: 17346
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Wed, 28 Apr 2021 06:25:29 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10682: Add buffering to hs2-http client in impala-shell

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17346 )

Change subject: IMPALA-10682: Add buffering to hs2-http client in impala-shell
......................................................................


Patch Set 1:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/8646/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If11f287be65b10bee2b0afffea118e3dc70fdbbd
Gerrit-Change-Number: 17346
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Tue, 27 Apr 2021 15:04:43 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10682: Add buffering to hs2-http client in impala-shell

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/17346 )

Change subject: IMPALA-10682: Add buffering to hs2-http client in impala-shell
......................................................................


Patch Set 2: Code-Review+2

(1 comment)

Nice finding! LGTM.

http://gerrit.cloudera.org:8080/#/c/17346/2/shell/impala_client.py
File shell/impala_client.py:

http://gerrit.cloudera.org:8080/#/c/17346/2/shell/impala_client.py@479
PS2, Line 479:     elif self.use_kerberos:
             :       transport = TSaslClientTransport(sasl_factory, "GSSAPI", sock)
             :     else:
             :       transport = TSaslClientTransport(sasl_factory, "PLAIN", sock)
Should we wrap this two cases with TBufferedTransport as well?

EDIT: I see buffers in thrift_sasl.TSaslClientTransport. So maybe don't need this. BTW, it seems thrift also has TSaslClientTransport. Maybe we should consider using that, or merging special features of thrift_sasl.TSaslClientTransport to thrift.TSaslClientTransport



-- 
To view, visit http://gerrit.cloudera.org:8080/17346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If11f287be65b10bee2b0afffea118e3dc70fdbbd
Gerrit-Change-Number: 17346
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Wed, 28 Apr 2021 02:46:09 +0000
Gerrit-HasComments: Yes