You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org> on 2023/05/30 14:34:32 UTC

[Impala-ASF-CR] IMPALA-12171: Optimize delimited output

Hello Jason Fehr, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/19894

to look at the new patch set (#5).

Change subject: IMPALA-12171: Optimize delimited output
......................................................................

IMPALA-12171: Optimize delimited output

The change adds a CSV generator that can handle simple cases
(no characters that need qouting) and falls back to Python's
builtin csv module if special characters are found.

Improvement of ClientFetchWaitTimer for
select * from tpch_parquet.lineitem:

python2 + hs2: 42s516ms -> 22s335ms
python2 + beeswax: 1m4s -> 22s126ms
python3 + hs2: 30s844ms -> 22s173ms
python3 + beeswax: 20s502ms -> 11s860ms

The different amount of improvement per protocol/Python version
probably comes from the varying amount utf-8 conversions that the
patch avoids. It seems that doing the conversion for the large
string after concatenation is much faster than doing it for a
lot of small strings.

Testing:
- added some shell tests with special characters and ran them

Change-Id: I671c6f538c588f8ad4ef4067f7bc8a6b8a5220cb
---
M shell/shell_output.py
M tests/shell/test_shell_commandline.py
2 files changed, 114 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/19894/5
-- 
To view, visit http://gerrit.cloudera.org:8080/19894
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I671c6f538c588f8ad4ef4067f7bc8a6b8a5220cb
Gerrit-Change-Number: 19894
Gerrit-PatchSet: 5
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jason Fehr <jf...@cloudera.com>