You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Attila Bukor (Code Review)" <ge...@cloudera.org> on 2019/10/03 10:43:01 UTC

[kudu-CR] KUDU-1938 Make UTF-8 truncation faster pt 2

Hello Kudu Jenkins, Adar Dembo, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14354

to look at the new patch set (#6).

Change subject: KUDU-1938 Make UTF-8 truncation faster pt 2
......................................................................

KUDU-1938 Make UTF-8 truncation faster pt 2

Adds Intel Intrinsics (up to SSE4.2) to speed up the processing of UTF8
character counting in the case of ASCII-only chunks (fast path) by
doubling the chunk size in a single pass from 64 to 128 bits.

Also optimizes the way slow-path chunks are processed.

Before:

[==========] Running 4 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 4 tests from CharUtilTest
[ RUN      ] CharUtilTest.CorrectnessTestUtf8
[       OK ] CharUtilTest.CorrectnessTestUtf8 (2 ms)
[ RUN      ] CharUtilTest.CorrectnessTestAscii
[       OK ] CharUtilTest.CorrectnessTestAscii (0 ms)
[ RUN      ] CharUtilTest.StressTestUtf8
[       OK ] CharUtilTest.StressTestUtf8 (3235 ms)
[ RUN      ] CharUtilTest.StressTestAscii
[       OK ] CharUtilTest.StressTestAscii (290 ms)
[----------] 4 tests from CharUtilTest (3527 ms total)

After:

[==========] Running 4 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 4 tests from CharUtilTest
[ RUN      ] CharUtilTest.CorrectnessTestUtf8
[       OK ] CharUtilTest.CorrectnessTestUtf8 (2 ms)
[ RUN      ] CharUtilTest.CorrectnessTestAscii
[       OK ] CharUtilTest.CorrectnessTestAscii (1 ms)
[ RUN      ] CharUtilTest.StressTestUtf8
[       OK ] CharUtilTest.StressTestUtf8 (2713 ms)
[ RUN      ] CharUtilTest.StressTestAscii
[       OK ] CharUtilTest.StressTestAscii (226 ms)
[----------] 4 tests from CharUtilTest (2942 ms total)

Change-Id: I9a491157dd5c8b4815030bbda921a0afc0bafd28
---
M src/kudu/util/char_util.cc
1 file changed, 41 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/54/14354/6
-- 
To view, visit http://gerrit.cloudera.org:8080/14354
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9a491157dd5c8b4815030bbda921a0afc0bafd28
Gerrit-Change-Number: 14354
Gerrit-PatchSet: 6
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)