You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Attila Bukor (Code Review)" <ge...@cloudera.org> on 2021/10/26 07:49:20 UTC

[kudu-CR] [encryption] KUDU-3331 Encrypt file system

Hello Alexey Serbin, Andrew Wong, Bankim Bhavsar,

I'd like you to do a code review. Please visit

    http://gerrit.cloudera.org:8080/17974

to review the following change.


Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................

[encryption] KUDU-3331 Encrypt file system

de02a34 introduced encryption support to Env in a self-contained way,
but it's not used across Kudu.

This commit integrates this encryption support into the project and
modifies several test suites to also run tests with encryption enabled.

Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/security-itest.cc
M src/kudu/mini-cluster/external_mini_cluster.cc
M src/kudu/mini-cluster/external_mini_cluster.h
M src/kudu/postgres/mini_postgres.cc
M src/kudu/ranger/ranger_client.cc
M src/kudu/security/test/mini_kdc.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_pbc.cc
M src/kudu/util/env.cc
M src/kudu/util/env.h
M src/kudu/util/env_posix.cc
M src/kudu/util/file_cache-test.cc
M src/kudu/util/pb_util-test.cc
M src/kudu/util/pb_util.cc
M src/kudu/util/rolling_log.cc
M src/kudu/util/yamlreader-test.cc
20 files changed, 258 insertions(+), 92 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/17974/1
-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 1
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>

[kudu-CR] [encryption] KUDU-3331 Encrypt file system

Posted by "Attila Bukor (Code Review)" <ge...@cloudera.org>.
Attila Bukor has removed a vote on this change.

Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................


Removed Verified-1 by Kudu Jenkins (120)
-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 2
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] [encryption] KUDU-3331 Encrypt file system

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/17974 )

Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17974/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17974/3//COMMIT_MSG@19
PS3, Line 19: The default is
            : now true
> I'm working on fixing this stuff.
| Can you elaborate on this? I'm not sure I understand.

The idea was to have a way to validate the resulting files under the kudu-master and kudu-tserver directories after running those processes with --enable_encryption flag set.   Basically, that's about making sure that different types of files are encrypted or non-encrypted as expected (think of some sort of checklist).  Not sure what's the best way of doing that in our test framework, though.

After thinking about this a bit more, I guess it's just enough to validate that once manually given that the code regarding encrypting/non-encrypting particular types of files isn't supposed to change often.



-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 3
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 03 Nov 2021 16:04:22 +0000
Gerrit-HasComments: Yes

[kudu-CR] [encryption] KUDU-3331 Encrypt file system

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17974 )

Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................


Patch Set 9: Code-Review+1

(3 comments)

http://gerrit.cloudera.org:8080/#/c/17974/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17974/7//COMMIT_MSG@133
PS7, Line 133: I also wanted to run dense_node-itest with -num_seconds=240 and
> It seems the results are all over the place even with rowset compaction dis
Yeah, I agree. It doesn't seem too useful to include.

Perhaps we can consider a similar test though, perhaps one that copies data from an encrypted server onto a non-encrypted server, and then bootstraps from the non-encrypted server (or vice versa). Though I'm fine leaving that for a later patch


http://gerrit.cloudera.org:8080/#/c/17974/9/src/kudu/tools/tool_action_pbc.cc
File src/kudu/tools/tool_action_pbc.cc:

http://gerrit.cloudera.org:8080/#/c/17974/9/src/kudu/tools/tool_action_pbc.cc@106
PS9, Line 106: IsFileEncrypted
nit: Is this actually meant to determine whether the file is sensitive or not?


http://gerrit.cloudera.org:8080/#/c/17974/9/src/kudu/util/pb_util-test.cc
File src/kudu/util/pb_util-test.cc:

PS9: 
Do you think it's worth adding a test that checks that we can't read a sensitive file when encryption is disabled? Or that we can't read a sensitive unencrypted file (e.g. written when encryption was disabled) when encryption is enabled?

I suppose the error scenarios may change when you add the headers, but it's worth thinking about early what it looks like when we're looking at mismatched encryption statuses



-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 9
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 17 Dec 2021 21:04:35 +0000
Gerrit-HasComments: Yes

[kudu-CR] [encryption] KUDU-3331 Encrypt file system

Posted by "Attila Bukor (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, Andrew Wong, Bankim Bhavsar, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17974

to look at the new patch set (#8).

Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................

[encryption] KUDU-3331 Encrypt file system

de02a34 introduced encryption support to Env in a self-contained way,
but it's not used across Kudu.

This commit integrates this encryption support into the project and
modifies several test suites to also run tests with encryption enabled.

I also renamed "encrypted" to "is_sensitive" in *FileOption as a file
with this flag will be encrypted only if encryption is enabled for the
process.

When encryption is enabled, the following files are encrypted:

- WAL segments
- LBM blocks and metadata
- FBM blocks
- tablet and consensus metadata

Logs, rolling logs, instance and block manager instance files,
configuration files in integration tests are not encrypted.

As FileCache is not used to access instance files, it only supports
handling sensitive files and can't be used to access unencrypted files.

As the PBC CLI tool be used to dump encrypted (metadata) and unencrypted
files (instance) as well, it needs to be able to determine if a file is
encrypted or not. As encryption headers are not yet implemented, I
introduced a hack which checks the file name and treats the file as
unencrypted if it ends with "instance" and encrypted otherwise.

I ran some benchmarks to compare running Kudu with encryption enabled
and disabled.

The following are StartupBenchmark tests run with KUDU_ALLOW_SLOW_TESTS
set to true, which uses a block count of 1,000,000.

It seems enabling encryption adds around 20% overhead on startup in a
typical use-case with no deletes. All tests below were ran in release
mode.

 Performance counter stats for './bin/log_block_manager-test --gtest_filter=*StartupBenchmark/0' (10 runs):

      40391.075316      task-clock (msec)         #    2.021 CPUs utilized            ( +-  1.05% )
            11,089      context-switches          #    0.275 K/sec                    ( +-  9.87% )
               280      cpu-migrations            #    0.007 K/sec                    ( +-  1.58% )
           593,982      page-faults               #    0.015 M/sec                    ( +-  2.13% )
   110,595,311,391      cycles                    #    2.738 GHz                      ( +-  1.03% )
    90,580,214,722      instructions              #    0.82  insn per cycle           ( +-  0.14% )
    16,449,237,957      branches                  #  407.249 M/sec                    ( +-  0.15% )
        67,169,915      branch-misses             #    0.41% of all branches          ( +-  0.49% )

      19.988553457 seconds time elapsed                                          ( +-  0.58% )

 Performance counter stats for './bin/log_block_manager-test --encrypt_data_at_rest=1 --gtest_filter=*StartupBenchmark/0' (10 runs):

      51317.845606      task-clock (msec)         #    2.133 CPUs utilized            ( +-  0.90% )
            13,214      context-switches          #    0.257 K/sec                    ( +-  4.03% )
               292      cpu-migrations            #    0.006 K/sec                    ( +-  1.76% )
           737,815      page-faults               #    0.014 M/sec                    ( +-  1.49% )
   144,898,246,536      cycles                    #    2.824 GHz                      ( +-  1.08% )
   126,702,271,070      instructions              #    0.87  insn per cycle           ( +-  0.05% )
    24,116,649,584      branches                  #  469.947 M/sec                    ( +-  0.05% )
       106,793,688      branch-misses             #    0.44% of all branches          ( +-  0.35% )

      24.055824830 seconds time elapsed                                          ( +-  0.89% )

With deletes, the difference seems to decrease to about 14% when 90% of
the blocks are deleted.

 Performance counter stats for './bin/log_block_manager-test --gtest_filter=*StartupBenchmark/1' (10 runs):

      53247.212289      task-clock (msec)         #    1.494 CPUs utilized            ( +-  0.69% )
            94,868      context-switches          #    0.002 M/sec                    ( +-  0.13% )
               530      cpu-migrations            #    0.010 K/sec                    ( +-  1.48% )
           399,284      page-faults               #    0.007 M/sec                    ( +-  1.66% )
   145,147,457,046      cycles                    #    2.726 GHz                      ( +-  0.48% )
   141,892,983,444      instructions              #    0.98  insn per cycle           ( +-  0.04% )
    26,167,495,753      branches                  #  491.434 M/sec                    ( +-  0.04% )
        59,986,442      branch-misses             #    0.23% of all branches          ( +-  0.33% )

      35.648681894 seconds time elapsed                                          ( +-  1.40% )

 Performance counter stats for './bin/log_block_manager-test --encrypt_data_at_rest=1 --gtest_filter=*StartupBenchmark/1' (10 runs):

      70616.598642      task-clock (msec)         #    1.737 CPUs utilized            ( +-  0.81% )
            95,082      context-switches          #    0.001 M/sec                    ( +-  0.28% )
               523      cpu-migrations            #    0.007 K/sec                    ( +-  1.69% )
           679,834      page-faults               #    0.010 M/sec                    ( +-  1.66% )
   203,066,615,244      cycles                    #    2.876 GHz                      ( +-  1.05% )
   209,355,734,267      instructions              #    1.03  insn per cycle           ( +-  0.08% )
    40,477,560,095      branches                  #  573.202 M/sec                    ( +-  0.07% )
       133,637,310      branch-misses             #    0.33% of all branches          ( +-  1.48% )

      40.653406472 seconds time elapsed                                          ( +-  1.52% )

Delete tablet benchmark takes less than a second to run, so I ran it
1000 times with encryption disabled and enabled. It seems encryption
costs about 30% of overhead in this case.

 Performance counter stats for './bin/tablet_server-test --gtest_filter=TabletServerTest.TestDeleteTabletBenchmark' (1000 runs):

        735.800649      task-clock (msec)         #    0.994 CPUs utilized            ( +-  0.33% )
             3,613      context-switches          #    0.005 M/sec                    ( +-  0.15% )
               178      cpu-migrations            #    0.242 K/sec                    ( +-  0.29% )
            10,722      page-faults               #    0.015 M/sec                    ( +-  0.08% )
     1,316,404,469      cycles                    #    1.789 GHz                      ( +-  0.19% )
     1,629,691,550      instructions              #    1.24  insn per cycle           ( +-  0.21% )
       337,778,107      branches                  #  459.062 M/sec                    ( +-  0.19% )
         6,340,956      branch-misses             #    1.88% of all branches          ( +-  0.21% )

       0.739940005 seconds time elapsed                                          ( +-  2.33% )

 Performance counter stats for './bin/tablet_server-test --encrypt_data_at_rest=1 --gtest_filter=TabletServerTest.TestDeleteTabletBenchmark' (1000 runs):

        769.368354      task-clock (msec)         #    0.792 CPUs utilized            ( +-  0.34% )
             3,633      context-switches          #    0.005 M/sec                    ( +-  0.13% )
               183      cpu-migrations            #    0.238 K/sec                    ( +-  0.29% )
            10,737      page-faults               #    0.014 M/sec                    ( +-  0.07% )
     1,356,327,815      cycles                    #    1.763 GHz                      ( +-  0.14% )
     1,635,206,270      instructions              #    1.21  insn per cycle           ( +-  0.06% )
       338,261,840      branches                  #  439.662 M/sec                    ( +-  0.06% )
         6,486,125      branch-misses             #    1.92% of all branches          ( +-  0.21% )

       0.971974609 seconds time elapsed                                          ( +-  2.42% )

I also wanted to run dense_node-itest with -num_seconds=240 and
-num_tablets=1000, but the amount of data written (both in terms of
number of blocks and bytes) was all over the place, and so was the time
spent reopening the tablet server (seemingly without correlation to the
amount of data under management by the LBM). This was true with and
without encryption being enabled, so it's hard to draw any conclusions
from this benchmark.

$ for x in {0..1}; do perf stat -r 5 --log-fd=3 ./bin/dense_node-itest -num_tablets=1000 -num_seconds=240 --gtest_filter=DenseNodeTest.RunTest/$x 3>&1 2> >(grep "dense_node-itest.cc") > >(grep "===="); done
[==========] Running 1 test from 1 test suite.
I1110 05:23:51.307804 107446 dense_node-itest.cc:223] Time spent restarting master: real 0.083s user 0.000s     sys 0.001s
I1110 05:24:19.118248 107446 dense_node-itest.cc:226] Time spent restarting tserver: real 27.810s       user 0.017s     sys 0.100s
I1110 05:24:19.118268 107446 dense_node-itest.cc:237] not waiting for bootstrapping tablets (flag disabled)
I1110 05:24:19.465764 107446 dense_node-itest.cc:242] log_block_manager_blocks_under_management: 38136
I1110 05:24:19.465798 107446 dense_node-itest.cc:242] log_block_manager_bytes_under_management: 250252129
I1110 05:24:19.465804 107446 dense_node-itest.cc:242] log_block_manager_containers: 9892
I1110 05:24:19.465808 107446 dense_node-itest.cc:242] log_block_manager_full_containers: 4017
I1110 05:24:19.465812 107446 dense_node-itest.cc:242] threads_running: 1334
[==========] 1 test from 1 test suite ran. (608499 ms total)
[==========] Running 1 test from 1 test suite.
I1110 05:33:58.751773 121520 dense_node-itest.cc:223] Time spent restarting master: real 0.053s user 0.000s     sys 0.001s
I1110 05:41:09.045049 121520 dense_node-itest.cc:226] Time spent restarting tserver: real 430.293s      user 0.291s     sys 1.523s
I1110 05:41:09.045073 121520 dense_node-itest.cc:237] not waiting for bootstrapping tablets (flag disabled)
I1110 05:41:09.257555 121520 dense_node-itest.cc:242] log_block_manager_blocks_under_management: 29376
I1110 05:41:09.257593 121520 dense_node-itest.cc:242] log_block_manager_bytes_under_management: 199511663
I1110 05:41:09.257599 121520 dense_node-itest.cc:242] log_block_manager_containers: 11464
I1110 05:41:09.257604 121520 dense_node-itest.cc:242] log_block_manager_full_containers: 1319
I1110 05:41:09.257608 121520 dense_node-itest.cc:242] threads_running: 226
[==========] 1 test from 1 test suite ran. (947330 ms total)
[==========] Running 1 test from 1 test suite.
I1110 05:50:01.919380 143199 dense_node-itest.cc:223] Time spent restarting master: real 0.094s user 0.000s     sys 0.003s
I1110 05:51:13.537168 143199 dense_node-itest.cc:226] Time spent restarting tserver: real 71.618s       user 0.057s     sys 0.249s
I1110 05:51:13.537186 143199 dense_node-itest.cc:237] not waiting for bootstrapping tablets (flag disabled)
I1110 05:51:13.581818 143199 dense_node-itest.cc:242] log_block_manager_blocks_under_management: 32557
I1110 05:51:13.581848 143199 dense_node-itest.cc:242] log_block_manager_bytes_under_management: 189720720
I1110 05:51:13.581853 143199 dense_node-itest.cc:242] log_block_manager_containers: 9910
I1110 05:51:13.581857 143199 dense_node-itest.cc:242] log_block_manager_full_containers: 2070
I1110 05:51:13.581861 143199 dense_node-itest.cc:242] threads_running: 225
[==========] 1 test from 1 test suite ran. (630376 ms total)
[==========] Running 1 test from 1 test suite.
I1110 06:00:18.327955 161886 dense_node-itest.cc:223] Time spent restarting master: real 0.094s user 0.001s     sys 0.001s
I1110 06:00:36.625900 161886 dense_node-itest.cc:226] Time spent restarting tserver: real 18.298s       user 0.011s     sys 0.063s
I1110 06:00:36.625918 161886 dense_node-itest.cc:237] not waiting for bootstrapping tablets (flag disabled)
I1110 06:00:36.779762 161886 dense_node-itest.cc:242] log_block_manager_blocks_under_management: 34068
I1110 06:00:36.779801 161886 dense_node-itest.cc:242] log_block_manager_bytes_under_management: 236467612
I1110 06:00:36.779808 161886 dense_node-itest.cc:242] log_block_manager_containers: 9880
I1110 06:00:36.779814 161886 dense_node-itest.cc:242] log_block_manager_full_containers: 1410
I1110 06:00:36.779817 161886 dense_node-itest.cc:242] threads_running: 1332
[==========] 1 test from 1 test suite ran. (481328 ms total)
[==========] Running 1 test from 1 test suite.
I1110 06:08:12.651594 176648 dense_node-itest.cc:223] Time spent restarting master: real 0.084s user 0.001s     sys 0.002s
I1110 06:09:14.436517 176648 dense_node-itest.cc:226] Time spent restarting tserver: real 61.785s       user 0.045s     sys 0.222s
I1110 06:09:14.436537 176648 dense_node-itest.cc:237] not waiting for bootstrapping tablets (flag disabled)
I1110 06:09:14.732786 176648 dense_node-itest.cc:242] log_block_manager_blocks_under_management: 32334
I1110 06:09:14.732823 176648 dense_node-itest.cc:242] log_block_manager_bytes_under_management: 225235870
I1110 06:09:14.732829 176648 dense_node-itest.cc:242] log_block_manager_containers: 10119
I1110 06:09:14.732833 176648 dense_node-itest.cc:242] log_block_manager_full_containers: 5070
I1110 06:09:14.732837 176648 dense_node-itest.cc:242] threads_running: 1313
[==========] 1 test from 1 test suite ran. (506357 ms total)

 Performance counter stats for './bin/dense_node-itest -num_tablets=1000 -num_seconds=240 --gtest_filter=DenseNodeTest.RunTest/0' (5 runs):

    2995824.938107      task-clock (msec)         #    4.719 CPUs utilized            ( +-  5.08% )
        51,210,606      context-switches          #    0.017 M/sec                    ( +-  8.77% )
        10,794,492      cpu-migrations            #    0.004 M/sec                    ( +- 12.18% )
         5,334,419      page-faults               #    0.002 M/sec                    ( +-  9.46% )
 6,394,831,534,947      cycles                    #    2.135 GHz                      ( +-  9.25% )
 3,981,645,181,601      instructions              #    0.62  insn per cycle           ( +-  9.86% )
   792,397,266,216      branches                  #  264.501 M/sec                    ( +-  9.75% )
     6,963,950,058      branch-misses             #    0.88% of all branches          ( +-  7.25% )

     634.804501857 seconds time elapsed                                          ( +- 13.11% )

[==========] Running 1 test from 1 test suite.
I1110 06:16:46.444833 194734 dense_node-itest.cc:223] Time spent restarting master: real 0.075s user 0.000s     sys 0.002s
I1110 06:17:29.511780 194734 dense_node-itest.cc:226] Time spent restarting tserver: real 43.067s       user 0.031s     sys 0.151s
I1110 06:17:29.511799 194734 dense_node-itest.cc:237] not waiting for bootstrapping tablets (flag disabled)
I1110 06:17:29.575773 194734 dense_node-itest.cc:242] log_block_manager_blocks_under_management: 33354
I1110 06:17:29.575809 194734 dense_node-itest.cc:242] log_block_manager_bytes_under_management: 162500445
I1110 06:17:29.575814 194734 dense_node-itest.cc:242] log_block_manager_containers: 9946
I1110 06:17:29.575817 194734 dense_node-itest.cc:242] log_block_manager_full_containers: 2847
I1110 06:17:29.575821 194734 dense_node-itest.cc:242] threads_running: 228
[==========] 1 test from 1 test suite ran. (468746 ms total)
[==========] Running 1 test from 1 test suite.
I1110 06:24:34.692034 14506 dense_node-itest.cc:223] Time spent restarting master: real 4.389s  user 0.003s     sys 0.017s
I1110 06:25:16.629412 14506 dense_node-itest.cc:226] Time spent restarting tserver: real 41.937s        user 0.027s     sys 0.146s
I1110 06:25:16.629434 14506 dense_node-itest.cc:237] not waiting for bootstrapping tablets (flag disabled)
I1110 06:25:22.050498 14506 dense_node-itest.cc:242] log_block_manager_blocks_under_management: 47634
I1110 06:25:22.050537 14506 dense_node-itest.cc:242] log_block_manager_bytes_under_management: 409166624
I1110 06:25:22.050544 14506 dense_node-itest.cc:242] log_block_manager_containers: 9779
I1110 06:25:22.050549 14506 dense_node-itest.cc:242] log_block_manager_full_containers: 3405
I1110 06:25:22.050552 14506 dense_node-itest.cc:242] threads_running: 1342
[==========] 1 test from 1 test suite ran. (434683 ms total)
[==========] Running 1 test from 1 test suite.
I1110 06:31:46.989178 26886 dense_node-itest.cc:223] Time spent restarting master: real 0.093s  user 0.000s     sys 0.003s
I1110 06:32:04.775068 26886 dense_node-itest.cc:226] Time spent restarting tserver: real 17.786s        user 0.010s     sys 0.048s
I1110 06:32:04.775091 26886 dense_node-itest.cc:237] not waiting for bootstrapping tablets (flag disabled)
I1110 06:32:04.830790 26886 dense_node-itest.cc:242] log_block_manager_blocks_under_management: 34068
I1110 06:32:04.830832 26886 dense_node-itest.cc:242] log_block_manager_bytes_under_management: 225249875
I1110 06:32:04.830838 26886 dense_node-itest.cc:242] log_block_manager_containers: 10352
I1110 06:32:04.830842 26886 dense_node-itest.cc:242] log_block_manager_full_containers: 1198
I1110 06:32:04.830849 26886 dense_node-itest.cc:242] threads_running: 1307
[==========] 1 test from 1 test suite ran. (401113 ms total)
[==========] Running 1 test from 1 test suite.
I1110 06:38:28.355651 39523 dense_node-itest.cc:223] Time spent restarting master: real 0.348s  user 0.001s     sys 0.003s
I1110 06:39:14.934329 39523 dense_node-itest.cc:226] Time spent restarting tserver: real 46.579s        user 0.034s     sys 0.161s
I1110 06:39:14.934355 39523 dense_node-itest.cc:237] not waiting for bootstrapping tablets (flag disabled)
I1110 06:39:15.086436 39523 dense_node-itest.cc:242] log_block_manager_blocks_under_management: 42942
I1110 06:39:15.086474 39523 dense_node-itest.cc:242] log_block_manager_bytes_under_management: 337224147
I1110 06:39:15.086480 39523 dense_node-itest.cc:242] log_block_manager_containers: 10706
I1110 06:39:15.086484 39523 dense_node-itest.cc:242] log_block_manager_full_containers: 2511
I1110 06:39:15.086489 39523 dense_node-itest.cc:242] threads_running: 1326
[==========] 1 test from 1 test suite ran. (542365 ms total)
[==========] Running 1 test from 1 test suite.
I1110 06:47:30.400804 54331 dense_node-itest.cc:223] Time spent restarting master: real 0.125s  user 0.001s     sys 0.002s
I1110 06:51:28.288359 54331 dense_node-itest.cc:226] Time spent restarting tserver: real 237.888s       user 0.160s     sys 0.855s
I1110 06:51:28.288374 54331 dense_node-itest.cc:237] not waiting for bootstrapping tablets (flag disabled)
I1110 06:51:28.633086 54331 dense_node-itest.cc:242] log_block_manager_blocks_under_management: 35292
I1110 06:51:28.633126 54331 dense_node-itest.cc:242] log_block_manager_bytes_under_management: 198352642
I1110 06:51:28.633132 54331 dense_node-itest.cc:242] log_block_manager_containers: 10015
I1110 06:51:28.633139 54331 dense_node-itest.cc:242] log_block_manager_full_containers: 2514
I1110 06:51:28.633143 54331 dense_node-itest.cc:242] threads_running: 1300
[==========] 1 test from 1 test suite ran. (709361 ms total)

 Performance counter stats for './bin/dense_node-itest -num_tablets=1000 -num_seconds=240 --gtest_filter=DenseNodeTest.RunTest/1' (5 runs):

    3115638.620173      task-clock (msec)         #    6.094 CPUs utilized            ( +-  2.07% )
        50,595,541      context-switches          #    0.016 M/sec                    ( +-  7.87% )
        10,972,126      cpu-migrations            #    0.004 M/sec                    ( +-  9.15% )
         5,985,924      page-faults               #    0.002 M/sec                    ( +- 10.89% )
 6,073,807,994,303      cycles                    #    1.949 GHz                      ( +- 10.17% )
 4,094,332,719,732      instructions              #    0.67  insn per cycle           ( +-  7.76% )
   807,628,648,546      branches                  #  259.218 M/sec                    ( +-  7.83% )
     7,158,875,684      branch-misses             #    0.89% of all branches          ( +-  6.38% )

     511.281338094 seconds time elapsed                                          ( +- 10.71% )

Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log_index.cc
M src/kudu/consensus/log_util.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/dir_manager.cc
M src/kudu/fs/dir_util.cc
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/log_block_manager-test-util.cc
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/integration-tests/dense_node-itest.cc
M src/kudu/integration-tests/mini_cluster_fs_inspector.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/security-itest.cc
M src/kudu/mini-cluster/external_mini_cluster.cc
M src/kudu/mini-cluster/external_mini_cluster.h
M src/kudu/postgres/mini_postgres.cc
M src/kudu/ranger/ranger_client.cc
M src/kudu/security/test/mini_kdc.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_pbc.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_source_session-test.cc
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/util/env-test.cc
M src/kudu/util/env.cc
M src/kudu/util/env.h
M src/kudu/util/env_posix.cc
M src/kudu/util/env_util.cc
M src/kudu/util/file_cache-test.cc
M src/kudu/util/file_cache.cc
M src/kudu/util/pb_util-test.cc
M src/kudu/util/pb_util.cc
M src/kudu/util/pb_util.h
M src/kudu/util/rolling_log.cc
M src/kudu/util/yamlreader-test.cc
41 files changed, 457 insertions(+), 187 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/17974/8
-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 8
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] [encryption] KUDU-3331 Encrypt file system

Posted by "Attila Bukor (Code Review)" <ge...@cloudera.org>.
Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17974 )

Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................


Patch Set 8:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/17974/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17974/7//COMMIT_MSG@32
PS7, Line 32: be us
> nit: seems to be an extra word?
Done


http://gerrit.cloudera.org:8080/#/c/17974/7//COMMIT_MSG@45
PS7, Line 45: tes. All tests below wer
> Just to double-check: the results below are for RELEASE builds, right?
Yes, added a note about it.


http://gerrit.cloudera.org:8080/#/c/17974/7//COMMIT_MSG@133
PS7, Line 133: I also wanted to run dense_node-itest with -num_seconds=240 and
> Actually I misunderstood the design of this test: to maximize the number of
It seems the results are all over the place even with rowset compaction disabled. Do you think I should just remove this part from the commit message?


http://gerrit.cloudera.org:8080/#/c/17974/7//COMMIT_MSG@134
PS7, Line 134: but the amount of data written (both in terms of
             : number of blocks and bytes) was a
> nit: missing some words?
Done


http://gerrit.cloudera.org:8080/#/c/17974/7/src/kudu/fs/log_block_manager-test.cc
File src/kudu/fs/log_block_manager-test.cc:

http://gerrit.cloudera.org:8080/#/c/17974/7/src/kudu/fs/log_block_manager-test.cc@a1173
PS7, Line 1173: 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
> Was this removal intentional?
Nope, thanks.


http://gerrit.cloudera.org:8080/#/c/17974/6/src/kudu/tserver/tablet_copy_client.cc
File src/kudu/tserver/tablet_copy_client.cc:

http://gerrit.cloudera.org:8080/#/c/17974/6/src/kudu/tserver/tablet_copy_client.cc@492
PS6, Line 492: 
> What about explicitly controlling the encryption for the result file here?
Done


http://gerrit.cloudera.org:8080/#/c/17974/6/src/kudu/tserver/tablet_copy_client.cc@798
PS6, Line 798: 
> What about explicitly controlling the encryption for the result file here?
Done


http://gerrit.cloudera.org:8080/#/c/17974/7/src/kudu/util/env_posix.cc
File src/kudu/util/env_posix.cc:

http://gerrit.cloudera.org:8080/#/c/17974/7/src/kudu/util/env_posix.cc@203
PS7, Line 203: 
> nit: since now the terminology is more stable (encrypted vs sensitive, etc.
Done


http://gerrit.cloudera.org:8080/#/c/17974/7/src/kudu/util/env_posix.cc@1306
PS7, Line 1306: 
> nit: would it make sense to introduce a helper function/method ShouldEncryp
I don't know about that, each file type uses separate *FileOptions types, so the helpers would need to be duplicated. Or should I use templates for this?


http://gerrit.cloudera.org:8080/#/c/17974/6/src/kudu/util/rolling_log.cc
File src/kudu/util/rolling_log.cc:

http://gerrit.cloudera.org:8080/#/c/17974/6/src/kudu/util/rolling_log.cc@163
PS6, Line 163: opts.is_sensitive = false;
> Just to make sure this assignment is left here intentionally even if 'is_se
It's not necessarily needed, but it doesn't hurt. Also, it makes things simpler if down the line it turns out it's easier/better to have is_sensitive default to true.



-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 8
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 30 Nov 2021 18:50:30 +0000
Gerrit-HasComments: Yes

[kudu-CR] [encryption] KUDU-3331 Encrypt file system

Posted by "Attila Bukor (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, Andrew Wong, Bankim Bhavsar, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17974

to look at the new patch set (#2).

Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................

[encryption] KUDU-3331 Encrypt file system

de02a34 introduced encryption support to Env in a self-contained way,
but it's not used across Kudu.

This commit integrates this encryption support into the project and
modifies several test suites to also run tests with encryption enabled.

Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/security-itest.cc
M src/kudu/mini-cluster/external_mini_cluster.cc
M src/kudu/mini-cluster/external_mini_cluster.h
M src/kudu/postgres/mini_postgres.cc
M src/kudu/ranger/ranger_client.cc
M src/kudu/security/test/mini_kdc.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_pbc.cc
M src/kudu/util/env.cc
M src/kudu/util/env.h
M src/kudu/util/env_posix.cc
M src/kudu/util/file_cache-test.cc
M src/kudu/util/pb_util-test.cc
M src/kudu/util/pb_util.cc
M src/kudu/util/rolling_log.cc
M src/kudu/util/yamlreader-test.cc
20 files changed, 259 insertions(+), 93 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/17974/2
-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 2
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] [encryption] KUDU-3331 Encrypt file system

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17974 )

Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17974/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17974/3//COMMIT_MSG@20
PS3, Line 20: not having to complicate
            : FileCache.
> Are you suggesting that the file cache would still only support encrypted f
Yeah, that's what I'm suggesting, the assumption being that the FileCache only supports sensitive files.



-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 4
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 03 Nov 2021 06:53:49 +0000
Gerrit-HasComments: Yes

[kudu-CR] [encryption] KUDU-3331 Encrypt file system

Posted by "Attila Bukor (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, Andrew Wong, Bankim Bhavsar, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17974

to look at the new patch set (#5).

Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................

[encryption] KUDU-3331 Encrypt file system

de02a34 introduced encryption support to Env in a self-contained way,
but it's not used across Kudu.

This commit integrates this encryption support into the project and
modifies several test suites to also run tests with encryption enabled.

I also renamed "encrypted" to "is_sensitive" in *FileOption as a file
with this flag will be encrypted only if encryption is enabled for the
process.

When encryption is enabled, the following files are encrypted:

- WAL segments
- LBM blocks and metadata
- FBM blocks
- tablet and consensus metadata

Logs, rolling logs, instance and block manager instance files,
configuration files in integration tests are not encrypted.

As FileCache is not used to access instance files, it only supports
handling sensitive files and can't be used to access unencrypted files.

As the PBC CLI tool needs can be used to dump encrypted (metadata) and
unencrypted files (instance) as well, it needs to be able to determine
if a file is encrypted or not. As encryption headers are not yet
implemented, I introduced a hack which checks the file name and treats
the file as unencrypted if it ends with "instance" and encrypted
otherwise.

I ran some benchmarks to compare running Kudu with encryption enabled
and disabled:

$ KUDU_ALLOW_SLOW_TESTS=1 ./bin/log_block_manager-test --gtest_filter="*StartupBenchmark*" 2>startup-bench.txt &&  grep "Time spent reopening" startup-bench.txt
Note: Google Test filter = *StartupBenchmark*
[==========] Running 2 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 2 tests from EncryptionEnabled/LogBlockManagerTest
[ RUN      ] EncryptionEnabled/LogBlockManagerTest.StartupBenchmark/0
[       OK ] EncryptionEnabled/LogBlockManagerTest.StartupBenchmark/0 (30192 ms)
[ RUN      ] EncryptionEnabled/LogBlockManagerTest.StartupBenchmark/1
[       OK ] EncryptionEnabled/LogBlockManagerTest.StartupBenchmark/1 (147586 ms)
[----------] 2 tests from EncryptionEnabled/LogBlockManagerTest (177779 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 1 test suite ran. (177779 ms total)
[  PASSED  ] 2 tests.
I1103 17:40:00.268719 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 1.835s     user 0.110s     sys 0.041s
I1103 17:40:02.084342 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 1.816s     user 0.142s     sys 0.000s
I1103 17:40:04.027940 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 1.944s     user 0.127s     sys 0.051s
I1103 17:40:05.838649 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 1.811s     user 0.147s     sys 0.003s
I1103 17:40:07.780369 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 1.942s     user 0.140s     sys 0.054s
I1103 17:40:09.581300 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 1.801s     user 0.156s     sys 0.001s
I1103 17:40:11.495551 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 1.914s     user 0.142s     sys 0.057s
I1103 17:40:13.302351 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 1.807s     user 0.147s     sys 0.002s
I1103 17:40:15.189821 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 1.870s     user 0.136s     sys 0.046s
I1103 17:40:17.010360 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 1.820s     user 0.135s     sys 0.002s
I1103 17:40:31.574581 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 2.258s     user 0.118s     sys 0.001s
I1103 17:40:33.885696 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 2.311s     user 0.133s     sys 0.054s
I1103 17:40:36.116078 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 2.218s     user 0.132s     sys 0.015s
I1103 17:40:38.349752 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 2.234s     user 0.125s     sys 0.043s
I1103 17:40:40.558445 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 2.209s     user 0.130s     sys 0.001s
I1103 17:40:42.805191 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 2.247s     user 0.131s     sys 0.039s
I1103 17:40:44.995265 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 2.190s     user 0.146s     sys 0.000s
I1103 17:40:47.269686 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 2.274s     user 0.137s     sys 0.046s
I1103 17:40:49.522470 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 2.253s     user 0.134s     sys 0.000s
I1103 17:40:51.816761 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 2.294s     user 0.128s     sys 0.044s

$ ./bin/dense_node-itest -num_tablets=1000 -num_seconds=240 2>dense-node.txt | grep "Time spent restarting tserver"
I1103 17:00:33.196427 14424 dense_node-itest.cc:226] Time spent restarting tserver: real 74.861s        user 0.044s     sys 0.177s
I1103 17:06:01.772245 14424 dense_node-itest.cc:226] Time spent restarting tserver: real 69.005s        user 0.041s     sys 0.156s

$ perf stat --log-fd 3 -r 10  ./bin/tablet_server-test --gtest_filter=TabletServerTest.TestDeleteTabletBenchmark 3>&1 2>/dev/null >/dev/null

 Performance counter stats for './bin/tablet_server-test --gtest_filter=TabletServerTest.TestDeleteTabletBenchmark' (10 runs):

            968.79 msec task-clock:u              #    1.102 CPUs utilized            ( +-  0.27% )
                 0      context-switches:u        #    0.000 K/sec
                 0      cpu-migrations:u          #    0.000 K/sec
            11,216      page-faults:u             #    0.012 M/sec                    ( +-  0.14% )
     1,192,386,602      cycles:u                  #    1.231 GHz                      ( +-  0.50% )  (83.24%)
     1,042,831,264      stalled-cycles-frontend:u #   87.46% frontend cycles idle     ( +-  0.43% )  (83.26%)
       858,989,227      stalled-cycles-backend:u  #   72.04% backend cycles idle      ( +-  0.33% )  (66.55%)
     1,360,615,269      instructions:u            #    1.14  insn per cycle
                                                  #    0.77  stalled cycles per insn  ( +-  0.38% )  (83.13%)
       291,921,003      branches:u                #  301.326 M/sec                    ( +-  0.27% )  (83.50%)
         8,332,352      branch-misses:u           #    2.85% of all branches          ( +-  0.24% )  (83.45%)

           0.87885 +- 0.00573 seconds time elapsed  ( +-  0.65% )

$ perf stat --log-fd 3 -r 10  ./bin/tablet_server-test --gtest_filter=TabletServerTest.TestDeleteTabletBenchmark --encrypt_data_at_rest 3>&1 2>/dev/null >/dev/null

 Performance counter stats for './bin/tablet_server-test --gtest_filter=TabletServerTest.TestDeleteTabletBenchmark --encrypt_data_at_rest' (10 runs):

            988.61 msec task-clock:u              #    1.099 CPUs utilized            ( +-  0.62% )
                 0      context-switches:u        #    0.000 K/sec
                 0      cpu-migrations:u          #    0.000 K/sec
            11,229      page-faults:u             #    0.011 M/sec                    ( +-  0.13% )
     1,223,443,798      cycles:u                  #    1.238 GHz                      ( +-  0.58% )  (83.40%)
     1,066,930,822      stalled-cycles-frontend:u #   87.21% frontend cycles idle     ( +-  0.53% )  (83.78%)
       875,815,760      stalled-cycles-backend:u  #   71.59% backend cycles idle      ( +-  0.45% )  (66.88%)
     1,389,694,090      instructions:u            #    1.14  insn per cycle
                                                  #    0.77  stalled cycles per insn  ( +-  0.38% )  (83.26%)
       296,238,671      branches:u                #  299.651 M/sec                    ( +-  0.33% )  (83.18%)
         8,548,361      branch-misses:u           #    2.89% of all branches          ( +-  0.37% )  (82.77%)

           0

Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log_index.cc
M src/kudu/consensus/log_util.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/dir_manager.cc
M src/kudu/fs/dir_util.cc
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/log_block_manager-test-util.cc
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/integration-tests/dense_node-itest.cc
M src/kudu/integration-tests/mini_cluster_fs_inspector.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/security-itest.cc
M src/kudu/mini-cluster/external_mini_cluster.cc
M src/kudu/mini-cluster/external_mini_cluster.h
M src/kudu/postgres/mini_postgres.cc
M src/kudu/ranger/ranger_client.cc
M src/kudu/security/test/mini_kdc.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_pbc.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_source_session-test.cc
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/util/env-test.cc
M src/kudu/util/env.cc
M src/kudu/util/env.h
M src/kudu/util/env_posix.cc
M src/kudu/util/env_util.cc
M src/kudu/util/file_cache-test.cc
M src/kudu/util/file_cache.cc
M src/kudu/util/pb_util-test.cc
M src/kudu/util/pb_util.cc
M src/kudu/util/pb_util.h
M src/kudu/util/rolling_log.cc
M src/kudu/util/yamlreader-test.cc
41 files changed, 426 insertions(+), 166 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/17974/5
-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 5
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] [encryption] KUDU-3331 Encrypt file system

Posted by "Attila Bukor (Code Review)" <ge...@cloudera.org>.
Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17974 )

Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................


Patch Set 10:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17974/9/src/kudu/tools/tool_action_pbc.cc
File src/kudu/tools/tool_action_pbc.cc:

http://gerrit.cloudera.org:8080/#/c/17974/9/src/kudu/tools/tool_action_pbc.cc@106
PS9, Line 106: IsFileEncrypted
> I wouldn't mind leaving it like this, so long as it's clear what's what. I 
You're right, temp files would cause a problem with this approach. I guess it's better to open the file to see if it's encrypted after all. In this case, this should stay IsFileEncrypted() as we can't tell if a file is sensitive based on the header, only if it's encrypted. I'll change this behavior in the encryption header patch.


http://gerrit.cloudera.org:8080/#/c/17974/9/src/kudu/util/pb_util-test.cc
File src/kudu/util/pb_util-test.cc:

PS9: 
> Missed this?
I wanted to figure out the expected behavior of the IsFileEncrypted before addressing it. With the header-based decision making the second case wouldn't be true, we would be able to read sensitive unencrypted files with encryption enabled. The first case would still be true, but I'm not sure there needs to be a test for that specifically, TestEncryption in env-test already has a similar scenario. Or do you think it's worth adding a tooling-specific test for it? If yes, I can add it in the encryption header patch.



-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 10
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 07 Jan 2022 09:38:37 +0000
Gerrit-HasComments: Yes

[kudu-CR] [security] KUDU-3331 Encrypt file system

Posted by "Attila Bukor (Code Review)" <ge...@cloudera.org>.
Attila Bukor has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17974 )

Change subject: [security] KUDU-3331 Encrypt file system
......................................................................

[security] KUDU-3331 Encrypt file system

de02a34 introduced encryption support to Env in a self-contained way,
but it's not used across Kudu.

This commit integrates this encryption support into the project and
modifies several test suites to also run tests with encryption enabled.

I also renamed "encrypted" to "is_sensitive" in *FileOption as a file
with this flag will be encrypted only if encryption is enabled for the
process.

When encryption is enabled, the following files are encrypted:

- WAL segments
- LBM blocks and metadata
- FBM blocks
- tablet and consensus metadata

Logs, rolling logs, instance and block manager instance files,
configuration files in integration tests are not encrypted.

As FileCache is not used to access instance files, it only supports
handling sensitive files and can't be used to access unencrypted files.

As the PBC CLI tool can be used to dump encrypted (metadata) and
unencrypted files (instance) as well, it needs to be able to determine
if a file is encrypted or not. As encryption headers are not yet
implemented, I introduced a hack which checks the file name and treats
the file as unencrypted if it ends with "instance" and encrypted
otherwise.

I ran some benchmarks to compare running Kudu with encryption enabled
and disabled.

The following are StartupBenchmark tests run with KUDU_ALLOW_SLOW_TESTS
set to true, which uses a block count of 1,000,000.

It seems enabling encryption adds around 20% overhead on startup in a
typical use-case with no deletes. All tests below were ran in release
mode.

 Performance counter stats for './bin/log_block_manager-test --gtest_filter=*StartupBenchmark/0' (10 runs):

      40391.075316      task-clock (msec)         #    2.021 CPUs utilized            ( +-  1.05% )
            11,089      context-switches          #    0.275 K/sec                    ( +-  9.87% )
               280      cpu-migrations            #    0.007 K/sec                    ( +-  1.58% )
           593,982      page-faults               #    0.015 M/sec                    ( +-  2.13% )
   110,595,311,391      cycles                    #    2.738 GHz                      ( +-  1.03% )
    90,580,214,722      instructions              #    0.82  insn per cycle           ( +-  0.14% )
    16,449,237,957      branches                  #  407.249 M/sec                    ( +-  0.15% )
        67,169,915      branch-misses             #    0.41% of all branches          ( +-  0.49% )

      19.988553457 seconds time elapsed                                          ( +-  0.58% )

 Performance counter stats for './bin/log_block_manager-test --encrypt_data_at_rest=1 --gtest_filter=*StartupBenchmark/0' (10 runs):

      51317.845606      task-clock (msec)         #    2.133 CPUs utilized            ( +-  0.90% )
            13,214      context-switches          #    0.257 K/sec                    ( +-  4.03% )
               292      cpu-migrations            #    0.006 K/sec                    ( +-  1.76% )
           737,815      page-faults               #    0.014 M/sec                    ( +-  1.49% )
   144,898,246,536      cycles                    #    2.824 GHz                      ( +-  1.08% )
   126,702,271,070      instructions              #    0.87  insn per cycle           ( +-  0.05% )
    24,116,649,584      branches                  #  469.947 M/sec                    ( +-  0.05% )
       106,793,688      branch-misses             #    0.44% of all branches          ( +-  0.35% )

      24.055824830 seconds time elapsed                                          ( +-  0.89% )

With deletes, the difference seems to decrease to about 14% when 90% of
the blocks are deleted.

 Performance counter stats for './bin/log_block_manager-test --gtest_filter=*StartupBenchmark/1' (10 runs):

      53247.212289      task-clock (msec)         #    1.494 CPUs utilized            ( +-  0.69% )
            94,868      context-switches          #    0.002 M/sec                    ( +-  0.13% )
               530      cpu-migrations            #    0.010 K/sec                    ( +-  1.48% )
           399,284      page-faults               #    0.007 M/sec                    ( +-  1.66% )
   145,147,457,046      cycles                    #    2.726 GHz                      ( +-  0.48% )
   141,892,983,444      instructions              #    0.98  insn per cycle           ( +-  0.04% )
    26,167,495,753      branches                  #  491.434 M/sec                    ( +-  0.04% )
        59,986,442      branch-misses             #    0.23% of all branches          ( +-  0.33% )

      35.648681894 seconds time elapsed                                          ( +-  1.40% )

 Performance counter stats for './bin/log_block_manager-test --encrypt_data_at_rest=1 --gtest_filter=*StartupBenchmark/1' (10 runs):

      70616.598642      task-clock (msec)         #    1.737 CPUs utilized            ( +-  0.81% )
            95,082      context-switches          #    0.001 M/sec                    ( +-  0.28% )
               523      cpu-migrations            #    0.007 K/sec                    ( +-  1.69% )
           679,834      page-faults               #    0.010 M/sec                    ( +-  1.66% )
   203,066,615,244      cycles                    #    2.876 GHz                      ( +-  1.05% )
   209,355,734,267      instructions              #    1.03  insn per cycle           ( +-  0.08% )
    40,477,560,095      branches                  #  573.202 M/sec                    ( +-  0.07% )
       133,637,310      branch-misses             #    0.33% of all branches          ( +-  1.48% )

      40.653406472 seconds time elapsed                                          ( +-  1.52% )

Delete tablet benchmark takes less than a second to run, so I ran it
1000 times with encryption disabled and enabled. It seems encryption
costs about 30% of overhead in this case.

 Performance counter stats for './bin/tablet_server-test --gtest_filter=TabletServerTest.TestDeleteTabletBenchmark' (1000 runs):

        735.800649      task-clock (msec)         #    0.994 CPUs utilized            ( +-  0.33% )
             3,613      context-switches          #    0.005 M/sec                    ( +-  0.15% )
               178      cpu-migrations            #    0.242 K/sec                    ( +-  0.29% )
            10,722      page-faults               #    0.015 M/sec                    ( +-  0.08% )
     1,316,404,469      cycles                    #    1.789 GHz                      ( +-  0.19% )
     1,629,691,550      instructions              #    1.24  insn per cycle           ( +-  0.21% )
       337,778,107      branches                  #  459.062 M/sec                    ( +-  0.19% )
         6,340,956      branch-misses             #    1.88% of all branches          ( +-  0.21% )

       0.739940005 seconds time elapsed                                          ( +-  2.33% )

 Performance counter stats for './bin/tablet_server-test --encrypt_data_at_rest=1 --gtest_filter=TabletServerTest.TestDeleteTabletBenchmark' (1000 runs):

        769.368354      task-clock (msec)         #    0.792 CPUs utilized            ( +-  0.34% )
             3,633      context-switches          #    0.005 M/sec                    ( +-  0.13% )
               183      cpu-migrations            #    0.238 K/sec                    ( +-  0.29% )
            10,737      page-faults               #    0.014 M/sec                    ( +-  0.07% )
     1,356,327,815      cycles                    #    1.763 GHz                      ( +-  0.14% )
     1,635,206,270      instructions              #    1.21  insn per cycle           ( +-  0.06% )
       338,261,840      branches                  #  439.662 M/sec                    ( +-  0.06% )
         6,486,125      branch-misses             #    1.92% of all branches          ( +-  0.21% )

       0.971974609 seconds time elapsed                                          ( +-  2.42% )

Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Reviewed-on: http://gerrit.cloudera.org:8080/17974
Tested-by: Kudu Jenkins
Reviewed-by: Andrew Wong <aw...@cloudera.com>
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log_index.cc
M src/kudu/consensus/log_util.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/dir_manager.cc
M src/kudu/fs/dir_util.cc
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/log_block_manager-test-util.cc
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/integration-tests/dense_node-itest.cc
M src/kudu/integration-tests/mini_cluster_fs_inspector.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/security-itest.cc
M src/kudu/mini-cluster/external_mini_cluster.cc
M src/kudu/mini-cluster/external_mini_cluster.h
M src/kudu/postgres/mini_postgres.cc
M src/kudu/ranger/ranger_client.cc
M src/kudu/security/test/mini_kdc.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_pbc.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_source_session-test.cc
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/util/env-test.cc
M src/kudu/util/env.cc
M src/kudu/util/env.h
M src/kudu/util/env_posix.cc
M src/kudu/util/env_util.cc
M src/kudu/util/file_cache-test.cc
M src/kudu/util/file_cache.cc
M src/kudu/util/pb_util-test.cc
M src/kudu/util/pb_util.cc
M src/kudu/util/pb_util.h
M src/kudu/util/rolling_log.cc
M src/kudu/util/yamlreader-test.cc
41 files changed, 458 insertions(+), 188 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Andrew Wong: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 12
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] [encryption] KUDU-3331 Encrypt file system

Posted by "Attila Bukor (Code Review)" <ge...@cloudera.org>.
Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17974 )

Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17974/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17974/3//COMMIT_MSG@19
PS3, Line 19: The default is
            : now true
I'm working on fixing this stuff.

> files created at a single fresh run of kudu-tserver/kudu-master are properly encrypted once kudu-tserver/kudu-master is restarted?

Can you elaborate on this? I'm not sure I understand.


http://gerrit.cloudera.org:8080/#/c/17974/3//COMMIT_MSG@32
PS3, Line 32: and tablet metadata files.
> +1
Makes sense, I'll change it to encrypt everything but instance files.



-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 4
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 03 Nov 2021 00:54:39 +0000
Gerrit-HasComments: Yes

[kudu-CR] [encryption] KUDU-3331 Encrypt file system

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/17974 )

Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................


Patch Set 4:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/17974/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17974/3//COMMIT_MSG@19
PS3, Line 19: The default is
            : now true
> From 'grep -r "FileOptions" . --exclude="*test*" --exclude="*env_posix*"', 
It seems the question about changing the default for the 'is_encrypred' is now resolved since the semantics for that field is clarified and now it's named 'is_sensitive'.

However, I think it's worth clarifying on the items that Andrew pointed at.  Do you think it's possible to add some sort of test scenario (or maybe even DCHECK()) to make sure the files created at a single fresh run of kudu-tserver/kudu-master are properly encrypted once kudu-tserver/kudu-master is restarted?


http://gerrit.cloudera.org:8080/#/c/17974/3//COMMIT_MSG@32
PS3, Line 32: and tablet metadata files.
> I don't think we should treat our redaction policy as the gold standard for
+1

I guess everything related to the user's data (and that includes metadata) should be considered sensitive.  As far as I can see, we do redact partitioning-related information at least in the logs (such as begin/end for range keys).


http://gerrit.cloudera.org:8080/#/c/17974/4/src/kudu/util/env.h
File src/kudu/util/env.h:

http://gerrit.cloudera.org:8080/#/c/17974/4/src/kudu/util/env.h@421
PS4, Line 421: is_sensitive
Ah, that's now much clearer!



-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 4
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 03 Nov 2021 00:02:07 +0000
Gerrit-HasComments: Yes

[kudu-CR] [encryption] KUDU-3331 Encrypt file system

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17974 )

Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................


Patch Set 3:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/17974/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17974/3//COMMIT_MSG@15
PS3, Line 15: I also changed the way the "encrypted" flag in *FileOptions works:
nit: maybe it's worth calling this "is_sensitive" to avoid confusion


http://gerrit.cloudera.org:8080/#/c/17974/3//COMMIT_MSG@19
PS3, Line 19: The default is
            : now true
From 'grep -r "FileOptions" . --exclude="*test*" --exclude="*env_posix*"', a few stand out as things we should consider disabling encryption for (based on this commit message, anyway):
- fs/dir_manager.cc when copying instance metadata files
- fs/dir_util.cc when creating instance metadata files
- consensus/log_index.cc for the log index -- it doesn't store data, so do we actually care to encrypt it?
- util/env_util.cc in OpenFileForWrite (without 'opts' input), though this is only used in tests right now

In general I'm still not sold on changing the default, though I'd like to better understand the complications that you've run into.


http://gerrit.cloudera.org:8080/#/c/17974/3//COMMIT_MSG@20
PS3, Line 20: not having to complicate
            : FileCache.
Could you elaborate on this a bit more -- what would go wrong if we left the default as false and relied on callers manually request encryption? Wouldn't that just mean the FileCache::ReopenFileIfNecessary() needs to set the encryption option to true?


http://gerrit.cloudera.org:8080/#/c/17974/3//COMMIT_MSG@28
PS3, Line 28: we can keep FileCache simpler contains block
            :   location
nit: I'm having trouble parsing this sentence. Mind rephrasing? I'm guessing the point is that it's simpler to implement when the entire FileCache is either encrypted or not.


http://gerrit.cloudera.org:8080/#/c/17974/3//COMMIT_MSG@32
PS3, Line 32: and tablet metadata files.
Why is this the case again? There's a lot you can infer from just a tablet metadata file alone (table schema, partitioning, data size, etc.).

The design doc mentions both needing to be encrypted. I can see an argument for consensus metadata, but tablet metadata seems critical.



-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 3
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 27 Oct 2021 20:51:05 +0000
Gerrit-HasComments: Yes

[kudu-CR] [encryption] KUDU-3331 Encrypt file system

Posted by "Attila Bukor (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, Andrew Wong, Bankim Bhavsar, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17974

to look at the new patch set (#9).

Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................

[encryption] KUDU-3331 Encrypt file system

de02a34 introduced encryption support to Env in a self-contained way,
but it's not used across Kudu.

This commit integrates this encryption support into the project and
modifies several test suites to also run tests with encryption enabled.

I also renamed "encrypted" to "is_sensitive" in *FileOption as a file
with this flag will be encrypted only if encryption is enabled for the
process.

When encryption is enabled, the following files are encrypted:

- WAL segments
- LBM blocks and metadata
- FBM blocks
- tablet and consensus metadata

Logs, rolling logs, instance and block manager instance files,
configuration files in integration tests are not encrypted.

As FileCache is not used to access instance files, it only supports
handling sensitive files and can't be used to access unencrypted files.

As the PBC CLI tool be used to dump encrypted (metadata) and unencrypted
files (instance) as well, it needs to be able to determine if a file is
encrypted or not. As encryption headers are not yet implemented, I
introduced a hack which checks the file name and treats the file as
unencrypted if it ends with "instance" and encrypted otherwise.

I ran some benchmarks to compare running Kudu with encryption enabled
and disabled.

The following are StartupBenchmark tests run with KUDU_ALLOW_SLOW_TESTS
set to true, which uses a block count of 1,000,000.

It seems enabling encryption adds around 20% overhead on startup in a
typical use-case with no deletes. All tests below were ran in release
mode.

 Performance counter stats for './bin/log_block_manager-test --gtest_filter=*StartupBenchmark/0' (10 runs):

      40391.075316      task-clock (msec)         #    2.021 CPUs utilized            ( +-  1.05% )
            11,089      context-switches          #    0.275 K/sec                    ( +-  9.87% )
               280      cpu-migrations            #    0.007 K/sec                    ( +-  1.58% )
           593,982      page-faults               #    0.015 M/sec                    ( +-  2.13% )
   110,595,311,391      cycles                    #    2.738 GHz                      ( +-  1.03% )
    90,580,214,722      instructions              #    0.82  insn per cycle           ( +-  0.14% )
    16,449,237,957      branches                  #  407.249 M/sec                    ( +-  0.15% )
        67,169,915      branch-misses             #    0.41% of all branches          ( +-  0.49% )

      19.988553457 seconds time elapsed                                          ( +-  0.58% )

 Performance counter stats for './bin/log_block_manager-test --encrypt_data_at_rest=1 --gtest_filter=*StartupBenchmark/0' (10 runs):

      51317.845606      task-clock (msec)         #    2.133 CPUs utilized            ( +-  0.90% )
            13,214      context-switches          #    0.257 K/sec                    ( +-  4.03% )
               292      cpu-migrations            #    0.006 K/sec                    ( +-  1.76% )
           737,815      page-faults               #    0.014 M/sec                    ( +-  1.49% )
   144,898,246,536      cycles                    #    2.824 GHz                      ( +-  1.08% )
   126,702,271,070      instructions              #    0.87  insn per cycle           ( +-  0.05% )
    24,116,649,584      branches                  #  469.947 M/sec                    ( +-  0.05% )
       106,793,688      branch-misses             #    0.44% of all branches          ( +-  0.35% )

      24.055824830 seconds time elapsed                                          ( +-  0.89% )

With deletes, the difference seems to decrease to about 14% when 90% of
the blocks are deleted.

 Performance counter stats for './bin/log_block_manager-test --gtest_filter=*StartupBenchmark/1' (10 runs):

      53247.212289      task-clock (msec)         #    1.494 CPUs utilized            ( +-  0.69% )
            94,868      context-switches          #    0.002 M/sec                    ( +-  0.13% )
               530      cpu-migrations            #    0.010 K/sec                    ( +-  1.48% )
           399,284      page-faults               #    0.007 M/sec                    ( +-  1.66% )
   145,147,457,046      cycles                    #    2.726 GHz                      ( +-  0.48% )
   141,892,983,444      instructions              #    0.98  insn per cycle           ( +-  0.04% )
    26,167,495,753      branches                  #  491.434 M/sec                    ( +-  0.04% )
        59,986,442      branch-misses             #    0.23% of all branches          ( +-  0.33% )

      35.648681894 seconds time elapsed                                          ( +-  1.40% )

 Performance counter stats for './bin/log_block_manager-test --encrypt_data_at_rest=1 --gtest_filter=*StartupBenchmark/1' (10 runs):

      70616.598642      task-clock (msec)         #    1.737 CPUs utilized            ( +-  0.81% )
            95,082      context-switches          #    0.001 M/sec                    ( +-  0.28% )
               523      cpu-migrations            #    0.007 K/sec                    ( +-  1.69% )
           679,834      page-faults               #    0.010 M/sec                    ( +-  1.66% )
   203,066,615,244      cycles                    #    2.876 GHz                      ( +-  1.05% )
   209,355,734,267      instructions              #    1.03  insn per cycle           ( +-  0.08% )
    40,477,560,095      branches                  #  573.202 M/sec                    ( +-  0.07% )
       133,637,310      branch-misses             #    0.33% of all branches          ( +-  1.48% )

      40.653406472 seconds time elapsed                                          ( +-  1.52% )

Delete tablet benchmark takes less than a second to run, so I ran it
1000 times with encryption disabled and enabled. It seems encryption
costs about 30% of overhead in this case.

 Performance counter stats for './bin/tablet_server-test --gtest_filter=TabletServerTest.TestDeleteTabletBenchmark' (1000 runs):

        735.800649      task-clock (msec)         #    0.994 CPUs utilized            ( +-  0.33% )
             3,613      context-switches          #    0.005 M/sec                    ( +-  0.15% )
               178      cpu-migrations            #    0.242 K/sec                    ( +-  0.29% )
            10,722      page-faults               #    0.015 M/sec                    ( +-  0.08% )
     1,316,404,469      cycles                    #    1.789 GHz                      ( +-  0.19% )
     1,629,691,550      instructions              #    1.24  insn per cycle           ( +-  0.21% )
       337,778,107      branches                  #  459.062 M/sec                    ( +-  0.19% )
         6,340,956      branch-misses             #    1.88% of all branches          ( +-  0.21% )

       0.739940005 seconds time elapsed                                          ( +-  2.33% )

 Performance counter stats for './bin/tablet_server-test --encrypt_data_at_rest=1 --gtest_filter=TabletServerTest.TestDeleteTabletBenchmark' (1000 runs):

        769.368354      task-clock (msec)         #    0.792 CPUs utilized            ( +-  0.34% )
             3,633      context-switches          #    0.005 M/sec                    ( +-  0.13% )
               183      cpu-migrations            #    0.238 K/sec                    ( +-  0.29% )
            10,737      page-faults               #    0.014 M/sec                    ( +-  0.07% )
     1,356,327,815      cycles                    #    1.763 GHz                      ( +-  0.14% )
     1,635,206,270      instructions              #    1.21  insn per cycle           ( +-  0.06% )
       338,261,840      branches                  #  439.662 M/sec                    ( +-  0.06% )
         6,486,125      branch-misses             #    1.92% of all branches          ( +-  0.21% )

       0.971974609 seconds time elapsed                                          ( +-  2.42% )

I also wanted to run dense_node-itest with -num_seconds=240 and
-num_tablets=1000, but the amount of data written (both in terms of
number of blocks and bytes) was all over the place, and so was the time
spent reopening the tablet server (seemingly without correlation to the
amount of data under management by the LBM). This was true with and
without encryption being enabled, so it's hard to draw any conclusions
from this benchmark.

$ for x in {0..1}; do perf stat -r 5 --log-fd=3 ./bin/dense_node-itest -num_tablets=1000 -num_seconds=240 --gtest_filter=DenseNodeTest.RunTest/$x 3>&1 2> >(grep "dense_node-itest.cc") > >(grep "===="); done
[==========] Running 1 test from 1 test suite.
I1110 05:23:51.307804 107446 dense_node-itest.cc:223] Time spent restarting master: real 0.083s user 0.000s     sys 0.001s
I1110 05:24:19.118248 107446 dense_node-itest.cc:226] Time spent restarting tserver: real 27.810s       user 0.017s     sys 0.100s
I1110 05:24:19.118268 107446 dense_node-itest.cc:237] not waiting for bootstrapping tablets (flag disabled)
I1110 05:24:19.465764 107446 dense_node-itest.cc:242] log_block_manager_blocks_under_management: 38136
I1110 05:24:19.465798 107446 dense_node-itest.cc:242] log_block_manager_bytes_under_management: 250252129
I1110 05:24:19.465804 107446 dense_node-itest.cc:242] log_block_manager_containers: 9892
I1110 05:24:19.465808 107446 dense_node-itest.cc:242] log_block_manager_full_containers: 4017
I1110 05:24:19.465812 107446 dense_node-itest.cc:242] threads_running: 1334
[==========] 1 test from 1 test suite ran. (608499 ms total)
[==========] Running 1 test from 1 test suite.
I1110 05:33:58.751773 121520 dense_node-itest.cc:223] Time spent restarting master: real 0.053s user 0.000s     sys 0.001s
I1110 05:41:09.045049 121520 dense_node-itest.cc:226] Time spent restarting tserver: real 430.293s      user 0.291s     sys 1.523s
I1110 05:41:09.045073 121520 dense_node-itest.cc:237] not waiting for bootstrapping tablets (flag disabled)
I1110 05:41:09.257555 121520 dense_node-itest.cc:242] log_block_manager_blocks_under_management: 29376
I1110 05:41:09.257593 121520 dense_node-itest.cc:242] log_block_manager_bytes_under_management: 199511663
I1110 05:41:09.257599 121520 dense_node-itest.cc:242] log_block_manager_containers: 11464
I1110 05:41:09.257604 121520 dense_node-itest.cc:242] log_block_manager_full_containers: 1319
I1110 05:41:09.257608 121520 dense_node-itest.cc:242] threads_running: 226
[==========] 1 test from 1 test suite ran. (947330 ms total)
[==========] Running 1 test from 1 test suite.
I1110 05:50:01.919380 143199 dense_node-itest.cc:223] Time spent restarting master: real 0.094s user 0.000s     sys 0.003s
I1110 05:51:13.537168 143199 dense_node-itest.cc:226] Time spent restarting tserver: real 71.618s       user 0.057s     sys 0.249s
I1110 05:51:13.537186 143199 dense_node-itest.cc:237] not waiting for bootstrapping tablets (flag disabled)
I1110 05:51:13.581818 143199 dense_node-itest.cc:242] log_block_manager_blocks_under_management: 32557
I1110 05:51:13.581848 143199 dense_node-itest.cc:242] log_block_manager_bytes_under_management: 189720720
I1110 05:51:13.581853 143199 dense_node-itest.cc:242] log_block_manager_containers: 9910
I1110 05:51:13.581857 143199 dense_node-itest.cc:242] log_block_manager_full_containers: 2070
I1110 05:51:13.581861 143199 dense_node-itest.cc:242] threads_running: 225
[==========] 1 test from 1 test suite ran. (630376 ms total)
[==========] Running 1 test from 1 test suite.
I1110 06:00:18.327955 161886 dense_node-itest.cc:223] Time spent restarting master: real 0.094s user 0.001s     sys 0.001s
I1110 06:00:36.625900 161886 dense_node-itest.cc:226] Time spent restarting tserver: real 18.298s       user 0.011s     sys 0.063s
I1110 06:00:36.625918 161886 dense_node-itest.cc:237] not waiting for bootstrapping tablets (flag disabled)
I1110 06:00:36.779762 161886 dense_node-itest.cc:242] log_block_manager_blocks_under_management: 34068
I1110 06:00:36.779801 161886 dense_node-itest.cc:242] log_block_manager_bytes_under_management: 236467612
I1110 06:00:36.779808 161886 dense_node-itest.cc:242] log_block_manager_containers: 9880
I1110 06:00:36.779814 161886 dense_node-itest.cc:242] log_block_manager_full_containers: 1410
I1110 06:00:36.779817 161886 dense_node-itest.cc:242] threads_running: 1332
[==========] 1 test from 1 test suite ran. (481328 ms total)
[==========] Running 1 test from 1 test suite.
I1110 06:08:12.651594 176648 dense_node-itest.cc:223] Time spent restarting master: real 0.084s user 0.001s     sys 0.002s
I1110 06:09:14.436517 176648 dense_node-itest.cc:226] Time spent restarting tserver: real 61.785s       user 0.045s     sys 0.222s
I1110 06:09:14.436537 176648 dense_node-itest.cc:237] not waiting for bootstrapping tablets (flag disabled)
I1110 06:09:14.732786 176648 dense_node-itest.cc:242] log_block_manager_blocks_under_management: 32334
I1110 06:09:14.732823 176648 dense_node-itest.cc:242] log_block_manager_bytes_under_management: 225235870
I1110 06:09:14.732829 176648 dense_node-itest.cc:242] log_block_manager_containers: 10119
I1110 06:09:14.732833 176648 dense_node-itest.cc:242] log_block_manager_full_containers: 5070
I1110 06:09:14.732837 176648 dense_node-itest.cc:242] threads_running: 1313
[==========] 1 test from 1 test suite ran. (506357 ms total)

 Performance counter stats for './bin/dense_node-itest -num_tablets=1000 -num_seconds=240 --gtest_filter=DenseNodeTest.RunTest/0' (5 runs):

    2995824.938107      task-clock (msec)         #    4.719 CPUs utilized            ( +-  5.08% )
        51,210,606      context-switches          #    0.017 M/sec                    ( +-  8.77% )
        10,794,492      cpu-migrations            #    0.004 M/sec                    ( +- 12.18% )
         5,334,419      page-faults               #    0.002 M/sec                    ( +-  9.46% )
 6,394,831,534,947      cycles                    #    2.135 GHz                      ( +-  9.25% )
 3,981,645,181,601      instructions              #    0.62  insn per cycle           ( +-  9.86% )
   792,397,266,216      branches                  #  264.501 M/sec                    ( +-  9.75% )
     6,963,950,058      branch-misses             #    0.88% of all branches          ( +-  7.25% )

     634.804501857 seconds time elapsed                                          ( +- 13.11% )

[==========] Running 1 test from 1 test suite.
I1110 06:16:46.444833 194734 dense_node-itest.cc:223] Time spent restarting master: real 0.075s user 0.000s     sys 0.002s
I1110 06:17:29.511780 194734 dense_node-itest.cc:226] Time spent restarting tserver: real 43.067s       user 0.031s     sys 0.151s
I1110 06:17:29.511799 194734 dense_node-itest.cc:237] not waiting for bootstrapping tablets (flag disabled)
I1110 06:17:29.575773 194734 dense_node-itest.cc:242] log_block_manager_blocks_under_management: 33354
I1110 06:17:29.575809 194734 dense_node-itest.cc:242] log_block_manager_bytes_under_management: 162500445
I1110 06:17:29.575814 194734 dense_node-itest.cc:242] log_block_manager_containers: 9946
I1110 06:17:29.575817 194734 dense_node-itest.cc:242] log_block_manager_full_containers: 2847
I1110 06:17:29.575821 194734 dense_node-itest.cc:242] threads_running: 228
[==========] 1 test from 1 test suite ran. (468746 ms total)
[==========] Running 1 test from 1 test suite.
I1110 06:24:34.692034 14506 dense_node-itest.cc:223] Time spent restarting master: real 4.389s  user 0.003s     sys 0.017s
I1110 06:25:16.629412 14506 dense_node-itest.cc:226] Time spent restarting tserver: real 41.937s        user 0.027s     sys 0.146s
I1110 06:25:16.629434 14506 dense_node-itest.cc:237] not waiting for bootstrapping tablets (flag disabled)
I1110 06:25:22.050498 14506 dense_node-itest.cc:242] log_block_manager_blocks_under_management: 47634
I1110 06:25:22.050537 14506 dense_node-itest.cc:242] log_block_manager_bytes_under_management: 409166624
I1110 06:25:22.050544 14506 dense_node-itest.cc:242] log_block_manager_containers: 9779
I1110 06:25:22.050549 14506 dense_node-itest.cc:242] log_block_manager_full_containers: 3405
I1110 06:25:22.050552 14506 dense_node-itest.cc:242] threads_running: 1342
[==========] 1 test from 1 test suite ran. (434683 ms total)
[==========] Running 1 test from 1 test suite.
I1110 06:31:46.989178 26886 dense_node-itest.cc:223] Time spent restarting master: real 0.093s  user 0.000s     sys 0.003s
I1110 06:32:04.775068 26886 dense_node-itest.cc:226] Time spent restarting tserver: real 17.786s        user 0.010s     sys 0.048s
I1110 06:32:04.775091 26886 dense_node-itest.cc:237] not waiting for bootstrapping tablets (flag disabled)
I1110 06:32:04.830790 26886 dense_node-itest.cc:242] log_block_manager_blocks_under_management: 34068
I1110 06:32:04.830832 26886 dense_node-itest.cc:242] log_block_manager_bytes_under_management: 225249875
I1110 06:32:04.830838 26886 dense_node-itest.cc:242] log_block_manager_containers: 10352
I1110 06:32:04.830842 26886 dense_node-itest.cc:242] log_block_manager_full_containers: 1198
I1110 06:32:04.830849 26886 dense_node-itest.cc:242] threads_running: 1307
[==========] 1 test from 1 test suite ran. (401113 ms total)
[==========] Running 1 test from 1 test suite.
I1110 06:38:28.355651 39523 dense_node-itest.cc:223] Time spent restarting master: real 0.348s  user 0.001s     sys 0.003s
I1110 06:39:14.934329 39523 dense_node-itest.cc:226] Time spent restarting tserver: real 46.579s        user 0.034s     sys 0.161s
I1110 06:39:14.934355 39523 dense_node-itest.cc:237] not waiting for bootstrapping tablets (flag disabled)
I1110 06:39:15.086436 39523 dense_node-itest.cc:242] log_block_manager_blocks_under_management: 42942
I1110 06:39:15.086474 39523 dense_node-itest.cc:242] log_block_manager_bytes_under_management: 337224147
I1110 06:39:15.086480 39523 dense_node-itest.cc:242] log_block_manager_containers: 10706
I1110 06:39:15.086484 39523 dense_node-itest.cc:242] log_block_manager_full_containers: 2511
I1110 06:39:15.086489 39523 dense_node-itest.cc:242] threads_running: 1326
[==========] 1 test from 1 test suite ran. (542365 ms total)
[==========] Running 1 test from 1 test suite.
I1110 06:47:30.400804 54331 dense_node-itest.cc:223] Time spent restarting master: real 0.125s  user 0.001s     sys 0.002s
I1110 06:51:28.288359 54331 dense_node-itest.cc:226] Time spent restarting tserver: real 237.888s       user 0.160s     sys 0.855s
I1110 06:51:28.288374 54331 dense_node-itest.cc:237] not waiting for bootstrapping tablets (flag disabled)
I1110 06:51:28.633086 54331 dense_node-itest.cc:242] log_block_manager_blocks_under_management: 35292
I1110 06:51:28.633126 54331 dense_node-itest.cc:242] log_block_manager_bytes_under_management: 198352642
I1110 06:51:28.633132 54331 dense_node-itest.cc:242] log_block_manager_containers: 10015
I1110 06:51:28.633139 54331 dense_node-itest.cc:242] log_block_manager_full_containers: 2514
I1110 06:51:28.633143 54331 dense_node-itest.cc:242] threads_running: 1300
[==========] 1 test from 1 test suite ran. (709361 ms total)

 Performance counter stats for './bin/dense_node-itest -num_tablets=1000 -num_seconds=240 --gtest_filter=DenseNodeTest.RunTest/1' (5 runs):

    3115638.620173      task-clock (msec)         #    6.094 CPUs utilized            ( +-  2.07% )
        50,595,541      context-switches          #    0.016 M/sec                    ( +-  7.87% )
        10,972,126      cpu-migrations            #    0.004 M/sec                    ( +-  9.15% )
         5,985,924      page-faults               #    0.002 M/sec                    ( +- 10.89% )
 6,073,807,994,303      cycles                    #    1.949 GHz                      ( +- 10.17% )
 4,094,332,719,732      instructions              #    0.67  insn per cycle           ( +-  7.76% )
   807,628,648,546      branches                  #  259.218 M/sec                    ( +-  7.83% )
     7,158,875,684      branch-misses             #    0.89% of all branches          ( +-  6.38% )

     511.281338094 seconds time elapsed                                          ( +- 10.71% )

Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log_index.cc
M src/kudu/consensus/log_util.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/dir_manager.cc
M src/kudu/fs/dir_util.cc
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/log_block_manager-test-util.cc
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/integration-tests/dense_node-itest.cc
M src/kudu/integration-tests/mini_cluster_fs_inspector.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/security-itest.cc
M src/kudu/mini-cluster/external_mini_cluster.cc
M src/kudu/mini-cluster/external_mini_cluster.h
M src/kudu/postgres/mini_postgres.cc
M src/kudu/ranger/ranger_client.cc
M src/kudu/security/test/mini_kdc.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_pbc.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_source_session-test.cc
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/util/env-test.cc
M src/kudu/util/env.cc
M src/kudu/util/env.h
M src/kudu/util/env_posix.cc
M src/kudu/util/env_util.cc
M src/kudu/util/file_cache-test.cc
M src/kudu/util/file_cache.cc
M src/kudu/util/pb_util-test.cc
M src/kudu/util/pb_util.cc
M src/kudu/util/pb_util.h
M src/kudu/util/rolling_log.cc
M src/kudu/util/yamlreader-test.cc
41 files changed, 458 insertions(+), 188 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/17974/9
-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 9
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] [encryption] KUDU-3331 Encrypt file system

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17974 )

Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................


Patch Set 10: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17974/9/src/kudu/tools/tool_action_pbc.cc
File src/kudu/tools/tool_action_pbc.cc:

http://gerrit.cloudera.org:8080/#/c/17974/9/src/kudu/tools/tool_action_pbc.cc@106
PS9, Line 106: IsFileEncrypted
> I guess at this point it's to determine if it's sensitive, but if we switch
I wouldn't mind leaving it like this, so long as it's clear what's what. I suppose we'd also need to think about temp files too though.

Maybe we ought to add a --force_file_is_sensitive=true/false/"" or somesuch for users to ignore this condition? Perhaps in a separate patch though


http://gerrit.cloudera.org:8080/#/c/17974/9/src/kudu/util/pb_util-test.cc
File src/kudu/util/pb_util-test.cc:

PS9: 
> Do you think it's worth adding a test that checks that we can't read a sens
Missed this?



-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 10
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 07 Jan 2022 00:47:30 +0000
Gerrit-HasComments: Yes

[kudu-CR] [encryption] KUDU-3331 Encrypt file system

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/17974 )

Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................


Patch Set 6:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/17974/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17974/7//COMMIT_MSG@32
PS7, Line 32: needs
nit: seems to be an extra word?


http://gerrit.cloudera.org:8080/#/c/17974/7//COMMIT_MSG@45
PS7, Line 45: ronment set-up.
Just to double-check: the results below are for RELEASE builds, right?


http://gerrit.cloudera.org:8080/#/c/17974/6/src/kudu/tserver/tablet_copy_client.cc
File src/kudu/tserver/tablet_copy_client.cc:

http://gerrit.cloudera.org:8080/#/c/17974/6/src/kudu/tserver/tablet_copy_client.cc@492
PS6, Line 492: WritableFileOptions()
What about explicitly controlling the encryption for the result file here?


http://gerrit.cloudera.org:8080/#/c/17974/6/src/kudu/tserver/tablet_copy_client.cc@798
PS6, Line 798: WritableFileOptions
What about explicitly controlling the encryption for the result file here?


http://gerrit.cloudera.org:8080/#/c/17974/7/src/kudu/util/env_posix.cc
File src/kudu/util/env_posix.cc:

http://gerrit.cloudera.org:8080/#/c/17974/7/src/kudu/util/env_posix.cc@203
PS7, Line 203: Whether file system should be encrypted.
nit: since now the terminology is more stable (encrypted vs sensitive, etc.), maybe update this description to be a bit more specific, i.e.

Whether to encrypt sensitive files


http://gerrit.cloudera.org:8080/#/c/17974/7/src/kudu/util/env_posix.cc@1306
PS7, Line 1306: opts.is_sensitive && IsEncryptionEnabled()
nit: would it make sense to introduce a helper function/method ShouldEncrypt(const SequentialFileOptions&) to use here and below?


http://gerrit.cloudera.org:8080/#/c/17974/6/src/kudu/util/rolling_log.cc
File src/kudu/util/rolling_log.cc:

http://gerrit.cloudera.org:8080/#/c/17974/6/src/kudu/util/rolling_log.cc@163
PS6, Line 163: opts.is_sensitive = false;
Just to make sure this assignment is left here intentionally even if 'is_sensitive' is 'false' by default?



-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 6
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 17 Nov 2021 01:00:26 +0000
Gerrit-HasComments: Yes

[kudu-CR] [encryption] KUDU-3331 Encrypt file system

Posted by "Attila Bukor (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, Andrew Wong, Bankim Bhavsar, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17974

to look at the new patch set (#7).

Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................

[encryption] KUDU-3331 Encrypt file system

de02a34 introduced encryption support to Env in a self-contained way,
but it's not used across Kudu.

This commit integrates this encryption support into the project and
modifies several test suites to also run tests with encryption enabled.

I also renamed "encrypted" to "is_sensitive" in *FileOption as a file
with this flag will be encrypted only if encryption is enabled for the
process.

When encryption is enabled, the following files are encrypted:

- WAL segments
- LBM blocks and metadata
- FBM blocks
- tablet and consensus metadata

Logs, rolling logs, instance and block manager instance files,
configuration files in integration tests are not encrypted.

As FileCache is not used to access instance files, it only supports
handling sensitive files and can't be used to access unencrypted files.

As the PBC CLI tool needs can be used to dump encrypted (metadata) and
unencrypted files (instance) as well, it needs to be able to determine
if a file is encrypted or not. As encryption headers are not yet
implemented, I introduced a hack which checks the file name and treats
the file as unencrypted if it ends with "instance" and encrypted
otherwise.

I ran some benchmarks to compare running Kudu with encryption enabled
and disabled.

The following are StartupBenchmark tests run with KUDU_ALLOW_SLOW_TESTS
set to true, which uses a block count of 1,000,000.

It seems enabling encryption adds around 20% overhead on startup in a
typical use-case with no deletes.

 Performance counter stats for './bin/log_block_manager-test --gtest_filter=*StartupBenchmark/0' (10 runs):

      40391.075316      task-clock (msec)         #    2.021 CPUs utilized            ( +-  1.05% )
            11,089      context-switches          #    0.275 K/sec                    ( +-  9.87% )
               280      cpu-migrations            #    0.007 K/sec                    ( +-  1.58% )
           593,982      page-faults               #    0.015 M/sec                    ( +-  2.13% )
   110,595,311,391      cycles                    #    2.738 GHz                      ( +-  1.03% )
    90,580,214,722      instructions              #    0.82  insn per cycle           ( +-  0.14% )
    16,449,237,957      branches                  #  407.249 M/sec                    ( +-  0.15% )
        67,169,915      branch-misses             #    0.41% of all branches          ( +-  0.49% )

      19.988553457 seconds time elapsed                                          ( +-  0.58% )

 Performance counter stats for './bin/log_block_manager-test --encrypt_data_at_rest=1 --gtest_filter=*StartupBenchmark/0' (10 runs):

      51317.845606      task-clock (msec)         #    2.133 CPUs utilized            ( +-  0.90% )
            13,214      context-switches          #    0.257 K/sec                    ( +-  4.03% )
               292      cpu-migrations            #    0.006 K/sec                    ( +-  1.76% )
           737,815      page-faults               #    0.014 M/sec                    ( +-  1.49% )
   144,898,246,536      cycles                    #    2.824 GHz                      ( +-  1.08% )
   126,702,271,070      instructions              #    0.87  insn per cycle           ( +-  0.05% )
    24,116,649,584      branches                  #  469.947 M/sec                    ( +-  0.05% )
       106,793,688      branch-misses             #    0.44% of all branches          ( +-  0.35% )

      24.055824830 seconds time elapsed                                          ( +-  0.89% )

With deletes, the difference seems to decrease to about 14% when 90% of
the blocks are deleted.

 Performance counter stats for './bin/log_block_manager-test --gtest_filter=*StartupBenchmark/1' (10 runs):

      53247.212289      task-clock (msec)         #    1.494 CPUs utilized            ( +-  0.69% )
            94,868      context-switches          #    0.002 M/sec                    ( +-  0.13% )
               530      cpu-migrations            #    0.010 K/sec                    ( +-  1.48% )
           399,284      page-faults               #    0.007 M/sec                    ( +-  1.66% )
   145,147,457,046      cycles                    #    2.726 GHz                      ( +-  0.48% )
   141,892,983,444      instructions              #    0.98  insn per cycle           ( +-  0.04% )
    26,167,495,753      branches                  #  491.434 M/sec                    ( +-  0.04% )
        59,986,442      branch-misses             #    0.23% of all branches          ( +-  0.33% )

      35.648681894 seconds time elapsed                                          ( +-  1.40% )

 Performance counter stats for './bin/log_block_manager-test --encrypt_data_at_rest=1 --gtest_filter=*StartupBenchmark/1' (10 runs):

      70616.598642      task-clock (msec)         #    1.737 CPUs utilized            ( +-  0.81% )
            95,082      context-switches          #    0.001 M/sec                    ( +-  0.28% )
               523      cpu-migrations            #    0.007 K/sec                    ( +-  1.69% )
           679,834      page-faults               #    0.010 M/sec                    ( +-  1.66% )
   203,066,615,244      cycles                    #    2.876 GHz                      ( +-  1.05% )
   209,355,734,267      instructions              #    1.03  insn per cycle           ( +-  0.08% )
    40,477,560,095      branches                  #  573.202 M/sec                    ( +-  0.07% )
       133,637,310      branch-misses             #    0.33% of all branches          ( +-  1.48% )

      40.653406472 seconds time elapsed                                          ( +-  1.52% )

Delete tablet benchmark takes less than a second to run, so I ran it
1000 times with encryption disabled and enabled. It seems encryption
costs about 30% of overhead in this case.

 Performance counter stats for './bin/tablet_server-test --gtest_filter=TabletServerTest.TestDeleteTabletBenchmark' (1000 runs):

        735.800649      task-clock (msec)         #    0.994 CPUs utilized            ( +-  0.33% )
             3,613      context-switches          #    0.005 M/sec                    ( +-  0.15% )
               178      cpu-migrations            #    0.242 K/sec                    ( +-  0.29% )
            10,722      page-faults               #    0.015 M/sec                    ( +-  0.08% )
     1,316,404,469      cycles                    #    1.789 GHz                      ( +-  0.19% )
     1,629,691,550      instructions              #    1.24  insn per cycle           ( +-  0.21% )
       337,778,107      branches                  #  459.062 M/sec                    ( +-  0.19% )
         6,340,956      branch-misses             #    1.88% of all branches          ( +-  0.21% )

       0.739940005 seconds time elapsed                                          ( +-  2.33% )

 Performance counter stats for './bin/tablet_server-test --encrypt_data_at_rest=1 --gtest_filter=TabletServerTest.TestDeleteTabletBenchmark' (1000 runs):

        769.368354      task-clock (msec)         #    0.792 CPUs utilized            ( +-  0.34% )
             3,633      context-switches          #    0.005 M/sec                    ( +-  0.13% )
               183      cpu-migrations            #    0.238 K/sec                    ( +-  0.29% )
            10,737      page-faults               #    0.014 M/sec                    ( +-  0.07% )
     1,356,327,815      cycles                    #    1.763 GHz                      ( +-  0.14% )
     1,635,206,270      instructions              #    1.21  insn per cycle           ( +-  0.06% )
       338,261,840      branches                  #  439.662 M/sec                    ( +-  0.06% )
         6,486,125      branch-misses             #    1.92% of all branches          ( +-  0.21% )

       0.971974609 seconds time elapsed                                          ( +-  2.42% )

I also wanted to run dense_node-itest with -num_seconds=240 and
-num_tablets=1000, but the amount of data written (both in terms of
number of blocks and bytes), and so was the time spent reopening the
tablet server (seemingly without correlation to the amount of data under
management by the LBM). This was true with and without encryption being
enabled, so it's hard to draw any conclusions from this benchmark.

$ for x in {0..1}; do perf stat -r 5 --log-fd=3 ./bin/dense_node-itest -num_tablets=1000 -num_seconds=240 --gtest_filter=DenseNodeTest.RunTest/$x 3>&1 2> >(grep "dense_node-itest.cc") > >(grep "===="); done
[==========] Running 1 test from 1 test suite.
I1110 05:23:51.307804 107446 dense_node-itest.cc:223] Time spent restarting master: real 0.083s user 0.000s     sys 0.001s
I1110 05:24:19.118248 107446 dense_node-itest.cc:226] Time spent restarting tserver: real 27.810s       user 0.017s     sys 0.100s
I1110 05:24:19.118268 107446 dense_node-itest.cc:237] not waiting for bootstrapping tablets (flag disabled)
I1110 05:24:19.465764 107446 dense_node-itest.cc:242] log_block_manager_blocks_under_management: 38136
I1110 05:24:19.465798 107446 dense_node-itest.cc:242] log_block_manager_bytes_under_management: 250252129
I1110 05:24:19.465804 107446 dense_node-itest.cc:242] log_block_manager_containers: 9892
I1110 05:24:19.465808 107446 dense_node-itest.cc:242] log_block_manager_full_containers: 4017
I1110 05:24:19.465812 107446 dense_node-itest.cc:242] threads_running: 1334
[==========] 1 test from 1 test suite ran. (608499 ms total)
[==========] Running 1 test from 1 test suite.
I1110 05:33:58.751773 121520 dense_node-itest.cc:223] Time spent restarting master: real 0.053s user 0.000s     sys 0.001s
I1110 05:41:09.045049 121520 dense_node-itest.cc:226] Time spent restarting tserver: real 430.293s      user 0.291s     sys 1.523s
I1110 05:41:09.045073 121520 dense_node-itest.cc:237] not waiting for bootstrapping tablets (flag disabled)
I1110 05:41:09.257555 121520 dense_node-itest.cc:242] log_block_manager_blocks_under_management: 29376
I1110 05:41:09.257593 121520 dense_node-itest.cc:242] log_block_manager_bytes_under_management: 199511663
I1110 05:41:09.257599 121520 dense_node-itest.cc:242] log_block_manager_containers: 11464
I1110 05:41:09.257604 121520 dense_node-itest.cc:242] log_block_manager_full_containers: 1319
I1110 05:41:09.257608 121520 dense_node-itest.cc:242] threads_running: 226
[==========] 1 test from 1 test suite ran. (947330 ms total)
[==========] Running 1 test from 1 test suite.
I1110 05:50:01.919380 143199 dense_node-itest.cc:223] Time spent restarting master: real 0.094s user 0.000s     sys 0.003s
I1110 05:51:13.537168 143199 dense_node-itest.cc:226] Time spent restarting tserver: real 71.618s       user 0.057s     sys 0.249s
I1110 05:51:13.537186 143199 dense_node-itest.cc:237] not waiting for bootstrapping tablets (flag disabled)
I1110 05:51:13.581818 143199 dense_node-itest.cc:242] log_block_manager_blocks_under_management: 32557
I1110 05:51:13.581848 143199 dense_node-itest.cc:242] log_block_manager_bytes_under_management: 189720720
I1110 05:51:13.581853 143199 dense_node-itest.cc:242] log_block_manager_containers: 9910
I1110 05:51:13.581857 143199 dense_node-itest.cc:242] log_block_manager_full_containers: 2070
I1110 05:51:13.581861 143199 dense_node-itest.cc:242] threads_running: 225
[==========] 1 test from 1 test suite ran. (630376 ms total)
[==========] Running 1 test from 1 test suite.
I1110 06:00:18.327955 161886 dense_node-itest.cc:223] Time spent restarting master: real 0.094s user 0.001s     sys 0.001s
I1110 06:00:36.625900 161886 dense_node-itest.cc:226] Time spent restarting tserver: real 18.298s       user 0.011s     sys 0.063s
I1110 06:00:36.625918 161886 dense_node-itest.cc:237] not waiting for bootstrapping tablets (flag disabled)
I1110 06:00:36.779762 161886 dense_node-itest.cc:242] log_block_manager_blocks_under_management: 34068
I1110 06:00:36.779801 161886 dense_node-itest.cc:242] log_block_manager_bytes_under_management: 236467612
I1110 06:00:36.779808 161886 dense_node-itest.cc:242] log_block_manager_containers: 9880
I1110 06:00:36.779814 161886 dense_node-itest.cc:242] log_block_manager_full_containers: 1410
I1110 06:00:36.779817 161886 dense_node-itest.cc:242] threads_running: 1332
[==========] 1 test from 1 test suite ran. (481328 ms total)
[==========] Running 1 test from 1 test suite.
I1110 06:08:12.651594 176648 dense_node-itest.cc:223] Time spent restarting master: real 0.084s user 0.001s     sys 0.002s
I1110 06:09:14.436517 176648 dense_node-itest.cc:226] Time spent restarting tserver: real 61.785s       user 0.045s     sys 0.222s
I1110 06:09:14.436537 176648 dense_node-itest.cc:237] not waiting for bootstrapping tablets (flag disabled)
I1110 06:09:14.732786 176648 dense_node-itest.cc:242] log_block_manager_blocks_under_management: 32334
I1110 06:09:14.732823 176648 dense_node-itest.cc:242] log_block_manager_bytes_under_management: 225235870
I1110 06:09:14.732829 176648 dense_node-itest.cc:242] log_block_manager_containers: 10119
I1110 06:09:14.732833 176648 dense_node-itest.cc:242] log_block_manager_full_containers: 5070
I1110 06:09:14.732837 176648 dense_node-itest.cc:242] threads_running: 1313
[==========] 1 test from 1 test suite ran. (506357 ms total)

 Performance counter stats for './bin/dense_node-itest -num_tablets=1000 -num_seconds=240 --gtest_filter=DenseNodeTest.RunTest/0' (5 runs):

    2995824.938107      task-clock (msec)         #    4.719 CPUs utilized            ( +-  5.08% )
        51,210,606      context-switches          #    0.017 M/sec                    ( +-  8.77% )
        10,794,492      cpu-migrations            #    0.004 M/sec                    ( +- 12.18% )
         5,334,419      page-faults               #    0.002 M/sec                    ( +-  9.46% )
 6,394,831,534,947      cycles                    #    2.135 GHz                      ( +-  9.25% )
 3,981,645,181,601      instructions              #    0.62  insn per cycle           ( +-  9.86% )
   792,397,266,216      branches                  #  264.501 M/sec                    ( +-  9.75% )
     6,963,950,058      branch-misses             #    0.88% of all branches          ( +-  7.25% )

     634.804501857 seconds time elapsed                                          ( +- 13.11% )

[==========] Running 1 test from 1 test suite.
I1110 06:16:46.444833 194734 dense_node-itest.cc:223] Time spent restarting master: real 0.075s user 0.000s     sys 0.002s
I1110 06:17:29.511780 194734 dense_node-itest.cc:226] Time spent restarting tserver: real 43.067s       user 0.031s     sys 0.151s
I1110 06:17:29.511799 194734 dense_node-itest.cc:237] not waiting for bootstrapping tablets (flag disabled)
I1110 06:17:29.575773 194734 dense_node-itest.cc:242] log_block_manager_blocks_under_management: 33354
I1110 06:17:29.575809 194734 dense_node-itest.cc:242] log_block_manager_bytes_under_management: 162500445
I1110 06:17:29.575814 194734 dense_node-itest.cc:242] log_block_manager_containers: 9946
I1110 06:17:29.575817 194734 dense_node-itest.cc:242] log_block_manager_full_containers: 2847
I1110 06:17:29.575821 194734 dense_node-itest.cc:242] threads_running: 228
[==========] 1 test from 1 test suite ran. (468746 ms total)
[==========] Running 1 test from 1 test suite.
I1110 06:24:34.692034 14506 dense_node-itest.cc:223] Time spent restarting master: real 4.389s  user 0.003s     sys 0.017s
I1110 06:25:16.629412 14506 dense_node-itest.cc:226] Time spent restarting tserver: real 41.937s        user 0.027s     sys 0.146s
I1110 06:25:16.629434 14506 dense_node-itest.cc:237] not waiting for bootstrapping tablets (flag disabled)
I1110 06:25:22.050498 14506 dense_node-itest.cc:242] log_block_manager_blocks_under_management: 47634
I1110 06:25:22.050537 14506 dense_node-itest.cc:242] log_block_manager_bytes_under_management: 409166624
I1110 06:25:22.050544 14506 dense_node-itest.cc:242] log_block_manager_containers: 9779
I1110 06:25:22.050549 14506 dense_node-itest.cc:242] log_block_manager_full_containers: 3405
I1110 06:25:22.050552 14506 dense_node-itest.cc:242] threads_running: 1342
[==========] 1 test from 1 test suite ran. (434683 ms total)
[==========] Running 1 test from 1 test suite.
I1110 06:31:46.989178 26886 dense_node-itest.cc:223] Time spent restarting master: real 0.093s  user 0.000s     sys 0.003s
I1110 06:32:04.775068 26886 dense_node-itest.cc:226] Time spent restarting tserver: real 17.786s        user 0.010s     sys 0.048s
I1110 06:32:04.775091 26886 dense_node-itest.cc:237] not waiting for bootstrapping tablets (flag disabled)
I1110 06:32:04.830790 26886 dense_node-itest.cc:242] log_block_manager_blocks_under_management: 34068
I1110 06:32:04.830832 26886 dense_node-itest.cc:242] log_block_manager_bytes_under_management: 225249875
I1110 06:32:04.830838 26886 dense_node-itest.cc:242] log_block_manager_containers: 10352
I1110 06:32:04.830842 26886 dense_node-itest.cc:242] log_block_manager_full_containers: 1198
I1110 06:32:04.830849 26886 dense_node-itest.cc:242] threads_running: 1307
[==========] 1 test from 1 test suite ran. (401113 ms total)
[==========] Running 1 test from 1 test suite.
I1110 06:38:28.355651 39523 dense_node-itest.cc:223] Time spent restarting master: real 0.348s  user 0.001s     sys 0.003s
I1110 06:39:14.934329 39523 dense_node-itest.cc:226] Time spent restarting tserver: real 46.579s        user 0.034s     sys 0.161s
I1110 06:39:14.934355 39523 dense_node-itest.cc:237] not waiting for bootstrapping tablets (flag disabled)
I1110 06:39:15.086436 39523 dense_node-itest.cc:242] log_block_manager_blocks_under_management: 42942
I1110 06:39:15.086474 39523 dense_node-itest.cc:242] log_block_manager_bytes_under_management: 337224147
I1110 06:39:15.086480 39523 dense_node-itest.cc:242] log_block_manager_containers: 10706
I1110 06:39:15.086484 39523 dense_node-itest.cc:242] log_block_manager_full_containers: 2511
I1110 06:39:15.086489 39523 dense_node-itest.cc:242] threads_running: 1326
[==========] 1 test from 1 test suite ran. (542365 ms total)
[==========] Running 1 test from 1 test suite.
I1110 06:47:30.400804 54331 dense_node-itest.cc:223] Time spent restarting master: real 0.125s  user 0.001s     sys 0.002s
I1110 06:51:28.288359 54331 dense_node-itest.cc:226] Time spent restarting tserver: real 237.888s       user 0.160s     sys 0.855s
I1110 06:51:28.288374 54331 dense_node-itest.cc:237] not waiting for bootstrapping tablets (flag disabled)
I1110 06:51:28.633086 54331 dense_node-itest.cc:242] log_block_manager_blocks_under_management: 35292
I1110 06:51:28.633126 54331 dense_node-itest.cc:242] log_block_manager_bytes_under_management: 198352642
I1110 06:51:28.633132 54331 dense_node-itest.cc:242] log_block_manager_containers: 10015
I1110 06:51:28.633139 54331 dense_node-itest.cc:242] log_block_manager_full_containers: 2514
I1110 06:51:28.633143 54331 dense_node-itest.cc:242] threads_running: 1300
[==========] 1 test from 1 test suite ran. (709361 ms total)

 Performance counter stats for './bin/dense_node-itest -num_tablets=1000 -num_seconds=240 --gtest_filter=DenseNodeTest.RunTest/1' (5 runs):

    3115638.620173      task-clock (msec)         #    6.094 CPUs utilized            ( +-  2.07% )
        50,595,541      context-switches          #    0.016 M/sec                    ( +-  7.87% )
        10,972,126      cpu-migrations            #    0.004 M/sec                    ( +-  9.15% )
         5,985,924      page-faults               #    0.002 M/sec                    ( +- 10.89% )
 6,073,807,994,303      cycles                    #    1.949 GHz                      ( +- 10.17% )
 4,094,332,719,732      instructions              #    0.67  insn per cycle           ( +-  7.76% )
   807,628,648,546      branches                  #  259.218 M/sec                    ( +-  7.83% )
     7,158,875,684      branch-misses             #    0.89% of all branches          ( +-  6.38% )

     511.281338094 seconds time elapsed                                          ( +- 10.71% )

Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log_index.cc
M src/kudu/consensus/log_util.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/dir_manager.cc
M src/kudu/fs/dir_util.cc
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/log_block_manager-test-util.cc
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/integration-tests/dense_node-itest.cc
M src/kudu/integration-tests/mini_cluster_fs_inspector.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/security-itest.cc
M src/kudu/mini-cluster/external_mini_cluster.cc
M src/kudu/mini-cluster/external_mini_cluster.h
M src/kudu/postgres/mini_postgres.cc
M src/kudu/ranger/ranger_client.cc
M src/kudu/security/test/mini_kdc.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_pbc.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_source_session-test.cc
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/util/env-test.cc
M src/kudu/util/env.cc
M src/kudu/util/env.h
M src/kudu/util/env_posix.cc
M src/kudu/util/env_util.cc
M src/kudu/util/file_cache-test.cc
M src/kudu/util/file_cache.cc
M src/kudu/util/pb_util-test.cc
M src/kudu/util/pb_util.cc
M src/kudu/util/pb_util.h
M src/kudu/util/rolling_log.cc
M src/kudu/util/yamlreader-test.cc
41 files changed, 450 insertions(+), 216 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/17974/7
-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 7
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] [security] KUDU-3331 Encrypt file system

Posted by "Attila Bukor (Code Review)" <ge...@cloudera.org>.
Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17974 )

Change subject: [security] KUDU-3331 Encrypt file system
......................................................................


Patch Set 11:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17974/9/src/kudu/tools/tool_action_pbc.cc
File src/kudu/tools/tool_action_pbc.cc:

http://gerrit.cloudera.org:8080/#/c/17974/9/src/kudu/tools/tool_action_pbc.cc@106
PS9, Line 106: IsFileEncrypted
> Sure. Just making sure I understand your proposal: for each file we'll alwa
Basically, yes.


http://gerrit.cloudera.org:8080/#/c/17974/9/src/kudu/util/pb_util-test.cc
File src/kudu/util/pb_util-test.cc:

PS9: 
> I was thinking a pb-specific test, to ensure other developers can know what
Ok, I'll add one to the encryption header patch then.



-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 11
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 11 Jan 2022 07:59:27 +0000
Gerrit-HasComments: Yes

[kudu-CR] [encryption] KUDU-3331 Encrypt file system

Posted by "Attila Bukor (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, Andrew Wong, Bankim Bhavsar, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17974

to look at the new patch set (#4).

Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................

[encryption] KUDU-3331 Encrypt file system

de02a34 introduced encryption support to Env in a self-contained way,
but it's not used across Kudu.

This commit integrates this encryption support into the project and
modifies several test suites to also run tests with encryption enabled.

I also changed the way the "encrypted" flag in *FileOptions works:
previously, the default was false, as files were not encrypted by
default. As this commit introduces the "--encrypt_data_at_rest" flag,
which is false by default, the encrypted flag now means that a file is
encrypted or not if data at rest encryption is enabled. The default is
now true, which results in less code change and not having to complicate
FileCache.

When encryption is enabled, the following files are encrypted:

- WAL and blocks as they contain user data, which is considered
  sensitive.
- LBM block metadata as FileCache is used to access them, so by
  encrypting these files, we can keep FileCache simpler as it doesn't
  have to be able to handle both encrypted and non-encrypted files.

Instance and block manager instance files don't need to be encrypted,
and neither do consensus and tablet metadata files.

As most PBC files are non-encrypted, PBUtil disables encryption by
default. This leads to an edge-case in the PBC CLI tool, which can dump
and edit PBC files. This needs to be able to recognize whether the file
is encrypted. As encryption headers are not yet implemented, I added a
hack which checks the file name and treats the file as encrypted if it
ends with ".metadata" (LBM block metadata extension) and non-encrypted
otherwise.

Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/security-itest.cc
M src/kudu/mini-cluster/external_mini_cluster.cc
M src/kudu/mini-cluster/external_mini_cluster.h
M src/kudu/postgres/mini_postgres.cc
M src/kudu/ranger/ranger_client.cc
M src/kudu/security/test/mini_kdc.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_pbc.cc
M src/kudu/util/env-test.cc
M src/kudu/util/env.cc
M src/kudu/util/env.h
M src/kudu/util/env_posix.cc
M src/kudu/util/file_cache-test.cc
M src/kudu/util/pb_util-test.cc
M src/kudu/util/pb_util.cc
M src/kudu/util/rolling_log.cc
M src/kudu/util/yamlreader-test.cc
20 files changed, 282 insertions(+), 103 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/17974/4
-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 4
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] [encryption] KUDU-3331 Encrypt file system

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17974 )

Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................


Patch Set 7: Code-Review+1

(3 comments)

http://gerrit.cloudera.org:8080/#/c/17974/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17974/7//COMMIT_MSG@133
PS7, Line 133: I also wanted to run dense_node-itest with -num_seconds=240 and
I guess you're saying it's based on a random workload so the results are also random. Maybe it's worth running the test with a limit on the number of rows instead of a timeout, changing the write pattern to sequential, and disabling compaction?


http://gerrit.cloudera.org:8080/#/c/17974/7//COMMIT_MSG@134
PS7, Line 134: but the amount of data written (both in terms of
             : number of blocks and bytes), and 
nit: missing some words?


http://gerrit.cloudera.org:8080/#/c/17974/7/src/kudu/fs/log_block_manager-test.cc
File src/kudu/fs/log_block_manager-test.cc:

http://gerrit.cloudera.org:8080/#/c/17974/7/src/kudu/fs/log_block_manager-test.cc@a1173
PS7, Line 1173: 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
Was this removal intentional?



-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 7
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 16 Nov 2021 20:33:44 +0000
Gerrit-HasComments: Yes

[kudu-CR] [encryption] KUDU-3331 Encrypt file system

Posted by "Attila Bukor (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, Andrew Wong, Bankim Bhavsar, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17974

to look at the new patch set (#6).

Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................

[encryption] KUDU-3331 Encrypt file system

de02a34 introduced encryption support to Env in a self-contained way,
but it's not used across Kudu.

This commit integrates this encryption support into the project and
modifies several test suites to also run tests with encryption enabled.

I also renamed "encrypted" to "is_sensitive" in *FileOption as a file
with this flag will be encrypted only if encryption is enabled for the
process.

When encryption is enabled, the following files are encrypted:

- WAL segments
- LBM blocks and metadata
- FBM blocks
- tablet and consensus metadata

Logs, rolling logs, instance and block manager instance files,
configuration files in integration tests are not encrypted.

As FileCache is not used to access instance files, it only supports
handling sensitive files and can't be used to access unencrypted files.

As the PBC CLI tool needs can be used to dump encrypted (metadata) and
unencrypted files (instance) as well, it needs to be able to determine
if a file is encrypted or not. As encryption headers are not yet
implemented, I introduced a hack which checks the file name and treats
the file as unencrypted if it ends with "instance" and encrypted
otherwise.

I ran some benchmarks to compare running Kudu with encryption enabled
and disabled:

$ KUDU_ALLOW_SLOW_TESTS=1 ./bin/log_block_manager-test --gtest_filter="*StartupBenchmark*" 2>startup-bench.txt &&  grep "Time spent reopening" startup-bench.txt
Note: Google Test filter = *StartupBenchmark*
[==========] Running 2 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 2 tests from EncryptionEnabled/LogBlockManagerTest
[ RUN      ] EncryptionEnabled/LogBlockManagerTest.StartupBenchmark/0
[       OK ] EncryptionEnabled/LogBlockManagerTest.StartupBenchmark/0 (30192 ms)
[ RUN      ] EncryptionEnabled/LogBlockManagerTest.StartupBenchmark/1
[       OK ] EncryptionEnabled/LogBlockManagerTest.StartupBenchmark/1 (147586 ms)
[----------] 2 tests from EncryptionEnabled/LogBlockManagerTest (177779 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 1 test suite ran. (177779 ms total)
[  PASSED  ] 2 tests.
I1103 17:40:00.268719 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 1.835s     user 0.110s     sys 0.041s
I1103 17:40:02.084342 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 1.816s     user 0.142s     sys 0.000s
I1103 17:40:04.027940 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 1.944s     user 0.127s     sys 0.051s
I1103 17:40:05.838649 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 1.811s     user 0.147s     sys 0.003s
I1103 17:40:07.780369 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 1.942s     user 0.140s     sys 0.054s
I1103 17:40:09.581300 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 1.801s     user 0.156s     sys 0.001s
I1103 17:40:11.495551 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 1.914s     user 0.142s     sys 0.057s
I1103 17:40:13.302351 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 1.807s     user 0.147s     sys 0.002s
I1103 17:40:15.189821 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 1.870s     user 0.136s     sys 0.046s
I1103 17:40:17.010360 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 1.820s     user 0.135s     sys 0.002s
I1103 17:40:31.574581 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 2.258s     user 0.118s     sys 0.001s
I1103 17:40:33.885696 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 2.311s     user 0.133s     sys 0.054s
I1103 17:40:36.116078 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 2.218s     user 0.132s     sys 0.015s
I1103 17:40:38.349752 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 2.234s     user 0.125s     sys 0.043s
I1103 17:40:40.558445 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 2.209s     user 0.130s     sys 0.001s
I1103 17:40:42.805191 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 2.247s     user 0.131s     sys 0.039s
I1103 17:40:44.995265 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 2.190s     user 0.146s     sys 0.000s
I1103 17:40:47.269686 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 2.274s     user 0.137s     sys 0.046s
I1103 17:40:49.522470 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 2.253s     user 0.134s     sys 0.000s
I1103 17:40:51.816761 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 2.294s     user 0.128s     sys 0.044s

$ ./bin/dense_node-itest -num_tablets=1000 -num_seconds=240 2>dense-node.txt | grep "Time spent restarting tserver"
I1103 17:00:33.196427 14424 dense_node-itest.cc:226] Time spent restarting tserver: real 74.861s        user 0.044s     sys 0.177s
I1103 17:06:01.772245 14424 dense_node-itest.cc:226] Time spent restarting tserver: real 69.005s        user 0.041s     sys 0.156s

$ perf stat --log-fd 3 -r 10  ./bin/tablet_server-test --gtest_filter=TabletServerTest.TestDeleteTabletBenchmark 3>&1 2>/dev/null >/dev/null

 Performance counter stats for './bin/tablet_server-test --gtest_filter=TabletServerTest.TestDeleteTabletBenchmark' (10 runs):

            968.79 msec task-clock:u              #    1.102 CPUs utilized            ( +-  0.27% )
                 0      context-switches:u        #    0.000 K/sec
                 0      cpu-migrations:u          #    0.000 K/sec
            11,216      page-faults:u             #    0.012 M/sec                    ( +-  0.14% )
     1,192,386,602      cycles:u                  #    1.231 GHz                      ( +-  0.50% )  (83.24%)
     1,042,831,264      stalled-cycles-frontend:u #   87.46% frontend cycles idle     ( +-  0.43% )  (83.26%)
       858,989,227      stalled-cycles-backend:u  #   72.04% backend cycles idle      ( +-  0.33% )  (66.55%)
     1,360,615,269      instructions:u            #    1.14  insn per cycle
                                                  #    0.77  stalled cycles per insn  ( +-  0.38% )  (83.13%)
       291,921,003      branches:u                #  301.326 M/sec                    ( +-  0.27% )  (83.50%)
         8,332,352      branch-misses:u           #    2.85% of all branches          ( +-  0.24% )  (83.45%)

           0.87885 +- 0.00573 seconds time elapsed  ( +-  0.65% )

$ perf stat --log-fd 3 -r 10  ./bin/tablet_server-test --gtest_filter=TabletServerTest.TestDeleteTabletBenchmark --encrypt_data_at_rest 3>&1 2>/dev/null >/dev/null

 Performance counter stats for './bin/tablet_server-test --gtest_filter=TabletServerTest.TestDeleteTabletBenchmark --encrypt_data_at_rest' (10 runs):

            988.61 msec task-clock:u              #    1.099 CPUs utilized            ( +-  0.62% )
                 0      context-switches:u        #    0.000 K/sec
                 0      cpu-migrations:u          #    0.000 K/sec
            11,229      page-faults:u             #    0.011 M/sec                    ( +-  0.13% )
     1,223,443,798      cycles:u                  #    1.238 GHz                      ( +-  0.58% )  (83.40%)
     1,066,930,822      stalled-cycles-frontend:u #   87.21% frontend cycles idle     ( +-  0.53% )  (83.78%)
       875,815,760      stalled-cycles-backend:u  #   71.59% backend cycles idle      ( +-  0.45% )  (66.88%)
     1,389,694,090      instructions:u            #    1.14  insn per cycle
                                                  #    0.77  stalled cycles per insn  ( +-  0.38% )  (83.26%)
       296,238,671      branches:u                #  299.651 M/sec                    ( +-  0.33% )  (83.18%)
         8,548,361      branch-misses:u           #    2.89% of all branches          ( +-  0.37% )  (82.77%)

           0

Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log_index.cc
M src/kudu/consensus/log_util.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/dir_manager.cc
M src/kudu/fs/dir_util.cc
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/log_block_manager-test-util.cc
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/integration-tests/dense_node-itest.cc
M src/kudu/integration-tests/mini_cluster_fs_inspector.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/security-itest.cc
M src/kudu/mini-cluster/external_mini_cluster.cc
M src/kudu/mini-cluster/external_mini_cluster.h
M src/kudu/postgres/mini_postgres.cc
M src/kudu/ranger/ranger_client.cc
M src/kudu/security/test/mini_kdc.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_pbc.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_source_session-test.cc
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/util/env-test.cc
M src/kudu/util/env.cc
M src/kudu/util/env.h
M src/kudu/util/env_posix.cc
M src/kudu/util/env_util.cc
M src/kudu/util/file_cache-test.cc
M src/kudu/util/file_cache.cc
M src/kudu/util/pb_util-test.cc
M src/kudu/util/pb_util.cc
M src/kudu/util/pb_util.h
M src/kudu/util/rolling_log.cc
M src/kudu/util/yamlreader-test.cc
41 files changed, 430 insertions(+), 167 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/17974/6
-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 6
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] [encryption] KUDU-3331 Encrypt file system

Posted by "Attila Bukor (Code Review)" <ge...@cloudera.org>.
Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17974 )

Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................


Patch Set 7:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/17974/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17974/6//COMMIT_MSG@42
PS6, Line 42: The following are StartupBenchmark tests run with KUDU_ALLOW_SLOW_TESTS
            : set to true, which uses a block count of 1,000,000.
            : 
            : It seems enabling encryption adds around 20% overhead on startup in a
            : typical use-case with no deletes.
            : 
            :  Performance counter stats for './bin/log_block_manager-test --gtest_filter=*StartupBenchmark/0' (10 runs):
            : 
            :       40391.075316      task-clock (msec)         #    2.021 CPUs utilized            ( +-  1.05% )
            :             11,089      context-switches          #    0.275 K/sec                    ( +-  9.87% )
            :                280      cpu-migrations            #    0.007 K/sec                    ( +-  1.58% )
            :            593,982      page-faults               #    0.015 M/sec                    ( +-  2.13% )
            :    110,595,311,391      cycles                    #    2.738 GHz                      ( +-  1.03% )
            :     90,580,214,722      instructions              #    0.82  insn per cycle           ( +-  0.14% )
            :     16,449,237,957      branches                  #  407.249 M/sec                    ( +-  0.15% )
            :         67,169,915      branch-misses             #    0.41% of all branches          ( +-  0.49% )
            : 
            :       19.988553457 seconds time elapsed                                          ( +-  0.58% )
            : 
            :  Performance counter stats for './bin/log_block_manager-test --encrypt_data_at_rest=1 --gtest_filter=*StartupBenchmark/0' (10 runs):
            : 
            :       51317.845606      task-clock (msec)         #    2.133 CPUs utilized            ( +-  0.90% )
            :             13,214      context-switches          #    0.257 K/sec                    ( +-  4.03% )
            :                292      cpu-migrations            #    0.006 K/sec                    ( +-  1.76% )
            :            737,815      page-faults               #    0.014 M/sec                    ( +-  1.49% )
            :    144,898,246,536      cycles                    #    2.824 GHz                      ( +-  1.08% )
            :    126,702,271,070      instructions              #    0.87  insn per cycle           ( +-  0.05% )
            :     24,116,649,584      branches                  #  469.947 M/sec                    ( +-  0.05% )
            :        106,793,688      branch-misses             #    0.44% of all branches          ( +-  0.35% )
            : 
            :       24.055824830 seconds time elapsed                                          ( +-  0.89% )
            : 
            : With deletes, the difference seems to decrease to about 14% when 90% of
            : the blocks are deleted.
> If you've drawn conclusions or summarizations about these numbers, it'd be 
Done


http://gerrit.cloudera.org:8080/#/c/17974/6//COMMIT_MSG@77
PS6, Line 77:  Performance counter stats for './bin/log_block_manager-test --gtest_filter=*StartupBenchmark/1' (10 runs):
> Probably worth looping a few times to see if the behavior holds across seve
Done


http://gerrit.cloudera.org:8080/#/c/17974/6//COMMIT_MSG@81
PS6, Line 81:                530      cpu-migrations            #    0.010 K/sec                    ( +-  1.48% )
            :            399,284      page-faults               #    0.007 M/sec                    ( +-  1.66% )
            :    145,147,457,046      cycles                    #    2.726 GHz                      ( +-  0.48% )
> Same here, any conclusions?
Done


http://gerrit.cloudera.org:8080/#/c/17974/6/src/kudu/consensus/log_index.cc
File src/kudu/consensus/log_index.cc:

http://gerrit.cloudera.org:8080/#/c/17974/6/src/kudu/consensus/log_index.cc@112
PS6, Line 112:     opts.is_sensitive = true;
> This is interesting to think about -- log indexes don't store any data, but
I don't think there's a security concern here, but in the design I wrote everything but instance files would be encrypted, so after you pointing this out w.r.t. tablet metadata, I changed all these to sensitive as well. I don't think there's a benefit either way regarding these two file types.


http://gerrit.cloudera.org:8080/#/c/17974/6/src/kudu/util/env.cc
File src/kudu/util/env.cc:

http://gerrit.cloudera.org:8080/#/c/17974/6/src/kudu/util/env.cc@30
PS6, Line 30:                                   bool is_sensitive) {
> nit: should be 'is_sensitive' too? Same elsewhere
Done


http://gerrit.cloudera.org:8080/#/c/17974/6/src/kudu/util/pb_util-test.cc
File src/kudu/util/pb_util-test.cc:

http://gerrit.cloudera.org:8080/#/c/17974/6/src/kudu/util/pb_util-test.cc@76
PS6, Line 76: SensitivityMode sensitivi
> nit: should we introduce some Sensitivity enum for this or somesuch?
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 7
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 10 Nov 2021 15:20:43 +0000
Gerrit-HasComments: Yes

[kudu-CR] [encryption] KUDU-3331 Encrypt file system

Posted by "Attila Bukor (Code Review)" <ge...@cloudera.org>.
Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17974 )

Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................


Patch Set 10:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/17974/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17974/7//COMMIT_MSG@45
PS7, Line 45: adds around 20% overhead
> Cool, thanks!
Done


http://gerrit.cloudera.org:8080/#/c/17974/7//COMMIT_MSG@133
PS7, Line 133: 
> Yeah, I agree. It doesn't seem too useful to include.
Done


http://gerrit.cloudera.org:8080/#/c/17974/9//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17974/9//COMMIT_MSG@32
PS9, Line 32: ca
> nit: 'be' --> 'is' or maybe add 'can'?
Done


http://gerrit.cloudera.org:8080/#/c/17974/9/src/kudu/tools/tool_action_pbc.cc
File src/kudu/tools/tool_action_pbc.cc:

http://gerrit.cloudera.org:8080/#/c/17974/9/src/kudu/tools/tool_action_pbc.cc@106
PS9, Line 106: IsFileEncrypted
> nit: Is this actually meant to determine whether the file is sensitive or n
I guess at this point it's to determine if it's sensitive, but if we switch to read the encryption header, it would be to determine if it's encrypted.

Also, now I'm thinking it may not be too hacky, and don't really know if there's a clear benefit to check the encryption header. That way we need to actually open the file and read the first 8 bytes instead of just checking the filename. How would you feel about leaving this like this?



-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 10
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Thu, 06 Jan 2022 19:51:34 +0000
Gerrit-HasComments: Yes

[kudu-CR] [security] KUDU-3331 Encrypt file system

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17974 )

Change subject: [security] KUDU-3331 Encrypt file system
......................................................................


Patch Set 11:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17974/9/src/kudu/tools/tool_action_pbc.cc
File src/kudu/tools/tool_action_pbc.cc:

http://gerrit.cloudera.org:8080/#/c/17974/9/src/kudu/tools/tool_action_pbc.cc@106
PS9, Line 106: IsFileEncrypted
> You're right, temp files would cause a problem with this approach. I guess 
Sure. Just making sure I understand your proposal: for each file we'll always try to read an encryption header, and if we can deserialize one for whatever reason, we'll assume there was no encryption and read decrypted. Is that the case?


http://gerrit.cloudera.org:8080/#/c/17974/9/src/kudu/util/pb_util-test.cc
File src/kudu/util/pb_util-test.cc:

PS9: 
> I wanted to figure out the expected behavior of the IsFileEncrypted before 
I was thinking a pb-specific test, to ensure other developers can know what to expect when using the SensitivityModes.



-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 11
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 11 Jan 2022 06:17:20 +0000
Gerrit-HasComments: Yes

[kudu-CR] [security] KUDU-3331 Encrypt file system

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17974 )

Change subject: [security] KUDU-3331 Encrypt file system
......................................................................


Patch Set 11: Code-Review+2

LGTM, given the discussed changes in later patches


-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 11
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 11 Jan 2022 06:20:47 +0000
Gerrit-HasComments: No

[kudu-CR] [encryption] KUDU-3331 Encrypt file system

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17974 )

Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17974/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17974/7//COMMIT_MSG@133
PS7, Line 133: I also wanted to run dense_node-itest with -num_seconds=240 and
> I guess you're saying it's based on a random workload so the results are al
Actually I misunderstood the design of this test: to maximize the number of blocks and containers created with as little data as possible. Random writes across a wide partition space seems to do the trick effectively in this case.

That said, perhaps disabling compactions with --enable_rowset_compaction=false for such a long-running test, especially given the random workload, would make the results a bit more repeatable, or at least ensure we're spending all our efforts creating new blocks rather than compacting?

Also worth ensuring nothing surprising is running in the background while running these benchmarks.



-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 7
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 19 Nov 2021 02:37:38 +0000
Gerrit-HasComments: Yes

[kudu-CR] [encryption] KUDU-3331 Encrypt file system

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/17974 )

Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................


Patch Set 2:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/17974/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17974/2//COMMIT_MSG@12
PS2, Line 12: ntegrates this encryption support into the project
nit: does it make sense to mention that there is at least one TODO to properly implement detection of whether a file is encrypted/unencrypted in tool_action_pbc?


http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/consensus/consensus_meta-test.cc
File src/kudu/consensus/consensus_meta-test.cc:

http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/consensus/consensus_meta-test.cc@82
PS2, Line 82: SetEncryptionFlags(bool encryption_enabled)
nit: why not just 'EnableEncryption(bool enable)' ?


http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/fs/log_block_manager-test.cc
File src/kudu/fs/log_block_manager-test.cc:

http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/fs/log_block_manager-test.cc@229
PS2, Line 229: SetEncryptionFlags(bool enable)
nit: why not just 'EnableEncryption(bool enable)' ?


http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/fs/log_block_manager.cc@3005
PS2, Line 3005:   RWFileOptions opts;
              :   string tmp_file_name;
              :   RETURN_NOT_OK_LBM_DISK_FAILURE_PREPEND(env_->NewTempRWFile(opts, tmpl,
              :                                                              &tmp_file_name, &tmp_file),
Looked at this a bit, but could not understand why this change was necessary :)

Could you please clarify on this just for me, maybe in this gerrit thread?

Maybe, something is missing here (like setting the 'encryped' field)?


http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/integration-tests/raft_consensus-itest.cc
File src/kudu/integration-tests/raft_consensus-itest.cc:

http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/integration-tests/raft_consensus-itest.cc@828
PS2, Line 828: FLAGS_encryption = true;
nit (here and below): could you add a comment explaining why adding flags to the set of tserver flags is not enough, so setting FLAGS_encryption is necessary as well?


http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/util/env.h
File src/kudu/util/env.h:

http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/util/env.h@383
PS2, Line 383: IsEncryptionEnabled
Do you see it ever be requiring some non-const stuff in the implementation?  If no, maybe consider adding 'const' specifier for this method?


http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/util/env.h@421
PS2, Line 421: true
Ah, interesting: I'd expect that everything is kept non-encrypted by default for the sake of backwards compatibility.  At least that way it's not necessary to update the setting in so many files.

Any special idea behind setting this to 'true' by default?


http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/util/env_posix.cc
File src/kudu/util/env_posix.cc:

http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/util/env_posix.cc@203
PS2, Line 203: encryption
nit: maybe, name this flag 'enable_encryption' ?


http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/util/env_posix.cc@1997
PS2, Line 1997: virtual
nit: drop 'virtual' since there is 'override' already



-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 2
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 26 Oct 2021 23:52:04 +0000
Gerrit-HasComments: Yes

[kudu-CR] [encryption] KUDU-3331 Encrypt file system

Posted by "Attila Bukor (Code Review)" <ge...@cloudera.org>.
Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17974 )

Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................


Patch Set 3:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/17974/2//COMMIT_MSG
Commit Message:

PS2: 
> It'd be nice to see some benchmarks for how this performs. Perhaps we could
Thanks for the tips, I'll add some benchmarks (currently working on it). I wanted to push my changes so you can do another pass before I add the benchmark results to the commit message (and some relevant code changes).


http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/consensus/consensus_meta-test.cc
File src/kudu/consensus/consensus_meta-test.cc:

http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/consensus/consensus_meta-test.cc@82
PS2, Line 82: EnableEncryption(bool enable) {
> nit: why not just 'EnableEncryption(bool enable)' ?
Done


http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/fs/log_block_manager-test.cc
File src/kudu/fs/log_block_manager-test.cc:

http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/fs/log_block_manager-test.cc@229
PS2, Line 229: EnableEncryption(bool enable) {
> nit: why not just 'EnableEncryption(bool enable)' ?
Done


http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/fs/log_block_manager.cc@3005
PS2, Line 3005:   string tmp_file_name;
              :   RETURN_NOT_OK_LBM_DISK_FAILURE_PREPEND(env_->NewTempRWFile(RWFileOptions(), tmpl,
              :                                                              &tmp_file_name, &tmp_file),
              :                                          "could not create temporary metadata file");
> Looked at this a bit, but could not understand why this change was necessar
I guess this wasn't necessary at the end. I tried different approaches, it seems I left this here by mistake.


http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/integration-tests/raft_consensus-itest.cc
File src/kudu/integration-tests/raft_consensus-itest.cc:

http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/integration-tests/raft_consensus-itest.cc@828
PS2, Line 828: // process, as both of t
> nit (here and below): could you add a comment explaining why adding flags t
Done


http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/tools/tool_action_pbc.cc
File src/kudu/tools/tool_action_pbc.cc:

http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/tools/tool_action_pbc.cc@105
PS2, Line 105: 
             : bool IsFileEncrypted(Env* env, const std::string& fname) {
             :   // TODO(abukor): replace with real encryption check. As LBM metadata files are
             :   // the only PBC files that are encrypted right now, this check will suffice
             :  
> nit: maybe add a comment mentioning why this is the case? I don't quite fol
thanks, added an explanation.


http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/util/env.h
File src/kudu/util/env.h:

http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/util/env.h@383
PS2, Line 383:  bool IsEncryptionE
> Do you see it ever be requiring some non-const stuff in the implementation?
Done


http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/util/env.h@421
PS2, Line 421: 
> +1, I found this this surprising. In general we've been security-disabled b
I changed the behavior somewhat, mostly to require fewer changes in the code. This now means that the file is/should be encrypted, but it only takes effect when encryption is enabled. This way, when you create a file, you don't need to know if encryption is enabled. If encryption is disabled (which is the case by default), it doesn't matter what this flag is set to. When encryption is enabled, all files, except those where this is explicitly set to false, are encrypted.

I added a note to the flag.


http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/util/env_posix.cc
File src/kudu/util/env_posix.cc:

http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/util/env_posix.cc@203
PS2, Line 203: encrypt_da
> nit: maybe, name this flag 'enable_encryption' ?
How about encrypt_data_at_rest to make sure it's not going to be confused with TLS?


http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/util/env_posix.cc@1997
PS2, Line 1997: const b
> nit: drop 'virtual' since there is 'override' already
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 3
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 27 Oct 2021 19:53:42 +0000
Gerrit-HasComments: Yes

[kudu-CR] [encryption] KUDU-3331 Encrypt file system

Posted by "Attila Bukor (Code Review)" <ge...@cloudera.org>.
Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17974 )

Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17974/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17974/3//COMMIT_MSG@19
PS3, Line 19: ypted:
            : 
> | Can you elaborate on this? I'm not sure I understand.
Yea, I verified it manually, but wasn't sure if there's a good way to do that in tests.


http://gerrit.cloudera.org:8080/#/c/17974/3//COMMIT_MSG@20
PS3, Line 20: 
            : - WAL segm
> Yeah, that's what I'm suggesting, the assumption being that the FileCache o
Yep, that's a good idea, I changed it to that.



-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 5
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 03 Nov 2021 18:06:38 +0000
Gerrit-HasComments: Yes

[kudu-CR] [encryption] KUDU-3331 Encrypt file system

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17974 )

Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................


Patch Set 6:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/17974/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17974/6//COMMIT_MSG@42
PS6, Line 42: $ KUDU_ALLOW_SLOW_TESTS=1 ./bin/log_block_manager-test --gtest_filter="*StartupBenchmark*" 2>startup-bench.txt &&  grep "Time spent reopening" startup-bench.txt
            : Note: Google Test filter = *StartupBenchmark*
            : [==========] Running 2 tests from 1 test suite.
            : [----------] Global test environment set-up.
            : [----------] 2 tests from EncryptionEnabled/LogBlockManagerTest
            : [ RUN      ] EncryptionEnabled/LogBlockManagerTest.StartupBenchmark/0
            : [       OK ] EncryptionEnabled/LogBlockManagerTest.StartupBenchmark/0 (30192 ms)
            : [ RUN      ] EncryptionEnabled/LogBlockManagerTest.StartupBenchmark/1
            : [       OK ] EncryptionEnabled/LogBlockManagerTest.StartupBenchmark/1 (147586 ms)
            : [----------] 2 tests from EncryptionEnabled/LogBlockManagerTest (177779 ms total)
            : 
            : [----------] Global test environment tear-down
            : [==========] 2 tests from 1 test suite ran. (177779 ms total)
            : [  PASSED  ] 2 tests.
            : I1103 17:40:00.268719 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 1.835s     user 0.110s     sys 0.041s
            : I1103 17:40:02.084342 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 1.816s     user 0.142s     sys 0.000s
            : I1103 17:40:04.027940 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 1.944s     user 0.127s     sys 0.051s
            : I1103 17:40:05.838649 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 1.811s     user 0.147s     sys 0.003s
            : I1103 17:40:07.780369 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 1.942s     user 0.140s     sys 0.054s
            : I1103 17:40:09.581300 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 1.801s     user 0.156s     sys 0.001s
            : I1103 17:40:11.495551 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 1.914s     user 0.142s     sys 0.057s
            : I1103 17:40:13.302351 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 1.807s     user 0.147s     sys 0.002s
            : I1103 17:40:15.189821 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 1.870s     user 0.136s     sys 0.046s
            : I1103 17:40:17.010360 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 1.820s     user 0.135s     sys 0.002s
            : I1103 17:40:31.574581 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 2.258s     user 0.118s     sys 0.001s
            : I1103 17:40:33.885696 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 2.311s     user 0.133s     sys 0.054s
            : I1103 17:40:36.116078 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 2.218s     user 0.132s     sys 0.015s
            : I1103 17:40:38.349752 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 2.234s     user 0.125s     sys 0.043s
            : I1103 17:40:40.558445 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 2.209s     user 0.130s     sys 0.001s
            : I1103 17:40:42.805191 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 2.247s     user 0.131s     sys 0.039s
            : I1103 17:40:44.995265 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 2.190s     user 0.146s     sys 0.000s
            : I1103 17:40:47.269686 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 2.274s     user 0.137s     sys 0.046s
            : I1103 17:40:49.522470 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 2.253s     user 0.134s     sys 0.000s
            : I1103 17:40:51.816761 31598 log_block_manager-test.cc:1074] Time spent reopening block manager: real 2.294s     user 0.128s     sys 0.044s
If you've drawn conclusions or summarizations about these numbers, it'd be nice to share those conclusions here. Otherwise it forces reviewers to look through all of these tests.

Looking at it myself, it seems that encryption is negligible wrt performance, at least when stressing many small blocks (and more metadata). Perhaps it's worth increasing the number of blocks too and seeing if the behavior holds.


http://gerrit.cloudera.org:8080/#/c/17974/6//COMMIT_MSG@77
PS6, Line 77: $ ./bin/dense_node-itest -num_tablets=1000 -num_seconds=240 2>dense-node.txt | grep "Time spent restarting tserver"
Probably worth looping a few times to see if the behavior holds across several runs.


http://gerrit.cloudera.org:8080/#/c/17974/6//COMMIT_MSG@81
PS6, Line 81: $ perf stat --log-fd 3 -r 10  ./bin/tablet_server-test --gtest_filter=TabletServerTest.TestDeleteTabletBenchmark 3>&1 2>/dev/null >/dev/null
            : 
            :  Performance counter stats for './bin/tablet_server-test --gtest_filter=TabletServerTest.TestDeleteTabletBenchmark' (10 runs):
Same here, any conclusions?


http://gerrit.cloudera.org:8080/#/c/17974/6/src/kudu/consensus/log_index.cc
File src/kudu/consensus/log_index.cc:

http://gerrit.cloudera.org:8080/#/c/17974/6/src/kudu/consensus/log_index.cc@112
PS6, Line 112:     opts.is_sensitive = true;
This is interesting to think about -- log indexes don't store any data, but rather store a map from index to location in a WAL file. As such, it's surprising to see it encrypted, since it stores no user data.

I suppose the same line of thinking goes for consensus metadata in general.

Is there a security concern with exposing consensus metadata and log indexing?


http://gerrit.cloudera.org:8080/#/c/17974/6/src/kudu/util/env.cc
File src/kudu/util/env.cc:

http://gerrit.cloudera.org:8080/#/c/17974/6/src/kudu/util/env.cc@30
PS6, Line 30:                                   bool encrypted) {
nit: should be 'is_sensitive' too? Same elsewhere


http://gerrit.cloudera.org:8080/#/c/17974/6/src/kudu/util/pb_util-test.cc
File src/kudu/util/pb_util-test.cc:

http://gerrit.cloudera.org:8080/#/c/17974/6/src/kudu/util/pb_util-test.cc@76
PS6, Line 76: bool is_sensitive = false
nit: should we introduce some Sensitivity enum for this or somesuch?



-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 6
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 03 Nov 2021 19:59:24 +0000
Gerrit-HasComments: Yes

[kudu-CR] [encryption] KUDU-3331 Encrypt file system

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17974 )

Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17974/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17974/3//COMMIT_MSG@32
PS3, Line 32: and tablet metadata files.
> So far, we haven't treated this metadata as sensitive (e.g. when you run wa
I don't think we should treat our redaction policy as the gold standard for whether files should be considered sensitive. IMO we'd be better off likening our mindset here to our mindset for fine-grained authz, where all user data/metadata is considered sensitive by default.

That aside, tablet metadata has more than just the table schema in it -- in its current form it includes live row count, and could one day include more user data like min/max values (provided we don't redo the metadata layer first).

...which isn't to say that this patch needs to implement tablet metadata changes. Maybe that could be a separate patch, assuming we can agree on whether to treat metadata as sensitive.



-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 4
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 02 Nov 2021 19:53:37 +0000
Gerrit-HasComments: Yes

[kudu-CR] [encryption] KUDU-3331 Encrypt file system

Posted by "Attila Bukor (Code Review)" <ge...@cloudera.org>.
Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17974 )

Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................


Patch Set 4:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/17974/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17974/2//COMMIT_MSG@12
PS2, Line 12: ntegrates this encryption support into the project
> nit: does it make sense to mention that there is at least one TODO to prope
Done


http://gerrit.cloudera.org:8080/#/c/17974/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17974/3//COMMIT_MSG@15
PS3, Line 15: I also changed the way the "encrypted" flag in *FileOptions works:
> nit: maybe it's worth calling this "is_sensitive" to avoid confusion
Done


http://gerrit.cloudera.org:8080/#/c/17974/3//COMMIT_MSG@20
PS3, Line 20: not having to complicate
            : FileCache.
> Could you elaborate on this a bit more -- what would go wrong if we left th
Are you suggesting that the file cache would still only support encrypted files, but it would be set inside the file cache instead of in env? I guess that wouldn't introduce much more complextity.


http://gerrit.cloudera.org:8080/#/c/17974/3//COMMIT_MSG@28
PS3, Line 28: we can keep FileCache simpler as it doesn't
            :   have to 
> nit: I'm having trouble parsing this sentence. Mind rephrasing? I'm guessin
Yep, this must have been an editing error.


http://gerrit.cloudera.org:8080/#/c/17974/3//COMMIT_MSG@32
PS3, Line 32: and tablet metadata files.
> Why is this the case again? There's a lot you can infer from just a tablet 
So far, we haven't treated this metadata as sensitive (e.g. when you run wal dump -redact=all, table schema info is still shown.



-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 4
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 02 Nov 2021 15:16:32 +0000
Gerrit-HasComments: Yes

[kudu-CR] [encryption] KUDU-3331 Encrypt file system

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17974 )

Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/17974/2//COMMIT_MSG
Commit Message:

PS2: 
It'd be nice to see some benchmarks for how this performs. Perhaps we could select some benchmark tests to run with and without encryption to compare, and post the runtimes before and after?

Here are some that come to mind as fairly disk-IO-intensive (or at least seem like they'd be affected by this patch, and may be exercise hot paths), though feel free to not use these and 'grep -r "Benchmark" . --include="*test*"' to pick some yourself:
- LogBlockManagerTest.StartupBenchmark
- DenseNodeTest.RunTest with various flags set (see the original commit for some example runs)
- TestEnv.TestHolePunchBenchmark and/or TabletServerTest.TestDeleteTabletBenchmark
- some test that exercises election storms (none come to mind, but maybe one exists already)


http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/tools/tool_action_pbc.cc
File src/kudu/tools/tool_action_pbc.cc:

http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/tools/tool_action_pbc.cc@105
PS2, Line 105: 
             : bool IsFileEncrypted(Env* env, const std::string& fname) {
             :   // TODO(abukor): replace with real encryption check
             :   return fname.length() > 9 && fname.compare(fname.length() - 9, 9, ".metadata") == 0;
             : }
nit: maybe add a comment mentioning why this is the case? I don't quite follow it as is -- seems like only log block metadata is unencrypted? What about instance and path instance files?


http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/util/env.h
File src/kudu/util/env.h:

http://gerrit.cloudera.org:8080/#/c/17974/2/src/kudu/util/env.h@421
PS2, Line 421: true
> Ah, interesting: I'd expect that everything is kept non-encrypted by defaul
+1, I found this this surprising. In general we've been security-disabled by default. Why change that now?



-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 2
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 27 Oct 2021 00:18:21 +0000
Gerrit-HasComments: Yes

[kudu-CR] [encryption] KUDU-3331 Encrypt file system

Posted by "Attila Bukor (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, Andrew Wong, Bankim Bhavsar, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17974

to look at the new patch set (#3).

Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................

[encryption] KUDU-3331 Encrypt file system

de02a34 introduced encryption support to Env in a self-contained way,
but it's not used across Kudu.

This commit integrates this encryption support into the project and
modifies several test suites to also run tests with encryption enabled.

I also changed the way the "encrypted" flag in *FileOptions works:
previously, the default was false, as files were not encrypted by
default. As this commit introduces the "--encrypt_data_at_rest" flag,
which is false by default, the encrypted flag now means that a file is
encrypted or not if data at rest encryption is enabled. The default is
now true, which results in less code change and not having to complicate
FileCache.

When encryption is enabled, the following files are encrypted:

- WAL and blocks as they contain user data, which is considered
  sensitive.
- LBM block metadata as FileCache is used to access them, so by
  encrypting these files, we can keep FileCache simpler contains block
  location

Instance and block manager instance files don't need to be encrypted,
and neither do consensus and tablet metadata files.

As most PBC files are non-encrypted, PBUtil disables encryption by
default. This leads to an edge-case in the PBC CLI tool, which can dump
and edit PBC files. This needs to be able to recognize whether the file
is encrypted. As encryption headers are not yet implemented, I added a
hack which checks the file name and treats the file as encrypted if it
ends with ".metadata" (LBM block metadata extension) and non-encrypted
otherwise.

Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/security-itest.cc
M src/kudu/mini-cluster/external_mini_cluster.cc
M src/kudu/mini-cluster/external_mini_cluster.h
M src/kudu/postgres/mini_postgres.cc
M src/kudu/ranger/ranger_client.cc
M src/kudu/security/test/mini_kdc.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_pbc.cc
M src/kudu/util/env.cc
M src/kudu/util/env.h
M src/kudu/util/env_posix.cc
M src/kudu/util/file_cache-test.cc
M src/kudu/util/pb_util-test.cc
M src/kudu/util/pb_util.cc
M src/kudu/util/rolling_log.cc
M src/kudu/util/yamlreader-test.cc
19 files changed, 272 insertions(+), 93 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/17974/3
-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 3
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] [security] KUDU-3331 Encrypt file system

Posted by "Attila Bukor (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, Andrew Wong, Bankim Bhavsar, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17974

to look at the new patch set (#11).

Change subject: [security] KUDU-3331 Encrypt file system
......................................................................

[security] KUDU-3331 Encrypt file system

de02a34 introduced encryption support to Env in a self-contained way,
but it's not used across Kudu.

This commit integrates this encryption support into the project and
modifies several test suites to also run tests with encryption enabled.

I also renamed "encrypted" to "is_sensitive" in *FileOption as a file
with this flag will be encrypted only if encryption is enabled for the
process.

When encryption is enabled, the following files are encrypted:

- WAL segments
- LBM blocks and metadata
- FBM blocks
- tablet and consensus metadata

Logs, rolling logs, instance and block manager instance files,
configuration files in integration tests are not encrypted.

As FileCache is not used to access instance files, it only supports
handling sensitive files and can't be used to access unencrypted files.

As the PBC CLI tool can be used to dump encrypted (metadata) and
unencrypted files (instance) as well, it needs to be able to determine
if a file is encrypted or not. As encryption headers are not yet
implemented, I introduced a hack which checks the file name and treats
the file as unencrypted if it ends with "instance" and encrypted
otherwise.

I ran some benchmarks to compare running Kudu with encryption enabled
and disabled.

The following are StartupBenchmark tests run with KUDU_ALLOW_SLOW_TESTS
set to true, which uses a block count of 1,000,000.

It seems enabling encryption adds around 20% overhead on startup in a
typical use-case with no deletes. All tests below were ran in release
mode.

 Performance counter stats for './bin/log_block_manager-test --gtest_filter=*StartupBenchmark/0' (10 runs):

      40391.075316      task-clock (msec)         #    2.021 CPUs utilized            ( +-  1.05% )
            11,089      context-switches          #    0.275 K/sec                    ( +-  9.87% )
               280      cpu-migrations            #    0.007 K/sec                    ( +-  1.58% )
           593,982      page-faults               #    0.015 M/sec                    ( +-  2.13% )
   110,595,311,391      cycles                    #    2.738 GHz                      ( +-  1.03% )
    90,580,214,722      instructions              #    0.82  insn per cycle           ( +-  0.14% )
    16,449,237,957      branches                  #  407.249 M/sec                    ( +-  0.15% )
        67,169,915      branch-misses             #    0.41% of all branches          ( +-  0.49% )

      19.988553457 seconds time elapsed                                          ( +-  0.58% )

 Performance counter stats for './bin/log_block_manager-test --encrypt_data_at_rest=1 --gtest_filter=*StartupBenchmark/0' (10 runs):

      51317.845606      task-clock (msec)         #    2.133 CPUs utilized            ( +-  0.90% )
            13,214      context-switches          #    0.257 K/sec                    ( +-  4.03% )
               292      cpu-migrations            #    0.006 K/sec                    ( +-  1.76% )
           737,815      page-faults               #    0.014 M/sec                    ( +-  1.49% )
   144,898,246,536      cycles                    #    2.824 GHz                      ( +-  1.08% )
   126,702,271,070      instructions              #    0.87  insn per cycle           ( +-  0.05% )
    24,116,649,584      branches                  #  469.947 M/sec                    ( +-  0.05% )
       106,793,688      branch-misses             #    0.44% of all branches          ( +-  0.35% )

      24.055824830 seconds time elapsed                                          ( +-  0.89% )

With deletes, the difference seems to decrease to about 14% when 90% of
the blocks are deleted.

 Performance counter stats for './bin/log_block_manager-test --gtest_filter=*StartupBenchmark/1' (10 runs):

      53247.212289      task-clock (msec)         #    1.494 CPUs utilized            ( +-  0.69% )
            94,868      context-switches          #    0.002 M/sec                    ( +-  0.13% )
               530      cpu-migrations            #    0.010 K/sec                    ( +-  1.48% )
           399,284      page-faults               #    0.007 M/sec                    ( +-  1.66% )
   145,147,457,046      cycles                    #    2.726 GHz                      ( +-  0.48% )
   141,892,983,444      instructions              #    0.98  insn per cycle           ( +-  0.04% )
    26,167,495,753      branches                  #  491.434 M/sec                    ( +-  0.04% )
        59,986,442      branch-misses             #    0.23% of all branches          ( +-  0.33% )

      35.648681894 seconds time elapsed                                          ( +-  1.40% )

 Performance counter stats for './bin/log_block_manager-test --encrypt_data_at_rest=1 --gtest_filter=*StartupBenchmark/1' (10 runs):

      70616.598642      task-clock (msec)         #    1.737 CPUs utilized            ( +-  0.81% )
            95,082      context-switches          #    0.001 M/sec                    ( +-  0.28% )
               523      cpu-migrations            #    0.007 K/sec                    ( +-  1.69% )
           679,834      page-faults               #    0.010 M/sec                    ( +-  1.66% )
   203,066,615,244      cycles                    #    2.876 GHz                      ( +-  1.05% )
   209,355,734,267      instructions              #    1.03  insn per cycle           ( +-  0.08% )
    40,477,560,095      branches                  #  573.202 M/sec                    ( +-  0.07% )
       133,637,310      branch-misses             #    0.33% of all branches          ( +-  1.48% )

      40.653406472 seconds time elapsed                                          ( +-  1.52% )

Delete tablet benchmark takes less than a second to run, so I ran it
1000 times with encryption disabled and enabled. It seems encryption
costs about 30% of overhead in this case.

 Performance counter stats for './bin/tablet_server-test --gtest_filter=TabletServerTest.TestDeleteTabletBenchmark' (1000 runs):

        735.800649      task-clock (msec)         #    0.994 CPUs utilized            ( +-  0.33% )
             3,613      context-switches          #    0.005 M/sec                    ( +-  0.15% )
               178      cpu-migrations            #    0.242 K/sec                    ( +-  0.29% )
            10,722      page-faults               #    0.015 M/sec                    ( +-  0.08% )
     1,316,404,469      cycles                    #    1.789 GHz                      ( +-  0.19% )
     1,629,691,550      instructions              #    1.24  insn per cycle           ( +-  0.21% )
       337,778,107      branches                  #  459.062 M/sec                    ( +-  0.19% )
         6,340,956      branch-misses             #    1.88% of all branches          ( +-  0.21% )

       0.739940005 seconds time elapsed                                          ( +-  2.33% )

 Performance counter stats for './bin/tablet_server-test --encrypt_data_at_rest=1 --gtest_filter=TabletServerTest.TestDeleteTabletBenchmark' (1000 runs):

        769.368354      task-clock (msec)         #    0.792 CPUs utilized            ( +-  0.34% )
             3,633      context-switches          #    0.005 M/sec                    ( +-  0.13% )
               183      cpu-migrations            #    0.238 K/sec                    ( +-  0.29% )
            10,737      page-faults               #    0.014 M/sec                    ( +-  0.07% )
     1,356,327,815      cycles                    #    1.763 GHz                      ( +-  0.14% )
     1,635,206,270      instructions              #    1.21  insn per cycle           ( +-  0.06% )
       338,261,840      branches                  #  439.662 M/sec                    ( +-  0.06% )
         6,486,125      branch-misses             #    1.92% of all branches          ( +-  0.21% )

       0.971974609 seconds time elapsed                                          ( +-  2.42% )

Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log_index.cc
M src/kudu/consensus/log_util.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/dir_manager.cc
M src/kudu/fs/dir_util.cc
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/log_block_manager-test-util.cc
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/integration-tests/dense_node-itest.cc
M src/kudu/integration-tests/mini_cluster_fs_inspector.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/security-itest.cc
M src/kudu/mini-cluster/external_mini_cluster.cc
M src/kudu/mini-cluster/external_mini_cluster.h
M src/kudu/postgres/mini_postgres.cc
M src/kudu/ranger/ranger_client.cc
M src/kudu/security/test/mini_kdc.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_pbc.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_source_session-test.cc
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/util/env-test.cc
M src/kudu/util/env.cc
M src/kudu/util/env.h
M src/kudu/util/env_posix.cc
M src/kudu/util/env_util.cc
M src/kudu/util/file_cache-test.cc
M src/kudu/util/file_cache.cc
M src/kudu/util/pb_util-test.cc
M src/kudu/util/pb_util.cc
M src/kudu/util/pb_util.h
M src/kudu/util/rolling_log.cc
M src/kudu/util/yamlreader-test.cc
41 files changed, 458 insertions(+), 188 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/17974/11
-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 11
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] [encryption] KUDU-3331 Encrypt file system

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/17974 )

Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................


Patch Set 9: Code-Review+1

(4 comments)

http://gerrit.cloudera.org:8080/#/c/17974/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17974/7//COMMIT_MSG@45
PS7, Line 45: tes. All tests below wer
> Yes, added a note about it.
Cool, thanks!


http://gerrit.cloudera.org:8080/#/c/17974/9//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17974/9//COMMIT_MSG@32
PS9, Line 32: be
nit: 'be' --> 'is' or maybe add 'can'?


http://gerrit.cloudera.org:8080/#/c/17974/7/src/kudu/util/env_posix.cc
File src/kudu/util/env_posix.cc:

http://gerrit.cloudera.org:8080/#/c/17974/7/src/kudu/util/env_posix.cc@1306
PS7, Line 1306: 
> I don't know about that, each file type uses separate *FileOptions types, s
Ah, I was under impression FileOptions structs/classes have a common ancestor, but they aren't.

With that I guess it's totally fine to keep it as is (it's just a nit, after all).


http://gerrit.cloudera.org:8080/#/c/17974/6/src/kudu/util/rolling_log.cc
File src/kudu/util/rolling_log.cc:

http://gerrit.cloudera.org:8080/#/c/17974/6/src/kudu/util/rolling_log.cc@163
PS6, Line 163: opts.is_sensitive = false;
> It's not necessarily needed, but it doesn't hurt. Also, it makes things sim
SGTM, thanks for the clarification.



-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 9
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 17 Dec 2021 22:27:32 +0000
Gerrit-HasComments: Yes

[kudu-CR] [encryption] KUDU-3331 Encrypt file system

Posted by "Attila Bukor (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, Andrew Wong, Bankim Bhavsar, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17974

to look at the new patch set (#10).

Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................

[encryption] KUDU-3331 Encrypt file system

de02a34 introduced encryption support to Env in a self-contained way,
but it's not used across Kudu.

This commit integrates this encryption support into the project and
modifies several test suites to also run tests with encryption enabled.

I also renamed "encrypted" to "is_sensitive" in *FileOption as a file
with this flag will be encrypted only if encryption is enabled for the
process.

When encryption is enabled, the following files are encrypted:

- WAL segments
- LBM blocks and metadata
- FBM blocks
- tablet and consensus metadata

Logs, rolling logs, instance and block manager instance files,
configuration files in integration tests are not encrypted.

As FileCache is not used to access instance files, it only supports
handling sensitive files and can't be used to access unencrypted files.

As the PBC CLI tool can be used to dump encrypted (metadata) and
unencrypted files (instance) as well, it needs to be able to determine
if a file is encrypted or not. As encryption headers are not yet
implemented, I introduced a hack which checks the file name and treats
the file as unencrypted if it ends with "instance" and encrypted
otherwise.

I ran some benchmarks to compare running Kudu with encryption enabled
and disabled.

The following are StartupBenchmark tests run with KUDU_ALLOW_SLOW_TESTS
set to true, which uses a block count of 1,000,000.

It seems enabling encryption adds around 20% overhead on startup in a
typical use-case with no deletes. All tests below were ran in release
mode.

 Performance counter stats for './bin/log_block_manager-test --gtest_filter=*StartupBenchmark/0' (10 runs):

      40391.075316      task-clock (msec)         #    2.021 CPUs utilized            ( +-  1.05% )
            11,089      context-switches          #    0.275 K/sec                    ( +-  9.87% )
               280      cpu-migrations            #    0.007 K/sec                    ( +-  1.58% )
           593,982      page-faults               #    0.015 M/sec                    ( +-  2.13% )
   110,595,311,391      cycles                    #    2.738 GHz                      ( +-  1.03% )
    90,580,214,722      instructions              #    0.82  insn per cycle           ( +-  0.14% )
    16,449,237,957      branches                  #  407.249 M/sec                    ( +-  0.15% )
        67,169,915      branch-misses             #    0.41% of all branches          ( +-  0.49% )

      19.988553457 seconds time elapsed                                          ( +-  0.58% )

 Performance counter stats for './bin/log_block_manager-test --encrypt_data_at_rest=1 --gtest_filter=*StartupBenchmark/0' (10 runs):

      51317.845606      task-clock (msec)         #    2.133 CPUs utilized            ( +-  0.90% )
            13,214      context-switches          #    0.257 K/sec                    ( +-  4.03% )
               292      cpu-migrations            #    0.006 K/sec                    ( +-  1.76% )
           737,815      page-faults               #    0.014 M/sec                    ( +-  1.49% )
   144,898,246,536      cycles                    #    2.824 GHz                      ( +-  1.08% )
   126,702,271,070      instructions              #    0.87  insn per cycle           ( +-  0.05% )
    24,116,649,584      branches                  #  469.947 M/sec                    ( +-  0.05% )
       106,793,688      branch-misses             #    0.44% of all branches          ( +-  0.35% )

      24.055824830 seconds time elapsed                                          ( +-  0.89% )

With deletes, the difference seems to decrease to about 14% when 90% of
the blocks are deleted.

 Performance counter stats for './bin/log_block_manager-test --gtest_filter=*StartupBenchmark/1' (10 runs):

      53247.212289      task-clock (msec)         #    1.494 CPUs utilized            ( +-  0.69% )
            94,868      context-switches          #    0.002 M/sec                    ( +-  0.13% )
               530      cpu-migrations            #    0.010 K/sec                    ( +-  1.48% )
           399,284      page-faults               #    0.007 M/sec                    ( +-  1.66% )
   145,147,457,046      cycles                    #    2.726 GHz                      ( +-  0.48% )
   141,892,983,444      instructions              #    0.98  insn per cycle           ( +-  0.04% )
    26,167,495,753      branches                  #  491.434 M/sec                    ( +-  0.04% )
        59,986,442      branch-misses             #    0.23% of all branches          ( +-  0.33% )

      35.648681894 seconds time elapsed                                          ( +-  1.40% )

 Performance counter stats for './bin/log_block_manager-test --encrypt_data_at_rest=1 --gtest_filter=*StartupBenchmark/1' (10 runs):

      70616.598642      task-clock (msec)         #    1.737 CPUs utilized            ( +-  0.81% )
            95,082      context-switches          #    0.001 M/sec                    ( +-  0.28% )
               523      cpu-migrations            #    0.007 K/sec                    ( +-  1.69% )
           679,834      page-faults               #    0.010 M/sec                    ( +-  1.66% )
   203,066,615,244      cycles                    #    2.876 GHz                      ( +-  1.05% )
   209,355,734,267      instructions              #    1.03  insn per cycle           ( +-  0.08% )
    40,477,560,095      branches                  #  573.202 M/sec                    ( +-  0.07% )
       133,637,310      branch-misses             #    0.33% of all branches          ( +-  1.48% )

      40.653406472 seconds time elapsed                                          ( +-  1.52% )

Delete tablet benchmark takes less than a second to run, so I ran it
1000 times with encryption disabled and enabled. It seems encryption
costs about 30% of overhead in this case.

 Performance counter stats for './bin/tablet_server-test --gtest_filter=TabletServerTest.TestDeleteTabletBenchmark' (1000 runs):

        735.800649      task-clock (msec)         #    0.994 CPUs utilized            ( +-  0.33% )
             3,613      context-switches          #    0.005 M/sec                    ( +-  0.15% )
               178      cpu-migrations            #    0.242 K/sec                    ( +-  0.29% )
            10,722      page-faults               #    0.015 M/sec                    ( +-  0.08% )
     1,316,404,469      cycles                    #    1.789 GHz                      ( +-  0.19% )
     1,629,691,550      instructions              #    1.24  insn per cycle           ( +-  0.21% )
       337,778,107      branches                  #  459.062 M/sec                    ( +-  0.19% )
         6,340,956      branch-misses             #    1.88% of all branches          ( +-  0.21% )

       0.739940005 seconds time elapsed                                          ( +-  2.33% )

 Performance counter stats for './bin/tablet_server-test --encrypt_data_at_rest=1 --gtest_filter=TabletServerTest.TestDeleteTabletBenchmark' (1000 runs):

        769.368354      task-clock (msec)         #    0.792 CPUs utilized            ( +-  0.34% )
             3,633      context-switches          #    0.005 M/sec                    ( +-  0.13% )
               183      cpu-migrations            #    0.238 K/sec                    ( +-  0.29% )
            10,737      page-faults               #    0.014 M/sec                    ( +-  0.07% )
     1,356,327,815      cycles                    #    1.763 GHz                      ( +-  0.14% )
     1,635,206,270      instructions              #    1.21  insn per cycle           ( +-  0.06% )
       338,261,840      branches                  #  439.662 M/sec                    ( +-  0.06% )
         6,486,125      branch-misses             #    1.92% of all branches          ( +-  0.21% )

       0.971974609 seconds time elapsed                                          ( +-  2.42% )

Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log_index.cc
M src/kudu/consensus/log_util.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/dir_manager.cc
M src/kudu/fs/dir_util.cc
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/log_block_manager-test-util.cc
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/integration-tests/dense_node-itest.cc
M src/kudu/integration-tests/mini_cluster_fs_inspector.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/security-itest.cc
M src/kudu/mini-cluster/external_mini_cluster.cc
M src/kudu/mini-cluster/external_mini_cluster.h
M src/kudu/postgres/mini_postgres.cc
M src/kudu/ranger/ranger_client.cc
M src/kudu/security/test/mini_kdc.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_pbc.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_source_session-test.cc
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/util/env-test.cc
M src/kudu/util/env.cc
M src/kudu/util/env.h
M src/kudu/util/env_posix.cc
M src/kudu/util/env_util.cc
M src/kudu/util/file_cache-test.cc
M src/kudu/util/file_cache.cc
M src/kudu/util/pb_util-test.cc
M src/kudu/util/pb_util.cc
M src/kudu/util/pb_util.h
M src/kudu/util/rolling_log.cc
M src/kudu/util/yamlreader-test.cc
41 files changed, 458 insertions(+), 188 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/17974/10
-- 
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 10
Gerrit-Owner: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)