You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Andrew Wong (Code Review)" <ge...@cloudera.org> on 2019/12/05 20:58:10 UTC

[kudu-CR] fs: make some fields optional

Andrew Wong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14856


Change subject: fs: make some fields optional
......................................................................

fs: make some fields optional

Some fields in fs.proto were defined as required. This isn't good
practice; I expect these definitions to be reused for the implementation
of a new directory manager for WALs, so let's make them optional. I've
only updated the ones I expect to be relevant for adding a new directory
manager (i.e. those that involve path instances, etc.).

Is this backwards compatible? Assuming the DataDirManager continues to
use the changed fields (which I expect), yes.

Change-Id: I5378391a94e5faa54a11a277d9191d151e0225d5
---
M src/kudu/fs/fs.proto
1 file changed, 6 insertions(+), 6 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/56/14856/1
-- 
To view, visit http://gerrit.cloudera.org:8080/14856
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I5378391a94e5faa54a11a277d9191d151e0225d5
Gerrit-Change-Number: 14856
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>

[kudu-CR] fs: mark 'filesystem block size bytes' optional

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14856 )

Change subject: fs: mark 'filesystem_block_size_bytes' optional
......................................................................

fs: mark 'filesystem_block_size_bytes' optional

While the field is useful when dealing with data directories, if we want
to reuse the path instance files for WAL directories, the field won't
always be necessary. As such, this marks it 'optional' instead of
'required'.

Is this safe? While not strictly backwards compatible as far as protobuf
goes, I'm assuming that the DataDirManager will continue to use the now
optional field. Taken in that context, the switch to optional shouldn't
pose backwards compatibility issues for older clusters. While I can't
guarantee that, I've at least added a comment indicating the appropriate
usage of the field.

Change-Id: I5378391a94e5faa54a11a277d9191d151e0225d5
Reviewed-on: http://gerrit.cloudera.org:8080/14856
Tested-by: Kudu Jenkins
Reviewed-by: Adar Dembo <ad...@cloudera.com>
---
M src/kudu/fs/fs.proto
1 file changed, 2 insertions(+), 1 deletion(-)

Approvals:
  Kudu Jenkins: Verified
  Adar Dembo: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/14856
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I5378391a94e5faa54a11a277d9191d151e0225d5
Gerrit-Change-Number: 14856
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] fs: make some fields optional

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14856 )

Change subject: fs: make some fields optional
......................................................................


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14856/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14856/2//COMMIT_MSG@17
PS2, Line 17: yes.
I wouldn't claim backward compatibility here since apparently it's not backward compatible -- components of prior versions don't expect fields to be missing (that's what optional is about, right?)


http://gerrit.cloudera.org:8080/#/c/14856/2//COMMIT_MSG@16
PS2, Line 16: continues to
            : use
It's not only use, but populate all of them.  How to guarantee that?  And is it necessary to guarantee that all?

Also, where are these structures persisted?  On the disk?



-- 
To view, visit http://gerrit.cloudera.org:8080/14856
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5378391a94e5faa54a11a277d9191d151e0225d5
Gerrit-Change-Number: 14856
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 06 Dec 2019 02:58:00 +0000
Gerrit-HasComments: Yes

[kudu-CR] fs: mark 'filesystem block size bytes' optional

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14856 )

Change subject: fs: mark 'filesystem_block_size_bytes' optional
......................................................................


Patch Set 4: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14856/4/src/kudu/fs/fs.proto
File src/kudu/fs/fs.proto:

http://gerrit.cloudera.org:8080/#/c/14856/4/src/kudu/fs/fs.proto@71
PS4, Line 71:   // This must be present if this instance belongs to a data directory.
FWIW, it's cheap to figure this out, and it could be useful if we ever wanted block boundary alignment in the WAL.



-- 
To view, visit http://gerrit.cloudera.org:8080/14856
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5378391a94e5faa54a11a277d9191d151e0225d5
Gerrit-Change-Number: 14856
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 06 Dec 2019 19:17:20 +0000
Gerrit-HasComments: Yes

[kudu-CR] fs: make some fields optional

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14856 )

Change subject: fs: make some fields optional
......................................................................


Patch Set 2: Code-Review+1


-- 
To view, visit http://gerrit.cloudera.org:8080/14856
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5378391a94e5faa54a11a277d9191d151e0225d5
Gerrit-Change-Number: 14856
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 06 Dec 2019 06:31:27 +0000
Gerrit-HasComments: No

[kudu-CR] fs: make some fields optional

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14856 )

Change subject: fs: make some fields optional
......................................................................


Patch Set 2: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14856/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14856/2//COMMIT_MSG@17
PS2, Line 17: yes.
> I wouldn't claim backward compatibility here since apparently it's not back
To Andrew's point, it's important to remember the (narrow) way in which we currently use these PB messages: the DataDirManager loads them into memory, each from its own 'block_manager_instance' file found in a Kudu data directory. This is the "old reader with an old schema" that we have to be careful about breaking. There are no others. Even 'kudu pbc dump' isn't an "old reader" per se; it uses the schema information in the PBC file to figure out how to read the contents of the file.

With that in mind, here's how we _could_ break backwards compatibility:
1. Mark these fields as optional.
2. Omit them from newly created block_manager_instance files written out to data directories.
3. Try to open those files using an older version of Kudu by loading the filesystem from disk (i.e. starting a server or running certain CLI commands).

Andrew is staking the following position: the DataDirManager will _always_ include these fields when it writes out these messages, even though they're now going to be optional. Only the new WAL-based directory manager will omit the fields, and the instance files it creates are guaranteed NOT to be loaded by a DataDirManager, since they're going to be in the WAL directories rather than the data directories, and presumably they won't be called 'block_manager_instance' either.

You can argue that Andrew can't guarantee this for all time. Maybe someone will slip up and start omitting the fields from new block_manager_instance files created by the DataDirManager. Maybe the WAL directory manager will somehow write block_manager_instance files in data directories. My take is that with decent documentation I think we can avoid both of these mistakes.



-- 
To view, visit http://gerrit.cloudera.org:8080/14856
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5378391a94e5faa54a11a277d9191d151e0225d5
Gerrit-Change-Number: 14856
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 06 Dec 2019 05:30:18 +0000
Gerrit-HasComments: Yes

[kudu-CR] fs: make some fields optional

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14856 )

Change subject: fs: make some fields optional
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14856/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14856/2//COMMIT_MSG@17
PS2, Line 17: yes.
> To Andrew's point, it's important to remember the (narrow) way in which we 
Adar, thank you for the explanation.  I'm just trying to understand what the statement on the backward compatibility means in this context.

I'm OK with making this change if it's really needed if we keep an eye on the code working with these used-to-be-mandatory fields.  At least, maybe add comments into the modified .proto file for the affected fields mentioning that these fields have to be populated when used in the DataDirManager.  Otherwise, since we don't prohibit running older Kudu binaries with newer data (like in case of rolled-back upgrade), it's possible to hit the introducing incompatibility.



-- 
To view, visit http://gerrit.cloudera.org:8080/14856
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5378391a94e5faa54a11a277d9191d151e0225d5
Gerrit-Change-Number: 14856
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 06 Dec 2019 06:31:17 +0000
Gerrit-HasComments: Yes

[kudu-CR] fs: make some fields optional

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14856

to look at the new patch set (#2).

Change subject: fs: make some fields optional
......................................................................

fs: make some fields optional

Some fields involved in directory tracking were defined as required.
This isn't good practice, and I expect these definitions to be reused
for the implementation of a new directory manager for WALs. So, let's
make them optional. I've only updated the ones I expect to be relevant
for adding a new directory manager (i.e. those that involve path
instances).

Is this backwards compatible? Assuming the DataDirManager continues to
use the changed fields (which I expect), yes.

Change-Id: I5378391a94e5faa54a11a277d9191d151e0225d5
---
M src/kudu/fs/fs.proto
1 file changed, 4 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/56/14856/2
-- 
To view, visit http://gerrit.cloudera.org:8080/14856
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5378391a94e5faa54a11a277d9191d151e0225d5
Gerrit-Change-Number: 14856
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] fs: mark 'filesystem block size bytes' optional

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14856 )

Change subject: fs: mark 'filesystem_block_size_bytes' optional
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14856/4/src/kudu/fs/fs.proto
File src/kudu/fs/fs.proto:

http://gerrit.cloudera.org:8080/#/c/14856/4/src/kudu/fs/fs.proto@71
PS4, Line 71:   // This must be present if this instance belongs to a data directory.
> FWIW, it's cheap to figure this out, and it could be useful if we ever want
Yeah it's not a dealbreaker to keep this 'required', but I don't want to force that on further directory management.



-- 
To view, visit http://gerrit.cloudera.org:8080/14856
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5378391a94e5faa54a11a277d9191d151e0225d5
Gerrit-Change-Number: 14856
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 06 Dec 2019 21:52:04 +0000
Gerrit-HasComments: Yes

[kudu-CR] fs: mark 'filesystem block size bytes' optional

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14856 )

Change subject: fs: mark 'filesystem_block_size_bytes' optional
......................................................................


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14856/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14856/2//COMMIT_MSG@17
PS2, Line 17: lder
> Adar, thank you for the explanation.  I'm just trying to understand what th
Thanks Adar.

Right, I don't expect the now-optional fields to go away any time soon for data directories. And even for the WAL directory, I think most of the fields will remain (only 'filesystem_block_size_bytes', since I'm planning on renaming 'block_manager_type' to 'dir_type'). I made this change in the spirit of trying to avoid using required fields if possible, as recommended by protobuf.

To your point about safety, I really can't guarantee complete safety with this change to optional beyond documentation. That said, I know I was a bit aggressive in switching over to optional. I don't strictly need all of them to be optional -- the aim was to do a bit of cleanup before using these messages more. I've cut back some of the switches, leaving the main one that I expect will actually not need to be used in the WAL directory manager.


http://gerrit.cloudera.org:8080/#/c/14856/2//COMMIT_MSG@16
PS2, Line 16: nal shouldn't
            : pos
> It's not only use, but populate all of them.  How to guarantee that?  And i
Right, they're persisted to disk when creating new data directories. There isn't a strict guarantee other than making sure we fill them out. Generally that's what the 'required' tag was for, but it seems like protobuf docs warn against using 'required'.

The absence of these fields would probably lead to a crash when starting up Kudu, whether these are marked required or not.



-- 
To view, visit http://gerrit.cloudera.org:8080/14856
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5378391a94e5faa54a11a277d9191d151e0225d5
Gerrit-Change-Number: 14856
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 06 Dec 2019 07:39:10 +0000
Gerrit-HasComments: Yes

[kudu-CR] fs: mark 'filesystem block size bytes' optional

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, Adar Dembo, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14856

to look at the new patch set (#3).

Change subject: fs: mark 'filesystem_block_size_bytes' optional
......................................................................

fs: mark 'filesystem_block_size_bytes' optional

While the field is useful when dealing with data directories, if we want
to reuse the path instance files for WAL directories, the field won't
always be necessary. As such, this marks it 'optional' instead of
'required'.

Is this safe? While not strictly backwards compatible as far as protobuf
goes, I'm assuming that the DataDirManager will continue to use the now
optional field. Taken in that context, the switch to optional shouldn't
pose backwards compatibility issues for older clusters. While I can't
guarantee that, I've at least added a comment indicating the appropriate
usage of the field.

Change-Id: I5378391a94e5faa54a11a277d9191d151e0225d5
---
M src/kudu/fs/fs.proto
1 file changed, 2 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/56/14856/3
-- 
To view, visit http://gerrit.cloudera.org:8080/14856
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5378391a94e5faa54a11a277d9191d151e0225d5
Gerrit-Change-Number: 14856
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)