You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "ZhangYao (Code Review)" <ge...@cloudera.org> on 2019/08/01 05:38:19 UTC

[kudu-CR] KUDU-2911: Consider the available space when selecting data dirs for tablets

ZhangYao has uploaded this change for review. ( http://gerrit.cloudera.org:8080/13975


Change subject: KUDU-2911: Consider the available space when selecting data dirs for tablets
......................................................................

KUDU-2911: Consider the available space when selecting data dirs for tablets

Use AvailableByte / TabletNumbers to consider both the tablet number and the
available space when selecting data dirs for tablet.

Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
---
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/util/env_util.cc
M src/kudu/util/env_util.h
4 files changed, 35 insertions(+), 20 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/13975/1
-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 1
Gerrit-Owner: ZhangYao <tr...@gmail.com>

[kudu-CR] Consider the available space when selecting data dirs for blocks.

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13975 )

Change subject: Consider the available space when selecting data dirs for blocks.
......................................................................


Patch Set 9:

> Patch Set 8:
> 
> > (9 comments)
>  > 
>  > I'll look into the test failure some more. If I check out this
>  > patch, I see it failing a small percentage (2-6%) of the time.
> 
> About the failure in DiskErrorITest.TestFailDuringScanWorkload. IIUC, it only inject the disk failure in data_dir[1], and after the reflush check change, it may be remove from the candidate dirs. So it may not trigger the failure and so the test will failure.

I added some logging and think I understand what's going on. When we first create the tablets, we always refresh the space, and this necessarily isn't atomic between the data dirs, so we can sometimes end up with the data directories registering that they have different amounts of available space, even though they share the same disk.

When this happens, because there are only three directories, this implementation of PO2C might end up completely ignoring the data dir with the least amount of space in it.

So I see two paths forward for this. Either:
1) update the implementation of PO2C to sometimes select the data dir with the least space. For example, select two random indices (may be the same) and compare the available space (compared to what we have now, which always compares two different data directories). OR...
2) update disk_failure-itest to inject failures into two data directories instead of one. With the current PO2C implementation, it's a safe bet that killing two data dirs will touch blocks.


-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 9
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Fri, 09 Aug 2019 20:58:12 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-2911: Consider the available space when selecting data dirs for tablets

Posted by "ZhangYao (Code Review)" <ge...@cloudera.org>.
ZhangYao has posted comments on this change. ( http://gerrit.cloudera.org:8080/13975 )

Change subject: KUDU-2911: Consider the available space when selecting data dirs for tablets
......................................................................


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13975/3/src/kudu/fs/data_dirs.h
File src/kudu/fs/data_dirs.h:

http://gerrit.cloudera.org:8080/#/c/13975/3/src/kudu/fs/data_dirs.h@470
PS3, Line 470: vailable disk space of directory.
             :   // The resulting behavior f
> How will this behave during the creation of a very large table, or at the s
Here is more complex if tablet were created but only start filling up. Under that condition, disk space will not change so althought
PO2C may not let all tablet select the same directory but it still have risk to make one directory handle too much tablets.
If use the (available space) / (tablets count):
1?When create tablets with data writing soon. It works like just use available space with the assumption that all tablets with same size.It do not need to assume that the disks are same size.
2?When create a lot of tablets once. It can prevent selecting the same directory which more available space.


http://gerrit.cloudera.org:8080/#/c/13975/3/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

http://gerrit.cloudera.org:8080/#/c/13975/3/src/kudu/fs/data_dirs.cc@988
PS3, Line 988:     int uuid_idx = (*group_uuid_indices)[i];
> It looks like PS4 works like this:
Yeah, this patch I just simply revert the code, I will update later.
It can use PO2C to randomly select two directory and then choose the one with more available space.
And the strategy here is not that complex because when we create block we will write data.



-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 4
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Fri, 02 Aug 2019 06:40:15 +0000
Gerrit-HasComments: Yes

[kudu-CR] Consider the available space when selecting data dirs for blocks.

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/13975 )

Change subject: Consider the available space when selecting data dirs for blocks.
......................................................................


Patch Set 10:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/13975/10//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13975/10//COMMIT_MSG@7
PS10, Line 7: Consider the available space when selecting data dirs for blocks.
Could you tag KUDU-2901 here?


http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.h
File src/kudu/fs/data_dirs.h:

http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.h@338
PS10, Line 338:   // Returns a random directory and gives preference to those with more free
Nit: "...directory, giving preference to..."


http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.h@472
PS10, Line 472:   // If those two directories have same load, will prefer the one with more
              :   // free space. 
Nit: Rephrase as "Ties are broken by choosing the directory with more free space."


http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.cc@90
PS10, Line 90:              "Number of seconds we cache the disk available space in the block manager. ");
Nit: "Number of seconds we cache available disk space in the block manager."


http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.cc@269
PS10, Line 269:       int64_t available_bytes;
Nit: keeping in with "is_full_" and "is_full_new", perhaps name this "available_bytes_new"?


http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.cc@984
PS10, Line 984:   vector<int> candidate_indices;
Wouldn't this be simpler as a vector<DataDir*>? Seems like we could avoid a few FindOrDie() calls that way.


http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.cc@999
PS10, Line 999:   shuffle(candidate_indices.begin(), candidate_indices.end(), default_random_engine(rng_.Next()));
This can be moved into the if block to avoid a no-op call when candidate_indices is empty.


http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/util/env_util.h
File src/kudu/util/env_util.h:

http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/util/env_util.h@56
PS10, Line 56: // If available_bytes is set, will get the free bytes of the disk space.
Nit: "If 'available_bytes' is not null, it will contain the amount of free disk space (in bytes) in 'path' when the function finishes".


http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/util/env_util.cc
File src/kudu/util/env_util.cc:

http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/util/env_util.cc@159
PS10, Line 159:   if (available_bytes != nullptr) {
              :     *available_bytes = free_bytes;
              :   }
Don't we want this to happen even if we return an ENOSPC IOError on L154? Otherwise won't 'available_bytes' (in data_dirs.cc) contain a garbage value?

Curious why this wasn't caught in a unit test; could you make sure we cover this case?



-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 10
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Tue, 13 Aug 2019 06:18:05 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2911: Consider the available space when selecting data dirs for tablets

Posted by "ZhangYao (Code Review)" <ge...@cloudera.org>.
Hello Yingchun Lai, Kudu Jenkins, Andrew Wong, Adar Dembo, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/13975

to look at the new patch set (#3).

Change subject: KUDU-2911: Consider the available space when selecting data dirs for tablets
......................................................................

KUDU-2911: Consider the available space when selecting data dirs for tablets

Use AvailableByte / TabletNumbers to consider both the tablet number and the
available space when selecting data dirs for tablet. Measured by available
bytes when selecting dir for new blocks.

Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
---
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/util/env_util.cc
M src/kudu/util/env_util.h
4 files changed, 52 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/13975/3
-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 3
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>

[kudu-CR] KUDU-2911: Consider the available space when selecting data dirs for tablets

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/13975 )

Change subject: KUDU-2911: Consider the available space when selecting data dirs for tablets
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13975/3/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

http://gerrit.cloudera.org:8080/#/c/13975/3/src/kudu/fs/data_dirs.cc@988
PS3, Line 988:     int uuid_idx = (*group_uuid_indices)[i];
> Thanks a lot. I have restored it to the original. This made me very trouble
It looks like PS4 works like this:
1. When creating a new tablet, take available disk space into account when choosing data dirs.
2. When creating a new block, continue to randomly select amongst the tablet's data dirs for the new block's home.

IIUC #2 was previously addressed by PS3, but that was reverted for PS4. And that's what Andrew had commented on (i.e. all the blocks stacking on one disk because it had more available space). I like Andrew's idea of extending power-of-2-choice to this problem. Would it work like this?
1. First choose two data dirs at random from the tablet's data dir group.
2. Then choose whichever of the two data dirs has more free space.

Would it be possible to profile the performance difference, perhaps using YCSB?



-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 4
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Fri, 02 Aug 2019 04:14:41 +0000
Gerrit-HasComments: Yes

[kudu-CR] Consider the available space when selecting data dirs for blocks.

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13975 )

Change subject: Consider the available space when selecting data dirs for blocks.
......................................................................


Patch Set 9:

> Patch Set 9:
> 
> > Patch Set 8:
> > 
> > > (9 comments)
> >  > 
> >  > I'll look into the test failure some more. If I check out this
> >  > patch, I see it failing a small percentage (2-6%) of the time.
> > 
> > About the failure in DiskErrorITest.TestFailDuringScanWorkload. IIUC, it only inject the disk failure in data_dir[1], and after the reflush check change, it may be remove from the candidate dirs. So it may not trigger the failure and so the test will failure.
> 
> I added some logging and think I understand what's going on. When we first create the tablets, we always refresh the space, and this necessarily isn't atomic between the data dirs, so we can sometimes end up with the data directories registering that they have different amounts of available space, even though they share the same disk.
> 
> When this happens, because there are only three directories, this implementation of PO2C might end up completely ignoring the data dir with the least amount of space in it.
> 
> So I see two paths forward for this. Either:
> 1) update the implementation of PO2C to sometimes select the data dir with the least space. For example, select two random indices (may be the same) and compare the available space (compared to what we have now, which always compares two different data directories). OR...
> 2) update disk_failure-itest to inject failures into two data directories instead of one. With the current PO2C implementation, it's a safe bet that killing two data dirs will touch blocks.

I tried both, and both seem to reduce the flakiness of the test (either fix would pass 500/500 times instead of ~480/500 times).


-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 9
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Fri, 09 Aug 2019 20:59:20 +0000
Gerrit-HasComments: No

[kudu-CR] Consider the available space when selecting data dirs for blocks.

Posted by "ZhangYao (Code Review)" <ge...@cloudera.org>.
Hello Yingchun Lai, Kudu Jenkins, Andrew Wong, Adar Dembo, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/13975

to look at the new patch set (#6).

Change subject: Consider the available space when selecting data dirs for blocks.
......................................................................

Consider the available space when selecting data dirs for blocks.

Randomly select two not full dir in tablet's data dir group  and then choose
the dir which has more free space.

Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
---
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/integration-tests/disk_failure-itest.cc
M src/kudu/util/env_util.cc
M src/kudu/util/env_util.h
6 files changed, 88 insertions(+), 56 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/13975/6
-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 6
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>

[kudu-CR] KUDU-2901: Consider the available space when selecting data dirs for tablets and blocks.

Posted by "ZhangYao (Code Review)" <ge...@cloudera.org>.
ZhangYao has posted comments on this change. ( http://gerrit.cloudera.org:8080/13975 )

Change subject: KUDU-2901: Consider the available space when selecting data dirs for tablets and blocks.
......................................................................


Patch Set 11:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.h
File src/kudu/fs/data_dirs.h:

http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.h@207
PS10, Line 207: ilable_bytes_.
> // Protects 'last_space_check_', 'is_full_' and 'available_bytes_'.
Done


http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.h@338
PS10, Line 338: e one
> the one ?
Done


http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.h@338
PS10, Line 338:   // Returns a random directory, giving preference to the one with more free
> Nit: "...directory, giving preference to..."
Done


http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.h@340
PS10, Line 340: , 
> nit: keep the blank space.
Done


http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.h@472
PS10, Line 472:   // Ties are broken by choosing the directory with more free space. The
              :   // resulting be
> Nit: Rephrase as "Ties are broken by choosing the directory with more free 
Done


http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.cc@90
PS10, Line 90:              "Number of seconds we cache the available disk space in the block manager. ");
> Nit: "Number of seconds we cache available disk space in the block manager.
Done


http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.cc@261
PS10, Line 261: last_space_check_
> How about change it to expiry meaning, then we don't need to add FLAGS_fs_d
If change it to expiry meaning, it will need to add FLAGS_fs_data_dirs_available_space_cache_seconds in ALWAYS mode. Maybe last_space_check_ is clear enough and what's more it can work when FLAGS_fs_data_dirs_available_space_cache_seconds is change by unit test during expire time.


http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.cc@269
PS10, Line 269:       int64_t available_bytes_new;
> Nit: keeping in with "is_full_" and "is_full_new", perhaps name this "avail
Done


http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.cc@984
PS10, Line 984:   vector<DataDir*> candidate_dirs;
> Wouldn't this be simpler as a vector<DataDir*>? Seems like we could avoid a
Yeah. Done.


http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.cc@999
PS10, Line 999:   if (candidate_dirs.size() >= 2) {
> This can be moved into the if block to avoid a no-op call when candidate_in
Done


http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.cc@1072
PS10, Line 1072:     } else {
> Seems some redundant with L994~L1006, how about do some refactor?
Yeah:) And after this patch, above L994~L1006 was changed. Maybe there is no need to refactor since they are different?


http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/util/env_util.h
File src/kudu/util/env_util.h:

http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/util/env_util.h@56
PS10, Line 56: // If 'available_bytes' is not null, it will contain the amount of free disk space (in bytes)
> Nit: "If 'available_bytes' is not null, it will contain the amount of free 
Done


http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/util/env_util.cc
File src/kudu/util/env_util.cc:

http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/util/env_util.cc@159
PS10, Line 159:                                       "($2 bytes available vs $3 bytes reserved)",
              :                                       requested_bytes, path, free_bytes, reserved_bytes),
              :    
> Don't we want this to happen even if we return an ENOSPC IOError on L154? O
Yeah... You are right. I originally put it here because if there is an IOERROR, this dir will not be selected when creating tablets, so its available_bytes will not be  used. I think it would cause bug because is_full is judged before available_bytes and creating tablets(using ALWAYS) is before creating block(using EXPIRED). So this garbage available_bytes have no chance to be used. However, I think it should update the available_bytes even if return IOError for the functional semantic integrity.



-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 11
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Tue, 13 Aug 2019 11:44:14 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2901: Consider the available space when selecting data dirs for tablets and blocks.

Posted by "ZhangYao (Code Review)" <ge...@cloudera.org>.
Hello Yingchun Lai, Kudu Jenkins, Andrew Wong, Adar Dembo, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/13975

to look at the new patch set (#11).

Change subject: KUDU-2901: Consider the available space when selecting data dirs for tablets and blocks.
......................................................................

KUDU-2901: Consider the available space when selecting data dirs for tablets and blocks.

When creating tablets:
If randomly selected two dirs have same tablets count, gives prefrence to the
one with more free space.

When creating blocks:
Randomly select two not full dir in tablet's data dir group and then choose
the dir which has more free space.

Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
---
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/data_dirs-test.cc
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/integration-tests/disk_failure-itest.cc
M src/kudu/util/env_util.cc
M src/kudu/util/env_util.h
9 files changed, 105 insertions(+), 86 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/13975/11
-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 11
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>

[kudu-CR] Consider the available space when selecting data dirs for blocks.

Posted by "ZhangYao (Code Review)" <ge...@cloudera.org>.
ZhangYao has posted comments on this change. ( http://gerrit.cloudera.org:8080/13975 )

Change subject: Consider the available space when selecting data dirs for blocks.
......................................................................


Patch Set 8:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/13975/7/src/kudu/fs/data_dirs.h
File src/kudu/fs/data_dirs.h:

http://gerrit.cloudera.org:8080/#/c/13975/7/src/kudu/fs/data_dirs.h@207
PS7, Line 207: ailable_bytes_.
> nit: wrap this in ''s like the other variable names.
Done


http://gerrit.cloudera.org:8080/#/c/13975/7/src/kudu/fs/data_dirs.h@212
PS7, Line 212: , updated by RefreshAvailab
> nit: remove this, or update it to RefreshAvailableSpace
Done


http://gerrit.cloudera.org:8080/#/c/13975/7/src/kudu/fs/data_dirs.h@338
PS7, Line 338: and gives preference to those with more free
             :   // space in the specified option's data dir gr
> nit: Could you rephrase a bit?
Done


http://gerrit.cloudera.org:8080/#/c/13975/7/src/kudu/fs/data_dirs.h@471
PS7, Line 471: as the number of unique tablets in the directory.
             :   // If those two directories have same load, will prefer the one with more
             :   // free space. The resulting behavior fills directories that hav
> nit: update this?
Done


http://gerrit.cloudera.org:8080/#/c/13975/7/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

http://gerrit.cloudera.org:8080/#/c/13975/7/src/kudu/fs/data_dirs.cc@90
PS7, Line 90:  ");
            : TAG_FLAG(fs_data_dirs_available_space_cache_seconds, advanced);
> nit: I would remove this sentence. These descriptions usually try to be use
Done


http://gerrit.cloudera.org:8080/#/c/13975/7/src/kudu/fs/data_dirs.cc@110
PS7, Line 110: DECLARE_bool(enable_data_block_fsync);
             : DECLARE_string(block_manager);
             : 
> nit: can you remove this extraneous newline?
Done


http://gerrit.cloudera.org:8080/#/c/13975/7/src/kudu/fs/data_dirs.cc@264
PS7, Line 264: 
> nit: maybe we should rename this to last_space_check_ or something similar
Done


http://gerrit.cloudera.org:8080/#/c/13975/7/src/kudu/fs/data_dirs.cc@1002
PS7, Line 1002:                                          candidate_indices[0]);
              :     DataDir* candidate_second = FindOrDie(data_dir_by_uu
> Could we just shuffle candidate_indices directly? And then compare dirs can
Of course :)


http://gerrit.cloudera.org:8080/#/c/13975/7/src/kudu/fs/data_dirs.cc@1083
PS7, Line 1083:         int64_t space_in_second = FindOrDie(data_dir_by_uuid_idx_,
              :                                             candidate_indices[1])->available_bytes();
              :         selected_index = space_in_first > space_in_second ? 0 : 1;
              :       } else {
              :         selected_index = tablets_in_first < tablets_in_second ? 0 : 1;
              :       }
              :       group_indices->push_back(candidate_indices[selected_index]);
              :       candidate_indices.erase(candidate_indices.begin() + selected_index);
              :     }
              :   }
              : }
              : 
              : DataDir* DataDirManager::FindDataDirByUuidIndex(int uuid_idx) const {
              :   DCHECK_LT(uuid_idx, data_dirs_.size());
              :   ret
> To be clear, this is selecting by the tablet count first, and then breaking
If we select by space first, it may assign a lot of tablets in the directory with more space when creating a lot of tablets at once and doesn't fill up so quick just like what Adar said. But we have PO2C, it may not happen. But I still afraid of that :(



-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 8
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Fri, 09 Aug 2019 03:34:00 +0000
Gerrit-HasComments: Yes

[kudu-CR] Consider the available space when selecting data dirs for blocks.

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13975 )

Change subject: Consider the available space when selecting data dirs for blocks.
......................................................................


Patch Set 6:

(1 comment)

The itself code looks fine, but I don't love the idea of having that test-only flag change the behavior like that. Could you explain a bit more why it's necessary?

http://gerrit.cloudera.org:8080/#/c/13975/6/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

http://gerrit.cloudera.org:8080/#/c/13975/6/src/kudu/fs/data_dirs.cc@991
PS6, Line 991: 
             :   if (FLAGS_refresh_is_full_with_expired_only_for_testing) {
             :     // This currently should only be reached by disk_failure-itest.
             :     refresh_mode = DataDir::RefreshMode::EXPIRED_ONLY;
             :   }
I'm confused about why this is necessary. Why do we need this here?

Also, I'm curious, in most tests that use a minicluster, the data directories all share a single disk, even though they are different directories. If that's the case, do they all have the same available space? If so, I wonder if we should add a way to break ties. For example, select two, and if they have the same space, select the one that has fewer tablets. What do you think?



-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 6
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Tue, 06 Aug 2019 04:52:44 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2901: Consider the available space when selecting data dirs for tablets and blocks.

Posted by "ZhangYao (Code Review)" <ge...@cloudera.org>.
ZhangYao has posted comments on this change. ( http://gerrit.cloudera.org:8080/13975 )

Change subject: KUDU-2901: Consider the available space when selecting data dirs for tablets and blocks.
......................................................................


Patch Set 12:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13975/11/src/kudu/util/env_util.h
File src/kudu/util/env_util.h:

http://gerrit.cloudera.org:8080/#/c/13975/11/src/kudu/util/env_util.h@56
PS11, Line 56: // If 'available_bytes' is not null, it will contain the amount of free disk space (in bytes)
> Should doc that this behavior (writing to available_bytes) occurs even if t
Done


http://gerrit.cloudera.org:8080/#/c/13975/11/src/kudu/util/env_util.h@57
PS11, Line 57: .
> Drop this double quote.
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 12
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Wed, 14 Aug 2019 02:17:52 +0000
Gerrit-HasComments: Yes

[kudu-CR] Consider the available space when selecting data dirs for blocks.

Posted by "ZhangYao (Code Review)" <ge...@cloudera.org>.
Hello Yingchun Lai, Kudu Jenkins, Andrew Wong, Adar Dembo, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/13975

to look at the new patch set (#8).

Change subject: Consider the available space when selecting data dirs for blocks.
......................................................................

Consider the available space when selecting data dirs for blocks.

Randomly select two not full dir in tablet's data dir group  and then choose
the dir which has more free space.

Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
---
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/data_dirs-test.cc
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/util/env_util.cc
M src/kudu/util/env_util.h
8 files changed, 102 insertions(+), 84 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/13975/8
-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 8
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>

[kudu-CR] KUDU-2911: Consider the available space when selecting data dirs for tablets

Posted by "Yingchun Lai (Code Review)" <ge...@cloudera.org>.
Yingchun Lai has posted comments on this change. ( http://gerrit.cloudera.org:8080/13975 )

Change subject: KUDU-2911: Consider the available space when selecting data dirs for tablets
......................................................................


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/13975/1/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

http://gerrit.cloudera.org:8080/#/c/13975/1/src/kudu/fs/data_dirs.cc@1041
PS1, Line 1041: int
As a 'score', how about make it float/doube type?


http://gerrit.cloudera.org:8080/#/c/13975/1/src/kudu/util/env_util.h
File src/kudu/util/env_util.h:

http://gerrit.cloudera.org:8080/#/c/13975/1/src/kudu/util/env_util.h@52
PS1, Line 52: bytes
Nit: not your fault, could you change it to 'requested_bytes'


http://gerrit.cloudera.org:8080/#/c/13975/1/src/kudu/util/env_util.h@56
PS1, Line 56: ,
nit: ,



-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 1
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Comment-Date: Thu, 01 Aug 2019 07:06:26 +0000
Gerrit-HasComments: Yes

[kudu-CR] Consider the available space when selecting data dirs for blocks.

Posted by "ZhangYao (Code Review)" <ge...@cloudera.org>.
Hello Yingchun Lai, Kudu Jenkins, Andrew Wong, Adar Dembo, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/13975

to look at the new patch set (#5).

Change subject: Consider the available space when selecting data dirs for blocks.
......................................................................

Consider the available space when selecting data dirs for blocks.

Randomly select two not full dir in tablet's data dir group  and then choose
the dir which has more free space.

Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
---
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/util/env_util.cc
M src/kudu/util/env_util.h
4 files changed, 59 insertions(+), 27 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/13975/5
-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 5
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>

[kudu-CR] Consider the available space when selecting data dirs for blocks.

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13975 )

Change subject: Consider the available space when selecting data dirs for blocks.
......................................................................


Patch Set 7:

(9 comments)

I'll look into the test failure some more. If I check out this patch, I see it failing a small percentage (2-6%) of the time.

http://gerrit.cloudera.org:8080/#/c/13975/7/src/kudu/fs/data_dirs.h
File src/kudu/fs/data_dirs.h:

http://gerrit.cloudera.org:8080/#/c/13975/7/src/kudu/fs/data_dirs.h@207
PS7, Line 207: available_bytes_
nit: wrap this in ''s like the other variable names.


http://gerrit.cloudera.org:8080/#/c/13975/7/src/kudu/fs/data_dirs.h@212
PS7, Line 212: , updated by RefreshIsFull.
nit: remove this, or update it to RefreshAvailableSpace


http://gerrit.cloudera.org:8080/#/c/13975/7/src/kudu/fs/data_dirs.h@338
PS7, Line 338: and give preference to which with more free space
             :   // from the specified option's data dir group.
nit: Could you rephrase a bit?

"and gives preference to those with more free space in the specified option's data dir group."


http://gerrit.cloudera.org:8080/#/c/13975/7/src/kudu/fs/data_dirs.h@471
PS7, Line 471: as the number of unique tablets in the directory.
             :   // The resulting behavior fills directories that have fewer tablets stored on
             :   // them while not completely neglecting those with more tablets.
nit: update this?


http://gerrit.cloudera.org:8080/#/c/13975/7/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

http://gerrit.cloudera.org:8080/#/c/13975/7/src/kudu/fs/data_dirs.cc@90
PS7, Line 90:  "
            :              "During this time, will not really check disk space when using EXPIRED_ONLY."
nit: I would remove this sentence. These descriptions usually try to be user-understandable, and EXPIRED_ONLY isn't really user-facing.


http://gerrit.cloudera.org:8080/#/c/13975/7/src/kudu/fs/data_dirs.cc@110
PS7, Line 110: 
             : 
             : 
nit: can you remove this extraneous newline?


http://gerrit.cloudera.org:8080/#/c/13975/7/src/kudu/fs/data_dirs.cc@264
PS7, Line 264: last_check_is_full_
nit: maybe we should rename this to last_space_check_ or something similar


http://gerrit.cloudera.org:8080/#/c/13975/7/src/kudu/fs/data_dirs.cc@1002
PS7, Line 1002:   vector<int> random_indices(candidate_indices.size());
              :   iota(random_indices.begin(), random_indices.end(), 0);
Could we just shuffle candidate_indices directly? And then compare dirs candidate_indices[0] and candidate_indices[1]?


http://gerrit.cloudera.org:8080/#/c/13975/7/src/kudu/fs/data_dirs.cc@1083
PS7, Line 1083:       int tablets_in_first = FindOrDie(tablets_by_uuid_idx_map_, candidate_indices[0]).size();
              :       int tablets_in_second = FindOrDie(tablets_by_uuid_idx_map_, candidate_indices[1]).size();
              :       int selected_index = 0;
              :       if (tablets_in_first == tablets_in_second) {
              :         int64_t space_in_first = FindOrDie(data_dir_by_uuid_idx_,
              :                                            candidate_indices[0])->available_bytes();
              :         int64_t space_in_second = FindOrDie(data_dir_by_uuid_idx_,
              :                                             candidate_indices[1])->available_bytes();
              :         selected_index = space_in_first > space_in_second ? 0 : 1;
              :       } else {
              :         selected_index = tablets_in_first < tablets_in_second ? 0 : 1;
              :       }
              :       group_indices->push_back(candidate_indices[selected_index]);
              :       candidate_indices.erase(candidate_indices.begin() + selected_index);
              :     }
To be clear, this is selecting by the tablet count first, and then breaking ties by space.

Is that preferred over selecting by space, and then breaking ties by tablet count?

This second scenario isn't as likely in production, but it definitely happens in tests since internal mini clusters share a single disk.



-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 7
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Thu, 08 Aug 2019 18:20:05 +0000
Gerrit-HasComments: Yes

[kudu-CR] Consider the available space when selecting data dirs for blocks.

Posted by "ZhangYao (Code Review)" <ge...@cloudera.org>.
Hello Yingchun Lai, Kudu Jenkins, Andrew Wong, Adar Dembo, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/13975

to look at the new patch set (#10).

Change subject: Consider the available space when selecting data dirs for blocks.
......................................................................

Consider the available space when selecting data dirs for blocks.

Randomly select two not full dir in tablet's data dir group  and then choose
the dir which has more free space.

Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
---
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/data_dirs-test.cc
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/integration-tests/disk_failure-itest.cc
M src/kudu/util/env_util.cc
M src/kudu/util/env_util.h
9 files changed, 108 insertions(+), 87 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/13975/10
-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 10
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>

[kudu-CR] KUDU-2911: Consider the available space when selecting data dirs for tablets

Posted by "ZhangYao (Code Review)" <ge...@cloudera.org>.
Hello Yingchun Lai, Kudu Jenkins, Andrew Wong, Adar Dembo, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/13975

to look at the new patch set (#4).

Change subject: KUDU-2911: Consider the available space when selecting data dirs for tablets
......................................................................

KUDU-2911: Consider the available space when selecting data dirs for tablets

Use available space to select data dirs for tablet. It can cover the
condition that each tablet or disk is not same size.

Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
---
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/util/env_util.cc
M src/kudu/util/env_util.h
4 files changed, 45 insertions(+), 30 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/13975/4
-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 4
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>

[kudu-CR] KUDU-2911: Consider the available space when selecting data dirs for tablets

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13975 )

Change subject: KUDU-2911: Consider the available space when selecting data dirs for tablets
......................................................................


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/13975/3/src/kudu/fs/data_dirs.h
File src/kudu/fs/data_dirs.h:

http://gerrit.cloudera.org:8080/#/c/13975/3/src/kudu/fs/data_dirs.h@470
PS3, Line 470: less load
We want a higher score though. Maybe reword that to clarify this.


http://gerrit.cloudera.org:8080/#/c/13975/3/src/kudu/fs/data_dirs.h@470
PS3, Line 470: (available disk space) / (tablets
             :   // number in the directory)
It's great that this metric tries to take into account disk space _and_ tablet count, but it's not clear to me that we need both. The PO2C algorithm balances whatever metric it is using to quantify load. Here, we're saying that we want to balance the amount of free disk space available per existing tablet, and it's difficult to understand what that metric actually is.

How about just using available disk space? If we assume that each tablet is roughly the same size, and that each disk has roughly the same space (which are assumptions that existed before), just using available disk space is roughly equivalent to using the tablet count only. But free disk space alone seems better because it accounts for the case where those assumptions aren't true.


http://gerrit.cloudera.org:8080/#/c/13975/3/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

http://gerrit.cloudera.org:8080/#/c/13975/3/src/kudu/fs/data_dirs.cc@988
PS3, Line 988:     Status s = candidate->RefreshIsFull(DataDir::RefreshMode::EXPIRED_ONLY);
Using EXPIRED_ONLY means that we will only refresh the available bytes if the directory was previously full or if it has been a long time (i.e. "expired).

This means that if we call this function very quickly in succession, all the candidates will have the same size, and we'd end up always picking the first data directory. If we wanted this to take into account space, we'd have to use RefreshMode::ALWAYS here.

I think this is what has been making disk_failure-itest unstable, since all the tablet blocks are being put on a single disk.


Also, this might negatively affect parallelism for block IO in exchange for more balance of space, right? For example, certain tablets may entirely be stored on a single disk. Not sure the greedy approach is the best. Maybe we can reuse PO2C here too, or continue having this be randomly selected among the non-full candidates. I'm curious what others think about this.


http://gerrit.cloudera.org:8080/#/c/13975/3/src/kudu/fs/data_dirs.cc@1047
PS3, Line 1047: candidate_indices
nit: would be clearer to name this something like candidates_index_and_score or something.



-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 3
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Thu, 01 Aug 2019 23:17:47 +0000
Gerrit-HasComments: Yes

[kudu-CR] Consider the available space when selecting data dirs for blocks.

Posted by "ZhangYao (Code Review)" <ge...@cloudera.org>.
ZhangYao has posted comments on this change. ( http://gerrit.cloudera.org:8080/13975 )

Change subject: Consider the available space when selecting data dirs for blocks.
......................................................................


Patch Set 10:

> > Patch Set 9:
 > >
 > > > Patch Set 8:
 > > >
 > > > > (9 comments)
 > > >  >
 > > >  > I'll look into the test failure some more. If I check out
 > this
 > > >  > patch, I see it failing a small percentage (2-6%) of the
 > time.
 > > >
 > > > About the failure in DiskErrorITest.TestFailDuringScanWorkload.
 > IIUC, it only inject the disk failure in data_dir[1], and after the
 > reflush check change, it may be remove from the candidate dirs. So
 > it may not trigger the failure and so the test will failure.
 > >
 > > I added some logging and think I understand what's going on. When
 > we first create the tablets, we always refresh the space, and this
 > necessarily isn't atomic between the data dirs, so we can sometimes
 > end up with the data directories registering that they have
 > different amounts of available space, even though they share the
 > same disk.
 > >
 > > When this happens, because there are only three directories, this
 > implementation of PO2C might end up completely ignoring the data
 > dir with the least amount of space in it.
 > >
 > > So I see two paths forward for this. Either:
 > > 1) update the implementation of PO2C to sometimes select the data
 > dir with the least space. For example, select two random indices
 > (may be the same) and compare the available space (compared to what
 > we have now, which always compares two different data directories).
 > OR...
 > > 2) update disk_failure-itest to inject failures into two data
 > directories instead of one. With the current PO2C implementation,
 > it's a safe bet that killing two data dirs will touch blocks.
 > 
 > I tried both, and both seem to reduce the flakiness of the test
 > (either fix would pass 500/500 times instead of ~480/500 times).

Thanks a lot :)  Done.


-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 10
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Sun, 11 Aug 2019 15:31:32 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-2901: Consider the available space when selecting data dirs for tablets and blocks.

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/13975 )

Change subject: KUDU-2901: Consider the available space when selecting data dirs for tablets and blocks.
......................................................................


Patch Set 11:

(4 comments)

Looks good, just a few doc nits.

http://gerrit.cloudera.org:8080/#/c/13975/11//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13975/11//COMMIT_MSG@10
PS11, Line 10: prefrence
preference


http://gerrit.cloudera.org:8080/#/c/13975/11/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

http://gerrit.cloudera.org:8080/#/c/13975/11/src/kudu/fs/data_dirs.cc@a89
PS11, Line 89: 
> Since I have changed the behaviors under this flag, so it is not suitable t
FWIW, I think we're comfortable renaming or deleting any flag tagged as 'evolving'. We only maintain compatibility for 'stable' flags.


http://gerrit.cloudera.org:8080/#/c/13975/11/src/kudu/util/env_util.h
File src/kudu/util/env_util.h:

http://gerrit.cloudera.org:8080/#/c/13975/11/src/kudu/util/env_util.h@56
PS11, Line 56: // If 'available_bytes' is not null, it will contain the amount of free disk space (in bytes)
Should doc that this behavior (writing to available_bytes) occurs even if the function returns IOError with ENOSPC, but not on any other error.


http://gerrit.cloudera.org:8080/#/c/13975/11/src/kudu/util/env_util.h@57
PS11, Line 57: "
Drop this double quote.



-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 11
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Tue, 13 Aug 2019 18:50:27 +0000
Gerrit-HasComments: Yes

[kudu-CR] Consider the available space when selecting data dirs for blocks.

Posted by "Yingchun Lai (Code Review)" <ge...@cloudera.org>.
Yingchun Lai has posted comments on this change. ( http://gerrit.cloudera.org:8080/13975 )

Change subject: Consider the available space when selecting data dirs for blocks.
......................................................................


Patch Set 10:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.h
File src/kudu/fs/data_dirs.h:

http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.h@207
PS10, Line 207: available_bytes_
// Protects 'last_space_check_', 'is_full_' and 'available_bytes_'.


http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.h@338
PS10, Line 338: those
the one ?


http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.h@340
PS10, Line 340: ,r
nit: keep the blank space.


http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.cc@261
PS10, Line 261: last_space_check_
How about change it to expiry meaning, then we don't need to add FLAGS_fs_data_dirs_available_space_cache_seconds every time.


http://gerrit.cloudera.org:8080/#/c/13975/10/src/kudu/fs/data_dirs.cc@1072
PS10, Line 1072:     shuffle(candidate_indices.begin(), candidate_indices.end(), default_random_engine(rng_.Next()));
Seems some redundant with L994~L1006, how about do some refactor?
Add a function:
Input: candidate_indices, rule(prefer space or tablet count)
Output: selected dir, candidate_indices remains



-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 10
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Tue, 13 Aug 2019 05:30:53 +0000
Gerrit-HasComments: Yes

[kudu-CR] Consider the available space when selecting data dirs for blocks.

Posted by "ZhangYao (Code Review)" <ge...@cloudera.org>.
ZhangYao has posted comments on this change. ( http://gerrit.cloudera.org:8080/13975 )

Change subject: Consider the available space when selecting data dirs for blocks.
......................................................................


Patch Set 1:

1. Seleted the dir with more available space in tablets' dir group selecting when the two dirs have same tablets on them.
2. Seleted the dir with more available space when selecting dir for block creating.


-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 1
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Thu, 08 Aug 2019 15:34:45 +0000
Gerrit-HasComments: No

[kudu-CR] Consider the available space when selecting data dirs for blocks.

Posted by "ZhangYao (Code Review)" <ge...@cloudera.org>.
ZhangYao has posted comments on this change. ( http://gerrit.cloudera.org:8080/13975 )

Change subject: Consider the available space when selecting data dirs for blocks.
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13975/3/src/kudu/fs/data_dirs.h
File src/kudu/fs/data_dirs.h:

http://gerrit.cloudera.org:8080/#/c/13975/3/src/kudu/fs/data_dirs.h@470
PS3, Line 470: (available disk space) / (tablets
             :   // number in the directory)
> Here is more complex if tablet were created but only start filling up. Unde
Oh I thought about this metrics again. It doesn't work like only use available space when create tablets with data writing soon because this metrics consider the existing tablets and it's not fair for the newly creating tablet. Just like what Andrew Wong said, this metrics doesn't have clear meaning.
I think it's not suitable here. But only consider available space can lead to hot directory when create a lot tablets once. I haven't figure out a better method to measure the directory when creating tablets so just hold the original strategy.
But I think it's clear for use available space when selecting directory for newly creating blocks :)



-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 3
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Fri, 02 Aug 2019 12:58:49 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2901: Consider the available space when selecting data dirs for tablets and blocks.

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/13975 )

Change subject: KUDU-2901: Consider the available space when selecting data dirs for tablets and blocks.
......................................................................


Patch Set 12: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 12
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Wed, 14 Aug 2019 03:28:53 +0000
Gerrit-HasComments: No

[kudu-CR] Consider the available space when selecting data dirs for blocks.

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/13975 )

Change subject: Consider the available space when selecting data dirs for blocks.
......................................................................


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13975/6/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

http://gerrit.cloudera.org:8080/#/c/13975/6/src/kudu/fs/data_dirs.cc@991
PS6, Line 991: 
             :   if (FLAGS_refresh_is_full_with_expired_only_for_testing) {
             :     // This currently should only be reached by disk_failure-itest.
             :     refresh_mode = DataDir::RefreshMode::EXPIRED_ONLY;
             :   }
> I prefer to change the EXPIRED_ONLY behaviors in RefreshIsFull such as igno
Yeah Andrew's proposal makes sense to me. We could probably also reduce the caching time from 30s to something like 5s if you're worried about holding onto a stale disk space value for too long. BTW, you might find https://github.com/apache/kudu/commit/2a802f9f376de5175170a933bc8c35154f6eda92 interesting.

FWIW, I don't find the test-only gflag terribly offensive if the alternatives are worse. Just make sure you tag it as HIDDEN.



-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 6
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Wed, 07 Aug 2019 04:23:46 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2911: Consider the available space when selecting data dirs for tablets

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/13975 )

Change subject: KUDU-2911: Consider the available space when selecting data dirs for tablets
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13975/3/src/kudu/fs/data_dirs.h
File src/kudu/fs/data_dirs.h:

http://gerrit.cloudera.org:8080/#/c/13975/3/src/kudu/fs/data_dirs.h@470
PS3, Line 470: vailable disk space of directory.
             :   // The resulting behavior f
> Emmmm, the available space is more clear to understand. Done
How will this behave during the creation of a very large table, or at the start of a rereplication storm? That is, how will it behave when we have to create many tablets all at once? Won't it bias all of those tablets' data dir groups towards the emptier data dirs? At least with the status quo, each created tablet affected the power-of-2 choice of the next created tablet (by increasing the number of tablets in the chosen data dirs); here the available disk space won't change until the tablets actually start filling up.



-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 4
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Fri, 02 Aug 2019 04:18:30 +0000
Gerrit-HasComments: Yes

[kudu-CR] Consider the available space when selecting data dirs for blocks.

Posted by "ZhangYao (Code Review)" <ge...@cloudera.org>.
ZhangYao has posted comments on this change. ( http://gerrit.cloudera.org:8080/13975 )

Change subject: Consider the available space when selecting data dirs for blocks.
......................................................................


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13975/6/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

http://gerrit.cloudera.org:8080/#/c/13975/6/src/kudu/fs/data_dirs.cc@991
PS6, Line 991: 
             :   if (FLAGS_refresh_is_full_with_expired_only_for_testing) {
             :     // This currently should only be reached by disk_failure-itest.
             :     refresh_mode = DataDir::RefreshMode::EXPIRED_ONLY;
             :   }
> I'm confused about why this is necessary. Why do we need this here?
1. FLAGS_refresh_is_full_with_expired_only_for_testing
That is necessary because DiskErrorITest.TestFailDuringScanWorkload(with DISK_FAILURE) wanted all tablets fail because DISK_FAILURE and then recovery. It only inject that diskerror in one dir. When the FlushMrs running backend try to write data it will get injected disk failure while preallocate space, so all tablets in that dir will fail. However, after my change I will really check dirs space so that "disk error dir" will not be selected, so all tablets will not fail. That why I have to change the behaviors when doing test.

2. Dirs in same disk have the same available space. In my docker machine it work like that. So if all dir located in the same disk, it do need try to select one which have fewer tablets :)



-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 6
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Tue, 06 Aug 2019 05:43:26 +0000
Gerrit-HasComments: Yes

[kudu-CR] Consider the available space when selecting data dirs for blocks.

Posted by "ZhangYao (Code Review)" <ge...@cloudera.org>.
ZhangYao has posted comments on this change. ( http://gerrit.cloudera.org:8080/13975 )

Change subject: Consider the available space when selecting data dirs for blocks.
......................................................................


Patch Set 8:

> (9 comments)
 > 
 > I'll look into the test failure some more. If I check out this
 > patch, I see it failing a small percentage (2-6%) of the time.

About the failure in DiskErrorITest.TestFailDuringScanWorkload. IIUC, it only inject the disk failure in data_dir[1], and after the reflush check change, it may be remove from the candidate dirs. So it may not trigger the failure and so the test will failure.


-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 8
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Fri, 09 Aug 2019 03:39:23 +0000
Gerrit-HasComments: No

[kudu-CR] Consider the available space when selecting data dirs for blocks.

Posted by "ZhangYao (Code Review)" <ge...@cloudera.org>.
Hello Yingchun Lai, Kudu Jenkins, Andrew Wong, Adar Dembo, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/13975

to look at the new patch set (#7).

Change subject: Consider the available space when selecting data dirs for blocks.
......................................................................

Consider the available space when selecting data dirs for blocks.

Randomly select two not full dir in tablet's data dir group  and then choose
the dir which has more free space.

Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
---
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/data_dirs-test.cc
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/util/env_util.cc
M src/kudu/util/env_util.h
8 files changed, 102 insertions(+), 79 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/13975/7
-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 7
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>

[kudu-CR] Consider the available space when selecting data dirs for blocks.

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13975 )

Change subject: Consider the available space when selecting data dirs for blocks.
......................................................................


Patch Set 10: Code-Review+2

LGTM!

Will let this sit for a little while in case Adar has further comments.


-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 10
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Tue, 13 Aug 2019 03:14:37 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-2901: Consider the available space when selecting data dirs for tablets and blocks.

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/13975 )

Change subject: KUDU-2901: Consider the available space when selecting data dirs for tablets and blocks.
......................................................................

KUDU-2901: Consider the available space when selecting data dirs for tablets and blocks.

When creating tablets:
If randomly selected two dirs have same tablets count, gives preference to the
one with more free space.

When creating blocks:
Randomly select two not full dir in tablet's data dir group and then choose
the dir which has more free space.

Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Reviewed-on: http://gerrit.cloudera.org:8080/13975
Tested-by: Kudu Jenkins
Reviewed-by: Adar Dembo <ad...@cloudera.com>
---
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/data_dirs-test.cc
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/integration-tests/disk_failure-itest.cc
M src/kudu/util/env_util.cc
M src/kudu/util/env_util.h
9 files changed, 106 insertions(+), 86 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Adar Dembo: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 13
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>

[kudu-CR] KUDU-2911: Consider the available space when selecting data dirs for tablets

Posted by "ZhangYao (Code Review)" <ge...@cloudera.org>.
ZhangYao has posted comments on this change. ( http://gerrit.cloudera.org:8080/13975 )

Change subject: KUDU-2911: Consider the available space when selecting data dirs for tablets
......................................................................


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/13975/1/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

http://gerrit.cloudera.org:8080/#/c/13975/1/src/kudu/fs/data_dirs.cc@1041
PS1, Line 1041: 
> As a 'score', how about make it float/doube type?
Nice, I used to swing in double and int QAQ, because I think the space is much large than tablet number and when it come to that situation each tablet can not have one bytes ordering has no meaning. Anyway double/float is better than int :)


http://gerrit.cloudera.org:8080/#/c/13975/1/src/kudu/util/env_util.h
File src/kudu/util/env_util.h:

http://gerrit.cloudera.org:8080/#/c/13975/1/src/kudu/util/env_util.h@52
PS1, Line 52: reque
> Nit: not your fault, could you change it to 'requested_bytes'
Done


http://gerrit.cloudera.org:8080/#/c/13975/1/src/kudu/util/env_util.h@56
PS1, Line 56: ,
> nit: ,
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 2
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Thu, 01 Aug 2019 09:02:50 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2901: Consider the available space when selecting data dirs for tablets and blocks.

Posted by "ZhangYao (Code Review)" <ge...@cloudera.org>.
Hello Yingchun Lai, Kudu Jenkins, Andrew Wong, Adar Dembo, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/13975

to look at the new patch set (#12).

Change subject: KUDU-2901: Consider the available space when selecting data dirs for tablets and blocks.
......................................................................

KUDU-2901: Consider the available space when selecting data dirs for tablets and blocks.

When creating tablets:
If randomly selected two dirs have same tablets count, gives preference to the
one with more free space.

When creating blocks:
Randomly select two not full dir in tablet's data dir group and then choose
the dir which has more free space.

Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
---
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/data_dirs-test.cc
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/integration-tests/disk_failure-itest.cc
M src/kudu/util/env_util.cc
M src/kudu/util/env_util.h
9 files changed, 106 insertions(+), 86 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/13975/12
-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 12
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>

[kudu-CR] Consider the available space when selecting data dirs for blocks.

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13975 )

Change subject: Consider the available space when selecting data dirs for blocks.
......................................................................


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13975/6/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

http://gerrit.cloudera.org:8080/#/c/13975/6/src/kudu/fs/data_dirs.cc@991
PS6, Line 991: 
             :   if (FLAGS_refresh_is_full_with_expired_only_for_testing) {
             :     // This currently should only be reached by disk_failure-itest.
             :     refresh_mode = DataDir::RefreshMode::EXPIRED_ONLY;
             :   }
> 1. FLAGS_refresh_is_full_with_expired_only_for_testing
I see. What if we took a slightly different approach and:
- continue to _always_ use EXPIRED_ONLY here. That way, we hit the disk error when we expect to in the test, i.e. we won't check for available space when we create blocks, but we _will_ hit a disk error when we write to those blocks.
- update the EXPIRED_ONLY behavior to not take into account fullness, i.e. remove the is_full_ check for EXPIRED_ONLY. That way, we're not constantly running this space check, which might be expensive, especially if run for every block creation. AFAICT the RefreshIsFull() was originally written with fullness only in mind. Since we're trying to use it for more than just fullness, updating it to be "RefreshAvailableSpace()" or something might be worth doing.


BTW another way to avoid this specific check for disk_failure-itest would be to have disk_failure-itest fail the glob "/data/*data" so it matches only to *.data and *.metadata files, rather than the entire directory. That said, I would prefer we consider the proposal above.

Also curious if Adar agrees here.



-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 6
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Wed, 07 Aug 2019 00:43:31 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2901: Consider the available space when selecting data dirs for tablets and blocks.

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/13975 )

Change subject: KUDU-2901: Consider the available space when selecting data dirs for tablets and blocks.
......................................................................


Patch Set 11:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13975/11/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

http://gerrit.cloudera.org:8080/#/c/13975/11/src/kudu/fs/data_dirs.cc@a89
PS11, Line 89: 
Is this compatibility break okay? If someone was using this flag will it cause potential issues that it is now "missing"?



-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 11
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Tue, 13 Aug 2019 12:34:37 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2901: Consider the available space when selecting data dirs for tablets and blocks.

Posted by "ZhangYao (Code Review)" <ge...@cloudera.org>.
ZhangYao has posted comments on this change. ( http://gerrit.cloudera.org:8080/13975 )

Change subject: KUDU-2901: Consider the available space when selecting data dirs for tablets and blocks.
......................................................................


Patch Set 11:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13975/11/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

http://gerrit.cloudera.org:8080/#/c/13975/11/src/kudu/fs/data_dirs.cc@a89
PS11, Line 89: 
> Is this compatibility break okay? If someone was using this flag will it ca
Since I have changed the behaviors under this flag, so it is not suitable to keep the original name. And after change, this evolving flag need be suitably release-noted when release new versions :)



-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 11
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Tue, 13 Aug 2019 13:44:57 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2911: Consider the available space when selecting data dirs for tablets

Posted by "ZhangYao (Code Review)" <ge...@cloudera.org>.
Hello Yingchun Lai, Kudu Jenkins, Andrew Wong, Adar Dembo, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/13975

to look at the new patch set (#2).

Change subject: KUDU-2911: Consider the available space when selecting data dirs for tablets
......................................................................

KUDU-2911: Consider the available space when selecting data dirs for tablets

Use AvailableByte / TabletNumbers to consider both the tablet number and the
available space when selecting data dirs for tablet. Measured by available
bytes when selecting dir for new blocks.

Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
---
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/util/env_util.cc
M src/kudu/util/env_util.h
4 files changed, 49 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/13975/2
-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 2
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>

[kudu-CR] Consider the available space when selecting data dirs for blocks.

Posted by "ZhangYao (Code Review)" <ge...@cloudera.org>.
ZhangYao has posted comments on this change. ( http://gerrit.cloudera.org:8080/13975 )

Change subject: Consider the available space when selecting data dirs for blocks.
......................................................................


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13975/6/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

http://gerrit.cloudera.org:8080/#/c/13975/6/src/kudu/fs/data_dirs.cc@991
PS6, Line 991: 
             :   if (FLAGS_refresh_is_full_with_expired_only_for_testing) {
             :     // This currently should only be reached by disk_failure-itest.
             :     refresh_mode = DataDir::RefreshMode::EXPIRED_ONLY;
             :   }
> I see. What if we took a slightly different approach and:
I prefer to change the EXPIRED_ONLY behaviors in RefreshIsFull such as ignore the is_full_ condition and focus on expired time. I will ask adr for advices :)



-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 6
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Wed, 07 Aug 2019 02:32:32 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2911: Consider the available space when selecting data dirs for tablets

Posted by "ZhangYao (Code Review)" <ge...@cloudera.org>.
ZhangYao has posted comments on this change. ( http://gerrit.cloudera.org:8080/13975 )

Change subject: KUDU-2911: Consider the available space when selecting data dirs for tablets
......................................................................


Patch Set 4:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/13975/3/src/kudu/fs/data_dirs.h
File src/kudu/fs/data_dirs.h:

http://gerrit.cloudera.org:8080/#/c/13975/3/src/kudu/fs/data_dirs.h@470
PS3, Line 470: more avai
> We want a higher score though. Maybe reword that to clarify this.
Done


http://gerrit.cloudera.org:8080/#/c/13975/3/src/kudu/fs/data_dirs.h@470
PS3, Line 470: vailable disk space of directory.
             :   // The resulting behavior f
> It's great that this metric tries to take into account disk space _and_ tab
Emmmm, the available space is more clear to understand. Done


http://gerrit.cloudera.org:8080/#/c/13975/3/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

http://gerrit.cloudera.org:8080/#/c/13975/3/src/kudu/fs/data_dirs.cc@988
PS3, Line 988:     int uuid_idx = (*group_uuid_indices)[i];
> Using EXPIRED_ONLY means that we will only refresh the available bytes if t
Thanks a lot. I have restored it to the original. This made me very troubled yesterday and I always think it may be something wrong with the lock.


http://gerrit.cloudera.org:8080/#/c/13975/3/src/kudu/fs/data_dirs.cc@1047
PS3, Line 1047: ata_dirs_, e.firs
> nit: would be clearer to name this something like candidates_index_and_scor
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 4
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Fri, 02 Aug 2019 04:03:27 +0000
Gerrit-HasComments: Yes

[kudu-CR] Consider the available space when selecting data dirs for blocks.

Posted by "ZhangYao (Code Review)" <ge...@cloudera.org>.
Hello Yingchun Lai, Kudu Jenkins, Andrew Wong, Adar Dembo, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/13975

to look at the new patch set (#9).

Change subject: Consider the available space when selecting data dirs for blocks.
......................................................................

Consider the available space when selecting data dirs for blocks.

Randomly select two not full dir in tablet's data dir group  and then choose
the dir which has more free space.

Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
---
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/data_dirs-test.cc
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/integration-tests/disk_failure-itest.cc
M src/kudu/util/env_util.cc
M src/kudu/util/env_util.h
9 files changed, 103 insertions(+), 85 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/13975/9
-- 
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 9
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>