You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Bikramjeet Vig (Code Review)" <ge...@cloudera.org> on 2017/06/20 18:42:05 UTC

[Impala-ASF-CR] IMPALA-5240: Allow configuration of number of disk I/O threads based on disk type

Bikramjeet Vig has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7232

Change subject: IMPALA-5240: Allow configuration of number of disk I/O threads based on disk type
......................................................................

IMPALA-5240: Allow configuration of number of disk I/O threads based
on disk type

Currently Impala defaults to 8 threads per flash disk and 1 thread per
rotational disk. This can be overridden with --num_threads_per_disk,
but that sets the thread count independent of disk type.

This would allow control of the number of disk I/O threads spawned
independently for flash disks using "--num_threads_per_flash_disk"
and for rotational disks using "--num_threads_per_rotational_disk"
as startup parameters.

Testing:
Tested manually. Unfortunately, there is no way using DiskIoMgr class
to confirm these variables were set, therefore no automated backend
tests were added.

Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
---
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M fe/src/test/java/org/apache/impala/testutil/SentryServicePinger.java
3 files changed, 36 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/32/7232/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................


Patch Set 9:

(2 comments)

Overall looks good to me, just a couple of minor things.

http://gerrit.cloudera.org:8080/#/c/7232/9/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

PS9, Line 53: string num_io_threads_per_rotational_disk_help_msg = Substitute("Number of I/O threads"
            :     " per rotational disk. Has priority over num_threads_per_disk. If neither is set, "
            :     "defaults to $0 thread(s) per rotational disk", THREADS_PER_ROTATIONAL_DISK);
            : DEFINE_int32(num_io_threads_per_rotational_disk, 0,
            :     num_io_threads_per_rotational_disk_help_msg.c_str());
> hm I just played around with this and it doesn't actually work, which makes
You could maybe define the constants with a #define and then stringify the values:

https://gcc.gnu.org/onlinedocs/gcc-3.4.3/cpp/Stringification.html


PS9, Line 271: deafult_val
typo: default


-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................


Patch Set 11: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5240: Allow configuration of number of disk I/O threads based on disk type

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5240: Allow configuration of number of disk I/O threads based on disk type
......................................................................


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7232/1//COMMIT_MSG
Commit Message:

PS1, Line 7: IMPALA-5240: Allow configuration of number of disk I/O threads based
           : on disk type
long line, please keep to 90chars


Line 20: Tested manually. Unfortunately, there is no way using DiskIoMgr class
did you see how we test num_threads_per_disk in disk-io-mgr-test ? it'd be good to do something similar.

at a minimum setting both of the new vars to the same value to reproduce the same behavior we have, but exercising the new code paths.


-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Bikramjeet Vig (Code Review)" <ge...@cloudera.org>.
Bikramjeet Vig has posted comments on this change.

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................


Patch Set 9:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7232/9/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

PS9, Line 53: string num_io_threads_per_rotational_disk_help_msg = Substitute("Number of I/O threads"
            :     " per rotational disk. Has priority over num_threads_per_disk. If neither is set, "
            :     "defaults to $0 thread(s) per rotational disk", THREADS_PER_ROTATIONAL_DISK);
            : DEFINE_int32(num_io_threads_per_rotational_disk, 0,
            :     num_io_threads_per_rotational_disk_help_msg.c_str());
@Matthew: as discussed, switched to a static const string instead which works as intended.
@Tim: I wanted to avoid using macro constants as they are also being used during member initialization in the constructor


PS9, Line 271: deafult_val
> typo: default
Done


http://gerrit.cloudera.org:8080/#/c/7232/9/be/src/util/disk-info.h
File be/src/util/disk-info.h:

Line 61:     if (disk_id >= disks_.size()) return false;
> It shouldn't be in practice, but this is necessary for the tests which test
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................


Patch Set 9: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7232/9/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

PS9, Line 53: string num_io_threads_per_rotational_disk_help_msg = Substitute("Number of I/O threads"
            :     " per rotational disk. Has priority over num_threads_per_disk. If neither is set, "
            :     "defaults to $0 thread(s) per rotational disk", THREADS_PER_ROTATIONAL_DISK);
            : DEFINE_int32(num_io_threads_per_rotational_disk, 0,
            :     num_io_threads_per_rotational_disk_help_msg.c_str());
hm I just played around with this and it doesn't actually work, which makes sense: the preprocessor expands the macro, but the string value doesn't get created until runtime.

Sorry for the noise, can you just revert and mention the constant. Please just put a TODO above l48 to say if the constants are updated, the help message should be updated as well.


-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Bikramjeet Vig (Code Review)" <ge...@cloudera.org>.
Bikramjeet Vig has posted comments on this change.

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7232/6/be/src/runtime/disk-io-mgr-test.cc
File be/src/runtime/disk-io-mgr-test.cc:

PS6, Line 1083: &client_buffer[0]
no idea where this came from, probably when i checked out this file from master but strangely even master doesn't have this.
reverting back in next patch ASAP


PS6, Line 1121: &client_buffer[0]
same as above


-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Bikramjeet Vig (Code Review)" <ge...@cloudera.org>.
Bikramjeet Vig has uploaded a new patch set (#4).

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................

IMPALA-5240: Allow config of number of disk I/O threads per disk type

Currently Impala defaults to 8 threads per flash disk and 1 thread per
rotational disk. This can be overridden with --num_threads_per_disk,
but that sets the thread count independent of disk type.

This would allow control of the number of disk I/O threads spawned
independently for solid state disks using
"--num_io_threads_per_solid_state_disk" and for rotational disks using
"--num_io_threads_per_rotational_disk" as startup parameters.

Testing:
Added backend tests that verify if "num_threads_per_disk",
"num_io_threads_per_solid_state_disk" and
"num_io_threads_per_rotational_disk" are used to spawn threads
appropriately

TODO (Request comments on this additional change):
Additionally made some changes to fix a bug, where impala would crash
if num_disks are set more than the number of logical disks on system and
num_threads_per_disk is not set. This bug also lets the user create more
disk queues than the num of logical disks which would eventually never be
used. As a part of this effort, existing test cases in disk-io-mgr-test
were identified that exploited this bug and hence have been fixed in
this commit. Moreover, after this fix, if num_disks is set to more than
the number of logical disks then it will default to max available disks
and log a warning.

Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
---
M be/src/runtime/disk-io-mgr-stress-test.cc
M be/src/runtime/disk-io-mgr-stress.cc
M be/src/runtime/disk-io-mgr-test.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M be/src/util/thread.cc
M be/src/util/thread.h
M fe/src/test/java/org/apache/impala/testutil/SentryServicePinger.java
8 files changed, 393 insertions(+), 341 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/32/7232/4
-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-5240: Allow configuration of number of disk I/O threads based on disk type

Posted by "Bikramjeet Vig (Code Review)" <ge...@cloudera.org>.
Bikramjeet Vig has posted comments on this change.

Change subject: IMPALA-5240: Allow configuration of number of disk I/O threads based on disk type
......................................................................


Patch Set 1:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/7232/1//COMMIT_MSG
Commit Message:

PS1, Line 7: IMPALA-5240: Allow configuration of number of disk I/O threads based
           : on disk type
> long line, please keep to 90chars
Done


Line 20: Tested manually. Unfortunately, there is no way using DiskIoMgr class
> did you see how we test num_threads_per_disk in disk-io-mgr-test ? it'd be 
Unfortunately there are no tests that check if num_threads_per_disk has been set correctly. Also, I have not been able to find any indicators in behavior of this class that we can check when these parameters are changed.
@Tim, do you have any thoughts on this? Thanks!


http://gerrit.cloudera.org:8080/#/c/7232/1/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

PS1, Line 47: num_threads_per_rotational_disk
> Maybe io_threads instead of threads_ to be more specific. We can leave the 
Done


PS1, Line 50: num_threads_per_flash_disk
> Not your change but we probably shouldn't call this "flash", since that's a
Done


http://gerrit.cloudera.org:8080/#/c/7232/1/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

PS1, Line 637:   ///  - threads_per_rotational_disk: number of read threads to create per rotational
             :   ///    disk. This is also the max queue depth.
             :   ///  - threads_per_flash_disk: number of read threads to create per flash disk. This is
             :   ///    also the max queue depth.
             :   DiskIoMgr(int num_disks, int threads_per_disk, int min_buffer_size,
             :       int max_buffer_size, int threads_per_rotational_disk = 0,
             :       int threads_per_flash_disk = 0)
> I believe this constructor only exists for the backend tests. I think we sh
I think the best thing would be to remove them because I don't think we have use for any test that is specifically aimed at the type of disk used.


-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Bikramjeet Vig (Code Review)" <ge...@cloudera.org>.
Bikramjeet Vig has posted comments on this change.

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................


Patch Set 7:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/7232/7//COMMIT_MSG
Commit Message:

PS7, Line 24: TODO (Request comments on this additional change):
> remove TODO line and update the message below to indicate this all works, t
Done


http://gerrit.cloudera.org:8080/#/c/7232/7/be/src/runtime/disk-io-mgr-stress.cc
File be/src/runtime/disk-io-mgr-stress.cc:

Line 75:           MIN_READ_BUFFER_SIZE, MAX_READ_BUFFER_SIZE));
> nit 4 spaces tab
Done


http://gerrit.cloudera.org:8080/#/c/7232/7/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

Line 48:     " disk. Has priority over num_threads_per_disk.");
> It'd be good to indicate what this defaults to if this and num_threads_per_
Done


Line 51: DEFINE_int32(num_io_threads_per_solid_state_disk, 0, "Number of I/O threads per solid"
> same
Done


Line 262: static inline int GetFirstPositiveVal(const int first_val, const int second_val,
> 1 line comment why this exists
Done


PS7, Line 273: FLASH
> inconsistent FLASH vs ssd naming
Done


PS7, Line 284: &&
> this should be ||
Done


PS7, Line 375: e
> add space
Done


http://gerrit.cloudera.org:8080/#/c/7232/7/be/src/util/disk-info.h
File be/src/util/disk-info.h:

Line 56: 
> add a comment that this returns true if the disk w/ disk_id exists and is a
Done


PS7, Line 59:     //TODO: temporarily removed DCHECK due to an issue tracked in IMPALA-5574, put it back
            :     // after its resolved
> remove, I don't think this needs to be temporary. If someone wants to check
I believe having the DCHECK back makes it consistent with the semantics of this method, like if is_rotational is false, it should mean its another type of disk, but calling it with a non-existent disk_id should be illegal and should be caught by the DCHECK (like it also is in device_name())


PS7, Line 64:   
> nit remove whitespace
Done


PS7, Line 80:  
> remove trailing whitespace
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................


Patch Set 4:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/7232/4//COMMIT_MSG
Commit Message:

Line 34: 
Comment that we are changing the tests to explicitly set num_disks=1 for now because the test code isn't yet aware of how many disks are actually on the test system and the test code also isn't updated to actually use multiple disks.
We should also have the JIRA to reference.


http://gerrit.cloudera.org:8080/#/c/7232/4/be/src/runtime/disk-io-mgr-stress-test.cc
File be/src/runtime/disk-io-mgr-stress-test.cc:

Line 31: const int NUM_DISKS = 1;
comment why we are setting this to 1 for now, i.e. that the test code isn't yet aware of how many disks are actually on the test system and the test code also isn't updated to actually use multiple disks.

We should also have the JIRA to reference.


http://gerrit.cloudera.org:8080/#/c/7232/4/be/src/runtime/disk-io-mgr-test.cc
File be/src/runtime/disk-io-mgr-test.cc:

Line 212:   const int num_disks = 1;
leave a TODO with the JIRA where we'll change this, same for the other cases below.


http://gerrit.cloudera.org:8080/#/c/7232/4/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

PS4, Line 87: // Rotational disks should have 1 thread per disk to minimize seeks.  Non-rotational
            : // don't have this penalty and benefit from multiple concurrent IO requests.
            : static const int THREADS_PER_ROTATIONAL_DISK = 1;
            : static const int THREADS_PER_FLASH_DISK = 8;
I don't think there's any reason to keep these separately and then check if FLAGS_x are non-zero. Let's just make these the default value of those flags, then we can simplify the logic in Init()


PS4, Line 263:     num_io_threads_per_rotational_disk_(FLAGS_num_io_threads_per_rotational_disk),
             :     num_io_threads_per_solid_state_disk_(FLAGS_num_io_threads_per_solid_state_disk),
I think we can configure num_io_threads_per_rotational_disk_ and num_io_threads_per_solid_state_disk_ based on the flags once, rather than having to have the logic in Init() multiple times.

E.g.

if (FLAGS_num_io_threads_per_rotational_disk > 0) {
  num_io_threads_per_rotational_disk_ = FLAGS_num_io_threads_per_rotational_disk;
} else if (FLAGS_num_io_threads_per_disk > 0) {
  num_io_threads_per_rotational_disk_ = FLAGS_num_io_threads_per_disk;
} else {
  num_io_threads_per_rotational_disk_ = DEFAULT_NUM_PER_ROT_DISK;
}

same for ssd.


Line 278:         " disks";
To be consistent we should also warn if the value is negative


PS4, Line 367:     } else if (num_io_threads_per_rotational_disk_ != 0 && DiskInfo::is_rotational(i)) {
             :       num_threads_per_disk = num_io_threads_per_rotational_disk_;
             :     } else if (num_io_threads_per_solid_state_disk_ != 0 && !DiskInfo::is_rotational(i)) {
             :       num_threads_per_disk = num_io_threads_per_solid_state_disk_;
             :     } else if (FLAGS_num_threads_per_disk != 0) {
             :       num_threads_per_disk = FLAGS_num_threads_per_disk;
             :     } else if (DiskInfo::is_rotational(i)) {
             :       num_threads_per_disk = THREADS_PER_ROTATIONAL_DISK;
             :     } else {
             :       num_threads_per_disk = THREADS_PER_FLASH_DISK;
             :     }
After my suggestion in the constructor this becomes greatly simplified.


-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Bikramjeet Vig (Code Review)" <ge...@cloudera.org>.
Bikramjeet Vig has uploaded a new patch set (#7).

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................

IMPALA-5240: Allow config of number of disk I/O threads per disk type

Currently Impala defaults to 8 threads per flash disk and 1 thread per
rotational disk. This can be overridden with --num_threads_per_disk,
but that sets the thread count independent of disk type.

This would allow control of the number of disk I/O threads spawned
independently for solid state disks using
"--num_io_threads_per_solid_state_disk" and for rotational disks using
"--num_io_threads_per_rotational_disk" as startup parameters.

Testing:
Added backend tests that verify if "num_threads_per_disk",
"num_io_threads_per_solid_state_disk" and
"num_io_threads_per_rotational_disk" are used to spawn threads
appropriately

TODO (Request comments on this additional change):
Additionally made some changes to fix a bug, where impala would crash
if num_disks are set more than the number of logical disks on system and
num_threads_per_disk is not set. This bug also lets the user create more
disk queues than the num of logical disks which would eventually never
be used. As a part of this effort, existing test cases in
disk-io-mgr-test were identified that would fail after this bug fix.
This issue will be tracked in IMPALA-5574, but until then,
DiskInfo::is_rotational() had to be changed to make these tests run as
before. Moreover, after this fix, if num_disks is set to more than
the number of logical disks then it will default to max available disks
and log a warning.

Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
---
M be/src/runtime/disk-io-mgr-stress.cc
M be/src/runtime/disk-io-mgr-test.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M be/src/util/disk-info.h
M be/src/util/thread.cc
M be/src/util/thread.h
M fe/src/test/java/org/apache/impala/testutil/SentryServicePinger.java
8 files changed, 108 insertions(+), 44 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/32/7232/7
-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................


Patch Set 8:

(7 comments)

looks good, very close now, thanks

http://gerrit.cloudera.org:8080/#/c/7232/8//COMMIT_MSG
Commit Message:

PS8, Line 24: TODO (Request comments on this additional change):
? remove


PS8, Line 29: used
, by the actual code. They were actually being used by the test code.

see disk-io-mgr-test:

  DiskIoMgr::ScanRange* InitRange(int num_buffers, const char* file_path, int offset,
      int len, int disk_id, int64_t mtime, void* meta_data = NULL,
      bool is_cached = false)

the disk_id is explicitly set.


PS8, Line 29: As a part of this effort, existing test cases in
            : disk-io-mgr-test were identified that exploited this bug and will be
            : tracked in IMPALA-5574.
This is overstating the problem. We need to call out IMPALA-5574, it will just distract from the core issue being fixed here. Please just say:

The tests have been updated to use the new DiskIoMgr test constructor, but retain the same behavior as before.


http://gerrit.cloudera.org:8080/#/c/7232/8/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

PS8, Line 48: 1
If we change the constants later, we'll probably forget to update this. You can just define the static const ints above this and use Substitute to print them in to the msg.


Line 54:     "8 threads per solid state disk");
same


http://gerrit.cloudera.org:8080/#/c/7232/7/be/src/util/disk-info.h
File be/src/util/disk-info.h:

PS7, Line 59:   static bool is_rotational(int disk_id) {
            :     DCHECK_GE(disk_id, 0)
> I believe having the DCHECK back makes it consistent with the semantics of 
I think that these new semantics make sense, but if you really wanted to handle this differently, a DCHECK isn't the right way to address the problem that the API allows you to give it bad values without a way to report them: i.e. a better API is something like

enum DiskType {...};

Status GetDiskType(int disk_id, DiskType* out);

Then this can return an error (e.g. disk doesn't exist, or maybe we just can't tell). Avoiding bool would also be more clear.

Anyway, we won't actually do this now.


http://gerrit.cloudera.org:8080/#/c/7232/8/be/src/util/disk-info.h
File be/src/util/disk-info.h:

Line 61:     if(disk_id >= disks_.size()) return false;
space after if


-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Bikramjeet Vig (Code Review)" <ge...@cloudera.org>.
Bikramjeet Vig has uploaded a new patch set (#5).

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................

IMPALA-5240: Allow config of number of disk I/O threads per disk type

Currently Impala defaults to 8 threads per flash disk and 1 thread per
rotational disk. This can be overridden with --num_threads_per_disk,
but that sets the thread count independent of disk type.

This would allow control of the number of disk I/O threads spawned
independently for solid state disks using
"--num_io_threads_per_solid_state_disk" and for rotational disks using
"--num_io_threads_per_rotational_disk" as startup parameters.

Testing:
Added backend tests that verify if "num_threads_per_disk",
"num_io_threads_per_solid_state_disk" and
"num_io_threads_per_rotational_disk" are used to spawn threads
appropriately

TODO (Request comments on this additional change):
Additionally made some changes to fix a bug, where impala would crash
if num_disks are set more than the number of logical disks on system and
num_threads_per_disk is not set. This bug also lets the user create more
disk queues than the num of logical disks which would eventually never
be used. As a part of this effort, existing test cases in
disk-io-mgr-test were identified that exploited this bug and hence have
been fixed by explicitly setting num_disks=1 for now because the test
code isn't yet aware of how many disks are actually on the test system.
Test code will be updated to use multiple disks as a part of
IMPALA-5574. Moreover, after this fix, if num_disks is set to more than
the number of logical disks then it will default to max available disks
and log a warning.

Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
---
M be/src/runtime/disk-io-mgr-stress-test.cc
M be/src/runtime/disk-io-mgr-stress.cc
M be/src/runtime/disk-io-mgr-test.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M be/src/util/thread.cc
M be/src/util/thread.h
M fe/src/test/java/org/apache/impala/testutil/SentryServicePinger.java
8 files changed, 410 insertions(+), 345 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/32/7232/5
-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Bikramjeet Vig (Code Review)" <ge...@cloudera.org>.
Hello Matthew Jacobs,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7232

to look at the new patch set (#10).

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................

IMPALA-5240: Allow config of number of disk I/O threads per disk type

Currently Impala defaults to 8 threads per flash disk and 1 thread per
rotational disk. This can be overridden with --num_threads_per_disk,
but that sets the thread count independent of disk type.

This would allow control of the number of disk I/O threads spawned
independently for solid state disks using
"--num_io_threads_per_solid_state_disk" and for rotational disks using
"--num_io_threads_per_rotational_disk" as startup parameters.

Testing:
Added backend tests that verify if "num_threads_per_disk",
"num_io_threads_per_solid_state_disk" and
"num_io_threads_per_rotational_disk" are used to spawn threads
appropriately. Existing tests have been updated to use the new
DiskIoMgr test constructor, but retain the same behavior as before.

Additionally made some changes to fix a bug, where impala would crash
if num_disks are set more than the number of logical disks on system and
num_threads_per_disk is not set. This bug also lets the user create more
disk queues than the num of logical disks which would eventually never
be used by the actual code. Moreover, after this fix, if num_disks is
set to more than the number of logical disks then it will default to
max available disks and log a warning.

Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
---
M be/src/runtime/disk-io-mgr-stress.cc
M be/src/runtime/disk-io-mgr-test.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M be/src/util/disk-info.h
M be/src/util/thread.cc
M be/src/util/thread.h
M fe/src/test/java/org/apache/impala/testutil/SentryServicePinger.java
8 files changed, 124 insertions(+), 50 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/32/7232/10
-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7232/1/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

PS1, Line 637:   ///  - threads_per_rotational_disk: number of read threads to create per rotational
             :   ///    disk. This is also the max queue depth.
             :   ///  - threads_per_flash_disk: number of read threads to create per flash disk. This is
             :   ///    also the max queue depth.
             :   DiskIoMgr(int num_disks, int threads_per_disk, int min_buffer_size,
             :       int max_buffer_size, int threads_per_rotational_disk = 0,
             :       int threads_per_flash_disk = 0)
> You can remove threads_per_disk and instead change the tests to pass the sa
You can also change DiskQueue to know how many threads it has, and update DiskIoMgr::DebugString() to include that info in the output when it iterates over the DiskQueues. That'd be easy to check from the test.


-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................


Patch Set 11:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/852/

-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................


Patch Set 10: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7232/10/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

Line 53: static const string num_io_threads_per_rotational_disk_help_msg = Substitute("Number of I/O threads"
Long line


Line 60: static const string num_io_threads_per_solid_state_disk_help_msg = Substitute("Number of I/O threads "
Long line.


-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Bikramjeet Vig (Code Review)" <ge...@cloudera.org>.
Bikramjeet Vig has uploaded a new patch set (#9).

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................

IMPALA-5240: Allow config of number of disk I/O threads per disk type

Currently Impala defaults to 8 threads per flash disk and 1 thread per
rotational disk. This can be overridden with --num_threads_per_disk,
but that sets the thread count independent of disk type.

This would allow control of the number of disk I/O threads spawned
independently for solid state disks using
"--num_io_threads_per_solid_state_disk" and for rotational disks using
"--num_io_threads_per_rotational_disk" as startup parameters.

Testing:
Added backend tests that verify if "num_threads_per_disk",
"num_io_threads_per_solid_state_disk" and
"num_io_threads_per_rotational_disk" are used to spawn threads
appropriately. Existing tests have been updated to use the new
DiskIoMgr test constructor, but retain the same behavior as before.

Additionally made some changes to fix a bug, where impala would crash
if num_disks are set more than the number of logical disks on system and
num_threads_per_disk is not set. This bug also lets the user create more
disk queues than the num of logical disks which would eventually never
be used by the actual code. Moreover, after this fix, if num_disks is
set to more than the number of logical disks then it will default to
max available disks and log a warning.

Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
---
M be/src/runtime/disk-io-mgr-stress.cc
M be/src/runtime/disk-io-mgr-test.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M be/src/util/disk-info.h
M be/src/util/thread.cc
M be/src/util/thread.h
M fe/src/test/java/org/apache/impala/testutil/SentryServicePinger.java
8 files changed, 122 insertions(+), 50 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/32/7232/9
-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Bikramjeet Vig (Code Review)" <ge...@cloudera.org>.
Hello Matthew Jacobs, Tim Armstrong,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7232

to look at the new patch set (#11).

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................

IMPALA-5240: Allow config of number of disk I/O threads per disk type

Currently Impala defaults to 8 threads per flash disk and 1 thread per
rotational disk. This can be overridden with --num_threads_per_disk,
but that sets the thread count independent of disk type.

This would allow control of the number of disk I/O threads spawned
independently for solid state disks using
"--num_io_threads_per_solid_state_disk" and for rotational disks using
"--num_io_threads_per_rotational_disk" as startup parameters.

Testing:
Added backend tests that verify if "num_threads_per_disk",
"num_io_threads_per_solid_state_disk" and
"num_io_threads_per_rotational_disk" are used to spawn threads
appropriately. Existing tests have been updated to use the new
DiskIoMgr test constructor, but retain the same behavior as before.

Additionally made some changes to fix a bug, where impala would crash
if num_disks are set more than the number of logical disks on system and
num_threads_per_disk is not set. This bug also lets the user create more
disk queues than the num of logical disks which would eventually never
be used by the actual code. Moreover, after this fix, if num_disks is
set to more than the number of logical disks then it will default to
max available disks and log a warning.

Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
---
M be/src/runtime/disk-io-mgr-stress.cc
M be/src/runtime/disk-io-mgr-test.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M be/src/util/disk-info.h
M be/src/util/thread.cc
M be/src/util/thread.h
M fe/src/test/java/org/apache/impala/testutil/SentryServicePinger.java
8 files changed, 125 insertions(+), 50 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/32/7232/11
-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7232/9/be/src/util/disk-info.h
File be/src/util/disk-info.h:

Line 61:     if (disk_id >= disks_.size()) return false;
> why is this the right thing to assume? why should disk_id ever be more than
It shouldn't be in practice, but this is necessary for the tests which test having # disk queues greater than the number of physical disks. Bikram filed a separate JIRA to make the tests more realistic. We should put a TODO with that JIRA number here to switch this back to a DCHECK later.


-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Bikramjeet Vig (Code Review)" <ge...@cloudera.org>.
Bikramjeet Vig has posted comments on this change.

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................


Patch Set 4:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/7232/4//COMMIT_MSG
Commit Message:

Line 34: 
> Comment that we are changing the tests to explicitly set num_disks=1 for no
Done


http://gerrit.cloudera.org:8080/#/c/7232/4/be/src/runtime/disk-io-mgr-stress-test.cc
File be/src/runtime/disk-io-mgr-stress-test.cc:

Line 31: const int NUM_DISKS = 1;
> comment why we are setting this to 1 for now, i.e. that the test code isn't
I have added a TODO here as well as per your other comment in disk-io-mgr-test


http://gerrit.cloudera.org:8080/#/c/7232/3/be/src/runtime/disk-io-mgr-test.cc
File be/src/runtime/disk-io-mgr-test.cc:

Line 1142:   const int num_io_threads_per_rotational_or_ssd = 2;
> const
Done


PS3, Line 1165: 
> weird indentation, just put the operator on the previous line and use 4 spa
Done


http://gerrit.cloudera.org:8080/#/c/7232/4/be/src/runtime/disk-io-mgr-test.cc
File be/src/runtime/disk-io-mgr-test.cc:

Line 212:   const int num_disks = 1;
> leave a TODO with the JIRA where we'll change this, same for the other case
Done


http://gerrit.cloudera.org:8080/#/c/7232/4/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

PS4, Line 87: // Rotational disks should have 1 thread per disk to minimize seeks.  Non-rotational
            : // don't have this penalty and benefit from multiple concurrent IO requests.
            : static const int THREADS_PER_ROTATIONAL_DISK = 1;
            : static const int THREADS_PER_FLASH_DISK = 8;
> I don't think there's any reason to keep these separately and then check if
We need the default value of the flags to be zero in order to figure out if they were set or not. That is needed to follow the priority order of :
1. FLAG_num_io_threads_per_rotational_disk
2. num_threads_per_disk
3. THREADS_PER_ROTATIONAL_DISK


PS4, Line 263:     num_io_threads_per_rotational_disk_(FLAGS_num_io_threads_per_rotational_disk),
             :     num_io_threads_per_solid_state_disk_(FLAGS_num_io_threads_per_solid_state_disk),
> I think we can configure num_io_threads_per_rotational_disk_ and num_io_thr
Done


Line 278:         " disks";
> To be consistent we should also warn if the value is negative
Done


PS4, Line 367:     } else if (num_io_threads_per_rotational_disk_ != 0 && DiskInfo::is_rotational(i)) {
             :       num_threads_per_disk = num_io_threads_per_rotational_disk_;
             :     } else if (num_io_threads_per_solid_state_disk_ != 0 && !DiskInfo::is_rotational(i)) {
             :       num_threads_per_disk = num_io_threads_per_solid_state_disk_;
             :     } else if (FLAGS_num_threads_per_disk != 0) {
             :       num_threads_per_disk = FLAGS_num_threads_per_disk;
             :     } else if (DiskInfo::is_rotational(i)) {
             :       num_threads_per_disk = THREADS_PER_ROTATIONAL_DISK;
             :     } else {
             :       num_threads_per_disk = THREADS_PER_FLASH_DISK;
             :     }
> After my suggestion in the constructor this becomes greatly simplified.
Done


http://gerrit.cloudera.org:8080/#/c/7232/3/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

PS3, Line 631: ///
> comment this is for testing only
Done


PS3, Line 837: o the disk.
> the naming of this should probably be num_io_threads_per_disk_ to be consis
Went ahead with your latter suggestion and removed it altogether


http://gerrit.cloudera.org:8080/#/c/7232/3/be/src/util/thread.cc
File be/src/util/thread.cc:

PS3, Line 335: const
> nit: space after const
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Bikramjeet Vig (Code Review)" <ge...@cloudera.org>.
Bikramjeet Vig has posted comments on this change.

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................


Patch Set 10:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7232/10/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

Line 53: static const string num_io_threads_per_rotational_disk_help_msg = Substitute("Number of I/O threads"
> Long line
Done


Line 60: static const string num_io_threads_per_solid_state_disk_help_msg = Substitute("Number of I/O threads "
> Long line.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Bikramjeet Vig (Code Review)" <ge...@cloudera.org>.
Bikramjeet Vig has posted comments on this change.

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7232/1/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

PS1, Line 637:   DiskIoMgr(int num_disks, int threads_per_disk, int min_buffer_size,
             :       int max_buffer_size);
             : 
             :   /// Create DiskIoMgr with default configs.
             :   DiskIoMgr();
             : 
             :   /// Clean up all threads and resour
> You can remove threads_per_disk and instead change the tests to pass the sa
Done


PS1, Line 637:   DiskIoMgr(int num_disks, int threads_per_disk, int min_buffer_size,
             :       int max_buffer_size);
             : 
             :   /// Create DiskIoMgr with default configs.
             :   DiskIoMgr();
             : 
             :   /// Clean up all threads and resour
> You can also change DiskQueue to know how many threads it has, and update D
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................


Patch Set 3:

(7 comments)

nice! thanks for adding the test, I think it looks good. I had a few small additional comments, and I realized I think we can simplify disk-io-mgr by removing one of the members.

http://gerrit.cloudera.org:8080/#/c/7232/3/be/src/runtime/disk-io-mgr-test.cc
File be/src/runtime/disk-io-mgr-test.cc:

Line 1142:   int num_io_threads_for_remote_disks = FLAGS_num_remote_hdfs_io_threads
const


PS3, Line 1160: 0
can you also set this to something else, e.g. 100 to show that it gets overwritten by the other parameters?


PS3, Line 1165: == 
weird indentation, just put the operator on the previous line and use 4 spaces to tab


http://gerrit.cloudera.org:8080/#/c/7232/3/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

PS3, Line 631: ///
comment this is for testing only


PS3, Line 638:   ///  - threads_per_rotational_disk: number of read threads to create per rotational
             :   ///    disk. This is also the max queue depth.
             :   ///  - threads_per_solid_state_disk: number of read threads to create per solid state
             :   ///    disk. This is also the max queue depth.
these overwrite threads_per_disk


PS3, Line 837: num_threads_per_disk_
the naming of this should probably be num_io_threads_per_disk_ to be consistent

or better yet, I think we can remove this member altogether because we can just set num_io_threads_per_{rotational,ssd}_ to the value of FLAGS_num_threads_per_disk if the finer grained flags weren't set.


http://gerrit.cloudera.org:8080/#/c/7232/3/be/src/util/thread.cc
File be/src/util/thread.cc:

PS3, Line 335: const
nit: space after const


-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Bikramjeet Vig (Code Review)" <ge...@cloudera.org>.
Bikramjeet Vig has uploaded a new patch set (#4).

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................

IMPALA-5240: Allow config of number of disk I/O threads per disk type

Currently Impala defaults to 8 threads per flash disk and 1 thread per
rotational disk. This can be overridden with --num_threads_per_disk,
but that sets the thread count independent of disk type.

This would allow control of the number of disk I/O threads spawned
independently for solid state disks using
"--num_io_threads_per_solid_state_disk" and for rotational disks using
"--num_io_threads_per_rotational_disk" as startup parameters.

Testing:
Added backend tests that verify if "num_threads_per_disk",
"num_io_threads_per_solid_state_disk" and
"num_io_threads_per_rotational_disk" are used to spawn threads
appropriately

Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
---
M be/src/runtime/disk-io-mgr-stress-test.cc
M be/src/runtime/disk-io-mgr-stress.cc
M be/src/runtime/disk-io-mgr-test.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M be/src/util/thread.cc
M be/src/util/thread.h
M fe/src/test/java/org/apache/impala/testutil/SentryServicePinger.java
8 files changed, 393 insertions(+), 341 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/32/7232/4
-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-5240: Allow configuration of number of disk I/O threads based on disk type

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5240: Allow configuration of number of disk I/O threads based on disk type
......................................................................


Patch Set 1:

(3 comments)

In addition to Matt's comments.

http://gerrit.cloudera.org:8080/#/c/7232/1/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

PS1, Line 47: num_threads_per_rotational_disk
Maybe io_threads instead of threads_ to be more specific. We can leave the existing num_threads_per_disk option as-is but it's probably better to match the below options.


PS1, Line 50: num_threads_per_flash_disk
Not your change but we probably shouldn't call this "flash", since that's a specific type of solid state disk. Maybe "solid_state_disk" or "ssd"?


http://gerrit.cloudera.org:8080/#/c/7232/1/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

PS1, Line 637:   ///  - threads_per_rotational_disk: number of read threads to create per rotational
             :   ///    disk. This is also the max queue depth.
             :   ///  - threads_per_flash_disk: number of read threads to create per flash disk. This is
             :   ///    also the max queue depth.
             :   DiskIoMgr(int num_disks, int threads_per_disk, int min_buffer_size,
             :       int max_buffer_size, int threads_per_rotational_disk = 0,
             :       int threads_per_flash_disk = 0)
I believe this constructor only exists for the backend tests. I think we should either remove the unused arguments or pass them in for at least one test. I think the second is better but I don't feel strongly.


-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Bikramjeet Vig (Code Review)" <ge...@cloudera.org>.
Bikramjeet Vig has uploaded a new patch set (#6).

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................

IMPALA-5240: Allow config of number of disk I/O threads per disk type

Currently Impala defaults to 8 threads per flash disk and 1 thread per
rotational disk. This can be overridden with --num_threads_per_disk,
but that sets the thread count independent of disk type.

This would allow control of the number of disk I/O threads spawned
independently for solid state disks using
"--num_io_threads_per_solid_state_disk" and for rotational disks using
"--num_io_threads_per_rotational_disk" as startup parameters.

Testing:
Added backend tests that verify if "num_threads_per_disk",
"num_io_threads_per_solid_state_disk" and
"num_io_threads_per_rotational_disk" are used to spawn threads
appropriately

TODO (Request comments on this additional change):
Additionally made some changes to fix a bug, where impala would crash
if num_disks are set more than the number of logical disks on system and
num_threads_per_disk is not set. This bug also lets the user create more
disk queues than the num of logical disks which would eventually never
be used. As a part of this effort, existing test cases in
disk-io-mgr-test were identified that would fail after this bug fix.
This issue will be tracked in IMPALA-5574, but until then,
DiskInfo::is_rotational() had to be changed to make these tests run as
before. Moreover, after this fix, if num_disks is set to more than
the number of logical disks then it will default to max available disks
and log a warning.

Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
---
M be/src/runtime/disk-io-mgr-stress.cc
M be/src/runtime/disk-io-mgr-test.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M be/src/util/disk-info.h
M be/src/util/thread.cc
M be/src/util/thread.h
M fe/src/test/java/org/apache/impala/testutil/SentryServicePinger.java
8 files changed, 110 insertions(+), 46 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/32/7232/6
-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................


Patch Set 9:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7232/9/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

PS9, Line 631: This constructor is only used for testing.
it's unfortunate that we have this at all. you don't have to fix it in this change, but I wonder why we can't just have the test set the FLAG_ variables, like in the real impalad environment.


http://gerrit.cloudera.org:8080/#/c/7232/9/be/src/util/disk-info.h
File be/src/util/disk-info.h:

Line 61:     if (disk_id >= disks_.size()) return false;
why is this the right thing to assume? why should disk_id ever be more than the number of disks?


-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Bikramjeet Vig (Code Review)" <ge...@cloudera.org>.
Bikramjeet Vig has uploaded a new patch set (#2).

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................

IMPALA-5240: Allow config of number of disk I/O threads per disk type

Currently Impala defaults to 8 threads per flash disk and 1 thread per
rotational disk. This can be overridden with --num_threads_per_disk,
but that sets the thread count independent of disk type.

This would allow control of the number of disk I/O threads spawned
independently for solid state disks using
"--num_io_threads_per_solid_state_disk" and for rotational disks using
"--num_io_threads_per_rotational_disk" as startup parameters.

Testing:
Tested manually. Unfortunately, there is no way using DiskIoMgr class
to confirm these variables were set, therefore no automated backend
tests were added.

Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
---
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M fe/src/test/java/org/apache/impala/testutil/SentryServicePinger.java
3 files changed, 28 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/32/7232/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Bikramjeet Vig (Code Review)" <ge...@cloudera.org>.
Bikramjeet Vig has uploaded a new patch set (#3).

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................

IMPALA-5240: Allow config of number of disk I/O threads per disk type

Currently Impala defaults to 8 threads per flash disk and 1 thread per
rotational disk. This can be overridden with --num_threads_per_disk,
but that sets the thread count independent of disk type.

This would allow control of the number of disk I/O threads spawned
independently for solid state disks using
"--num_io_threads_per_solid_state_disk" and for rotational disks using
"--num_io_threads_per_rotational_disk" as startup parameters.

Testing:
Added backend tests that verify if "num_threads_per_disk",
"num_io_threads_per_solid_state_disk" and
"num_io_threads_per_rotational_disk" are used to spawn threads
appropriately

Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
---
M be/src/runtime/disk-io-mgr-test.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M be/src/util/thread.cc
M be/src/util/thread.h
M fe/src/test/java/org/apache/impala/testutil/SentryServicePinger.java
6 files changed, 78 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/32/7232/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Bikramjeet Vig (Code Review)" <ge...@cloudera.org>.
Bikramjeet Vig has uploaded a new patch set (#8).

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................

IMPALA-5240: Allow config of number of disk I/O threads per disk type

Currently Impala defaults to 8 threads per flash disk and 1 thread per
rotational disk. This can be overridden with --num_threads_per_disk,
but that sets the thread count independent of disk type.

This would allow control of the number of disk I/O threads spawned
independently for solid state disks using
"--num_io_threads_per_solid_state_disk" and for rotational disks using
"--num_io_threads_per_rotational_disk" as startup parameters.

Testing:
Added backend tests that verify if "num_threads_per_disk",
"num_io_threads_per_solid_state_disk" and
"num_io_threads_per_rotational_disk" are used to spawn threads
appropriately

TODO (Request comments on this additional change):
Additionally made some changes to fix a bug, where impala would crash
if num_disks are set more than the number of logical disks on system and
num_threads_per_disk is not set. This bug also lets the user create more
disk queues than the num of logical disks which would eventually never
be used. As a part of this effort, existing test cases in
disk-io-mgr-test were identified that exploited this bug and will be
tracked in IMPALA-5574. Moreover, after this fix, if num_disks is set to
more than the number of logical disks then it will default to max
available disks and log a warning.

Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
---
M be/src/runtime/disk-io-mgr-stress.cc
M be/src/runtime/disk-io-mgr-test.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M be/src/util/disk-info.h
M be/src/util/thread.cc
M be/src/util/thread.h
M fe/src/test/java/org/apache/impala/testutil/SentryServicePinger.java
8 files changed, 113 insertions(+), 46 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/32/7232/8
-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................


IMPALA-5240: Allow config of number of disk I/O threads per disk type

Currently Impala defaults to 8 threads per flash disk and 1 thread per
rotational disk. This can be overridden with --num_threads_per_disk,
but that sets the thread count independent of disk type.

This would allow control of the number of disk I/O threads spawned
independently for solid state disks using
"--num_io_threads_per_solid_state_disk" and for rotational disks using
"--num_io_threads_per_rotational_disk" as startup parameters.

Testing:
Added backend tests that verify if "num_threads_per_disk",
"num_io_threads_per_solid_state_disk" and
"num_io_threads_per_rotational_disk" are used to spawn threads
appropriately. Existing tests have been updated to use the new
DiskIoMgr test constructor, but retain the same behavior as before.

Additionally made some changes to fix a bug, where impala would crash
if num_disks are set more than the number of logical disks on system and
num_threads_per_disk is not set. This bug also lets the user create more
disk queues than the num of logical disks which would eventually never
be used by the actual code. Moreover, after this fix, if num_disks is
set to more than the number of logical disks then it will default to
max available disks and log a warning.

Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Reviewed-on: http://gerrit.cloudera.org:8080/7232
Reviewed-by: Matthew Jacobs <mj...@cloudera.com>
Tested-by: Impala Public Jenkins
---
M be/src/runtime/disk-io-mgr-stress.cc
M be/src/runtime/disk-io-mgr-test.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M be/src/util/disk-info.h
M be/src/util/thread.cc
M be/src/util/thread.h
M fe/src/test/java/org/apache/impala/testutil/SentryServicePinger.java
8 files changed, 125 insertions(+), 50 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Matthew Jacobs: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................


Patch Set 7:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/7232/7//COMMIT_MSG
Commit Message:

PS7, Line 24: TODO (Request comments on this additional change):
remove TODO line and update the message below to indicate this all works, the only future work is to consider changing the BE tests to actually have files on different disks, but that's basically unrelated to your change.


http://gerrit.cloudera.org:8080/#/c/7232/7/be/src/runtime/disk-io-mgr-stress.cc
File be/src/runtime/disk-io-mgr-stress.cc:

Line 75:           MIN_READ_BUFFER_SIZE, MAX_READ_BUFFER_SIZE));
nit 4 spaces tab


http://gerrit.cloudera.org:8080/#/c/7232/7/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

Line 48:     " disk. Has priority over num_threads_per_disk.");
It'd be good to indicate what this defaults to if this and num_threads_per_disk isn't set.


Line 51: DEFINE_int32(num_io_threads_per_solid_state_disk, 0, "Number of I/O threads per solid"
same


Line 262: static inline int GetFirstPositiveVal(const int first_val, const int second_val,
1 line comment why this exists


PS7, Line 273: FLASH
inconsistent FLASH vs ssd naming


PS7, Line 284: &&
this should be ||


PS7, Line 375: e
add space


http://gerrit.cloudera.org:8080/#/c/7232/7/be/src/util/disk-info.h
File be/src/util/disk-info.h:

Line 56: 
add a comment that this returns true if the disk w/ disk_id exists and is a rotational disk, is false otherwise.


PS7, Line 59:     //TODO: temporarily removed DCHECK due to an issue tracked in IMPALA-5574, put it back
            :     // after its resolved
remove, I don't think this needs to be temporary. If someone wants to check the number of disks they can use num_disks().


PS7, Line 64:   
nit remove whitespace


PS7, Line 80:  
remove trailing whitespace


-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7232/5/be/src/runtime/disk-io-mgr-test.cc
File be/src/runtime/disk-io-mgr-test.cc:

Line 909:         DiskIoMgr io_mgr(num_disks, threads_per_disk, threads_per_disk, MIN_BUFFER_SIZE, MAX_BUFFER_SIZE);
long line


http://gerrit.cloudera.org:8080/#/c/7232/5/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

PS5, Line 262: CheckAndGetFlagInOrder
how about calling this

GetFirstPositiveVal()?


PS5, Line 376: is_rotational
as we discussed, let's have this return false if i is outside the bounds of the number of disks, then we can revert the test changes so that we have the same coverage as before.


-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Bikramjeet Vig (Code Review)" <ge...@cloudera.org>.
Bikramjeet Vig has posted comments on this change.

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7232/5/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

PS5, Line 262: CheckAndGetFlagInOrder
> how about calling this
Done


PS5, Line 376: is_rotational
> as we discussed, let's have this return false if i is outside the bounds of
Made changes as suggested and reverted corresponding changes in the backend tests as well


-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................


Patch Set 11: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7232/1/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

PS1, Line 637:   ///  - threads_per_rotational_disk: number of read threads to create per rotational
             :   ///    disk. This is also the max queue depth.
             :   ///  - threads_per_flash_disk: number of read threads to create per flash disk. This is
             :   ///    also the max queue depth.
             :   DiskIoMgr(int num_disks, int threads_per_disk, int min_buffer_size,
             :       int max_buffer_size, int threads_per_rotational_disk = 0,
             :       int threads_per_flash_disk = 0)
> I think the best thing would be to remove them because I don't think we hav
You can remove threads_per_disk and instead change the tests to pass the same test value to both threads_per_rotational_disk and threads_per_flash_disk. Then you'll exercise the new code paths and get the same behavior that the tests had previously, so it shouldn't change.


-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes