You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Joe McDonnell (Code Review)" <ge...@cloudera.org> on 2018/02/21 16:55:30 UTC

[Impala-ASF-CR] IMPALA-6549: Enable file handle cache by default

Joe McDonnell has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9371


Change subject: IMPALA-6549: Enable file handle cache by default
......................................................................

IMPALA-6549: Enable file handle cache by default

The file handle cache was disabled by default
due to two HDFS issues: HDFS-12528 and HDFS-14872.
Both have been fixed and the CDH components in
the toolchain include both fixes.

This reenables the file handle cache by default.

Change-Id: I6935825a1c4c7b2da0bb877f732027be1a57a8b7
---
M be/src/runtime/io/disk-io-mgr.cc
1 file changed, 2 insertions(+), 8 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/9371/1
-- 
To view, visit http://gerrit.cloudera.org:8080/9371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I6935825a1c4c7b2da0bb877f732027be1a57a8b7
Gerrit-Change-Number: 9371
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins

[Impala-ASF-CR] IMPALA-6549: Enable file handle cache by default

Posted by "Joe McDonnell (Code Review)" <ge...@cloudera.org>.
Hello Lars Volker, Dan Hecht, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/9371

to look at the new patch set (#2).

Change subject: IMPALA-6549: Enable file handle cache by default
......................................................................

IMPALA-6549: Enable file handle cache by default

The file handle cache was disabled by default
due to two HDFS issues: HDFS-12528 and HDFS-14872.
Both have been fixed and the CDH components in
the toolchain include both fixes.

This reenables the file handle cache by default.

Change-Id: I6935825a1c4c7b2da0bb877f732027be1a57a8b7
---
M be/src/runtime/io/disk-io-mgr.cc
1 file changed, 7 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/9371/2
-- 
To view, visit http://gerrit.cloudera.org:8080/9371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6935825a1c4c7b2da0bb877f732027be1a57a8b7
Gerrit-Change-Number: 9371
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>

[Impala-ASF-CR] IMPALA-6549: Enable file handle cache by default

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9371 )

Change subject: IMPALA-6549: Enable file handle cache by default
......................................................................

IMPALA-6549: Enable file handle cache by default

The file handle cache was disabled by default
due to two HDFS issues: HDFS-12528 and HDFS-14872.
Both have been fixed and the CDH components in
the toolchain include both fixes.

This reenables the file handle cache by default.

Change-Id: I6935825a1c4c7b2da0bb877f732027be1a57a8b7
Reviewed-on: http://gerrit.cloudera.org:8080/9371
Reviewed-by: Joe McDonnell <jo...@cloudera.com>
Tested-by: Impala Public Jenkins
---
M be/src/runtime/io/disk-io-mgr.cc
1 file changed, 7 insertions(+), 9 deletions(-)

Approvals:
  Joe McDonnell: Looks good to me, approved
  Impala Public Jenkins: Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/9371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I6935825a1c4c7b2da0bb877f732027be1a57a8b7
Gerrit-Change-Number: 9371
Gerrit-PatchSet: 4
Gerrit-Owner: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>

[Impala-ASF-CR] IMPALA-6549: Enable file handle cache by default

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9371 )

Change subject: IMPALA-6549: Enable file handle cache by default
......................................................................


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1992/


-- 
To view, visit http://gerrit.cloudera.org:8080/9371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6935825a1c4c7b2da0bb877f732027be1a57a8b7
Gerrit-Change-Number: 9371
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Thu, 22 Feb 2018 23:56:09 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-6549: Enable file handle cache by default

Posted by "Joe McDonnell (Code Review)" <ge...@cloudera.org>.
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/9371 )

Change subject: IMPALA-6549: Enable file handle cache by default
......................................................................


Patch Set 1:

(1 comment)

> (1 comment)
 > 
 > > (1 comment)
 > >
 > > > (1 comment)
 > > >
 > > > Do you have a list of tests you plan to run?
 > >
 > > I ran core tests with no issues. This will get incorporated into
 > > the next run of performance tests. Separately, I plan to do a
 > > warmup test where I run with a table with a large number of files
 > > with a cold cache.
 > >
 > 
 > And the goal of the warmup test be to determine whether IMPALA-6364
 > is a problem in the cache miss case, is that right?

Yes, the goal of the test is to get some idea of the extent of any contention during warm up. This might provide input into an appropriate value for num_file_handle_cache_partitions.

http://gerrit.cloudera.org:8080/#/c/9371/1/be/src/runtime/io/disk-io-mgr.cc
File be/src/runtime/io/disk-io-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/9371/1/be/src/runtime/io/disk-io-mgr.cc@109
PS1, Line 109: 21600
> Maybe incorporate some of that context into the comment?
Added this information to the comment.



-- 
To view, visit http://gerrit.cloudera.org:8080/9371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6935825a1c4c7b2da0bb877f732027be1a57a8b7
Gerrit-Change-Number: 9371
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Thu, 22 Feb 2018 00:35:00 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-6549: Enable file handle cache by default

Posted by "Joe McDonnell (Code Review)" <ge...@cloudera.org>.
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/9371 )

Change subject: IMPALA-6549: Enable file handle cache by default
......................................................................


Patch Set 3:

GVO hit IMPALA-6532


-- 
To view, visit http://gerrit.cloudera.org:8080/9371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6935825a1c4c7b2da0bb877f732027be1a57a8b7
Gerrit-Change-Number: 9371
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Thu, 22 Feb 2018 23:54:33 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-6549: Enable file handle cache by default

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9371 )

Change subject: IMPALA-6549: Enable file handle cache by default
......................................................................


Patch Set 3: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/9371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6935825a1c4c7b2da0bb877f732027be1a57a8b7
Gerrit-Change-Number: 9371
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Fri, 23 Feb 2018 05:02:30 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-6549: Enable file handle cache by default

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/9371 )

Change subject: IMPALA-6549: Enable file handle cache by default
......................................................................


Patch Set 1: Code-Review+1


-- 
To view, visit http://gerrit.cloudera.org:8080/9371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6935825a1c4c7b2da0bb877f732027be1a57a8b7
Gerrit-Change-Number: 9371
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Wed, 21 Feb 2018 18:23:14 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-6549: Enable file handle cache by default

Posted by "Joe McDonnell (Code Review)" <ge...@cloudera.org>.
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/9371 )

Change subject: IMPALA-6549: Enable file handle cache by default
......................................................................


Patch Set 1:

(1 comment)

> (1 comment)
 > 
 > Do you have a list of tests you plan to run?

I ran core tests with no issues. This will get incorporated into the next run of performance tests. Separately, I plan to do a warmup test where I run with a table with a large number of files with a cold cache.

Since remote files don't participate in the cache, I don't forsee any issues with S3 or other platforms.

http://gerrit.cloudera.org:8080/#/c/9371/1/be/src/runtime/io/disk-io-mgr.cc
File be/src/runtime/io/disk-io-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/9371/1/be/src/runtime/io/disk-io-mgr.cc@109
PS1, Line 109: 21600
> how was that chosen?
This is a good point. The concern for aging out file handles was about disk space getting freed up. This is controlled by the fs.trash.interval. We wanted the eviction time to be less than the typical fs.trash.interval so that the disk space usage would not be impacted by the cache. This is reverting back to what we originally merged in IMPALA-5352.

I checked some Cloudera deployments, and they default to 1 day for fs.trash.interval. HDP has a default of 6 hours. I think there is a range of appropriate defaults.



-- 
To view, visit http://gerrit.cloudera.org:8080/9371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6935825a1c4c7b2da0bb877f732027be1a57a8b7
Gerrit-Change-Number: 9371
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Thu, 22 Feb 2018 00:05:09 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-6549: Enable file handle cache by default

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/9371 )

Change subject: IMPALA-6549: Enable file handle cache by default
......................................................................


Patch Set 2: Code-Review+2

Thanks. I'm fine with merging it as long as those tests are run soon.


-- 
To view, visit http://gerrit.cloudera.org:8080/9371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6935825a1c4c7b2da0bb877f732027be1a57a8b7
Gerrit-Change-Number: 9371
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Thu, 22 Feb 2018 18:30:03 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-6549: Enable file handle cache by default

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/9371 )

Change subject: IMPALA-6549: Enable file handle cache by default
......................................................................


Patch Set 1:

(1 comment)

Do you have a list of tests you plan to run?

http://gerrit.cloudera.org:8080/#/c/9371/1/be/src/runtime/io/disk-io-mgr.cc
File be/src/runtime/io/disk-io-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/9371/1/be/src/runtime/io/disk-io-mgr.cc@109
PS1, Line 109: 21600
how was that chosen?



-- 
To view, visit http://gerrit.cloudera.org:8080/9371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6935825a1c4c7b2da0bb877f732027be1a57a8b7
Gerrit-Change-Number: 9371
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Wed, 21 Feb 2018 23:41:50 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-6549: Enable file handle cache by default

Posted by "Joe McDonnell (Code Review)" <ge...@cloudera.org>.
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/9371 )

Change subject: IMPALA-6549: Enable file handle cache by default
......................................................................


Patch Set 3: Code-Review+2

Rebased. Carry +2


-- 
To view, visit http://gerrit.cloudera.org:8080/9371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6935825a1c4c7b2da0bb877f732027be1a57a8b7
Gerrit-Change-Number: 9371
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Thu, 22 Feb 2018 19:21:04 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-6549: Enable file handle cache by default

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9371 )

Change subject: IMPALA-6549: Enable file handle cache by default
......................................................................


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1986/


-- 
To view, visit http://gerrit.cloudera.org:8080/9371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6935825a1c4c7b2da0bb877f732027be1a57a8b7
Gerrit-Change-Number: 9371
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Thu, 22 Feb 2018 19:21:54 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-6549: Enable file handle cache by default

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/9371 )

Change subject: IMPALA-6549: Enable file handle cache by default
......................................................................


Patch Set 1:

(1 comment)

> (1 comment)
 > 
 > > (1 comment)
 > >
 > > Do you have a list of tests you plan to run?
 > 
 > I ran core tests with no issues. This will get incorporated into
 > the next run of performance tests. Separately, I plan to do a
 > warmup test where I run with a table with a large number of files
 > with a cold cache.
 > 

And the goal of the warmup test be to determine whether IMPALA-6364 is a problem in the cache miss case, is that right?

http://gerrit.cloudera.org:8080/#/c/9371/1/be/src/runtime/io/disk-io-mgr.cc
File be/src/runtime/io/disk-io-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/9371/1/be/src/runtime/io/disk-io-mgr.cc@109
PS1, Line 109: 21600
> This is a good point. The concern for aging out file handles was about disk
Maybe incorporate some of that context into the comment?



-- 
To view, visit http://gerrit.cloudera.org:8080/9371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6935825a1c4c7b2da0bb877f732027be1a57a8b7
Gerrit-Change-Number: 9371
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Thu, 22 Feb 2018 00:12:10 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-6549: Enable file handle cache by default

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9371 )

Change subject: IMPALA-6549: Enable file handle cache by default
......................................................................


Patch Set 3: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1986/


-- 
To view, visit http://gerrit.cloudera.org:8080/9371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6935825a1c4c7b2da0bb877f732027be1a57a8b7
Gerrit-Change-Number: 9371
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Thu, 22 Feb 2018 23:53:18 +0000
Gerrit-HasComments: No