You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Matthew Jacobs (Code Review)" <ge...@cloudera.org> on 2017/09/11 23:46:19 UTC

[Impala-ASF-CR] IMPALA-5920: Remove admission control dependency on YARN resourcemanager

Matthew Jacobs has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/8035

Change subject: IMPALA-5920: Remove admission control dependency on YARN resourcemanager
......................................................................

IMPALA-5920: Remove admission control dependency on YARN resourcemanager

Impala's admission controller relies on the YARN
fair-scheduler.xml for configuration. That configuration is
loaded using YARN directly (ie. as a library by the
frontend). In Hadoop 3, a number of changes were made to the
YARN resourcemanager which break Impala. While we eventually
want to rethink the admission control configuration
(IMPALA-4159), in the meantime we at least should avoid
using unsupported YARN APIs.

This patch removes the fe dependency on the YARN artifact
'hadoop-yarn-server-resourcemanager' which contains private
APIs and isn't meant to be used as a library.

A subset of the required code has been added in
'common/yarn-extras', taken from Hadoop 2.6.0.
The goal the 'yarn-extras' is to make Impala's handling of
the AC configuration self-sufficient, i.e. it shouldn't
matter what version of Hadoop exists. As-is, this was tested
and found to work when Hadoop 2.6 is installed. Because
the yarn-extras/pom.xml still references hadoop-common,
hadoop-yarn-common, and hadoop-yarn-api, additional testing
will be required to ensure Impala using yarn-extras works
when installed along side Hadoop 3.

That testing for Hadoop 3 will be done later. Future changes
will make any other changes required for existing code to
work when Hadoop 3 is installed.

Testing:
* Ran existing tests on master.

Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
---
M CMakeLists.txt
A common/yarn-extras/CMakeLists.txt
A common/yarn-extras/README.txt
A common/yarn-extras/pom.xml
A common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/resource/ResourceWeights.java
A common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
A common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfigurationException.java
A common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java
A common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSQueueType.java
A common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerConfiguration.java
A common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueuePlacementPolicy.java
A common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueuePlacementRule.java
A common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java
A common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/utils/BuilderUtils.java
M fe/CMakeLists.txt
M fe/pom.xml
M fe/src/test/java/org/apache/impala/util/TestRequestPoolService.java
17 files changed, 1,804 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/8035/1
-- 
To view, visit http://gerrit.cloudera.org:8080/8035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs <mj...@cloudera.com>

[Impala-ASF-CR] IMPALA-5920: Remove admission control dependency on YARN RM jar

Posted by "Philip Zeyliger (Code Review)" <ge...@cloudera.org>.
Philip Zeyliger has posted comments on this change.

Change subject: IMPALA-5920: Remove admission control dependency on YARN RM jar
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8035/3/common/yarn-extras/pom.xml
File common/yarn-extras/pom.xml:

Line 23:   <groupId>org.apache.impala</groupId>
I think we should move the implementing classes here to "org.apache.impala" as well, as opposed to leaving them in org.apache.hadoop. This is a fairly mechanical change, but it'll make it explicit that we've copied these over, and it'll avoid the unfortunate bugs we'd see if two copies of one of the classes appeared on the CLASSPATH, and we didn't know which one was active.


-- 
To view, visit http://gerrit.cloudera.org:8080/8035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs <mj...@apache.org>
Gerrit-Reviewer: Matthew Jacobs <mj...@apache.org>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5920: Remove admission control dependency on YARN RM jar

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has submitted this change and it was merged.

Change subject: IMPALA-5920: Remove admission control dependency on YARN RM jar
......................................................................


IMPALA-5920: Remove admission control dependency on YARN RM jar

Impala's admission controller relies on the YARN
fair-scheduler.xml for configuration. That configuration is
loaded using YARN directly (ie. as a library by the
frontend). In Hadoop 3, a number of changes were made to the
YARN resourcemanager which break Impala. While we eventually
want to rethink the admission control configuration
(IMPALA-4159), in the meantime we at least should avoid
using unsupported YARN APIs.

This patch removes the fe dependency on the YARN artifact
'hadoop-yarn-server-resourcemanager' which contains private
APIs and isn't meant to be used as a library.

A subset of the required code has been added in
'common/yarn-extras', taken from Hadoop 2.6.0 in CDH, e.g.
see [1]. The code is added in packages 'org.apache.impala.*'
instead of 'org.apache.yarn.*'. Some code could be copied
as-is, those files are marked with the comment:
//YARNUTIL: VANILLA
Files that required some modifications are marked with:
//YARNUTIL: MODIFIED
Or, if all code except a dummy interface could be added:
//YARNUTIL: DUMMY IMPL

The goal the 'yarn-extras' is to make Impala's handling of
the AC configuration self-sufficient, i.e. it shouldn't
matter what version of Hadoop exists. As-is, this was tested
and found to work when Hadoop 2.6 is installed. Because
the yarn-extras/pom.xml still references hadoop-common,
hadoop-yarn-common, and hadoop-yarn-api, additional testing
will be required to ensure Impala using yarn-extras works
when installed along side Hadoop 3.

That testing for Hadoop 3 will be done later. Future changes
will make any other changes required for existing code to
work when Hadoop 3 is installed.

Testing:
* Ran existing tests on master.

1: https://www.cloudera.com/documentation/enterprise/release-notes/topics/cm_vd_cdh_package_tarball_512.html

Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
Reviewed-on: http://gerrit.cloudera.org:8080/8035
Reviewed-by: Matthew Jacobs <mj...@apache.org>
Tested-by: Matthew Jacobs <mj...@apache.org>
---
M CMakeLists.txt
A common/yarn-extras/CMakeLists.txt
A common/yarn-extras/README.txt
A common/yarn-extras/pom.xml
A common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/resource/ResourceWeights.java
A common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
A common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationConfigurationException.java
A common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java
A common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/FSQueueType.java
A common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/FairSchedulerConfiguration.java
A common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/QueuePlacementPolicy.java
A common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/QueuePlacementRule.java
A common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java
A common/yarn-extras/src/main/java/org/apache/impala/yarn/server/utils/BuilderUtils.java
M fe/CMakeLists.txt
M fe/pom.xml
M fe/src/main/java/org/apache/impala/util/RequestPoolService.java
M fe/src/test/java/org/apache/impala/util/TestRequestPoolService.java
18 files changed, 1,690 insertions(+), 17 deletions(-)

Approvals:
  Matthew Jacobs: Looks good to me, approved; Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/8035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs <mj...@apache.org>
Gerrit-Reviewer: Matthew Jacobs <mj...@apache.org>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>

[Impala-ASF-CR] IMPALA-5920: Remove admission control dependency on YARN RM jar

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5920: Remove admission control dependency on YARN RM jar
......................................................................


Patch Set 6: Code-Review+2

I'll count the +1s from Tim and Zach, and no further comments from Phil as enough to commit this change.

-- 
To view, visit http://gerrit.cloudera.org:8080/8035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs <mj...@apache.org>
Gerrit-Reviewer: Matthew Jacobs <mj...@apache.org>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5920: Remove admission control dependency on YARN RM jar

Posted by "Zach Amsden (Code Review)" <ge...@cloudera.org>.
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5920: Remove admission control dependency on YARN RM jar
......................................................................


Patch Set 4:

(17 comments)

Looks good but I have a few comments - there is a lot more that could be pruned.  Let me know which if any you might pursue and I can give a +1

http://gerrit.cloudera.org:8080/#/c/8035/4/common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
File common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java:

Line 61:   private final float queueMaxAMShareDefault;
Impala doesn't make use of YARN's concept of ApplicationMasters.  Why not kill this.


Line 64:   private final Map<String, Map<QueueACL, AccessControlList>> queueAcls;
Useless (just SUBMIT_APPLICATIONS)


Line 69:   private final Map<String, Long> minSharePreemptionTimeouts;
Unused by Impala


Line 74:   private final Map<String, Long> fairSharePreemptionTimeouts;
Also unused by Impala


Line 80:   private final Map<String, Float> fairSharePreemptionThresholds;
Also unused


Line 82:   private final Map<String, SchedulingPolicy> schedulingPolicies;
The only policy is DEFAULT_POLICY so this is also not needed.


Line 84:   private final SchedulingPolicy defaultSchedulingPolicy;
Ditto


http://gerrit.cloudera.org:8080/#/c/8035/4/common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationConfigurationException.java
File common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationConfigurationException.java:

Line 30: public class AllocationConfigurationException extends Exception {
The only point of this class is capturing configuration exceptions and throwing, but if you remove most of the throws, maybe this class is not necessary either.


http://gerrit.cloudera.org:8080/#/c/8035/4/common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java
File common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java:

Line 103:   public void serviceInit(Configuration conf) throws Exception {
103-165 duplicate code and functionality provided by impala.util.FileWatchService (minus the ALLOC_RELOAD_WAIT_MS check); we only needed this code before because this is the way yarn worked internally, but it would be better to have just one implementation here - maybe switch this to use FileWatchService.


Line 171:    * classpath, but loaded like a regular File.
This seems like unnecessary functionality that we could deprecate if desired.


Line 223:     Map<String, Long> minSharePreemptionTimeouts = new HashMap<>();
Preemption configuration does nothing in Impala; could also be killed.


Line 226:     Map<String, Map<QueueACL, AccessControlList>> queueAcls = new HashMap<>();
Impala only uses a single queueACL, SUBMIT_APPLICATIONS.  We could pull the entire QueueACL class into impala to get rid of one more Yarnish dependency.


Line 323:         } else if ("defaultQueueSchedulingPolicy".equals(element.getTagName())
Here's the useless code and exception throwing.


Line 533:   public interface Listener {
This can probably go away too.


http://gerrit.cloudera.org:8080/#/c/8035/4/common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/FairSchedulerConfiguration.java
File common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/FairSchedulerConfiguration.java:

Line 70:     Pattern pattern = Pattern.compile("(\\d+)(\\.\\d*)?\\s*" + units);
This seems a pretty inefficient way to look up resources. This would be a candidate for memoizing if this is called often with the same 'units'.


http://gerrit.cloudera.org:8080/#/c/8035/4/common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java
File common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java:

Line 25:     return DEFAULT_POLICY;
This whole class is pretty useless and in Impala's case, only serves to throw exceptions when the SchedulingPolicy can't be parsed as a valid policy.

Since that is now impossible, maybe just remove the class?


http://gerrit.cloudera.org:8080/#/c/8035/4/fe/src/test/java/org/apache/impala/util/TestRequestPoolService.java
File fe/src/test/java/org/apache/impala/util/TestRequestPoolService.java:

Line 124:         new Configuration().get(HADOOP_SECURITY_AUTH_TO_LOCAL, "DEFAULT"));
Not clear to me why this is needed now.


-- 
To view, visit http://gerrit.cloudera.org:8080/8035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs <mj...@apache.org>
Gerrit-Reviewer: Matthew Jacobs <mj...@apache.org>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5920: Remove admission control dependency on YARN RM jar

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5920: Remove admission control dependency on YARN RM jar
......................................................................


Patch Set 7:

> The rat-check job actually failed because of a file introduced here
 > - it broke my GVO. https://jenkins.impala.io/job/rat-check/1929/

Sorry about that - I haven't seen the RAT check fail before. Thanks for updating the rat exclude list.

-- 
To view, visit http://gerrit.cloudera.org:8080/8035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs <mj...@apache.org>
Gerrit-Reviewer: Matthew Jacobs <mj...@apache.org>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5920: Remove admission control dependency on YARN RM jar

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Hello Tim Armstrong,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/8035

to look at the new patch set (#4).

Change subject: IMPALA-5920: Remove admission control dependency on YARN RM jar
......................................................................

IMPALA-5920: Remove admission control dependency on YARN RM jar

Impala's admission controller relies on the YARN
fair-scheduler.xml for configuration. That configuration is
loaded using YARN directly (ie. as a library by the
frontend). In Hadoop 3, a number of changes were made to the
YARN resourcemanager which break Impala. While we eventually
want to rethink the admission control configuration
(IMPALA-4159), in the meantime we at least should avoid
using unsupported YARN APIs.

This patch removes the fe dependency on the YARN artifact
'hadoop-yarn-server-resourcemanager' which contains private
APIs and isn't meant to be used as a library.

A subset of the required code has been added in
'common/yarn-extras', taken from Hadoop 2.6.0 in CDH, e.g.
see [1]. The code is added in packages 'org.apache.impala.*'
instead of 'org.apache.yarn.*'. Some code could be copied
as-is, those files are marked with the comment:
//YARNUTIL: VANILLA
Files that required some modifications are marked with:
//YARNUTIL: MODIFIED
Or, if all code except a dummy interface could be added:
//YARNUTIL: DUMMY IMPL

The goal the 'yarn-extras' is to make Impala's handling of
the AC configuration self-sufficient, i.e. it shouldn't
matter what version of Hadoop exists. As-is, this was tested
and found to work when Hadoop 2.6 is installed. Because
the yarn-extras/pom.xml still references hadoop-common,
hadoop-yarn-common, and hadoop-yarn-api, additional testing
will be required to ensure Impala using yarn-extras works
when installed along side Hadoop 3.

That testing for Hadoop 3 will be done later. Future changes
will make any other changes required for existing code to
work when Hadoop 3 is installed.

Testing:
* Ran existing tests on master.

1: https://www.cloudera.com/documentation/enterprise/release-notes/topics/cm_vd_cdh_package_tarball_512.html

Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
---
M CMakeLists.txt
A common/yarn-extras/CMakeLists.txt
A common/yarn-extras/README.txt
A common/yarn-extras/pom.xml
A common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/resource/ResourceWeights.java
A common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
A common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationConfigurationException.java
A common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java
A common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/FSQueueType.java
A common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/FairSchedulerConfiguration.java
A common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/QueuePlacementPolicy.java
A common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/QueuePlacementRule.java
A common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java
A common/yarn-extras/src/main/java/org/apache/impala/yarn/server/utils/BuilderUtils.java
M fe/CMakeLists.txt
M fe/pom.xml
M fe/src/main/java/org/apache/impala/util/RequestPoolService.java
M fe/src/test/java/org/apache/impala/util/TestRequestPoolService.java
18 files changed, 1,805 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/8035/4
-- 
To view, visit http://gerrit.cloudera.org:8080/8035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs <mj...@apache.org>
Gerrit-Reviewer: Matthew Jacobs <mj...@apache.org>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-5920: Remove admission control dependency on YARN RM jar

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5920: Remove admission control dependency on YARN RM jar
......................................................................


Patch Set 4:

(13 comments)

Thanks, Zach - some good observations. I made a bunch of the changes for additional cleanup that you suggested, but some I held off on because I don't wanna make diffing the more complex classes harder later. We can always choose to simplify this further later, if we want. Let me know if you feel strongly about it.

http://gerrit.cloudera.org:8080/#/c/8035/4/common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
File common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java:

Line 61:   private final float queueMaxAMShareDefault;
> Impala doesn't make use of YARN's concept of ApplicationMasters.  Why not k
Done


Line 64:   private final Map<String, Map<QueueACL, AccessControlList>> queueAcls;
> Useless (just SUBMIT_APPLICATIONS)
I don't think it's worth changing this one otherwise I'll have to modify a bunch of code that I don't wanna bother adding extra tests for.


Line 69:   private final Map<String, Long> minSharePreemptionTimeouts;
> Unused by Impala
Done


Line 74:   private final Map<String, Long> fairSharePreemptionTimeouts;
> Also unused by Impala
Done


Line 80:   private final Map<String, Float> fairSharePreemptionThresholds;
> Also unused
Done


Line 82:   private final Map<String, SchedulingPolicy> schedulingPolicies;
> The only policy is DEFAULT_POLICY so this is also not needed.
Done


Line 84:   private final SchedulingPolicy defaultSchedulingPolicy;
> Ditto
Done


http://gerrit.cloudera.org:8080/#/c/8035/4/common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationConfigurationException.java
File common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationConfigurationException.java:

Line 30: public class AllocationConfigurationException extends Exception {
> The only point of this class is capturing configuration exceptions and thro
We still use the resource parsing code that throws this, e.g. in FairSchedulerConfiguration. I don't think we should change the API of those functions.


http://gerrit.cloudera.org:8080/#/c/8035/4/common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java
File common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java:

Line 103:   public void serviceInit(Configuration conf) throws Exception {
> 103-165 duplicate code and functionality provided by impala.util.FileWatchS
Yeah you're right, though I'm on the fence about how much we should doctor this right now given we know this works and that we don't have great automated end-to-end test coverage.

Unless you feel strongly about cleaning up this now, I'd rather leave this as a cleanup opportunity for later (ramp up JIRA?) when we can add some additional tests which include modifying the allocation file.


Line 223:     Map<String, Long> minSharePreemptionTimeouts = new HashMap<>();
> Preemption configuration does nothing in Impala; could also be killed.
I think it may be worth leaving this code as-is even if it's doing extra work. The AllocationConfiguration will ignore the things we don't care about, but leaving this as-is will make diff-ing this later much easier.


http://gerrit.cloudera.org:8080/#/c/8035/4/common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/FairSchedulerConfiguration.java
File common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/FairSchedulerConfiguration.java:

Line 70:     Pattern pattern = Pattern.compile("(\\d+)(\\.\\d*)?\\s*" + units);
> This seems a pretty inefficient way to look up resources. This would be a c
Agreed, though I don't think it's worth spending time improving this code. I'd rather not make diffing harder later. Do you feel strongly?


http://gerrit.cloudera.org:8080/#/c/8035/4/common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java
File common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java:

Line 25:     return DEFAULT_POLICY;
> This whole class is pretty useless and in Impala's case, only serves to thr
Per my other comments, I'd prefer not to change the parsing code too much yet, in which case I think it'll just be worth keeping this dummy impl here for now.


http://gerrit.cloudera.org:8080/#/c/8035/4/fe/src/test/java/org/apache/impala/util/TestRequestPoolService.java
File fe/src/test/java/org/apache/impala/util/TestRequestPoolService.java:

Line 124:         new Configuration().get(HADOOP_SECURITY_AUTH_TO_LOCAL, "DEFAULT"));
> Not clear to me why this is needed now.
I'm actually not sure why the testResolvePrincipalName() was working before.

RequestPoolService.resolveRequestPool() -> RPS.assignToPool() -> (impala) User.getShortName() -> (Hadoop) KerberosName.getShortName()

And KerberosName.getShortName() relies on there being some rule set, otherwise a NPE is thrown.

I think that there was some static initialization that just so happened to work before. That said, I think setting this is the right thing to do if we're going to resolve Kerberos principals.

Let me know what you think.


-- 
To view, visit http://gerrit.cloudera.org:8080/8035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs <mj...@apache.org>
Gerrit-Reviewer: Matthew Jacobs <mj...@apache.org>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5920: Remove admission control dependency on YARN RM jar

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5920: Remove admission control dependency on YARN RM jar
......................................................................


Patch Set 7:

The rat-check job actually failed because of a file introduced here - it broke my GVO. https://jenkins.impala.io/job/rat-check/1929/

-- 
To view, visit http://gerrit.cloudera.org:8080/8035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs <mj...@apache.org>
Gerrit-Reviewer: Matthew Jacobs <mj...@apache.org>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5920: Remove admission control dependency on YARN RM jar

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5920: Remove admission control dependency on YARN RM jar
......................................................................


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8035/2//COMMIT_MSG
Commit Message:

Line 23: 'common/yarn-extras', taken from Hadoop 2.6.0.
> Is the original source code that you cribbed this from available publicly s
Done


http://gerrit.cloudera.org:8080/#/c/8035/2/common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
File common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java:

Line 19: //YARNUTIL: MODIFIED
> Is there an easy way I can see the diff? If you point me to the hadoop sour
Yeah, based on the CDH source tarball, e.g.
https://www.cloudera.com/documentation/enterprise/release-notes/topics/cm_vd_cdh_package_tarball_512.html#cm_vd_cdh_package_tarball_512

I'll update the commit message.


http://gerrit.cloudera.org:8080/#/c/8035/2/fe/pom.xml
File fe/pom.xml:

Line 111:       <scope>system</scope>
> Is systemPath necessary? We seem to be able to resolve data-source-api abov
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/8035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5920: Remove admission control dependency on YARN RM jar

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5920: Remove admission control dependency on YARN RM jar
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8035/3/common/yarn-extras/pom.xml
File common/yarn-extras/pom.xml:

Line 23:   <groupId>org.apache.impala</groupId>
> I think we should move the implementing classes here to "org.apache.impala"
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/8035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs <mj...@apache.org>
Gerrit-Reviewer: Matthew Jacobs <mj...@apache.org>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5920: Remove admission control dependency on YARN RM jar

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5920: Remove admission control dependency on YARN RM jar
......................................................................


Patch Set 2:

(3 comments)

It's a bit unfortunate that we have to pull in this much code to get it to work, but it seems like you've already pruned out the unnecessary stuff.

http://gerrit.cloudera.org:8080/#/c/8035/2//COMMIT_MSG
Commit Message:

Line 23: 'common/yarn-extras', taken from Hadoop 2.6.0.
Is the original source code that you cribbed this from available publicly somewhere? Would be handy to diff.


http://gerrit.cloudera.org:8080/#/c/8035/2/common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
File common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java:

Line 19: //YARNUTIL: MODIFIED
Is there an easy way I can see the diff? If you point me to the hadoop source this was derived from I can do it myself.


http://gerrit.cloudera.org:8080/#/c/8035/2/fe/pom.xml
File fe/pom.xml:

Line 111:       <scope>system</scope>
Is systemPath necessary? We seem to be able to resolve data-source-api above without specifying this.


-- 
To view, visit http://gerrit.cloudera.org:8080/8035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5920: Remove admission control dependency on YARN RM jar

Posted by "Zach Amsden (Code Review)" <ge...@cloudera.org>.
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5920: Remove admission control dependency on YARN RM jar
......................................................................


Patch Set 5: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8035/5/common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationConfigurationException.java
File common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationConfigurationException.java:

Line 23: import org.apache.hadoop.classification.InterfaceStability.Unstable;
These don't look pretty, since we own this now I guess they could disappear (along with the @Private @Unstable annotations below).


-- 
To view, visit http://gerrit.cloudera.org:8080/8035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs <mj...@apache.org>
Gerrit-Reviewer: Matthew Jacobs <mj...@apache.org>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5920: Remove admission control dependency on YARN RM jar

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5920: Remove admission control dependency on YARN RM jar
......................................................................


Patch Set 5:

(2 comments)

Thanks for the detailed review, Zach. If nobody has further feedback, I'll try to get this in soon.

http://gerrit.cloudera.org:8080/#/c/8035/4/common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java
File common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java:

Line 103:   public void serviceInit(Configuration conf) throws Exception {
> Can you file a JIRA?  This is actually a well compartmentalized ramp-up or 
https://issues.apache.org/jira/browse/IMPALA-5958


http://gerrit.cloudera.org:8080/#/c/8035/4/common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/QueuePlacementRule.java
File common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/QueuePlacementRule.java:

Line 57:   
> Is it worth fixing the trailing spaces so future diffs don't give the linte
These are their trailing spaces though - I figured leaving them in would make diffing easier later.


-- 
To view, visit http://gerrit.cloudera.org:8080/8035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs <mj...@apache.org>
Gerrit-Reviewer: Matthew Jacobs <mj...@apache.org>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5920: Remove admission control dependency on YARN RM jar

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5920: Remove admission control dependency on YARN RM jar
......................................................................


Patch Set 6: Verified+1

The job actually passed, cleaning up the workspace failed

21:19:36 ERROR: Step ‘Delete workspace when build is done’ failed: Cannot delete workspace: remote file operation failed: /home/ubuntu at hudson.remoting.Channel@43c5313:ubuntu-16.04 (i-0f8cf68ee32aebe80): java.io.IOException: Unable to delete '/home/ubuntu'. Tried 3 times (of a maximum of 3) waiting 0.1 sec between attempts.


21:19:36 Finished: SUCCESS

-- 
To view, visit http://gerrit.cloudera.org:8080/8035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs <mj...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs <mj...@apache.org>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5920: Remove admission control dependency on YARN RM jar

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5920: Remove admission control dependency on YARN RM jar
......................................................................


Patch Set 6: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1235/

-- 
To view, visit http://gerrit.cloudera.org:8080/8035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs <mj...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs <mj...@apache.org>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5920: Remove admission control dependency on YARN RM jar

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Hello Tim Armstrong,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/8035

to look at the new patch set (#5).

Change subject: IMPALA-5920: Remove admission control dependency on YARN RM jar
......................................................................

IMPALA-5920: Remove admission control dependency on YARN RM jar

Impala's admission controller relies on the YARN
fair-scheduler.xml for configuration. That configuration is
loaded using YARN directly (ie. as a library by the
frontend). In Hadoop 3, a number of changes were made to the
YARN resourcemanager which break Impala. While we eventually
want to rethink the admission control configuration
(IMPALA-4159), in the meantime we at least should avoid
using unsupported YARN APIs.

This patch removes the fe dependency on the YARN artifact
'hadoop-yarn-server-resourcemanager' which contains private
APIs and isn't meant to be used as a library.

A subset of the required code has been added in
'common/yarn-extras', taken from Hadoop 2.6.0 in CDH, e.g.
see [1]. The code is added in packages 'org.apache.impala.*'
instead of 'org.apache.yarn.*'. Some code could be copied
as-is, those files are marked with the comment:
//YARNUTIL: VANILLA
Files that required some modifications are marked with:
//YARNUTIL: MODIFIED
Or, if all code except a dummy interface could be added:
//YARNUTIL: DUMMY IMPL

The goal the 'yarn-extras' is to make Impala's handling of
the AC configuration self-sufficient, i.e. it shouldn't
matter what version of Hadoop exists. As-is, this was tested
and found to work when Hadoop 2.6 is installed. Because
the yarn-extras/pom.xml still references hadoop-common,
hadoop-yarn-common, and hadoop-yarn-api, additional testing
will be required to ensure Impala using yarn-extras works
when installed along side Hadoop 3.

That testing for Hadoop 3 will be done later. Future changes
will make any other changes required for existing code to
work when Hadoop 3 is installed.

Testing:
* Ran existing tests on master.

1: https://www.cloudera.com/documentation/enterprise/release-notes/topics/cm_vd_cdh_package_tarball_512.html

Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
---
M CMakeLists.txt
A common/yarn-extras/CMakeLists.txt
A common/yarn-extras/README.txt
A common/yarn-extras/pom.xml
A common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/resource/ResourceWeights.java
A common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
A common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationConfigurationException.java
A common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java
A common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/FSQueueType.java
A common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/FairSchedulerConfiguration.java
A common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/QueuePlacementPolicy.java
A common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/QueuePlacementRule.java
A common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java
A common/yarn-extras/src/main/java/org/apache/impala/yarn/server/utils/BuilderUtils.java
M fe/CMakeLists.txt
M fe/pom.xml
M fe/src/main/java/org/apache/impala/util/RequestPoolService.java
M fe/src/test/java/org/apache/impala/util/TestRequestPoolService.java
18 files changed, 1,690 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/8035/5
-- 
To view, visit http://gerrit.cloudera.org:8080/8035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs <mj...@apache.org>
Gerrit-Reviewer: Matthew Jacobs <mj...@apache.org>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>

[Impala-ASF-CR] IMPALA-5920: Remove admission control dependency on YARN RM jar

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has uploaded a new patch set (#2).

Change subject: IMPALA-5920: Remove admission control dependency on YARN RM jar
......................................................................

IMPALA-5920: Remove admission control dependency on YARN RM jar

Impala's admission controller relies on the YARN
fair-scheduler.xml for configuration. That configuration is
loaded using YARN directly (ie. as a library by the
frontend). In Hadoop 3, a number of changes were made to the
YARN resourcemanager which break Impala. While we eventually
want to rethink the admission control configuration
(IMPALA-4159), in the meantime we at least should avoid
using unsupported YARN APIs.

This patch removes the fe dependency on the YARN artifact
'hadoop-yarn-server-resourcemanager' which contains private
APIs and isn't meant to be used as a library.

A subset of the required code has been added in
'common/yarn-extras', taken from Hadoop 2.6.0.
The goal the 'yarn-extras' is to make Impala's handling of
the AC configuration self-sufficient, i.e. it shouldn't
matter what version of Hadoop exists. As-is, this was tested
and found to work when Hadoop 2.6 is installed. Because
the yarn-extras/pom.xml still references hadoop-common,
hadoop-yarn-common, and hadoop-yarn-api, additional testing
will be required to ensure Impala using yarn-extras works
when installed along side Hadoop 3.

That testing for Hadoop 3 will be done later. Future changes
will make any other changes required for existing code to
work when Hadoop 3 is installed.

Testing:
* Ran existing tests on master.

Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
---
M CMakeLists.txt
A common/yarn-extras/CMakeLists.txt
A common/yarn-extras/README.txt
A common/yarn-extras/pom.xml
A common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/resource/ResourceWeights.java
A common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
A common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfigurationException.java
A common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java
A common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSQueueType.java
A common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerConfiguration.java
A common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueuePlacementPolicy.java
A common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueuePlacementRule.java
A common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java
A common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/utils/BuilderUtils.java
M fe/CMakeLists.txt
M fe/pom.xml
M fe/src/test/java/org/apache/impala/util/TestRequestPoolService.java
17 files changed, 1,804 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/8035/2
-- 
To view, visit http://gerrit.cloudera.org:8080/8035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs <mj...@cloudera.com>

[Impala-ASF-CR] IMPALA-5920: Remove admission control dependency on YARN RM jar

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5920: Remove admission control dependency on YARN RM jar
......................................................................


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1235/

-- 
To view, visit http://gerrit.cloudera.org:8080/8035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs <mj...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs <mj...@apache.org>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5920: Remove admission control dependency on YARN RM jar

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has uploaded a new patch set (#3).

Change subject: IMPALA-5920: Remove admission control dependency on YARN RM jar
......................................................................

IMPALA-5920: Remove admission control dependency on YARN RM jar

Impala's admission controller relies on the YARN
fair-scheduler.xml for configuration. That configuration is
loaded using YARN directly (ie. as a library by the
frontend). In Hadoop 3, a number of changes were made to the
YARN resourcemanager which break Impala. While we eventually
want to rethink the admission control configuration
(IMPALA-4159), in the meantime we at least should avoid
using unsupported YARN APIs.

This patch removes the fe dependency on the YARN artifact
'hadoop-yarn-server-resourcemanager' which contains private
APIs and isn't meant to be used as a library.

A subset of the required code has been added in
'common/yarn-extras', taken from Hadoop 2.6.0 in CDH, e.g.
see [1].

The goal the 'yarn-extras' is to make Impala's handling of
the AC configuration self-sufficient, i.e. it shouldn't
matter what version of Hadoop exists. As-is, this was tested
and found to work when Hadoop 2.6 is installed. Because
the yarn-extras/pom.xml still references hadoop-common,
hadoop-yarn-common, and hadoop-yarn-api, additional testing
will be required to ensure Impala using yarn-extras works
when installed along side Hadoop 3.

That testing for Hadoop 3 will be done later. Future changes
will make any other changes required for existing code to
work when Hadoop 3 is installed.

Testing:
* Ran existing tests on master.

1: https://www.cloudera.com/documentation/enterprise/release-notes/topics/cm_vd_cdh_package_tarball_512.html

Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
---
M CMakeLists.txt
A common/yarn-extras/CMakeLists.txt
A common/yarn-extras/README.txt
A common/yarn-extras/pom.xml
A common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/resource/ResourceWeights.java
A common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
A common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfigurationException.java
A common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java
A common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSQueueType.java
A common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerConfiguration.java
A common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueuePlacementPolicy.java
A common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueuePlacementRule.java
A common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java
A common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/utils/BuilderUtils.java
M fe/CMakeLists.txt
M fe/pom.xml
M fe/src/test/java/org/apache/impala/util/TestRequestPoolService.java
17 files changed, 1,802 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/8035/3
-- 
To view, visit http://gerrit.cloudera.org:8080/8035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-5920: Remove admission control dependency on YARN RM jar

Posted by "Zach Amsden (Code Review)" <ge...@cloudera.org>.
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5920: Remove admission control dependency on YARN RM jar
......................................................................


Patch Set 4:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/8035/4/common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
File common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java:

Line 64:   private final Map<String, Map<QueueACL, AccessControlList>> queueAcls;
> I don't think it's worth changing this one otherwise I'll have to modify a 
That's fine.  We may even want extra ACLs sometime in the future - I can see a case for CANCEL_OTHERS


http://gerrit.cloudera.org:8080/#/c/8035/4/common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationConfigurationException.java
File common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationConfigurationException.java:

Line 30: public class AllocationConfigurationException extends Exception {
> We still use the resource parsing code that throws this, e.g. in FairSchedu
Fair enough.


http://gerrit.cloudera.org:8080/#/c/8035/4/common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java
File common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java:

Line 103:   public void serviceInit(Configuration conf) throws Exception {
> Yeah you're right, though I'm on the fence about how much we should doctor 
Can you file a JIRA?  This is actually a well compartmentalized ramp-up or beginner task (that even someone from outside Cloudera could do)


Line 223:     Map<String, Long> minSharePreemptionTimeouts = new HashMap<>();
> I think it may be worth leaving this code as-is even if it's doing extra wo
Seems fine.


http://gerrit.cloudera.org:8080/#/c/8035/4/common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/FairSchedulerConfiguration.java
File common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/FairSchedulerConfiguration.java:

Line 70:     Pattern pattern = Pattern.compile("(\\d+)(\\.\\d*)?\\s*" + units);
> Agreed, though I don't think it's worth spending time improving this code. 
Nope.  I have zero insight into how often this is called.  Obviously if it ends up being called every time we need to check a memory reservation it might be a problem, but that seems very unlikely.


http://gerrit.cloudera.org:8080/#/c/8035/4/common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/QueuePlacementRule.java
File common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/QueuePlacementRule.java:

Line 57:   
Is it worth fixing the trailing spaces so future diffs don't give the linter hives?


http://gerrit.cloudera.org:8080/#/c/8035/4/common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java
File common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java:

Line 25:     return DEFAULT_POLICY;
> Per my other comments, I'd prefer not to change the parsing code too much y
Seems fine.


http://gerrit.cloudera.org:8080/#/c/8035/4/fe/src/test/java/org/apache/impala/util/TestRequestPoolService.java
File fe/src/test/java/org/apache/impala/util/TestRequestPoolService.java:

Line 124:         new Configuration().get(HADOOP_SECURITY_AUTH_TO_LOCAL, "DEFAULT"));
> I'm actually not sure why the testResolvePrincipalName() was working before
If I had to guess, I'd guess that kerberos tests were flaky and got disabled or marked xfailed.


-- 
To view, visit http://gerrit.cloudera.org:8080/8035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs <mj...@apache.org>
Gerrit-Reviewer: Matthew Jacobs <mj...@apache.org>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5920: Remove admission control dependency on YARN RM jar

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5920: Remove admission control dependency on YARN RM jar
......................................................................


Patch Set 3: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8035/2/common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
File common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java:

Line 19: //YARNUTIL: MODIFIED
> Yeah, based on the CDH source tarball, e.g.
In case it benefits anyone else I was able to look at the diffs by downloading the tarball and using the following bash:

  for f in $(find common/yarn-extras/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ -name *.java) ; do vimdiff $(echo $f | sed 's/common\/yarn-extras\//\/home\/tarmstrong\/hadoop-2.6.0-cdh5.12.0\/src\/hadoop-yarn-project\/hadoop-yarn\/hadoop-yarn-server\/hadoop-yarn-server-resourcemanager\//') $f; done


-- 
To view, visit http://gerrit.cloudera.org:8080/8035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes