You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2021/06/18 21:09:21 UTC

[GitHub] [hive] belugabehr opened a new pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

belugabehr opened a new pull request #1742:
URL: https://github.com/apache/hive/pull/1742


   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://cwiki.apache.org/confluence/display/Hive/HowToContribute
     2. Ensure that you have created an issue on the Hive project JIRA: https://issues.apache.org/jira/projects/HIVE/summary
     3. Ensure you have added or run the appropriate tests for your PR: 
     4. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP]HIVE-XXXXX:  Your PR title ...'.
     5. Be sure to keep the PR description updated to reflect all changes.
     6. Please write your PR title to summarize what this PR proposes.
     7. If possible, provide a concise example to reproduce the issue for a faster review.
   
   -->
   
   ### What changes were proposed in this pull request?
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If you fix some SQL features, you can provide some references of other DBMSes.
     3. If there is design documentation, please add the link.
     4. If there is a discussion in the mailing list, please add the link.
   -->
   
   
   ### Why are the changes needed?
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   
   
   ### Does this PR introduce _any_ user-facing change?
   <!--
   Note that it means *any* user-facing change including all aspects such as the documentation fix.
   If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description, screenshot and/or a reproducable example to show the behavior difference if possible.
   If possible, please also clarify if this is a user-facing change compared to the released Hive versions or within the unreleased branches such as master.
   If no, write 'No'.
   -->
   
   
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] belugabehr commented on pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.0

Posted by GitBox <gi...@apache.org>.
belugabehr commented on pull request #1742:
URL: https://github.com/apache/hive/pull/1742#issuecomment-751765866


   @wangyum Stuck.
   
   There are two big issues here:
   
   1. Hive integration tests fire up Druid, Kafka, HDFS, LLAP, etc. all in the same JVM and their 3rd party dependencies are all over the place. Using a higher version of a dependency breaks one product, but using a lower version breaks the other.  To make this work well, there probably needs to be a way to launch each service in their own JVM class loader.  In lieu of that, I've been trying to move the ball closer to the goal post and getting dependencies closer together.
   
   https://github.com/apache/druid/pull/10683
   HIVE-24542
   
   
   2. In HDFS 3.3.0, Hadoop team introduced `ProtobufRpcEngine2` in addition to `ProtobufRpcEngine` (sigh).  Some of the Hive LLAP stuff is using this Hadoop Protobuf RPC engine (`ProtobufRpcEngine`).  There's some `static` logic in the protocol engines that prohibits loading both RPC engines into the same JVM at the same time, I'm not sure why.  HDFS was migrated to `ProtobufRpcEngine2`.  So, again, in the integration tests, when the HDFS mini cluster is loaded, version 2 of the RPC engine is loaded into the JVM.  When LLAP is later loaded, it fails to start because version 1 cannot be registered at the same time.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ayushtkn commented on pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on pull request #1742:
URL: https://github.com/apache/hive/pull/1742#issuecomment-979217036


   > or even... upgrade to hadoop 3.1.MAX or 3.2.ANYTHING to grab some of the changes we have to cover in upgrading to 3.3.1
   
   Hadoop 3.1 line is EOL and 3.1.MAX & 3.2.Anything doesn't have Guava Shaded. So, guava version mismatch leads to even more issues,
   From 3.3 only Guava & Protobuf are shaded in hadoop.
   
   > so the issue now is that Hadoop 3.3+ is using org.jline v3 and one of the Hive dependencies of sqline has a dependency on jline v2 which is causing a clash
   
   If I decode @belugabehr's comment. The problem is only with Jline. If Jline v3 & v2 are compatible, and I see Jline is used by  only Yarn-Client, So can we just not exclude the jline dependency while adding yarn-client?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] belugabehr closed pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
belugabehr closed pull request #1742:
URL: https://github.com/apache/hive/pull/1742


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] belugabehr commented on a change in pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
belugabehr commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r674886306



##########
File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/BaseReplicationAcrossInstances.java
##########
@@ -55,14 +56,15 @@ static void internalBeforeClassSetup(Map<String, String> overrides, Class clazz)
       throws Exception {
     conf = new HiveConf(clazz);
     conf.set("dfs.client.use.datanode.hostname", "true");
-    conf.set("hadoop.proxyuser." + Utils.getUGI().getShortUserName() + ".hosts", "*");

Review comment:
       Hey @abstractdog, thanks for the review.
   
   Take a look at my notes here:
   
   https://issues.apache.org/jira/browse/HIVE-24484?focusedCommentId=17369708&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17369708
   
   tldr; These unit tests are launching two HMS within the same JVM (same class-loader) and therefore they are able to modify each other's state where it stored in static variables.  This testing cannot be done any more.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] github-actions[bot] commented on pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.0

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #1742:
URL: https://github.com/apache/hive/pull/1742#issuecomment-808822021


   This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the dev@hive.apache.org list if the patch is in need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] github-actions[bot] commented on pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #1742:
URL: https://github.com/apache/hive/pull/1742#issuecomment-974730579


   This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the dev@hive.apache.org list if the patch is in need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] abstractdog commented on a change in pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
abstractdog commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r674875763



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/io/RecordReaderWrapper.java
##########
@@ -69,7 +70,14 @@ static RecordReader create(InputFormat inputFormat, HiveInputFormat.HiveInputSpl
       JobConf jobConf, Reporter reporter) throws IOException {
     int headerCount = Utilities.getHeaderCount(tableDesc);
     int footerCount = Utilities.getFooterCount(tableDesc, jobConf);
-    RecordReader innerReader = inputFormat.getRecordReader(split.getInputSplit(), jobConf, reporter);
+
+    RecordReader innerReader = null;
+    try {
+     innerReader = inputFormat.getRecordReader(split.getInputSplit(), jobConf, reporter);
+    } catch (InterruptedIOException iioe) {
+      // If reading from the underlying record reader is interrupted, return a no-op record reader
+      return new ZeroRowsInputFormat().getRecordReader(split.getInputSplit(), jobConf, reporter);

Review comment:
       why is it better to return with no-op record reader instead of letting this codepath fail and handle the exception somewhere else? doesn't this mask issues?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] kgyrtkirk commented on pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
kgyrtkirk commented on pull request #1742:
URL: https://github.com/apache/hive/pull/1742#issuecomment-975396279


   this PR is not making much progress - I think this in its current form will not work; or will not land soon:\
   I think it would make sesnse to consider:
   * split this thing up into some pieces which we could get in...
   * or even... upgrade to hadoop 3.1.MAX or 3.2.ANYTHING to grab some of the changes we have to cover in upgrading to 3.3.1
   instead of waiting this thing to get in with JDK11 support and everything? - what do you guys think?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] kgyrtkirk commented on pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
kgyrtkirk commented on pull request #1742:
URL: https://github.com/apache/hive/pull/1742#issuecomment-977679326


   good question.... I don't know; but it seems like this PR have stalled!
   
   this patch have:
   * added a lib named `jansi` => this seems like an independent step
   * removed the explicit guava usage from druid => could we upgrade guava separately? probably not...
   * jline / jetty related things
   * lots of other things


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] abstractdog commented on a change in pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
abstractdog commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r674869601



##########
File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/BaseReplicationAcrossInstances.java
##########
@@ -55,14 +56,15 @@ static void internalBeforeClassSetup(Map<String, String> overrides, Class clazz)
       throws Exception {
     conf = new HiveConf(clazz);
     conf.set("dfs.client.use.datanode.hostname", "true");
-    conf.set("hadoop.proxyuser." + Utils.getUGI().getShortUserName() + ".hosts", "*");

Review comment:
       this is for impersonation according to https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Superusers.html, don't we want to test this scenario anymore?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] belugabehr edited a comment on pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
belugabehr edited a comment on pull request #1742:
URL: https://github.com/apache/hive/pull/1742#issuecomment-916319837


   OK, so the issue now is that Hadoop 3.3+ is using `org.jline v3` and one of the Hive dependencies of `sqline` has a dependency on `jline v2` which is causing a clash, well, not a clash per se, but it may be possible to address this by including both versions of jline since they have different namespaces. Ugh.  I'll see if there is an updates sqline or what that is doing exactly.  Maybe it can be replaced with jline totally?  I don't really know much about these libraries, just trying to rubik's cube this together.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] pgaref commented on a change in pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
pgaref commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r665214848



##########
File path: ql/src/test/results/clientpositive/llap/check_constraint.q.out
##########
@@ -2415,14 +2415,14 @@ STAGE PLANS:
                 outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
                 Statistics: Num rows: 1 Data size: 409 Basic stats: COMPLETE Column stats: NONE
                 Select Operator
-                  expressions: _col1 (type: string), _col0 (type: int), _col4 (type: string), _col5 (type: struct<writeid:bigint,bucketid:int,rowid:bigint>), _col2 (type: string), _col3 (type: int)
+                  expressions: _col0 (type: int), _col4 (type: string), _col3 (type: int), _col1 (type: string), _col5 (type: struct<writeid:bigint,bucketid:int,rowid:bigint>), _col2 (type: string)

Review comment:
       Are these q.out changes expected? What is the root cause?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] abstractdog commented on a change in pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
abstractdog commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r674876607



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/metastore/HiveMetaStoreAuthorizer.java
##########
@@ -483,7 +484,8 @@ HiveAuthorizer createHiveMetaStoreAuthorizer() throws Exception {
   boolean isSuperUser(String userName) {
     Configuration conf      = getConf();
     String        ipAddress = HMSHandler.getIPAddress();
-    return (MetaStoreServerUtils.checkUserHasHostProxyPrivileges(userName, conf, ipAddress));
+    ProxyUsers.refreshSuperUserGroupsConfiguration(conf);
+    return (MetaStoreServerUtils.checkUserHasHostProxyPrivileges(userName, ipAddress));

Review comment:
       nit: extra bracket is not needed I guess




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] belugabehr commented on a change in pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
belugabehr commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r675041145



##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
##########
@@ -367,6 +368,7 @@ public static void startMetaStore(int port, HadoopThriftAuthBridge bridge,
     boolean tcpKeepAlive = MetastoreConf.getBoolVar(conf, ConfVars.TCP_KEEP_ALIVE);
     boolean useCompactProtocol = MetastoreConf.getBoolVar(conf, ConfVars.USE_THRIFT_COMPACT_PROTOCOL);
     boolean useSSL = MetastoreConf.getBoolVar(conf, ConfVars.USE_SSL);
+    ProxyUsers.refreshSuperUserGroupsConfiguration(conf);

Review comment:
       So ya, this was done as a separate thing buried in the Hive code.  This makes it much more explicit and less hidden.
   
   Before Hadoop 3.3, it could easily be detected if a call to `refreshSuperUserGroupsConfiguration` had already been performed because there was a corresponding getter that would return a `null` value if it had not.  Well, in 3.3 that went away and instead of returning null, you get some sort of default value.  So now one can't lazily refresh these configurations, if they haven't already been, it's better to just refresh them explicitly here and be done with it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] belugabehr commented on a change in pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
belugabehr commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r665373095



##########
File path: llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapInputFormat.java
##########
@@ -137,6 +137,8 @@
       // This starts the reader in the background.
       rr.start();
       return result;
+    } catch (IOException ioe) {

Review comment:
       Hey @pgaref,
   
   Ya, this is required.  Based on the `InvalidInputException` (which is a subclass of `IOException`) changes in HDFS, this code is require to pass the `InvalidInputException` up to the caller directly, otherwise, in the `Exception` block, it gets wrapped in yet another `IOException` and that caller is no longer able to detect the `InvalidInputException`.
   
   I hope that makes sense.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ayushtkn commented on pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on pull request #1742:
URL: https://github.com/apache/hive/pull/1742#issuecomment-915417026


   Seems the tests are failing with
   ``
   java.lang.NoSuchMethodError: org.jline.reader.impl.completer.StringsCompleter.<init>([Lorg/jline/reader/Candidate;)V
   ``
   Should be fixable in the Hive Code itself?
   If there is something required in the Hadoop Code, we can get that in now, 3.3.2 release is being planned out


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] belugabehr closed pull request #1742: HIVE-24484: Upgrade Hadoop to 3.2.1

Posted by GitBox <gi...@apache.org>.
belugabehr closed pull request #1742:
URL: https://github.com/apache/hive/pull/1742


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] belugabehr edited a comment on pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
belugabehr edited a comment on pull request #1742:
URL: https://github.com/apache/hive/pull/1742#issuecomment-916319837


   OK, so the issue now is that Hadoop 3.3+ is using `org.jline v3` and one of the Hive dependencies of `sqline` has a dependency on `jline v2` which is causing a clash, well, not a clash per se, but it may be possible to address this by including both versions of jline since they have different namespaces. Ugh.  I'll see if there is an updated `sqline` library or what that is doing exactly.  Maybe it can be replaced with jline totally?  I don't really know much about these libraries, just trying to rubik's cube this together.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] abstractdog commented on a change in pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
abstractdog commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r674871945



##########
File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenarios.java
##########
@@ -4117,28 +4118,33 @@ public void testAuthForNotificationAPIs() throws Exception {
     createDB(dbName, driver);
     NotificationEventResponse rsp = metaStoreClient.getNextNotification(firstEventId, 0, null);
     assertEquals(1, rsp.getEventsSize());
+
     // Test various scenarios
-    // Remove the proxy privilege and the auth should fail (in reality the proxy setting should not be changed on the fly)
-    hconf.unset(proxySettingName);
-    // Need to explicitly update ProxyUsers
-    ProxyUsers.refreshSuperUserGroupsConfiguration(hconf);
-    // Verify if the auth should fail
-    Exception ex = null;
+    // Remove the proxy privilege by reseting proxy configuration to default value.
+    // The auth should fail (in reality the proxy setting should not be changed on the fly)
+    // Pretty hacky: Affects both instances of HMS
+    ProxyUsers.refreshSuperUserGroupsConfiguration();
+
     try {
       rsp = metaStoreClient.getNextNotification(firstEventId, 0, null);
+      Assert.fail("Get Next Nofitication should have failed due to no proxy auth");
     } catch (TException e) {
-      ex = e;

Review comment:
       I have no idea how can we hit this catch, but having it empty is always a red sign, have you checked if at least a log.debug is useful here?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] belugabehr commented on a change in pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
belugabehr commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r675041145



##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
##########
@@ -367,6 +368,7 @@ public static void startMetaStore(int port, HadoopThriftAuthBridge bridge,
     boolean tcpKeepAlive = MetastoreConf.getBoolVar(conf, ConfVars.TCP_KEEP_ALIVE);
     boolean useCompactProtocol = MetastoreConf.getBoolVar(conf, ConfVars.USE_THRIFT_COMPACT_PROTOCOL);
     boolean useSSL = MetastoreConf.getBoolVar(conf, ConfVars.USE_SSL);
+    ProxyUsers.refreshSuperUserGroupsConfiguration(conf);

Review comment:
       So ya, this was done as a separate thing buried in the Hive code.  Moving it here makes it much more explicit and less hidden.
   
   Before Hadoop 3.3, it could easily be detected if a call to `refreshSuperUserGroupsConfiguration` had already been performed because there was a corresponding getter that would return a `null` value if it had not.  Well, in 3.3 that went away and instead of returning null, you get some sort of default value.  So now one can't lazily refresh these configurations, if they haven't already been refreshed, so it's better to just refresh them explicitly here as part of the servers initialization and be done with it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] belugabehr commented on a change in pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
belugabehr commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r674896581



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/io/RecordReaderWrapper.java
##########
@@ -69,7 +70,14 @@ static RecordReader create(InputFormat inputFormat, HiveInputFormat.HiveInputSpl
       JobConf jobConf, Reporter reporter) throws IOException {
     int headerCount = Utilities.getHeaderCount(tableDesc);
     int footerCount = Utilities.getFooterCount(tableDesc, jobConf);
-    RecordReader innerReader = inputFormat.getRecordReader(split.getInputSplit(), jobConf, reporter);
+
+    RecordReader innerReader = null;
+    try {
+     innerReader = inputFormat.getRecordReader(split.getInputSplit(), jobConf, reporter);
+    } catch (InterruptedIOException iioe) {
+      // If reading from the underlying record reader is interrupted, return a no-op record reader
+      return new ZeroRowsInputFormat().getRecordReader(split.getInputSplit(), jobConf, reporter);

Review comment:
       Hey.
   
   So, in my experimentation, this is the least-bad option.  I did this to preserve the previous behavior.  The Hive code is not setup to handle this error condition.  As thing currently stand in `master`, if the calling Thread was interrupted, the thread would finish fetching the rows regardless and then just later ignore them (throw them away).  The calling code does not handle 'null' return value and it does not handle this Exception.  As currently implemented in Hive `master`, if it gets an exception it simply exits execution with an Error message, without implementing a lot more code, there is no way to ignore/skip this one specific error type.  So, the cleanest thing to do is to return `ZeroRows` since it's going to be thrown away later anyway.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] kgyrtkirk commented on pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
kgyrtkirk commented on pull request #1742:
URL: https://github.com/apache/hive/pull/1742#issuecomment-987685455


   @ayushtkn: I've done some digging into the jline3 issue in #2617 ([here](https://github.com/apache/hive/pull/2617#issuecomment-978029623))  and I'm not sure if it was deliberate move to declare jline3 as a dependency of the `hadoop-yarn-client` artifact
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] github-actions[bot] closed pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed pull request #1742:
URL: https://github.com/apache/hive/pull/1742


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] belugabehr closed pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
belugabehr closed pull request #1742:
URL: https://github.com/apache/hive/pull/1742


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] belugabehr commented on a change in pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
belugabehr commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r674888734



##########
File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenarios.java
##########
@@ -4117,28 +4118,33 @@ public void testAuthForNotificationAPIs() throws Exception {
     createDB(dbName, driver);
     NotificationEventResponse rsp = metaStoreClient.getNextNotification(firstEventId, 0, null);
     assertEquals(1, rsp.getEventsSize());
+
     // Test various scenarios
-    // Remove the proxy privilege and the auth should fail (in reality the proxy setting should not be changed on the fly)
-    hconf.unset(proxySettingName);
-    // Need to explicitly update ProxyUsers
-    ProxyUsers.refreshSuperUserGroupsConfiguration(hconf);
-    // Verify if the auth should fail
-    Exception ex = null;
+    // Remove the proxy privilege by reseting proxy configuration to default value.
+    // The auth should fail (in reality the proxy setting should not be changed on the fly)
+    // Pretty hacky: Affects both instances of HMS
+    ProxyUsers.refreshSuperUserGroupsConfiguration();
+
     try {
       rsp = metaStoreClient.getNextNotification(firstEventId, 0, null);
+      Assert.fail("Get Next Nofitication should have failed due to no proxy auth");
     } catch (TException e) {
-      ex = e;

Review comment:
       The idea here is that it SHOULD throw an Exception.  If it does not throw an Exception from `getNextNofitication` then it will hit the `Assert.fail`.  I can add a comment to clarify that this is the expected behavior.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] kgyrtkirk commented on a change in pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
kgyrtkirk commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r677540831



##########
File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/WarehouseInstance.java
##########
@@ -394,16 +393,13 @@ WarehouseInstance verifyResults(String[] data) throws IOException {
   }
 
   WarehouseInstance verifyFailure(String[] data) throws IOException {
-    List<String> results = getOutput();
-    logger.info("Expecting {}", StringUtils.join(data, ","));
-    logger.info("Got {}", results);
-    boolean dataMatched = (data.length == results.size());
-    if (dataMatched) {
-      for (int i = 0; i < data.length; i++) {
-        dataMatched &= data[i].toLowerCase().equals(results.get(i).toLowerCase());
-      }
-    }
-    assertFalse(dataMatched);
+    final List<String> expectedResults =
+        Arrays.asList(data).stream().map(r -> r.toLowerCase()).collect(Collectors.toList());
+    final List<String> actualResults = getOutput().stream().map(r -> r.toLowerCase()).collect(Collectors.toList());
+
+    assertTrue("Data " + expectedResults + " should not be present in " + actualResults,
+        Collections.disjoint(expectedResults, actualResults));
+

Review comment:
       old and new code seem to be doing  different things
   old is:
   ```
   e[0] != r[0] || ... || e[n] != r[n]
   ```
   new block is a condition which is checking a set operation between the elements...
   
   why is this change necessary - how this is connected to a hadoop upgrade?
   

##########
File path: ql/src/test/results/clientpositive/llap/check_constraint.q.out
##########
@@ -2415,14 +2415,14 @@ STAGE PLANS:
                 outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
                 Statistics: Num rows: 1 Data size: 409 Basic stats: COMPLETE Column stats: NONE
                 Select Operator
-                  expressions: _col1 (type: string), _col0 (type: int), _col4 (type: string), _col5 (type: struct<writeid:bigint,bucketid:int,rowid:bigint>), _col2 (type: string), _col3 (type: int)
+                  expressions: _col0 (type: int), _col4 (type: string), _col3 (type: int), _col1 (type: string), _col5 (type: struct<writeid:bigint,bucketid:int,rowid:bigint>), _col2 (type: string)

Review comment:
       the join operand order seem to have changed




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] github-actions[bot] closed pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.0

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed pull request #1742:
URL: https://github.com/apache/hive/pull/1742


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] belugabehr commented on a change in pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
belugabehr commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r675041145



##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
##########
@@ -367,6 +368,7 @@ public static void startMetaStore(int port, HadoopThriftAuthBridge bridge,
     boolean tcpKeepAlive = MetastoreConf.getBoolVar(conf, ConfVars.TCP_KEEP_ALIVE);
     boolean useCompactProtocol = MetastoreConf.getBoolVar(conf, ConfVars.USE_THRIFT_COMPACT_PROTOCOL);
     boolean useSSL = MetastoreConf.getBoolVar(conf, ConfVars.USE_SSL);
+    ProxyUsers.refreshSuperUserGroupsConfiguration(conf);

Review comment:
       So ya, this was done as a separate thing buried in the Hive code.  Moving it here makes it much more explicit and less hidden.
   
   Before Hadoop 3.3, it could easily be detected if a call to `refreshSuperUserGroupsConfiguration` had already been performed because there was a corresponding getter that would return a `null` value if it had not.  Well, in 3.3 that went away and instead of returning null, you get some sort of default value.  So now one can't lazily refresh these configurations, if they haven't already been, it's better to just refresh them explicitly here and be done with it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] github-actions[bot] commented on pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #1742:
URL: https://github.com/apache/hive/pull/1742#issuecomment-1030718952


   This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the dev@hive.apache.org list if the patch is in need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] github-actions[bot] closed pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed pull request #1742:
URL: https://github.com/apache/hive/pull/1742


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] pgaref commented on a change in pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
pgaref commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r665213804



##########
File path: llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapInputFormat.java
##########
@@ -137,6 +137,8 @@
       // This starts the reader in the background.
       rr.start();
       return result;
+    } catch (IOException ioe) {

Review comment:
       is this needed? Exception should catch everything right?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] wangyum commented on pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.0

Posted by GitBox <gi...@apache.org>.
wangyum commented on pull request #1742:
URL: https://github.com/apache/hive/pull/1742#issuecomment-751136951


   Any update?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] belugabehr commented on a change in pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
belugabehr commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r675037187



##########
File path: standalone-metastore/pom.xml
##########
@@ -79,8 +79,8 @@
     <dropwizard-metrics-hadoop-metrics2-reporter.version>0.1.2
     </dropwizard-metrics-hadoop-metrics2-reporter.version>
     <dropwizard.version>3.1.0</dropwizard.version>
-    <guava.version>19.0</guava.version>
-    <hadoop.version>3.1.0</hadoop.version>
+    <guava.version>27.0-jre</guava.version>
+    <hadoop.version>3.2.1</hadoop.version>

Review comment:
       Wow, great catch.  Nooooooooooo!  Ugh.
   
   I hope it doesn't break anything.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] vyaslav commented on pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.0

Posted by GitBox <gi...@apache.org>.
vyaslav commented on pull request #1742:
URL: https://github.com/apache/hive/pull/1742#issuecomment-756999283


   > @wangyum Stuck.
   > 
   > There are two big issues here:
   > 
   > 1. Hive integration tests fire up Druid, Kafka, HDFS, LLAP, etc. all in the same JVM and their 3rd party dependencies are all over the place. Using a higher version of a dependency breaks one product, but using a lower version breaks the other.  To make this work well, there probably needs to be a way to launch each service in their own JVM class loader.  In lieu of that, I've been trying to move the ball closer to the goal post and getting dependencies closer together.
   > 
   > [apache/druid#10683](https://github.com/apache/druid/pull/10683)
   > [HIVE-24542](https://issues.apache.org/jira/browse/HIVE-24542)
   > 
   > 1. In HDFS 3.3.0, Hadoop team introduced `ProtobufRpcEngine2` in addition to `ProtobufRpcEngine` (sigh).  Some of the Hive LLAP stuff is using this Hadoop Protobuf RPC engine (`ProtobufRpcEngine`).  There's some `static` logic in the protocol engines that prohibits loading both RPC engines into the same JVM at the same time, I'm not sure why.  HDFS was migrated to `ProtobufRpcEngine2`.  So, again, in the integration tests, when the HDFS mini cluster is loaded, version 2 of the RPC engine is loaded into the JVM.  When LLAP is later loaded, it fails to start because version 1 cannot be registered at the same time.
   
   Regarding 1st, I faced the same issues in my PR for upgrade to 3.1.3 - https://github.com/apache/hive/pull/1638
   But, regarding 2nd I'am curious if it would be hard to replace `ProtobufRpcEngine` with `ProtobufRpcEngine2` in Hive itself. As I understand they have upgraded from PB2 to PB3


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] abstractdog commented on a change in pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
abstractdog commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r674881535



##########
File path: standalone-metastore/pom.xml
##########
@@ -79,8 +79,8 @@
     <dropwizard-metrics-hadoop-metrics2-reporter.version>0.1.2
     </dropwizard-metrics-hadoop-metrics2-reporter.version>
     <dropwizard.version>3.1.0</dropwizard.version>
-    <guava.version>19.0</guava.version>
-    <hadoop.version>3.1.0</hadoop.version>
+    <guava.version>27.0-jre</guava.version>
+    <hadoop.version>3.2.1</hadoop.version>

Review comment:
       I think we're targeting 3.3.1 here too, right?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] pgaref commented on a change in pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
pgaref commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r665212475



##########
File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationOnHDFSEncryptedZones.java
##########
@@ -119,16 +120,15 @@ public void targetAndSourceHaveDifferentEncryptionZoneKeys() throws Throwable {
             .run("insert into table encrypted_table values (2,'value2')")
             .dump(primaryDbName, dumpWithClause);
 
-    replica
-        .run("repl load " + primaryDbName + " into " + replicatedDbName
-                + " with('hive.repl.add.raw.reserved.namespace'='true', "
-                + "'hive.repl.replica.external.table.base.dir'='" + replica.externalTableWarehouseRoot + "', "
-                + "'distcp.options.pugpbx'='', 'distcp.options.skipcrccheck'='')")
-        .run("use " + replicatedDbName)
-        .run("repl status " + replicatedDbName)
-        .verifyResult(tuple.lastReplicationId)
-        .run("select value from encrypted_table")
-        .verifyFailure(new String[] { "value1", "value2" });
+    try {
+      replica.run("repl load " + primaryDbName + " into " + replicatedDbName
+          + " with('hive.repl.add.raw.reserved.namespace'='true', " + "'hive.repl.replica.external.table.base.dir'='"
+          + replica.externalTableWarehouseRoot + "', "
+          + "'distcp.options.pugpbx'='', 'distcp.options.skipcrccheck'='')");
+      Assert.fail("Test should have thrown an exception because cross-encryption-zone is not allowed for RAW");
+    } catch (IOException ioe) {
+      // ignore

Review comment:
       Hey @belugabehr  just read the detailed comment on the JIRA about this but I believe we should add some explanation here as well for clarity




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] belugabehr commented on pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
belugabehr commented on pull request #1742:
URL: https://github.com/apache/hive/pull/1742#issuecomment-916319837


   OK, so the issue now is that Hadoop 3.3+ is using `org.jline v3` and one of the Hive dependencies of `sqline` has a dependency on `jline v2` which is causing a clash.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] github-actions[bot] commented on pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #1742:
URL: https://github.com/apache/hive/pull/1742#issuecomment-974730579


   This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the dev@hive.apache.org list if the patch is in need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] abstractdog commented on pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
abstractdog commented on pull request #1742:
URL: https://github.com/apache/hive/pull/1742#issuecomment-975424894


   what are the unresolved blockers of 3.3.1 upgrade at the moment?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] vyaslav commented on pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.0

Posted by GitBox <gi...@apache.org>.
vyaslav commented on pull request #1742:
URL: https://github.com/apache/hive/pull/1742#issuecomment-756999283


   > @wangyum Stuck.
   > 
   > There are two big issues here:
   > 
   > 1. Hive integration tests fire up Druid, Kafka, HDFS, LLAP, etc. all in the same JVM and their 3rd party dependencies are all over the place. Using a higher version of a dependency breaks one product, but using a lower version breaks the other.  To make this work well, there probably needs to be a way to launch each service in their own JVM class loader.  In lieu of that, I've been trying to move the ball closer to the goal post and getting dependencies closer together.
   > 
   > [apache/druid#10683](https://github.com/apache/druid/pull/10683)
   > [HIVE-24542](https://issues.apache.org/jira/browse/HIVE-24542)
   > 
   > 1. In HDFS 3.3.0, Hadoop team introduced `ProtobufRpcEngine2` in addition to `ProtobufRpcEngine` (sigh).  Some of the Hive LLAP stuff is using this Hadoop Protobuf RPC engine (`ProtobufRpcEngine`).  There's some `static` logic in the protocol engines that prohibits loading both RPC engines into the same JVM at the same time, I'm not sure why.  HDFS was migrated to `ProtobufRpcEngine2`.  So, again, in the integration tests, when the HDFS mini cluster is loaded, version 2 of the RPC engine is loaded into the JVM.  When LLAP is later loaded, it fails to start because version 1 cannot be registered at the same time.
   
   Regarding 1st, I faced the same issues in my PR for upgrade to 3.1.3 - https://github.com/apache/hive/pull/1638
   But, regarding 2nd I'am curious if it would be hard to replace `ProtobufRpcEngine` with `ProtobufRpcEngine2` in Hive itself. As I understand they have upgraded from PB2 to PB3


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] belugabehr commented on pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
belugabehr commented on pull request #1742:
URL: https://github.com/apache/hive/pull/1742#issuecomment-916087565


   @ayushtkn I'm not 100% sure what's going on there.  I am working on upgrading jline as a separate task HIVE-25495 (#2617) and I'm hitting a similar issue there event though I thought I synchronized the versions between Hive and Hadoop, so I need to look into it more


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] abstractdog commented on a change in pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
abstractdog commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r674879041



##########
File path: spark-client/pom.xml
##########
@@ -159,45 +159,10 @@
 
   <build>
     <plugins>
-      <plugin>

Review comment:
       happy to see that we can get rid of these maven magics!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] abstractdog commented on a change in pull request #1742: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
abstractdog commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r674880301



##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
##########
@@ -367,6 +368,7 @@ public static void startMetaStore(int port, HadoopThriftAuthBridge bridge,
     boolean tcpKeepAlive = MetastoreConf.getBoolVar(conf, ConfVars.TCP_KEEP_ALIVE);
     boolean useCompactProtocol = MetastoreConf.getBoolVar(conf, ConfVars.USE_THRIFT_COMPACT_PROTOCOL);
     boolean useSSL = MetastoreConf.getBoolVar(conf, ConfVars.USE_SSL);
+    ProxyUsers.refreshSuperUserGroupsConfiguration(conf);

Review comment:
       is done somewhere else implicitly before hadoop 3.3?

##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
##########
@@ -367,6 +368,7 @@ public static void startMetaStore(int port, HadoopThriftAuthBridge bridge,
     boolean tcpKeepAlive = MetastoreConf.getBoolVar(conf, ConfVars.TCP_KEEP_ALIVE);
     boolean useCompactProtocol = MetastoreConf.getBoolVar(conf, ConfVars.USE_THRIFT_COMPACT_PROTOCOL);
     boolean useSSL = MetastoreConf.getBoolVar(conf, ConfVars.USE_SSL);
+    ProxyUsers.refreshSuperUserGroupsConfiguration(conf);

Review comment:
       is this done somewhere else implicitly before hadoop 3.3?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org