You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2022/05/11 07:53:41 UTC

[GitHub] [hive] ayushtkn opened a new pull request, #3279: HIVE-24484: Upgrade Hadoop to 3.3.2.

ayushtkn opened a new pull request, #3279:
URL: https://github.com/apache/hive/pull/3279

   Exploratory: Just to figure out what all breaks and can be fixed here or in next hadoop 3.3 release


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ayushtkn commented on pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.1 And Tez to 0.10.2

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on PR #3279:
URL: https://github.com/apache/hive/pull/3279#issuecomment-1198993149

   Got a new Test failure due to jetty. TestSSL, Fixed in the latest commit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ayushtkn commented on a diff in pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.2.

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on code in PR #3279:
URL: https://github.com/apache/hive/pull/3279#discussion_r871357800


##########
standalone-metastore/pom.xml:
##########
@@ -227,6 +227,10 @@
         <artifactId>hadoop-mapreduce-client-core</artifactId>
         <version>${hadoop.version}</version>
         <exclusions>
+          <exclusion>
+            <groupId>org.jline</groupId>
+            <artifactId>jline</artifactId>
+          </exclusion>

Review Comment:
   yeps, the best answer is to upgrade Jline, which was stuck. So, I thought to upgrade Hadoop that shouldn't block if possible, we are already on 3.1.0 which died long back



##########
storage-api/src/java/org/apache/hadoop/hive/common/ValidReadTxnList.java:
##########
@@ -18,10 +18,10 @@
 
 package org.apache.hadoop.hive.common;
 
-import org.apache.commons.lang.StringUtils;
+import org.apache.commons.lang3.StringUtils;

Review Comment:
   Code doesn't compile with this. It is already marked as banned import, guess the logic has flaw.
   https://github.com/apache/hive/blob/master/pom.xml#L1529
   
   The dependency was getting pulled in from Hadoop & now it isn't there, so I have to change it to make it compile



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ayushtkn commented on pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.2.

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on PR #3279:
URL: https://github.com/apache/hive/pull/3279#issuecomment-1131786459

   The build is green with ``3.3.3``
   I built the distro and checked if it contains reload4j, it doesn't
   ```
   lib % ls -l | grep reload4j
   lib % 
   ```
   Deployed and tried with hadoop-3.3.3, Hive on MR and ran some basic queries and they were working.
   
   @steveloughran do we need anything more or 3.3.3 or are we good


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] steveloughran commented on pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.3

Posted by GitBox <gi...@apache.org>.
steveloughran commented on PR #3279:
URL: https://github.com/apache/hive/pull/3279#issuecomment-1143517119

   jetty upgrade came in https://issues.apache.org/jira/browse/HADOOP-17796 & https://github.com/apache/hadoop/pull/3208  some security advisories there so it is probably better to deal with the change than try and stick to the older version. sorry


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] abstractdog commented on pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.3

Posted by GitBox <gi...@apache.org>.
abstractdog commented on PR #3279:
URL: https://github.com/apache/hive/pull/3279#issuecomment-1141946273

   > I've deployed
   > 
   > * hadoop-3.3.3
   > * tez 0.10.1
   > * hive from the PR
   >   running a simple insert failed with:
   > 
   > ```
   > Caused by: java.lang.NoSuchMethodError: org.eclipse.jetty.server.session.SessionHandler.getSessionManager()Lorg/eclipse/jetty/server/SessionManager;
   > 	at org.apache.hadoop.http.HttpServer2.initializeWebServer(HttpServer2.java:569)
   > 	at org.apache.hadoop.http.HttpServer2.<init>(HttpServer2.java:550)
   > 	at org.apache.hadoop.http.HttpServer2.<init>(HttpServer2.java:117)
   > 	at org.apache.hadoop.http.HttpServer2$Builder.build(HttpServer2.java:425)
   > 	at org.apache.hadoop.yarn.webapp.WebApps$Builder.build(WebApps.java:341)
   > 	at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:432)
   > 	at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:428)
   > 	at org.apache.tez.dag.app.web.WebUIService.serviceStart(WebUIService.java:94)
   > 	at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
   > 	at org.apache.tez.dag.app.DAGAppMaster$ServiceWithDependency.start(DAGAppMaster.java:1800)
   > 	at org.apache.tez.dag.app.DAGAppMaster$ServiceThread.run(DAGAppMaster.java:1821)
   > 2022-05-31 09:17:19,422 [INFO] [shutdown-hook-0] |app.DAGAppMaster|: DAGAppMasterShutdownHook invoked
   > ```
   > 
   > maybe I've missed something - but it seems like the tez dagappmaster has issues running with the jetty because of hadoop 3.3.3
   > 
   > I've these settings:
   > 
   > ```
   > tez/tez-site/tez.lib.uris ${fs.defaultFS}/apps/tez/tez.tar.gz
   > tez/tez-site/tez.use.cluster.hadoop-libs true
   > ```
   > 
   > @abstractdog is hadoop-3.3.3 supported with tez-0.10.1?
   
   thanks @kgyrtkirk for trying this out, I've just created TEZ-4420, as tez is on hadoop 3.3.1, and I'm not sure about compatibility
   
   checked tez.tar.gz contents before and after bump and I got:
   
   ```
   hadoop 3.3.1
   
   tar tf tez-dist/target/tez-0.10.2-SNAPSHOT.tar.gz | grep jetty
   
   lib/jetty-server-9.4.40.v20210413.jar
   lib/jetty-http-9.4.40.v20210413.jar
   lib/jetty-io-9.4.40.v20210413.jar
   lib/jetty-util-9.4.40.v20210413.jar
   lib/jetty-servlet-9.4.40.v20210413.jar
   lib/jetty-security-9.4.40.v20210413.jar
   lib/jetty-util-ajax-9.4.40.v20210413.jar
   lib/jetty-webapp-9.4.40.v20210413.jar
   lib/jetty-xml-9.4.40.v20210413.jar
   lib/jetty-client-9.4.40.v20210413.jar
   
   hadoop 3.3.3
   
   tar tf tez-dist/target/tez-0.10.2-SNAPSHOT.tar.gz | grep jetty
   
   lib/jetty-server-9.4.43.v20210629.jar
   lib/jetty-http-9.4.43.v20210629.jar
   lib/jetty-io-9.4.43.v20210629.jar
   lib/jetty-util-9.4.43.v20210629.jar
   lib/jetty-servlet-9.4.43.v20210629.jar
   lib/jetty-security-9.4.43.v20210629.jar
   lib/jetty-util-ajax-9.4.43.v20210629.jar
   lib/jetty-webapp-9.4.43.v20210629.jar
   lib/jetty-xml-9.4.43.v20210629.jar
   lib/jetty-client-9.4.43.v20210629.jar
   ``` 
   
   tez packs jetty from hadoop, so hadoop upgrade means jetty upgrade too, so there is a chance that an old tez.tar.gz can clash with new hadoop
   
   I've just uplodaded the new tez.tar.gz, is there a chance you can give it a try?
   https://drive.google.com/file/d/18RMfh40s6kKdFt77E7HpS-j4EJhd2DKi/view?usp=sharing


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ayushtkn commented on pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.1 And Tez to 0.10.2

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on PR #3279:
URL: https://github.com/apache/hive/pull/3279#issuecomment-1191490566

   >But Hive 3.1.x version is not very old and 4.x still looks like in alpha so we may not able to upgrade. so with this PR still we have compatibility issues with Hive 3.x version. any suggestions? thanks
   
   @sujith71955  Unfortunately, I don't have a use case for 3.x line, but that should be doable but would requires changes across Hadoop, Hive & Tez. We did the same here as well....
   
   It is certainly doable, if you folks have a use case, feel free to create a Jira for 3.1.x line. Running busy so couldn't spare time to check the problem you folks stated above..


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ayushtkn commented on pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.2.

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on PR #3279:
URL: https://github.com/apache/hive/pull/3279#issuecomment-1125879377

   The only test failure here is intermittent. Have answered/addressed all the comments. 
   One test I have disabled, firstly it was not failing itself but was corrupting the XML, it wasn't functional, but some test infra stuff and relying on hadoop ls command, for which output was intermittently changing. Not a good test to have either.
   For record it is this:
   http://ci.hive.apache.org/job/hive-precommit/job/PR-3279/7/testReport/junit/TEST-org.apache.hadoop.hive.cli.split0.TestMiniLlapLocalCliDriver/xml/_failed_to_read_/
   
   Can't decode the failure reason here, it was that broken test which was causing this. 
   If everything is good here, and only this test block. I will have a followup jira and figure this test out with the original author of the test.
   
   I have tried basic stuff with Hive-On-MR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ayushtkn merged pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.1 And Tez to 0.10.2

Posted by GitBox <gi...@apache.org>.
ayushtkn merged PR #3279:
URL: https://github.com/apache/hive/pull/3279


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ayushtkn commented on a diff in pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on code in PR #3279:
URL: https://github.com/apache/hive/pull/3279#discussion_r895742321


##########
ql/src/test/results/clientpositive/llap/acid_table_directories_test.q.out:
##########
@@ -163,13 +170,6 @@ POSTHOOK: Input: default@acidparttbl@p=200
 ### ACID DELTA DIR ###
 ### ACID DELTA DIR ###
 ### ACID DELTA DIR ###
-#### A masked pattern was here ####

Review Comment:
   [HADOOP-12502](https://issues.apache.org/jira/browse/HADOOP-12502) is the culprit and earlier listing was sorted now it isn't so the Ls -R output will change intermittently so we can't have this test only hence disabled



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] steveloughran commented on pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.1 And Tez to 0.10.2

Posted by GitBox <gi...@apache.org>.
steveloughran commented on PR #3279:
URL: https://github.com/apache/hive/pull/3279#issuecomment-1190200900

   it's targeting 3.3.3+; there is actually a 3.3.4 RC coming out today with specific changes to assist tez (HADOOP-18332).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ayushtkn commented on pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.1 And Tez to 0.10.2

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on PR #3279:
URL: https://github.com/apache/hive/pull/3279#issuecomment-1190200835

   Nopes, not chasing that branch


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] kgyrtkirk commented on a diff in pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.2.

Posted by GitBox <gi...@apache.org>.
kgyrtkirk commented on code in PR #3279:
URL: https://github.com/apache/hive/pull/3279#discussion_r871272145


##########
common/pom.xml:
##########
@@ -195,6 +194,11 @@
       <artifactId>tez-api</artifactId>
       <version>${tez.version}</version>
     </dependency>
+    <dependency>
+      <groupId>org.fusesource.jansi</groupId>
+      <artifactId>jansi</artifactId>
+      <version>2.3.4</version>

Review Comment:
   move version to root pom



##########
itests/pom.xml:
##########
@@ -352,6 +352,12 @@
         <groupId>org.apache.hadoop</groupId>
         <artifactId>hadoop-yarn-client</artifactId>
         <version>${hadoop.version}</version>
+        <exclusions>
+          <exclusion>
+            <groupId>org.jline</groupId>
+            <artifactId>jline</artifactId>
+          </exclusion>

Review Comment:
   I'm not sure if this fix will work; it could work for the tests; but you've just excluded the dependency; I think that will not prevent that dep from appearing on the classpath during runtime...
   
   have you tested a dist build as well?



##########
ql/src/java/org/apache/hadoop/hive/ql/io/RecordReaderWrapper.java:
##########
@@ -69,7 +70,14 @@ static RecordReader create(InputFormat inputFormat, HiveInputFormat.HiveInputSpl
       JobConf jobConf, Reporter reporter) throws IOException {
     int headerCount = Utilities.getHeaderCount(tableDesc);
     int footerCount = Utilities.getFooterCount(tableDesc, jobConf);
-    RecordReader innerReader = inputFormat.getRecordReader(split.getInputSplit(), jobConf, reporter);
+
+    RecordReader innerReader = null;
+    try {
+     innerReader = inputFormat.getRecordReader(split.getInputSplit(), jobConf, reporter);
+    } catch (InterruptedIOException iioe) {
+      // If reading from the underlying record reader is interrupted, return a no-op record reader

Review Comment:
   why not simply propagate the `Exception` ?
   This will hide away the exception



##########
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/StorageBasedAuthorizationProvider.java:
##########
@@ -178,8 +178,7 @@ public void authorize(Database db, Privilege[] readRequiredPriv, Privilege[] wri
 
   private static boolean userHasProxyPrivilege(String user, Configuration conf) {
     try {
-      if (MetaStoreServerUtils.checkUserHasHostProxyPrivileges(user, conf,
-              HMSHandler.getIPAddress())) {
+      if (MetaStoreServerUtils.checkUserHasHostProxyPrivileges(user, conf, HMSHandler.getIPAddress())) {

Review Comment:
   I think max_linelength should be <=100 ; are you using the `dev-support/eclipse-styles.xml` ?



##########
streaming/src/test/org/apache/hive/streaming/TestStreaming.java:
##########
@@ -1317,6 +1318,11 @@ public void testTransactionBatchEmptyCommit() throws Exception {
     connection.close();
   }
 
+  /**
+   * Starting with HDFS 3.3.1, the underlying system NOW SUPPORTS hflush so this
+   * test fails.

Review Comment:
   ok; then I think this test could be probably converted into a test which checks that it works



##########
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatMultiOutputFormat.java:
##########
@@ -315,18 +320,19 @@ public void testOutputFormat() throws Throwable {
 
     // Check permisssion on partition dirs and files created
     for (int i = 0; i < tableNames.length; i++) {
-      Path partitionFile = new Path(warehousedir + "/" + tableNames[i]
-        + "/ds=1/cluster=ag/part-m-00000");
-      FileSystem fs = partitionFile.getFileSystem(mrConf);
-      Assert.assertEquals("File permissions of table " + tableNames[i] + " is not correct",
-        fs.getFileStatus(partitionFile).getPermission(),
-        new FsPermission(tablePerms[i]));
-      Assert.assertEquals("File permissions of table " + tableNames[i] + " is not correct",
-        fs.getFileStatus(partitionFile.getParent()).getPermission(),
-        new FsPermission(tablePerms[i]));
-      Assert.assertEquals("File permissions of table " + tableNames[i] + " is not correct",
-        fs.getFileStatus(partitionFile.getParent().getParent()).getPermission(),
-        new FsPermission(tablePerms[i]));
+      final Path partitionFile = new Path(warehousedir + "/" + tableNames[i] + "/ds=1/cluster=ag/part-m-00000");
+      final Path grandParentOfPartitionFile = partitionFile.getParent();

Review Comment:
   I would expect `grandParent` to be parent-of-parent;
   
   I think this change could be revoked  - it was more readable earlier; the last assert now checks for the `parent` dir and not for `parent.parent`; the second assert was also clobbered....



##########
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationOnHDFSEncryptedZones.java:
##########
@@ -123,57 +122,24 @@ public void targetAndSourceHaveDifferentEncryptionZoneKeys() throws Throwable {
               put(HiveConf.ConfVars.REPLDIR.varname, primary.repldDir);
             }}, "test_key123");
 
-    List<String> dumpWithClause = Arrays.asList(
-            "'hive.repl.add.raw.reserved.namespace'='true'",
-            "'" + HiveConf.ConfVars.REPL_EXTERNAL_TABLE_BASE_DIR.varname + "'='"
-                    + replica.externalTableWarehouseRoot + "'",
-            "'distcp.options.skipcrccheck'=''",
-            "'" + HiveConf.ConfVars.HIVE_SERVER2_ENABLE_DOAS.varname + "'='false'",
-            "'" + HiveConf.ConfVars.HIVE_DISTCP_DOAS_USER.varname + "'='"
-                    + UserGroupInformation.getCurrentUser().getUserName() +"'");
-    WarehouseInstance.Tuple tuple =
-            primary.run("use " + primaryDbName)
-                    .run("create table encrypted_table (id int, value string)")
-                    .run("insert into table encrypted_table values (1,'value1')")
-                    .run("insert into table encrypted_table values (2,'value2')")
-                    .dump(primaryDbName, dumpWithClause);
-
-    replica
-            .run("repl load " + primaryDbName + " into " + replicatedDbName
-                    + " with('hive.repl.add.raw.reserved.namespace'='true', "
-                    + "'hive.repl.replica.external.table.base.dir'='" + replica.externalTableWarehouseRoot + "', "
-                    + "'hive.exec.copyfile.maxsize'='0', 'distcp.options.skipcrccheck'='')")
-            .run("use " + replicatedDbName)
-            .run("repl status " + replicatedDbName)
-            .verifyResult(tuple.lastReplicationId);
-
-    try {
-      replica
-              .run("select value from encrypted_table")
-              .verifyResults(new String[] { "value1", "value2" });
-      Assert.fail("Src EZKey shouldn't be present on target");
-    } catch (IOException e) {
-      Assert.assertTrue(e.getCause().getMessage().contains("KeyVersion name 'test_key@0' does not exist"));
-    }
-
     //read should pass without raw-byte distcp
-    dumpWithClause = Arrays.asList( "'" + HiveConf.ConfVars.REPL_EXTERNAL_TABLE_BASE_DIR.varname + "'='"
+    List<String> dumpWithClause = Arrays.asList( "'" + HiveConf.ConfVars.REPL_EXTERNAL_TABLE_BASE_DIR.varname + "'='"
             + replica.externalTableWarehouseRoot + "'");
-    tuple = primary.run("use " + primaryDbName)
+    WarehouseInstance.Tuple tuple =
+        primary.run("use " + primaryDbName)
             .run("create external table encrypted_table2 (id int, value string)")
             .run("insert into table encrypted_table2 values (1,'value1')")
             .run("insert into table encrypted_table2 values (2,'value2')")
             .dump(primaryDbName, dumpWithClause);
 
     replica
-            .run("repl load " + primaryDbName + " into " + replicatedDbName
-                    + " with('hive.repl.replica.external.table.base.dir'='" + replica.externalTableWarehouseRoot + "', "
-                    + "'hive.exec.copyfile.maxsize'='0', 'distcp.options.skipcrccheck'='')")
-            .run("use " + replicatedDbName)
-            .run("repl status " + replicatedDbName)
-            .verifyResult(tuple.lastReplicationId)

Review Comment:
   wasn't this the expected behaviour?



##########
storage-api/src/java/org/apache/hadoop/hive/common/ValidReadTxnList.java:
##########
@@ -18,10 +18,10 @@
 
 package org.apache.hadoop.hive.common;
 
-import org.apache.commons.lang.StringUtils;
+import org.apache.commons.lang3.StringUtils;

Review Comment:
   these lang/lang3 changes seem unrelated to me; I think they could be done in a separate jira to reduce the amount of work.
   
   if you are moving away from the usage of `org.apache.commons.lang`  ; could you please also ban it in thr root pom.xml?



##########
standalone-metastore/pom.xml:
##########
@@ -227,6 +227,10 @@
         <artifactId>hadoop-mapreduce-client-core</artifactId>
         <version>${hadoop.version}</version>
         <exclusions>
+          <exclusion>
+            <groupId>org.jline</groupId>
+            <artifactId>jline</artifactId>
+          </exclusion>

Review Comment:
   this jline dep just creeps in from multiple hadoop artifacts; the best would be to upgrade jline and not risk our chances with exclusions



##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java:
##########
@@ -9361,7 +9362,8 @@ public NotificationEventsCountResponse get_notification_events_count(Notificatio
   private void authorizeProxyPrivilege() throws TException {
     // Skip the auth in embedded mode or if the auth is disabled
     if (!HiveMetaStore.isMetaStoreRemote() ||
-        !MetastoreConf.getBoolVar(conf, ConfVars.EVENT_DB_NOTIFICATION_API_AUTH)) {
+        !MetastoreConf.getBoolVar(conf, ConfVars.EVENT_DB_NOTIFICATION_API_AUTH) || conf.getBoolean(HIVE_IN_TEST.getVarname(),
+        false)) {

Review Comment:
   you are turning a feature off based on this `HIVE_IN_TEST` boolean; which means the feature will not be tested during regular hive test; please find another way; and since it seems like this is being turned off multiple places - can you cover it with a test?



##########
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationOnHDFSEncryptedZones.java:
##########
@@ -123,57 +122,24 @@ public void targetAndSourceHaveDifferentEncryptionZoneKeys() throws Throwable {
               put(HiveConf.ConfVars.REPLDIR.varname, primary.repldDir);
             }}, "test_key123");
 
-    List<String> dumpWithClause = Arrays.asList(

Review Comment:
   seems like a testcase was removed; I wonder if this it not supported anymore ? ...and why are we removing this case in the scope of a hadoop upgrade?



##########
ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java:
##########
@@ -18,20 +18,8 @@
 
 package org.apache.hadoop.hive.ql.exec;
 
-import java.io.FileNotFoundException;

Review Comment:
   import order is different in your IDE than in existing code; can you configure it to not reorder the imports in every file?
   
   I wonder if we have some agreement what order we want to use....



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] abstractdog commented on pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.1 And Tez to 0.10.2

Posted by GitBox <gi...@apache.org>.
abstractdog commented on PR #3279:
URL: https://github.com/apache/hive/pull/3279#issuecomment-1175431162

   > Thanx @abstractdog We have a green build here, so I have updated the PR title, while merging they will get squashed automatically :-)
   
   makes sense, thanks! I wish we could merge this now, hopefully we can release tez in 2 weeks
   
   I can see the PTF boolean patch in the commits, it's not intentional I guess


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] abstractdog commented on pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
abstractdog commented on PR #3279:
URL: https://github.com/apache/hive/pull/3279#issuecomment-1175156597

   > These bunch of failures looks like due to Hadoop only, Since they are passing with my Hadoop Upgrade PR. We can merge that first, then rebase this and merge post that?
   
   if precommit tests cannot pass cleanly without both upgrades (hadoop+tez), we should commit those together also
   (because we cannot even revert them later one by one if needed), in this case, jira title and commit message might want to contain hadoop and tez upgrade too


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] abstractdog commented on a diff in pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.3

Posted by GitBox <gi...@apache.org>.
abstractdog commented on code in PR #3279:
URL: https://github.com/apache/hive/pull/3279#discussion_r885344061


##########
ql/src/test/results/clientpositive/llap/acid_table_directories_test.q.out:
##########
@@ -163,13 +170,6 @@ POSTHOOK: Input: default@acidparttbl@p=200
 ### ACID DELTA DIR ###
 ### ACID DELTA DIR ###
 ### ACID DELTA DIR ###
-#### A masked pattern was here ####

Review Comment:
   in case of a hive patch, usually, I don't care about result ordering change, however, this is a hadoop upgrade, this is not expected, can we explain this change?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ayushtkn commented on a diff in pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.3

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on code in PR #3279:
URL: https://github.com/apache/hive/pull/3279#discussion_r885395534


##########
ql/src/java/org/apache/hadoop/hive/ql/io/RecordReaderWrapper.java:
##########
@@ -69,7 +70,14 @@ static RecordReader create(InputFormat inputFormat, HiveInputFormat.HiveInputSpl
       JobConf jobConf, Reporter reporter) throws IOException {
     int headerCount = Utilities.getHeaderCount(tableDesc);
     int footerCount = Utilities.getFooterCount(tableDesc, jobConf);
-    RecordReader innerReader = inputFormat.getRecordReader(split.getInputSplit(), jobConf, reporter);
+
+    RecordReader innerReader = null;
+    try {
+     innerReader = inputFormat.getRecordReader(split.getInputSplit(), jobConf, reporter);
+    } catch (InterruptedIOException iioe) {
+      // If reading from the underlying record reader is interrupted, return a no-op record reader

Review Comment:
   Does this log line work:
   ```
         LOG.info("Interrupted while getting the input reader for {}", split.getInputSplit());
   ```
   For 2nd I suppose it can get interrupted on any abort, not very sure, do you have any suggestions



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ayushtkn commented on a diff in pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.2.

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on code in PR #3279:
URL: https://github.com/apache/hive/pull/3279#discussion_r872214538


##########
streaming/src/test/org/apache/hive/streaming/TestStreaming.java:
##########
@@ -1317,6 +1318,11 @@ public void testTransactionBatchEmptyCommit() throws Exception {
     connection.close();
   }
 
+  /**
+   * Starting with HDFS 3.3.1, the underlying system NOW SUPPORTS hflush so this
+   * test fails.

Review Comment:
   Sure, I have removed the exception assertion. Kept the reason as is.
   Just for code context, why HFlush support gets rid of the exception
   ```
               if (!out.hasCapability(StreamCapabilities.HFLUSH)) {
                 throw new ConnectionError(
                     "The backing filesystem only supports transaction batch sizes of 1, but " + transactionBatchSize
                         + " was requested.");
               }
   ```



##########
common/pom.xml:
##########
@@ -195,6 +194,11 @@
       <artifactId>tez-api</artifactId>
       <version>${tez.version}</version>
     </dependency>
+    <dependency>
+      <groupId>org.fusesource.jansi</groupId>
+      <artifactId>jansi</artifactId>
+      <version>2.3.4</version>

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ayushtkn commented on pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.2.

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on PR #3279:
URL: https://github.com/apache/hive/pull/3279#issuecomment-1124857444

   My Last run here had 4 errors, of which I think I have fixed 3 more. The one remaining is some XML parsing error, which I think might get auto resolved or may be an after affect.
   
   @kgyrtkirk I have sorted the JLine issue here as well, which you told in the previous PR. Can you give a check once


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ayushtkn commented on pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.2.

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on PR #3279:
URL: https://github.com/apache/hive/pull/3279#issuecomment-1130131300

   >test not fully flushing/closing xml file before trying to read it?
   
   Looks like some maven issue only, it tries to compute diff between the generated query output file & the stored query output file, if the diff is empty, the test is said to be passing else failed.
   else it puts the diff in the xml, Guess diff has some character which is messing up the xml structure, I need to further investigate though....
   
   > the move in 3.3.3 to reload4j might add some exclusion complications if hive is declaring its own logging classes.
   
   I just pushed a commit upgrading to 3.3.3, I haven't tested the distro, but the compilation & ran one test. Do I need to exclude reload4j from every hadoop dependency?
   If it creates some runtime issues, I think I am happy moving from 3.1.0 to 3.3.2 and rest wait for 3.4.0. Even if I exclude next time when I move to 3.4.x or above I need to take those changes back, right?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ayushtkn commented on pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.1

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on PR #3279:
URL: https://github.com/apache/hive/pull/3279#issuecomment-1174768913

   These bunch of failures looks like due to Hadoop only, Since they are passing with my Hadoop Upgrade PR. We can merge that first, then rebase this and merge post that?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] suryanshagnihotri commented on pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.1 And Tez to 0.10.2

Posted by GitBox <gi...@apache.org>.
suryanshagnihotri commented on PR #3279:
URL: https://github.com/apache/hive/pull/3279#issuecomment-1189823384

   @ayushtkn Did you not face this compilation error. I do not see any change in `FairSchedulerShim.java`. I compiled hive 3.1.2 with hadoop 3.3.1.
   `Compilation failure
   [ERROR] /Users/suryansh/Documents/BDS/apache_hive/shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerShim.java:[31,68] org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueuePlacementPolicy is not public in org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair; cannot be accessed from outside package`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] steveloughran commented on pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.3

Posted by GitBox <gi...@apache.org>.
steveloughran commented on PR #3279:
URL: https://github.com/apache/hive/pull/3279#issuecomment-1143612371

   > I would guess the directory listing order might have changed...
   
   shouldn't have AFAIK


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] steveloughran commented on pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.2.

Posted by GitBox <gi...@apache.org>.
steveloughran commented on PR #3279:
URL: https://github.com/apache/hive/pull/3279#issuecomment-1127862030

   > Can't decode the failure reason here, it was that broken test which was causing this
   
   test not fully flushing/closing xml file before trying to read it?
   
   changes look ok to me; the move in 3.3.3 to reload4j might add some exclusion complications if hive is declaring its own logging classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ayushtkn commented on pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.1 And Tez to 0.10.2

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on PR #3279:
URL: https://github.com/apache/hive/pull/3279#issuecomment-1190044149

   @suryanshagnihotri If this was the case, the build would have failed, which it didn't. So, we are cool here. Nothing pending here apart from awaiting an official Tez release here..


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ayushtkn commented on pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.1 And Tez to 0.10.2

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on PR #3279:
URL: https://github.com/apache/hive/pull/3279#issuecomment-1175369204

   Thanx @abstractdog We have a green build here, so I have updated the PR title, while merging they will get squashed automatically :-) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] abstractdog commented on pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.1 And Tez to 0.10.2

Posted by GitBox <gi...@apache.org>.
abstractdog commented on PR #3279:
URL: https://github.com/apache/hive/pull/3279#issuecomment-1203550667

   also, let me grab the opportunity to thank @belugabehr who put enormous efforts into the hadoop upgrade in the early days!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] steveloughran commented on pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.2.

Posted by GitBox <gi...@apache.org>.
steveloughran commented on PR #3279:
URL: https://github.com/apache/hive/pull/3279#issuecomment-1130195121

   excluding reload4j is harmless on versions without those artifacts, so safe to add and not worry too much. 
   
   bad xmll can happen if the test reporter doesn't escape test names properly and you've managed to get some invalid xml in there. do you have any parameterized tests? check how the strings are created, as they get included. CI tools generally aren't paranoid enough about test method names as historically it was only a java method name


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] steveloughran commented on pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.2.

Posted by GitBox <gi...@apache.org>.
steveloughran commented on PR #3279:
URL: https://github.com/apache/hive/pull/3279#issuecomment-1135008475

   LGTM. spark has gone up to the same version last week, incidentally


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] kgyrtkirk commented on pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.3

Posted by GitBox <gi...@apache.org>.
kgyrtkirk commented on PR #3279:
URL: https://github.com/apache/hive/pull/3279#issuecomment-1141893046

   I've deployed
   * hadoop-3.3.3
   * tez 0.10.1
   * hive from the PR
   running a simple insert failed with:
   ```
   Caused by: java.lang.NoSuchMethodError: org.eclipse.jetty.server.session.SessionHandler.getSessionManager()Lorg/eclipse/jetty/server/SessionManager;
   	at org.apache.hadoop.http.HttpServer2.initializeWebServer(HttpServer2.java:569)
   	at org.apache.hadoop.http.HttpServer2.<init>(HttpServer2.java:550)
   	at org.apache.hadoop.http.HttpServer2.<init>(HttpServer2.java:117)
   	at org.apache.hadoop.http.HttpServer2$Builder.build(HttpServer2.java:425)
   	at org.apache.hadoop.yarn.webapp.WebApps$Builder.build(WebApps.java:341)
   	at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:432)
   	at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:428)
   	at org.apache.tez.dag.app.web.WebUIService.serviceStart(WebUIService.java:94)
   	at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
   	at org.apache.tez.dag.app.DAGAppMaster$ServiceWithDependency.start(DAGAppMaster.java:1800)
   	at org.apache.tez.dag.app.DAGAppMaster$ServiceThread.run(DAGAppMaster.java:1821)
   2022-05-31 09:17:19,422 [INFO] [shutdown-hook-0] |app.DAGAppMaster|: DAGAppMasterShutdownHook invoked
   ```
   maybe I've missed something - but it seems like the tez dagappmaster has issues running with the jetty because of hadoop 3.3.3
   
   I've these settings:
   ```
   tez/tez-site/tez.lib.uris ${fs.defaultFS}/apps/tez/tez.tar.gz
   tez/tez-site/tez.use.cluster.hadoop-libs true
   ```
   
   @abstractdog is hadoop-3.3.3 supported with tez-0.10.1?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] suryanshagnihotri commented on pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.1 And Tez to 0.10.2

Posted by GitBox <gi...@apache.org>.
suryanshagnihotri commented on PR #3279:
URL: https://github.com/apache/hive/pull/3279#issuecomment-1190181001

   @ayushtkn your base branch is does not seems to be based on 3.1.x release branch. It does not have `FairSchedulerShim.java` https://github.com/ayushtkn/hive/tree/HIVE-24484/shims/scheduler.
   It fails if base branch is cut from 3.1.x...
   Is there any plan to add support in 3.1.x version of hive?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ayushtkn commented on pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.1 And Tez to 0.10.2

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on PR #3279:
URL: https://github.com/apache/hive/pull/3279#issuecomment-1203503443

   Merged. Thanx @abstractdog, @kgyrtkirk and @steveloughran for helping with reviews. :-) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ayushtkn commented on a diff in pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.2.

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on code in PR #3279:
URL: https://github.com/apache/hive/pull/3279#discussion_r871355509


##########
ql/src/java/org/apache/hadoop/hive/ql/io/RecordReaderWrapper.java:
##########
@@ -69,7 +70,14 @@ static RecordReader create(InputFormat inputFormat, HiveInputFormat.HiveInputSpl
       JobConf jobConf, Reporter reporter) throws IOException {
     int headerCount = Utilities.getHeaderCount(tableDesc);
     int footerCount = Utilities.getFooterCount(tableDesc, jobConf);
-    RecordReader innerReader = inputFormat.getRecordReader(split.getInputSplit(), jobConf, reporter);
+
+    RecordReader innerReader = null;
+    try {
+     innerReader = inputFormat.getRecordReader(split.getInputSplit(), jobConf, reporter);
+    } catch (InterruptedIOException iioe) {
+      // If reading from the underlying record reader is interrupted, return a no-op record reader

Review Comment:
   Answer is here & this does fixes a couple of test so I picked it:
   https://github.com/apache/hive/pull/1742/files#r674896581



##########
itests/pom.xml:
##########
@@ -352,6 +352,12 @@
         <groupId>org.apache.hadoop</groupId>
         <artifactId>hadoop-yarn-client</artifactId>
         <version>${hadoop.version}</version>
+        <exclusions>
+          <exclusion>
+            <groupId>org.jline</groupId>
+            <artifactId>jline</artifactId>
+          </exclusion>

Review Comment:
   Just tried. Started a Hive cluster with derby, init hive db, started HS2, then beeline.
   show databases;
   show tables;
   create table emp(id int)
    insert into emp values (1),(2),(3),(4);
   select * from emp;
   show create table emp;
   
   Jline was used in Beeline, I think it should have broken that. Let me know what else can be tested.



##########
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/StorageBasedAuthorizationProvider.java:
##########
@@ -178,8 +178,7 @@ public void authorize(Database db, Privilege[] readRequiredPriv, Privilege[] wri
 
   private static boolean userHasProxyPrivilege(String user, Configuration conf) {
     try {
-      if (MetaStoreServerUtils.checkUserHasHostProxyPrivileges(user, conf,
-              HMSHandler.getIPAddress())) {
+      if (MetaStoreServerUtils.checkUserHasHostProxyPrivileges(user, conf, HMSHandler.getIPAddress())) {

Review Comment:
   Max LineLength allowed I guess is 120?
   https://github.com/apache/hive/blob/master/checkstyle/checkstyle.xml#L159-L160



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ayushtkn commented on a diff in pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.2.

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on code in PR #3279:
URL: https://github.com/apache/hive/pull/3279#discussion_r871355730


##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java:
##########
@@ -9361,7 +9362,8 @@ public NotificationEventsCountResponse get_notification_events_count(Notificatio
   private void authorizeProxyPrivilege() throws TException {
     // Skip the auth in embedded mode or if the auth is disabled
     if (!HiveMetaStore.isMetaStoreRemote() ||
-        !MetastoreConf.getBoolVar(conf, ConfVars.EVENT_DB_NOTIFICATION_API_AUTH)) {
+        !MetastoreConf.getBoolVar(conf, ConfVars.EVENT_DB_NOTIFICATION_API_AUTH) || conf.getBoolean(HIVE_IN_TEST.getVarname(),
+        false)) {

Review Comment:
   It is covered via test in TestReplicationScenarios#testAuthForNotificationAPIs
   This method is also used mostly in replication context only I suppose for getting NotificationLog entries...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] sujith71955 commented on pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.1 And Tez to 0.10.2

Posted by GitBox <gi...@apache.org>.
sujith71955 commented on PR #3279:
URL: https://github.com/apache/hive/pull/3279#issuecomment-1190280390

   > @ayushtkn your base branch is does not seems to be based on 3.1.x release branch. It does not have `FairSchedulerShim.java` https://github.com/ayushtkn/hive/tree/HIVE-24484/shims/scheduler. It fails if base branch is cut from 3.1.x...
   > 
   > Check https://github.com/apache/hive/blob/release-3.1.3-rc3/shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerShim.java#L31 makes reference to `QueuePlacementPolicy` which is package protected final class https://github.com/apache/hadoop/blame/release-3.3.3-RC1/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueuePlacementPolicy.java#L54
   > 
   > Is there any plan to add support in 3.1.x version of hive?
   
   But Hive 3.1.x version is not very old and 4.x still looks like in alpha so we may not able to upgrade.  so with this PR still we have compatibility issues with Hive 3.x version.  any suggestions? thanks
   cc @sankarh 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] abstractdog commented on a diff in pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.1 And Tez to 0.10.2

Posted by GitBox <gi...@apache.org>.
abstractdog commented on code in PR #3279:
URL: https://github.com/apache/hive/pull/3279#discussion_r935282547


##########
ql/src/test/queries/clientpositive/acid_table_directories_test.q:
##########
@@ -1,3 +1,5 @@
+--! qt:disabled:disabled Tests the output of LS and that changes, Not a functional test, just adds some masking logic

Review Comment:
   please include hadoop upgrade as the cause in this comment, otherwise looks good to me



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] abstractdog commented on a diff in pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.3

Posted by GitBox <gi...@apache.org>.
abstractdog commented on code in PR #3279:
URL: https://github.com/apache/hive/pull/3279#discussion_r885341338


##########
ql/src/java/org/apache/hadoop/hive/ql/io/RecordReaderWrapper.java:
##########
@@ -69,7 +70,14 @@ static RecordReader create(InputFormat inputFormat, HiveInputFormat.HiveInputSpl
       JobConf jobConf, Reporter reporter) throws IOException {
     int headerCount = Utilities.getHeaderCount(tableDesc);
     int footerCount = Utilities.getFooterCount(tableDesc, jobConf);
-    RecordReader innerReader = inputFormat.getRecordReader(split.getInputSplit(), jobConf, reporter);
+
+    RecordReader innerReader = null;
+    try {
+     innerReader = inputFormat.getRecordReader(split.getInputSplit(), jobConf, reporter);
+    } catch (InterruptedIOException iioe) {
+      // If reading from the underlying record reader is interrupted, return a no-op record reader

Review Comment:
   the explanation on the other PR makes sense to me...considering that we only catch InterruptedIOException here, I'm fine with ZeroRowsInputFormat, but for future code readers, this is confusing, let's do the following:
   1. put a log line here marking this branch
   2. make a comment: in which scenarios does this happen?, who's typically interrupting this codepath?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] kgyrtkirk commented on a diff in pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.3

Posted by GitBox <gi...@apache.org>.
kgyrtkirk commented on code in PR #3279:
URL: https://github.com/apache/hive/pull/3279#discussion_r885363228


##########
ql/src/test/queries/clientpositive/acid_table_directories_test.q:
##########
@@ -1,3 +1,5 @@
+--! qt:disabled:disabled Tests the output of LS and that changes, Not a functional test, just adds some masking logic

Review Comment:
   you may also remove this test....and/or open a jira to remove the things which were added in HIVE-21650;
   I think `qt:replace` could do the same..
   
   hmm...it seems like `hive.qtest.additional.partial.mask.pattern` is only used in this test and nowhere else...



##########
ql/src/test/results/clientpositive/llap/acid_table_directories_test.q.out:
##########
@@ -163,13 +170,6 @@ POSTHOOK: Input: default@acidparttbl@p=200
 ### ACID DELTA DIR ###
 ### ACID DELTA DIR ###
 ### ACID DELTA DIR ###
-#### A masked pattern was here ####

Review Comment:
   I would guess the directory listing order might have changed...
   
   note: I think we should have better masking policies instead of removing the whole lines (mask only the WH part of the path)...it could be important what was the directory...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] TheCodeTracer commented on pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.3

Posted by GitBox <gi...@apache.org>.
TheCodeTracer commented on PR #3279:
URL: https://github.com/apache/hive/pull/3279#issuecomment-1144935338

   I am sorry if its the incorrect forum but since it's tracking Hadoop 3.3 upgrade on Hive, I just wanted to confirm if there is a possibility that webhcat might not work with Hadoop 3.3.x yet (https://issues.apache.org/jira/browse/HIVE-24083 ). Thanks a lot. Right now, it does seem that WebHCat tests are breaking anyway due to a different issue (https://issues.apache.org/jira/browse/HIVE-26286) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] steveloughran commented on pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.2.

Posted by GitBox <gi...@apache.org>.
steveloughran commented on PR #3279:
URL: https://github.com/apache/hive/pull/3279#issuecomment-1123389005

   i have a 3.3.3 RC1 coming out this week, if that helps


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ayushtkn commented on a diff in pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.2.

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on code in PR #3279:
URL: https://github.com/apache/hive/pull/3279#discussion_r872214761


##########
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationOnHDFSEncryptedZones.java:
##########
@@ -123,57 +122,24 @@ public void targetAndSourceHaveDifferentEncryptionZoneKeys() throws Throwable {
               put(HiveConf.ConfVars.REPLDIR.varname, primary.repldDir);
             }}, "test_key123");
 
-    List<String> dumpWithClause = Arrays.asList(
-            "'hive.repl.add.raw.reserved.namespace'='true'",
-            "'" + HiveConf.ConfVars.REPL_EXTERNAL_TABLE_BASE_DIR.varname + "'='"
-                    + replica.externalTableWarehouseRoot + "'",
-            "'distcp.options.skipcrccheck'=''",
-            "'" + HiveConf.ConfVars.HIVE_SERVER2_ENABLE_DOAS.varname + "'='false'",
-            "'" + HiveConf.ConfVars.HIVE_DISTCP_DOAS_USER.varname + "'='"
-                    + UserGroupInformation.getCurrentUser().getUserName() +"'");
-    WarehouseInstance.Tuple tuple =
-            primary.run("use " + primaryDbName)
-                    .run("create table encrypted_table (id int, value string)")
-                    .run("insert into table encrypted_table values (1,'value1')")
-                    .run("insert into table encrypted_table values (2,'value2')")
-                    .dump(primaryDbName, dumpWithClause);
-
-    replica
-            .run("repl load " + primaryDbName + " into " + replicatedDbName
-                    + " with('hive.repl.add.raw.reserved.namespace'='true', "
-                    + "'hive.repl.replica.external.table.base.dir'='" + replica.externalTableWarehouseRoot + "', "
-                    + "'hive.exec.copyfile.maxsize'='0', 'distcp.options.skipcrccheck'='')")
-            .run("use " + replicatedDbName)
-            .run("repl status " + replicatedDbName)
-            .verifyResult(tuple.lastReplicationId);
-
-    try {
-      replica
-              .run("select value from encrypted_table")
-              .verifyResults(new String[] { "value1", "value2" });
-      Assert.fail("Src EZKey shouldn't be present on target");
-    } catch (IOException e) {
-      Assert.assertTrue(e.getCause().getMessage().contains("KeyVersion name 'test_key@0' does not exist"));
-    }
-
     //read should pass without raw-byte distcp
-    dumpWithClause = Arrays.asList( "'" + HiveConf.ConfVars.REPL_EXTERNAL_TABLE_BASE_DIR.varname + "'='"
+    List<String> dumpWithClause = Arrays.asList( "'" + HiveConf.ConfVars.REPL_EXTERNAL_TABLE_BASE_DIR.varname + "'='"
             + replica.externalTableWarehouseRoot + "'");
-    tuple = primary.run("use " + primaryDbName)
+    WarehouseInstance.Tuple tuple =
+        primary.run("use " + primaryDbName)
             .run("create external table encrypted_table2 (id int, value string)")
             .run("insert into table encrypted_table2 values (1,'value1')")
             .run("insert into table encrypted_table2 values (2,'value2')")
             .dump(primaryDbName, dumpWithClause);
 
     replica
-            .run("repl load " + primaryDbName + " into " + replicatedDbName
-                    + " with('hive.repl.replica.external.table.base.dir'='" + replica.externalTableWarehouseRoot + "', "
-                    + "'hive.exec.copyfile.maxsize'='0', 'distcp.options.skipcrccheck'='')")
-            .run("use " + replicatedDbName)
-            .run("repl status " + replicatedDbName)
-            .verifyResult(tuple.lastReplicationId)

Review Comment:
   DistCp itself fails, It is running with hive.repl.add.raw.reserved.namespace and you can't copy if the key is not present on target cluster. Earlier I converted this to a failure case test, but then the next iteration fails which is without hive.repl.add.raw.reserved.namespace because the last load wasn't successful, so I kept the success case



##########
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationOnHDFSEncryptedZones.java:
##########
@@ -123,57 +122,24 @@ public void targetAndSourceHaveDifferentEncryptionZoneKeys() throws Throwable {
               put(HiveConf.ConfVars.REPLDIR.varname, primary.repldDir);
             }}, "test_key123");
 
-    List<String> dumpWithClause = Arrays.asList(

Review Comment:
   Same as above:
   DistCp itself fails, It is running with hive.repl.add.raw.reserved.namespace and you can't copy if the key is not present on target cluster. Earlier I converted this to a failure case test, but then the next iteration fails which is without hive.repl.add.raw.reserved.namespace because the last load wasn't successful, so I kept the success case
   
   I am not sure how it was working before, but with todat raw type the key should be there, earlier this test was also single instance, while fixing it became 2 instances....



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ayushtkn commented on a diff in pull request #3279: HIVE-24484: Upgrade Hadoop to 3.3.2.

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on code in PR #3279:
URL: https://github.com/apache/hive/pull/3279#discussion_r872219527


##########
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatMultiOutputFormat.java:
##########
@@ -315,18 +320,19 @@ public void testOutputFormat() throws Throwable {
 
     // Check permisssion on partition dirs and files created
     for (int i = 0; i < tableNames.length; i++) {
-      Path partitionFile = new Path(warehousedir + "/" + tableNames[i]
-        + "/ds=1/cluster=ag/part-m-00000");
-      FileSystem fs = partitionFile.getFileSystem(mrConf);
-      Assert.assertEquals("File permissions of table " + tableNames[i] + " is not correct",
-        fs.getFileStatus(partitionFile).getPermission(),
-        new FsPermission(tablePerms[i]));
-      Assert.assertEquals("File permissions of table " + tableNames[i] + " is not correct",
-        fs.getFileStatus(partitionFile.getParent()).getPermission(),
-        new FsPermission(tablePerms[i]));
-      Assert.assertEquals("File permissions of table " + tableNames[i] + " is not correct",
-        fs.getFileStatus(partitionFile.getParent().getParent()).getPermission(),
-        new FsPermission(tablePerms[i]));
+      final Path partitionFile = new Path(warehousedir + "/" + tableNames[i] + "/ds=1/cluster=ag/part-m-00000");
+      final Path grandParentOfPartitionFile = partitionFile.getParent();

Review Comment:
   Changed. I picked it as is from the previous PR, when I saw this test failing :-) 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org