You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2021/03/07 20:18:01 UTC

[GitHub] [hive] ayushtkn opened a new pull request #2044: HIVE-24853. HMS leaks queries in case of timeout.

ayushtkn opened a new pull request #2044:
URL: https://github.com/apache/hive/pull/2044


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ayushtkn commented on a change in pull request #2044: HIVE-24853. HMS leaks queries in case of timeout.

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on a change in pull request #2044:
URL: https://github.com/apache/hive/pull/2044#discussion_r591758640



##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
##########
@@ -1802,57 +1808,73 @@ private long partsFoundForPartitions(
           + " and \"ENGINE\" = ? "
           + " group by \"COLUMN_NAME\", \"COLUMN_TYPE\"";
       start = doTrace ? System.nanoTime() : 0;
-      query = pm.newQuery("javax.jdo.query.SQL", queryText);
-      qResult = executeWithArray(query, prepareParams(catName, dbName, tableName, partNames, colNames, engine),
-          queryText);
-      end = doTrace ? System.nanoTime() : 0;
-      MetastoreDirectSqlUtils.timingTrace(doTrace, queryText, start, end);
-      if (qResult == null) {
-        query.closeAll();
-        return Collections.emptyList();
-      }
       List<String> noExtraColumnNames = new ArrayList<String>();
       Map<String, String[]> extraColumnNameTypeParts = new HashMap<String, String[]>();
-      List<Object[]> list = MetastoreDirectSqlUtils.ensureList(qResult);
-      for (Object[] row : list) {
-        String colName = (String) row[0];
-        String colType = (String) row[1];
-        // Extrapolation is not needed for this column if
-        // count(\"PARTITION_NAME\")==partNames.size()
-        // Or, extrapolation is not possible for this column if
-        // count(\"PARTITION_NAME\")<2
-        Long count = MetastoreDirectSqlUtils.extractSqlLong(row[2]);
-        if (count == partNames.size() || count < 2) {
-          noExtraColumnNames.add(colName);
-        } else {
-          extraColumnNameTypeParts.put(colName, new String[] { colType, String.valueOf(count) });
+      try(Query query = pm.newQuery("javax.jdo.query.SQL", queryText);) {
+        qResult = executeWithArray(query,
+            prepareParams(catName, dbName, tableName, partNames, colNames,
+                engine), queryText);
+        end = doTrace ? System.nanoTime() : 0;
+        MetastoreDirectSqlUtils.timingTrace(doTrace, queryText, start, end);
+        if (qResult == null) {
+          query.closeAll();
+          return Collections.emptyList();
+        }
+
+        List<Object[]> list = MetastoreDirectSqlUtils.ensureList(qResult);
+        for (Object[] row : list) {
+          String colName = (String) row[0];
+          String colType = (String) row[1];
+          // Extrapolation is not needed for this column if
+          // count(\"PARTITION_NAME\")==partNames.size()
+          // Or, extrapolation is not possible for this column if
+          // count(\"PARTITION_NAME\")<2
+          Long count = MetastoreDirectSqlUtils.extractSqlLong(row[2]);
+          if (count == partNames.size() || count < 2) {
+            noExtraColumnNames.add(colName);
+          } else {
+            extraColumnNameTypeParts
+                .put(colName, new String[] {colType, String.valueOf(count)});
+          }
+          Deadline.checkTimeout();
+        }
+      } catch (Exception e) {
+        if (e instanceof MetaException) {
+          throw (MetaException) e;
         }
-        Deadline.checkTimeout();
       }
-      query.closeAll();
       // Extrapolation is not needed for columns noExtraColumnNames

Review comment:
       Isn't that only being done here?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ayushtkn commented on a change in pull request #2044: HIVE-24853. HMS leaks queries in case of timeout.

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on a change in pull request #2044:
URL: https://github.com/apache/hive/pull/2044#discussion_r594605627



##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
##########
@@ -1783,12 +1783,16 @@ private long partsFoundForPartitions(
       MetastoreDirectSqlUtils.timingTrace(doTrace, queryText, start, end);
       List<Object[]> list = MetastoreDirectSqlUtils.ensureList(qResult);
       List<ColumnStatisticsObj> colStats = new ArrayList<ColumnStatisticsObj>(list.size());
-      for (Object[] row : list) {
-        colStats.add(prepareCSObjWithAdjustedNDV(row, 0, useDensityFunctionForNDVEstimation, ndvTuner));
-        Deadline.checkTimeout();
+        for (Object[] row : list) {

Review comment:
       Done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ayushtkn commented on pull request #2044: HIVE-24853. HMS leaks queries in case of timeout.

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on pull request #2044:
URL: https://github.com/apache/hive/pull/2044#issuecomment-795038056


   Thanx @kgyrtkirk for the review, I have addressed the review comments.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] kgyrtkirk commented on a change in pull request #2044: HIVE-24853. HMS leaks queries in case of timeout.

Posted by GitBox <gi...@apache.org>.
kgyrtkirk commented on a change in pull request #2044:
URL: https://github.com/apache/hive/pull/2044#discussion_r591623056



##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
##########
@@ -1802,57 +1808,73 @@ private long partsFoundForPartitions(
           + " and \"ENGINE\" = ? "
           + " group by \"COLUMN_NAME\", \"COLUMN_TYPE\"";
       start = doTrace ? System.nanoTime() : 0;
-      query = pm.newQuery("javax.jdo.query.SQL", queryText);
-      qResult = executeWithArray(query, prepareParams(catName, dbName, tableName, partNames, colNames, engine),
-          queryText);
-      end = doTrace ? System.nanoTime() : 0;
-      MetastoreDirectSqlUtils.timingTrace(doTrace, queryText, start, end);
-      if (qResult == null) {
-        query.closeAll();
-        return Collections.emptyList();
-      }
       List<String> noExtraColumnNames = new ArrayList<String>();
       Map<String, String[]> extraColumnNameTypeParts = new HashMap<String, String[]>();
-      List<Object[]> list = MetastoreDirectSqlUtils.ensureList(qResult);
-      for (Object[] row : list) {
-        String colName = (String) row[0];
-        String colType = (String) row[1];
-        // Extrapolation is not needed for this column if
-        // count(\"PARTITION_NAME\")==partNames.size()
-        // Or, extrapolation is not possible for this column if
-        // count(\"PARTITION_NAME\")<2
-        Long count = MetastoreDirectSqlUtils.extractSqlLong(row[2]);
-        if (count == partNames.size() || count < 2) {
-          noExtraColumnNames.add(colName);
-        } else {
-          extraColumnNameTypeParts.put(colName, new String[] { colType, String.valueOf(count) });
+      try(Query query = pm.newQuery("javax.jdo.query.SQL", queryText);) {
+        qResult = executeWithArray(query,
+            prepareParams(catName, dbName, tableName, partNames, colNames,
+                engine), queryText);
+        end = doTrace ? System.nanoTime() : 0;
+        MetastoreDirectSqlUtils.timingTrace(doTrace, queryText, start, end);
+        if (qResult == null) {
+          query.closeAll();
+          return Collections.emptyList();
+        }
+
+        List<Object[]> list = MetastoreDirectSqlUtils.ensureList(qResult);
+        for (Object[] row : list) {
+          String colName = (String) row[0];
+          String colType = (String) row[1];
+          // Extrapolation is not needed for this column if
+          // count(\"PARTITION_NAME\")==partNames.size()
+          // Or, extrapolation is not possible for this column if
+          // count(\"PARTITION_NAME\")<2
+          Long count = MetastoreDirectSqlUtils.extractSqlLong(row[2]);
+          if (count == partNames.size() || count < 2) {
+            noExtraColumnNames.add(colName);
+          } else {
+            extraColumnNameTypeParts
+                .put(colName, new String[] {colType, String.valueOf(count)});
+          }
+          Deadline.checkTimeout();
+        }
+      } catch (Exception e) {
+        if (e instanceof MetaException) {
+          throw (MetaException) e;
         }
-        Deadline.checkTimeout();
       }
-      query.closeAll();
       // Extrapolation is not needed for columns noExtraColumnNames

Review comment:
       swallows exceptions which are not metaexceptions




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] kgyrtkirk commented on a change in pull request #2044: HIVE-24853. HMS leaks queries in case of timeout.

Posted by GitBox <gi...@apache.org>.
kgyrtkirk commented on a change in pull request #2044:
URL: https://github.com/apache/hive/pull/2044#discussion_r591622476



##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
##########
@@ -1783,12 +1783,18 @@ private long partsFoundForPartitions(
       MetastoreDirectSqlUtils.timingTrace(doTrace, queryText, start, end);
       List<Object[]> list = MetastoreDirectSqlUtils.ensureList(qResult);
       List<ColumnStatisticsObj> colStats = new ArrayList<ColumnStatisticsObj>(list.size());
-      for (Object[] row : list) {
-        colStats.add(prepareCSObjWithAdjustedNDV(row, 0, useDensityFunctionForNDVEstimation, ndvTuner));
-        Deadline.checkTimeout();
+        for (Object[] row : list) {
+          colStats.add(prepareCSObjWithAdjustedNDV(row, 0,
+              useDensityFunctionForNDVEstimation, ndvTuner));
+          Deadline.checkTimeout();
+        }
+        return colStats;
+      } catch (Exception e) {

Review comment:
       wouldn't this will swallow some exception types?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] kgyrtkirk commented on a change in pull request #2044: HIVE-24853. HMS leaks queries in case of timeout.

Posted by GitBox <gi...@apache.org>.
kgyrtkirk commented on a change in pull request #2044:
URL: https://github.com/apache/hive/pull/2044#discussion_r593017095



##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetastoreDirectSqlUtils.java
##########
@@ -138,23 +138,31 @@ static void timingTrace(boolean doTrace, String queryText, long start, long quer
     List<Object[]> list = ensureList(result);
     Iterator<Object[]> iter = list.iterator();
     Object[] fields = null;
-    for (Map.Entry<Long, T> entry : tree.entrySet()) {
-      if (fields == null && !iter.hasNext()) break;
-      long id = entry.getKey();
-      while (fields != null || iter.hasNext()) {
-        if (fields == null) {
-          fields = iter.next();
+    int rv = 0;

Review comment:
       there is a `query` in this method - can we fortify it with the usual `try-with-resources` armor?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] kgyrtkirk commented on a change in pull request #2044: HIVE-24853. HMS leaks queries in case of timeout.

Posted by GitBox <gi...@apache.org>.
kgyrtkirk commented on a change in pull request #2044:
URL: https://github.com/apache/hive/pull/2044#discussion_r593013960



##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
##########
@@ -1783,12 +1783,16 @@ private long partsFoundForPartitions(
       MetastoreDirectSqlUtils.timingTrace(doTrace, queryText, start, end);
       List<Object[]> list = MetastoreDirectSqlUtils.ensureList(qResult);
       List<ColumnStatisticsObj> colStats = new ArrayList<ColumnStatisticsObj>(list.size());
-      for (Object[] row : list) {
-        colStats.add(prepareCSObjWithAdjustedNDV(row, 0, useDensityFunctionForNDVEstimation, ndvTuner));
-        Deadline.checkTimeout();
+        for (Object[] row : list) {
+          colStats.add(prepareCSObjWithAdjustedNDV(row, 0,
+              useDensityFunctionForNDVEstimation, ndvTuner));
+          Deadline.checkTimeout();
+        }
+        return colStats;
+      } catch (Exception e) {
+        throwMetaOrRuntimeException(e);
+        return Collections.emptyList();

Review comment:
       I believe control never comes back from the "throwMetaOrRuntimeException" method




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] kgyrtkirk commented on a change in pull request #2044: HIVE-24853. HMS leaks queries in case of timeout.

Posted by GitBox <gi...@apache.org>.
kgyrtkirk commented on a change in pull request #2044:
URL: https://github.com/apache/hive/pull/2044#discussion_r593015393



##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
##########
@@ -1870,30 +1886,33 @@ private long partsFoundForPartitions(
             + " and \"ENGINE\" = ? "
             + " group by \"COLUMN_NAME\"";
         start = doTrace ? System.nanoTime() : 0;
-        query = pm.newQuery("javax.jdo.query.SQL", queryText);
-        List<String> extraColumnNames = new ArrayList<String>();
-        extraColumnNames.addAll(extraColumnNameTypeParts.keySet());
-        qResult = executeWithArray(query,
-            prepareParams(catName, dbName, tableName, partNames, extraColumnNames, engine), queryText);
-        if (qResult == null) {
-          query.closeAll();
-          return Collections.emptyList();
-        }
-        list = MetastoreDirectSqlUtils.ensureList(qResult);
-        // see the indexes for colstats in IExtrapolatePartStatus
-        Integer[] sumIndex = new Integer[] { 6, 10, 11, 15 };
-        for (Object[] row : list) {
-          Map<Integer, Object> indexToObject = new HashMap<Integer, Object>();
-          for (int ind = 1; ind < row.length; ind++) {
-            indexToObject.put(sumIndex[ind - 1], row[ind]);
+        try (Query query = pm.newQuery("javax.jdo.query.SQL", queryText);) {

Review comment:
       we have an extra `;` here and in every other `try ( ) `




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ayushtkn commented on a change in pull request #2044: HIVE-24853. HMS leaks queries in case of timeout.

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on a change in pull request #2044:
URL: https://github.com/apache/hive/pull/2044#discussion_r595858774



##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
##########
@@ -1772,23 +1774,29 @@ private long partsFoundForPartitions(
           + " and \"ENGINE\" = ? "
           + " group by \"COLUMN_NAME\", \"COLUMN_TYPE\"";
       start = doTrace ? System.nanoTime() : 0;
-      query = pm.newQuery("javax.jdo.query.SQL", queryText);
-      qResult = executeWithArray(query, prepareParams(catName, dbName, tableName, partNames, colNames, engine),

Review comment:
       Done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] kgyrtkirk commented on a change in pull request #2044: HIVE-24853. HMS leaks queries in case of timeout.

Posted by GitBox <gi...@apache.org>.
kgyrtkirk commented on a change in pull request #2044:
URL: https://github.com/apache/hive/pull/2044#discussion_r593012170



##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
##########
@@ -1783,12 +1783,16 @@ private long partsFoundForPartitions(
       MetastoreDirectSqlUtils.timingTrace(doTrace, queryText, start, end);
       List<Object[]> list = MetastoreDirectSqlUtils.ensureList(qResult);
       List<ColumnStatisticsObj> colStats = new ArrayList<ColumnStatisticsObj>(list.size());
-      for (Object[] row : list) {
-        colStats.add(prepareCSObjWithAdjustedNDV(row, 0, useDensityFunctionForNDVEstimation, ndvTuner));
-        Deadline.checkTimeout();
+        for (Object[] row : list) {

Review comment:
       why do we have this indented 1 more level? please format your code with the hive [formatter](https://github.com/apache/hive/blob/master/dev-support/eclipse-styles.xml) 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] kgyrtkirk commented on a change in pull request #2044: HIVE-24853. HMS leaks queries in case of timeout.

Posted by GitBox <gi...@apache.org>.
kgyrtkirk commented on a change in pull request #2044:
URL: https://github.com/apache/hive/pull/2044#discussion_r593013277



##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
##########
@@ -1772,7 +1772,7 @@ private long partsFoundForPartitions(
           + " and \"ENGINE\" = ? "
           + " group by \"COLUMN_NAME\", \"COLUMN_TYPE\"";
       start = doTrace ? System.nanoTime() : 0;
-      query = pm.newQuery("javax.jdo.query.SQL", queryText);
+      try(Query query = pm.newQuery("javax.jdo.query.SQL", queryText);) {
       qResult = executeWithArray(query, prepareParams(catName, dbName, tableName, partNames, colNames, engine),

Review comment:
       I believe this line should be indented




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ayushtkn commented on a change in pull request #2044: HIVE-24853. HMS leaks queries in case of timeout.

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on a change in pull request #2044:
URL: https://github.com/apache/hive/pull/2044#discussion_r591759083



##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
##########
@@ -1783,12 +1783,18 @@ private long partsFoundForPartitions(
       MetastoreDirectSqlUtils.timingTrace(doTrace, queryText, start, end);
       List<Object[]> list = MetastoreDirectSqlUtils.ensureList(qResult);
       List<ColumnStatisticsObj> colStats = new ArrayList<ColumnStatisticsObj>(list.size());
-      for (Object[] row : list) {
-        colStats.add(prepareCSObjWithAdjustedNDV(row, 0, useDensityFunctionForNDVEstimation, ndvTuner));
-        Deadline.checkTimeout();
+        for (Object[] row : list) {
+          colStats.add(prepareCSObjWithAdjustedNDV(row, 0,
+              useDensityFunctionForNDVEstimation, ndvTuner));
+          Deadline.checkTimeout();
+        }
+        return colStats;
+      } catch (Exception e) {

Review comment:
       Yeps, because the main source code throws MetaException only, the Exception came only because of that try-with resources, due to that AutoClosable stuff




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] rbalamohan edited a comment on pull request #2044: HIVE-24853. HMS leaks queries in case of timeout.

Posted by GitBox <gi...@apache.org>.
rbalamohan edited a comment on pull request #2044:
URL: https://github.com/apache/hive/pull/2044#issuecomment-792377866


   @ayushtkn : Should this be fixed in the set of methods in "MetastoreDirectSqlUtils" (e.g MetastoreDirectSqlUtils::loopJoinOrderedResult) as well?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] kgyrtkirk commented on a change in pull request #2044: HIVE-24853. HMS leaks queries in case of timeout.

Posted by GitBox <gi...@apache.org>.
kgyrtkirk commented on a change in pull request #2044:
URL: https://github.com/apache/hive/pull/2044#discussion_r593014649



##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
##########
@@ -1802,57 +1806,69 @@ private long partsFoundForPartitions(
           + " and \"ENGINE\" = ? "
           + " group by \"COLUMN_NAME\", \"COLUMN_TYPE\"";
       start = doTrace ? System.nanoTime() : 0;
-      query = pm.newQuery("javax.jdo.query.SQL", queryText);
-      qResult = executeWithArray(query, prepareParams(catName, dbName, tableName, partNames, colNames, engine),
-          queryText);
-      end = doTrace ? System.nanoTime() : 0;
-      MetastoreDirectSqlUtils.timingTrace(doTrace, queryText, start, end);
-      if (qResult == null) {
-        query.closeAll();
-        return Collections.emptyList();
-      }
       List<String> noExtraColumnNames = new ArrayList<String>();
       Map<String, String[]> extraColumnNameTypeParts = new HashMap<String, String[]>();
-      List<Object[]> list = MetastoreDirectSqlUtils.ensureList(qResult);
-      for (Object[] row : list) {
-        String colName = (String) row[0];
-        String colType = (String) row[1];
-        // Extrapolation is not needed for this column if
-        // count(\"PARTITION_NAME\")==partNames.size()
-        // Or, extrapolation is not possible for this column if
-        // count(\"PARTITION_NAME\")<2
-        Long count = MetastoreDirectSqlUtils.extractSqlLong(row[2]);
-        if (count == partNames.size() || count < 2) {
-          noExtraColumnNames.add(colName);
-        } else {
-          extraColumnNameTypeParts.put(colName, new String[] { colType, String.valueOf(count) });
+      try(Query query = pm.newQuery("javax.jdo.query.SQL", queryText);) {
+        qResult = executeWithArray(query,
+            prepareParams(catName, dbName, tableName, partNames, colNames,
+                engine), queryText);
+        end = doTrace ? System.nanoTime() : 0;
+        MetastoreDirectSqlUtils.timingTrace(doTrace, queryText, start, end);
+        if (qResult == null) {
+          query.closeAll();
+          return Collections.emptyList();
         }
-        Deadline.checkTimeout();
+
+        List<Object[]> list = MetastoreDirectSqlUtils.ensureList(qResult);
+        for (Object[] row : list) {
+          String colName = (String) row[0];
+          String colType = (String) row[1];
+          // Extrapolation is not needed for this column if
+          // count(\"PARTITION_NAME\")==partNames.size()
+          // Or, extrapolation is not possible for this column if
+          // count(\"PARTITION_NAME\")<2
+          Long count = MetastoreDirectSqlUtils.extractSqlLong(row[2]);
+          if (count == partNames.size() || count < 2) {
+            noExtraColumnNames.add(colName);
+          } else {
+            extraColumnNameTypeParts
+                .put(colName, new String[] {colType, String.valueOf(count)});
+          }
+          Deadline.checkTimeout();
+        }
+      } catch (Exception e) {
+        throwMetaOrRuntimeException(e);
       }
-      query.closeAll();
       // Extrapolation is not needed for columns noExtraColumnNames
+      List<Object[]> list;
       if (noExtraColumnNames.size() != 0) {
         queryText = commonPrefix + " and \"COLUMN_NAME\" in ("
             + makeParams(noExtraColumnNames.size()) + ")" + " and \"PARTITION_NAME\" in ("
             + makeParams(partNames.size()) + ")"
             + " and \"ENGINE\" = ? "
             + " group by \"COLUMN_NAME\", \"COLUMN_TYPE\"";
         start = doTrace ? System.nanoTime() : 0;
-        query = pm.newQuery("javax.jdo.query.SQL", queryText);
-        qResult = executeWithArray(query,
-            prepareParams(catName, dbName, tableName, partNames, noExtraColumnNames, engine), queryText);
-        if (qResult == null) {
-          query.closeAll();
-          return Collections.emptyList();
-        }
-        list = MetastoreDirectSqlUtils.ensureList(qResult);
-        for (Object[] row : list) {
-          colStats.add(prepareCSObjWithAdjustedNDV(row, 0, useDensityFunctionForNDVEstimation, ndvTuner));
-          Deadline.checkTimeout();
+
+        try {

Review comment:
       we have a try-with-resources inside a try ; can't we just remove the outer try?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] kgyrtkirk commented on a change in pull request #2044: HIVE-24853. HMS leaks queries in case of timeout.

Posted by GitBox <gi...@apache.org>.
kgyrtkirk commented on a change in pull request #2044:
URL: https://github.com/apache/hive/pull/2044#discussion_r591621303



##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
##########
@@ -1134,6 +1138,7 @@ private Path constructRenamedPath(Path defaultNewPath, Path currentPath) {
             if (found) {

Review comment:
       move the check above this `if`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] rbalamohan commented on pull request #2044: HIVE-24853. HMS leaks queries in case of timeout.

Posted by GitBox <gi...@apache.org>.
rbalamohan commented on pull request #2044:
URL: https://github.com/apache/hive/pull/2044#issuecomment-792377866


   Should this be fixed in the set of methods in "MetastoreDirectSqlUtils" (e.g MetastoreDirectSqlUtils::loopJoinOrderedResult) as well?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ayushtkn commented on pull request #2044: HIVE-24853. HMS leaks queries in case of timeout.

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on pull request #2044:
URL: https://github.com/apache/hive/pull/2044#issuecomment-792478370


   Thanx @rbalamohan for the review. I have fixed this in `MetastoreDirectSqlUtils ` as well, There were two occurances there, and one had this `Deadline` check, so have added the fix for that one


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ayushtkn commented on a change in pull request #2044: HIVE-24853. HMS leaks queries in case of timeout.

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on a change in pull request #2044:
URL: https://github.com/apache/hive/pull/2044#discussion_r594604694



##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
##########
@@ -1783,12 +1783,16 @@ private long partsFoundForPartitions(
       MetastoreDirectSqlUtils.timingTrace(doTrace, queryText, start, end);
       List<Object[]> list = MetastoreDirectSqlUtils.ensureList(qResult);
       List<ColumnStatisticsObj> colStats = new ArrayList<ColumnStatisticsObj>(list.size());
-      for (Object[] row : list) {
-        colStats.add(prepareCSObjWithAdjustedNDV(row, 0, useDensityFunctionForNDVEstimation, ndvTuner));
-        Deadline.checkTimeout();
+        for (Object[] row : list) {
+          colStats.add(prepareCSObjWithAdjustedNDV(row, 0,
+              useDensityFunctionForNDVEstimation, ndvTuner));
+          Deadline.checkTimeout();
+        }
+        return colStats;
+      } catch (Exception e) {
+        throwMetaOrRuntimeException(e);
+        return Collections.emptyList();

Review comment:
       Compiler wants it. :-(




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] kgyrtkirk commented on a change in pull request #2044: HIVE-24853. HMS leaks queries in case of timeout.

Posted by GitBox <gi...@apache.org>.
kgyrtkirk commented on a change in pull request #2044:
URL: https://github.com/apache/hive/pull/2044#discussion_r589355466



##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
##########
@@ -390,13 +394,16 @@ public void alterTable(RawStore msdb, Warehouse wh, String catName, String dbnam
                     part.getValues(), oldCols, oldt, part, null, null);
                 assert (colStats.isEmpty());
                 if (cascade) {

Review comment:
       I think it will enough to have 1 call in this method - either here above the if or keep the one after the block

##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetastoreDirectSqlUtils.java
##########
@@ -138,23 +138,30 @@ static void timingTrace(boolean doTrace, String queryText, long start, long quer
     List<Object[]> list = ensureList(result);
     Iterator<Object[]> iter = list.iterator();
     Object[] fields = null;
-    for (Map.Entry<Long, T> entry : tree.entrySet()) {
-      if (fields == null && !iter.hasNext()) break;
-      long id = entry.getKey();
-      while (fields != null || iter.hasNext()) {
-        if (fields == null) {
-          fields = iter.next();
+    int rv = 0;
+    try {
+      for (Map.Entry<Long, T> entry : tree.entrySet()) {
+        if (fields == null && !iter.hasNext())
+          break;
+        long id = entry.getKey();
+        while (fields != null || iter.hasNext()) {
+          if (fields == null) {
+            fields = iter.next();
+          }
+          long nestedId = extractSqlLong(fields[keyIndex]);
+          if (nestedId < id)

Review comment:
       please follow the hive formatter braces (`{}`) are mandatory for even 1 line if blocks as well

##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
##########
@@ -355,8 +357,10 @@ public void alterTable(RawStore msdb, Warehouse wh, String catName, String dbnam
                 writeIdList, newt.getWriteId());
           }
         } else {
+          Deadline.checkTimeout();

Review comment:
       I think 1 check should be enough in every loop
   
   a few lines above the `for (Entry<Partition, ColumnStatistics> partColStats : columnStatsNeedUpdated.entries()) {`
   
   doesnt have any checks

##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
##########
@@ -1783,11 +1783,15 @@ private long partsFoundForPartitions(
       MetastoreDirectSqlUtils.timingTrace(doTrace, queryText, start, end);
       List<Object[]> list = MetastoreDirectSqlUtils.ensureList(qResult);
       List<ColumnStatisticsObj> colStats = new ArrayList<ColumnStatisticsObj>(list.size());
-      for (Object[] row : list) {
-        colStats.add(prepareCSObjWithAdjustedNDV(row, 0, useDensityFunctionForNDVEstimation, ndvTuner));
-        Deadline.checkTimeout();
+      try {

Review comment:
       instead of try-finally ; can't we switch to try-with-resources for the query object ? 
   note that for example `ensureList` may also throw an exception in some cases - and in that particular case the query will  not be closed...

##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
##########
@@ -559,11 +567,14 @@ public Partition alterPartition(RawStore msdb, Warehouse wh, String catName, Str
 
         // PartitionView does not have SD. We do not need update its column stats
         if (oldPart.getSd() != null) {
+          Deadline.checkTimeout();

Review comment:
       I think we should be ok to only check this once in every for loop

##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
##########
@@ -1814,22 +1818,26 @@ private long partsFoundForPartitions(
       List<String> noExtraColumnNames = new ArrayList<String>();
       Map<String, String[]> extraColumnNameTypeParts = new HashMap<String, String[]>();
       List<Object[]> list = MetastoreDirectSqlUtils.ensureList(qResult);
-      for (Object[] row : list) {
-        String colName = (String) row[0];
-        String colType = (String) row[1];
-        // Extrapolation is not needed for this column if
-        // count(\"PARTITION_NAME\")==partNames.size()
-        // Or, extrapolation is not possible for this column if
-        // count(\"PARTITION_NAME\")<2
-        Long count = MetastoreDirectSqlUtils.extractSqlLong(row[2]);
-        if (count == partNames.size() || count < 2) {
-          noExtraColumnNames.add(colName);
-        } else {
-          extraColumnNameTypeParts.put(colName, new String[] { colType, String.valueOf(count) });
+      try {
+        for (Object[] row : list) {
+          String colName = (String) row[0];

Review comment:
       try-with-resources for `query`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ayushtkn commented on pull request #2044: HIVE-24853. HMS leaks queries in case of timeout.

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on pull request #2044:
URL: https://github.com/apache/hive/pull/2044#issuecomment-801798182


   Thanx Zoltan for helping with review and commit. :-)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] kgyrtkirk merged pull request #2044: HIVE-24853. HMS leaks queries in case of timeout.

Posted by GitBox <gi...@apache.org>.
kgyrtkirk merged pull request #2044:
URL: https://github.com/apache/hive/pull/2044


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] kgyrtkirk commented on a change in pull request #2044: HIVE-24853. HMS leaks queries in case of timeout.

Posted by GitBox <gi...@apache.org>.
kgyrtkirk commented on a change in pull request #2044:
URL: https://github.com/apache/hive/pull/2044#discussion_r595828736



##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
##########
@@ -1772,23 +1774,29 @@ private long partsFoundForPartitions(
           + " and \"ENGINE\" = ? "
           + " group by \"COLUMN_NAME\", \"COLUMN_TYPE\"";
       start = doTrace ? System.nanoTime() : 0;
-      query = pm.newQuery("javax.jdo.query.SQL", queryText);
-      qResult = executeWithArray(query, prepareParams(catName, dbName, tableName, partNames, colNames, engine),

Review comment:
       I think we should get rid of forward declaring `qResult` with null/etc values - it may aid the escape of a query bound object
   
   forward declaring `queryText` may also doesn't have any benefits




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] kgyrtkirk commented on a change in pull request #2044: HIVE-24853. HMS leaks queries in case of timeout.

Posted by GitBox <gi...@apache.org>.
kgyrtkirk commented on a change in pull request #2044:
URL: https://github.com/apache/hive/pull/2044#discussion_r591620666



##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
##########
@@ -390,13 +394,16 @@ public void alterTable(RawStore msdb, Warehouse wh, String catName, String dbnam
                     part.getValues(), oldCols, oldt, part, null, null);
                 assert (colStats.isEmpty());
                 if (cascade) {

Review comment:
       move the deadline check above the if - so that it covers both paths




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org