You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2022/04/11 13:06:18 UTC

[GitHub] [hive] kasakrisz commented on a diff in pull request #3014: HIVE-25941: Long compilation time of complex query due to analysis fo…

kasakrisz commented on code in PR #3014:
URL: https://github.com/apache/hive/pull/3014#discussion_r847305506


##########
ql/src/java/org/apache/hadoop/hive/ql/metadata/MaterializedViewsCache.java:
##########
@@ -205,4 +212,52 @@ HiveRelOptMaterialization get(String dbName, String viewName) {
   public boolean isEmpty() {
     return materializedViews.isEmpty();
   }
+
+
+  private static class ASTKey {
+    private final ASTNode root;
+
+    public ASTKey(ASTNode root) {
+      this.root = root;
+    }
+
+    @Override
+    public boolean equals(Object o) {
+      if (this == o) return true;
+      if (o == null || getClass() != o.getClass()) return false;
+      ASTKey that = (ASTKey) o;
+      return equals(root, that.root);
+    }
+
+    private boolean equals(ASTNode astNode1, ASTNode astNode2) {
+      if (!(astNode1.getType() == astNode2.getType() &&
+              astNode1.getText().equals(astNode2.getText()) &&
+              astNode1.getChildCount() == astNode2.getChildCount())) {
+        return false;
+      }
+
+      for (int i = 0; i < astNode1.getChildCount(); ++i) {
+        if (!equals((ASTNode) astNode1.getChild(i), (ASTNode) astNode2.getChild(i))) {
+          return false;
+        }
+      }
+
+      return true;
+    }
+
+    @Override
+    public int hashCode() {
+      return hashcode(root);

Review Comment:
   * Hashcode of the ASTs stored in the `MaterializedViewCache` calculated only once: when the MVs are loaded when hs2 starts or a new MV is created because Java hashmap implementation caches the key's hashcode.
   * When we look-up a Materialization the hashcode of the key is calculated every time the get method is called. This is called only once for the entire tree per query.
   * To find sub-query rewrites the look-up is done by sub AST-s and the hashcode is also calculated for the subTrees but when I did some performance tests locally I didn't found this as a bottleneck.
   
   This solution is still much faster then generating the expanded query text of every possible sub-query using `UnparseTranslator` and `TokenRewriteStream`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org