You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2020/12/01 19:04:39 UTC

[GitHub] [hive] omalley opened a new pull request #1726: HIVE-24458: Add a method to provide a non-normalized ExpressionTree

omalley opened a new pull request #1726:
URL: https://github.com/apache/hive/pull/1726


   This pull request provides a method to get a non-normalized copy of the SArg expression. This can be useful for applications where the smaller non-normalized expression is useful.
   
   I've changed the SArg ExpressionTree leaves to keep a pointer to the PredicateLeafImpl rather than just the id. That makes a lot of things easier, including dealing with keeping the normalized and non-normalized versions in sync with the leaves.
   
   It also changes the format of the toString to be easier to read, but to make the old tests easier to port I've provided a toOldString method that generates the old string.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] omalley commented on a change in pull request #1726: HIVE-24458: Add a method to provide a non-normalized ExpressionTree

Posted by GitBox <gi...@apache.org>.
omalley commented on a change in pull request #1726:
URL: https://github.com/apache/hive/pull/1726#discussion_r566343100



##########
File path: storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java
##########
@@ -132,6 +130,14 @@ public Object getLiteral() {
       return literalList;
     }
 
+    public int getId() {

Review comment:
       I made getId more visible, but left setId as private. setId only makes sense in the context of the builder.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] pgaref commented on a change in pull request #1726: HIVE-24458: Add a method to provide a non-normalized ExpressionTree

Posted by GitBox <gi...@apache.org>.
pgaref commented on a change in pull request #1726:
URL: https://github.com/apache/hive/pull/1726#discussion_r566325355



##########
File path: storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java
##########
@@ -330,7 +355,7 @@ public Builder lessThan(String column, PredicateLeaf.Type type,
       if (column == null || literal == null) {
         parent.getChildren().add(new ExpressionTree(TruthValue.YES_NO_NULL));
       } else {
-        PredicateLeaf leaf =
+        PredicateLeafImpl leaf =

Review comment:
       same

##########
File path: storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java
##########
@@ -345,7 +370,7 @@ public Builder lessThanEquals(String column, PredicateLeaf.Type type,
       if (column == null || literal == null) {
         parent.getChildren().add(new ExpressionTree(TruthValue.YES_NO_NULL));
       } else {
-        PredicateLeaf leaf =
+        PredicateLeafImpl leaf =

Review comment:
       same on all alteration below..

##########
File path: storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java
##########
@@ -312,12 +338,11 @@ public Builder end() {
       return this;
     }
 
-    private int addLeaf(PredicateLeaf leaf) {
-      Integer result = leaves.get(leaf);
+    private PredicateLeafImpl addLeaf(PredicateLeafImpl leaf) {

Review comment:
       Argument may change to PredicateLeaf?

##########
File path: storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/ExpressionTree.java
##########
@@ -31,25 +31,25 @@
   public enum Operator {OR, AND, NOT, LEAF, CONSTANT}
   private final Operator operator;
   private final List<ExpressionTree> children;
-  private int leaf;
+  private final PredicateLeaf leaf;
   private final SearchArgument.TruthValue constant;
 
   ExpressionTree() {
     operator = null;
     children = null;
-    leaf = 0;
+    leaf = null;
     constant = null;
   }
 
   ExpressionTree(Operator op, ExpressionTree... kids) {
     operator = op;
-    children = new ArrayList<ExpressionTree>();
-    leaf = -1;
+    children = new ArrayList<>();
+    leaf = null;
     this.constant = null;
     Collections.addAll(children, kids);
   }
 
-  ExpressionTree(int leaf) {
+  ExpressionTree(SearchArgumentImpl.PredicateLeafImpl leaf) {

Review comment:
       This can now become PredicateLeaf




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] kgyrtkirk merged pull request #1726: HIVE-24458: Add a method to provide a non-normalized ExpressionTree

Posted by GitBox <gi...@apache.org>.
kgyrtkirk merged pull request #1726:
URL: https://github.com/apache/hive/pull/1726


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] omalley commented on pull request #1726: HIVE-24458: Add a method to provide a non-normalized ExpressionTree

Posted by GitBox <gi...@apache.org>.
omalley commented on pull request #1726:
URL: https://github.com/apache/hive/pull/1726#issuecomment-769323606


   @pgaref I think we are ok with two trees, because we were already keeping the larger tree in memory. Adding the compact one should be ok.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] pgaref commented on a change in pull request #1726: HIVE-24458: Add a method to provide a non-normalized ExpressionTree

Posted by GitBox <gi...@apache.org>.
pgaref commented on a change in pull request #1726:
URL: https://github.com/apache/hive/pull/1726#discussion_r566060211



##########
File path: ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestSearchArgumentImpl.java
##########
@@ -85,100 +89,101 @@ public static PredicateLeaf createPredicateLeaf(PredicateLeaf.Operator operator,
   @Test
   public void testNotPushdown() throws Exception {
     assertEquals("leaf-1", SearchArgumentImpl.BuilderImpl.pushDownNot(leaf(1))
-        .toString());
+        .toOldString());
     assertEquals("(not leaf-1)",
-        SearchArgumentImpl.BuilderImpl.pushDownNot(not(leaf(1))).toString());
+        SearchArgumentImpl.BuilderImpl.pushDownNot(not(leaf(1))).toOldString());
     assertEquals("leaf-1",
         SearchArgumentImpl.BuilderImpl.pushDownNot(not(not(leaf(1))))
-            .toString());
+            .toOldString());
     assertEquals("(not leaf-1)",
         SearchArgumentImpl.BuilderImpl.pushDownNot(not(not(not(leaf(1))))).
-            toString());
+            toOldString());
     assertEquals("(or leaf-1 (not leaf-2))",
         SearchArgumentImpl.BuilderImpl.pushDownNot(not(and(not(leaf(1)),
-            leaf(2)))).toString());
+            leaf(2)))).toOldString());
     assertEquals("(and (not leaf-1) leaf-2)",
         SearchArgumentImpl.BuilderImpl.pushDownNot(not(or(leaf(1),
-            not(leaf(2))))).toString());
+            not(leaf(2))))).toOldString());
     assertEquals("(or (or (not leaf-1) leaf-2) leaf-3)",
         SearchArgumentImpl.BuilderImpl.pushDownNot(or(not(and(leaf(1),
                 not(leaf(2)))),
-            not(not(leaf(3))))).toString());
+            not(not(leaf(3))))).toOldString());
     assertEquals("NO", SearchArgumentImpl.BuilderImpl.pushDownNot(
-        not(constant(TruthValue.YES))).toString());
+        not(constant(TruthValue.YES))).toOldString());
     assertEquals("YES", SearchArgumentImpl.BuilderImpl.pushDownNot(
-        not(constant(TruthValue.NO))).toString());
+        not(constant(TruthValue.NO))).toOldString());
     assertEquals("NULL", SearchArgumentImpl.BuilderImpl.pushDownNot(
-        not(constant(TruthValue.NULL))).toString());
+        not(constant(TruthValue.NULL))).toOldString());
     assertEquals("YES_NO", SearchArgumentImpl.BuilderImpl.pushDownNot(
-        not(constant(TruthValue.YES_NO))).toString());
+        not(constant(TruthValue.YES_NO))).toOldString());
     assertEquals("YES_NULL", SearchArgumentImpl.BuilderImpl.pushDownNot(
-        not(constant(TruthValue.NO_NULL))).toString());
+        not(constant(TruthValue.NO_NULL))).toOldString());
     assertEquals("NO_NULL", SearchArgumentImpl.BuilderImpl.pushDownNot(
-        not(constant(TruthValue.YES_NULL))).toString());
+        not(constant(TruthValue.YES_NULL))).toOldString());
     assertEquals("YES_NO_NULL", SearchArgumentImpl.BuilderImpl.pushDownNot(
-        not(constant(TruthValue.YES_NO_NULL))).toString());
+        not(constant(TruthValue.YES_NO_NULL))).toOldString());
   }
 
   @Test
   public void testFlatten() throws Exception {
-    assertEquals("leaf-1", SearchArgumentImpl.BuilderImpl.flatten(leaf(1)).toString());
+    PredicateLeafImpl leaf1 = new PredicateLeafImpl();

Review comment:
       Unused variable

##########
File path: storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/ExpressionTree.java
##########
@@ -31,25 +31,25 @@
   public enum Operator {OR, AND, NOT, LEAF, CONSTANT}
   private final Operator operator;
   private final List<ExpressionTree> children;
-  private int leaf;
+  private final SearchArgumentImpl.PredicateLeafImpl leaf;

Review comment:
       This could be changed to PredicateLeaf -- see below

##########
File path: storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java
##########
@@ -504,28 +493,18 @@ public SearchArgument build() {
         throw new IllegalArgumentException("Failed to end " +
             currentTree.size() + " operations.");
       }
+      ExpressionTree optimized = optimize(root);

Review comment:
       maybe rename vars to compact and normalized ExpressionTree to avoid confusion?

##########
File path: storage-api/src/test/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentFactoryTest.java
##########
@@ -0,0 +1,160 @@
+package org.apache.hadoop.hive.ql.io.sarg;

Review comment:
       Apache License?

##########
File path: storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgument.java
##########
@@ -162,6 +162,13 @@ public boolean isNeeded() {
    */
   public ExpressionTree getExpression();
 
+  /**
+   * Get the expression tree without the normalization to conjunctive normal

Review comment:
       Nit: could also explicitly say in **getExpression** above that it normalizes the expression




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] pgaref commented on a change in pull request #1726: HIVE-24458: Add a method to provide a non-normalized ExpressionTree

Posted by GitBox <gi...@apache.org>.
pgaref commented on a change in pull request #1726:
URL: https://github.com/apache/hive/pull/1726#discussion_r566080036



##########
File path: storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgument.java
##########
@@ -162,6 +162,13 @@ public boolean isNeeded() {
    */
   public ExpressionTree getExpression();
 
+  /**
+   * Get the expression tree without the normalization to conjunctive normal

Review comment:
       Nit: maybe we should also explicitly mention in **getExpression** doc that it normalizes the expression




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] kgyrtkirk commented on a change in pull request #1726: HIVE-24458: Add a method to provide a non-normalized ExpressionTree

Posted by GitBox <gi...@apache.org>.
kgyrtkirk commented on a change in pull request #1726:
URL: https://github.com/apache/hive/pull/1726#discussion_r565500287



##########
File path: storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgument.java
##########
@@ -162,6 +162,11 @@ public boolean isNeeded() {
    */
   public ExpressionTree getExpression();
 
+  /**
+   * Get the expression tree without the

Review comment:
       unfinished sentece
   

##########
File path: storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgument.java
##########
@@ -162,6 +162,11 @@ public boolean isNeeded() {
    */
   public ExpressionTree getExpression();
 
+  /**
+   * Get the expression tree without the
+   */
+  public ExpressionTree getUnexpandedExpression();

Review comment:
       I find `unexpanded` a little strange - a different keyword might fit this thing better (what do you think about compact or  original).
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] pgaref commented on a change in pull request #1726: HIVE-24458: Add a method to provide a non-normalized ExpressionTree

Posted by GitBox <gi...@apache.org>.
pgaref commented on a change in pull request #1726:
URL: https://github.com/apache/hive/pull/1726#discussion_r566096110



##########
File path: storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java
##########
@@ -132,6 +130,14 @@ public Object getLiteral() {
       return literalList;
     }
 
+    public int getId() {

Review comment:
       I believe **Id** setter and getter could be moved to the **PredicateLeaf** interface.
   It would be cleaner and as a result **addLeaf** would need less modifications.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] pgaref commented on a change in pull request #1726: HIVE-24458: Add a method to provide a non-normalized ExpressionTree

Posted by GitBox <gi...@apache.org>.
pgaref commented on a change in pull request #1726:
URL: https://github.com/apache/hive/pull/1726#discussion_r566325762



##########
File path: storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java
##########
@@ -345,7 +370,7 @@ public Builder lessThanEquals(String column, PredicateLeaf.Type type,
       if (column == null || literal == null) {
         parent.getChildren().add(new ExpressionTree(TruthValue.YES_NO_NULL));
       } else {
-        PredicateLeaf leaf =
+        PredicateLeafImpl leaf =

Review comment:
       same on all alterations below..




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org