You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/05/19 16:07:24 UTC

[GitHub] [iceberg] samredai opened a new pull request, #4815: BooleanExpressionVisitor and visit method

samredai opened a new pull request, #4815:
URL: https://github.com/apache/iceberg/pull/4815

   This adds the `BooleanExpressionVisitor` base class and a `visit` function to the expressions base. The visit function is missing a dispatched function for `UnboundPredicate` and `BoundPredicate`, which we can add once those classes are added. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] samredai commented on a diff in pull request #4815: BooleanExpressionVisitor and visit method

Posted by GitBox <gi...@apache.org>.
samredai commented on code in PR #4815:
URL: https://github.com/apache/iceberg/pull/4815#discussion_r878986833


##########
python/src/iceberg/expressions/base.py:
##########
@@ -346,3 +346,116 @@ def bind(self, schema: Schema, case_sensitive: bool) -> BoundReference:
             raise ValueError(f"Cannot find field '{self.name}' in schema: {schema}")
 
         return BoundReference(field=field, accessor=schema.accessor_for_field(field.field_id))
+
+
+class BooleanExpressionVisitor(ABC):

Review Comment:
   I see, updated to `BooleanExpressionVisitor(Generic[T], ABC)` 👍 



##########
python/src/iceberg/expressions/base.py:
##########
@@ -346,3 +346,116 @@ def bind(self, schema: Schema, case_sensitive: bool) -> BoundReference:
             raise ValueError(f"Cannot find field '{self.name}' in schema: {schema}")
 
         return BoundReference(field=field, accessor=schema.accessor_for_field(field.field_id))
+
+
+class BooleanExpressionVisitor(ABC):
+    @abstractmethod
+    def visit_always_true(self) -> BooleanExpression:

Review Comment:
   Fixed!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #4815: BooleanExpressionVisitor and visit method

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #4815:
URL: https://github.com/apache/iceberg/pull/4815#discussion_r877594250


##########
python/src/iceberg/expressions/base.py:
##########
@@ -346,3 +346,116 @@ def bind(self, schema: Schema, case_sensitive: bool) -> BoundReference:
             raise ValueError(f"Cannot find field '{self.name}' in schema: {schema}")
 
         return BoundReference(field=field, accessor=schema.accessor_for_field(field.field_id))
+
+
+class BooleanExpressionVisitor(ABC):
+    @abstractmethod
+    def visit_always_true(self) -> BooleanExpression:

Review Comment:
   `BooleanExpression` -> `T`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #4815: BooleanExpressionVisitor and visit method

Posted by GitBox <gi...@apache.org>.
rdblue commented on PR #4815:
URL: https://github.com/apache/iceberg/pull/4815#issuecomment-1139912694

   Thanks, @samredai! Looks great.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] samredai commented on a diff in pull request #4815: BooleanExpressionVisitor and visit method

Posted by GitBox <gi...@apache.org>.
samredai commented on code in PR #4815:
URL: https://github.com/apache/iceberg/pull/4815#discussion_r878988217


##########
python/src/iceberg/expressions/base.py:
##########
@@ -346,3 +346,116 @@ def bind(self, schema: Schema, case_sensitive: bool) -> BoundReference:
             raise ValueError(f"Cannot find field '{self.name}' in schema: {schema}")
 
         return BoundReference(field=field, accessor=schema.accessor_for_field(field.field_id))
+
+
+class BooleanExpressionVisitor(ABC):
+    @abstractmethod
+    def visit_always_true(self) -> BooleanExpression:
+        """Visit method for an AlwaysTrue boolean expression
+
+        Note: This visit method has no arguments since AlwaysTrue instances have no context.
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_always_false(self) -> BooleanExpression:
+        """Visit method for an AlwaysFalse boolean expression
+
+        Note: This visit method has no arguments since AlwaysFalse instances have no context.
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_not(self, result: BooleanExpression) -> BooleanExpression:
+        """Visit method for a Not boolean expression
+
+        Args:
+            left (BooleanExpression): The left side of the expression
+            right (BooleanExpression): The right side of the expression
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_and(self, left: BooleanExpression, right: BooleanExpression) -> BooleanExpression:
+        """Visit method for an And boolean expression
+
+        Args:
+            left (BooleanExpression): The left side of the expression
+            right (BooleanExpression): The right side of the expression
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_or(self, left: BooleanExpression, right: BooleanExpression) -> BooleanExpression:
+        """Visit method for an Or boolean expression
+
+        Args:
+            left (BooleanExpression): The left side of the expression
+            right (BooleanExpression): The right side of the expression
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_predicate(self, predicate):
+        """Visit method for a predicate in an expression tree
+
+        Args:
+            predicate (UnboundPredicate | BoundPredicate): An instance of either an UnboundPredicate or a BoundPredicate,
+            depending on the concrete implementation.
+        """
+
+
+@singledispatch
+def visit(obj, visitor: BooleanExpressionVisitor) -> BooleanExpression:
+    """A generic function for applying a boolean expression visitor to any point within an expression
+
+    The function traverses the expression in post-order fashion
+
+    Args:
+        obj(BooleanExpression): An instance of a BooleanExpression
+        visitor (BooleanExpressionVisitor): An instance of an implementation of the generic BooleanExpressionVisitor base class
+
+    Raises:
+        NotImplementedError: If attempting to visit an unrecognized object type
+    """
+    raise NotImplementedError(f"Cannot visit non-type: {obj}")
+
+
+@visit.register(AlwaysTrue)
+def _(obj: AlwaysTrue, visitor: BooleanExpressionVisitor) -> BooleanExpression:
+    """Visit an AlwaysTrue boolean expression with a concrete BooleanExpressionVisitor"""
+    return visitor.visit_always_true()
+
+
+@visit.register(AlwaysFalse)
+def _(obj: AlwaysFalse, visitor: BooleanExpressionVisitor) -> BooleanExpression:
+    """Visit an AlwaysFalse boolean expression with a concrete BooleanExpressionVisitor"""
+    return visitor.visit_always_true()
+
+
+@visit.register(Not)
+def _(obj: Not, visitor: BooleanExpressionVisitor) -> BooleanExpression:
+    """Visit a Not boolean expression with a concrete BooleanExpressionVisitor"""
+    return visitor.visit_not(result=obj)

Review Comment:
   Thanks, fixed!



##########
python/src/iceberg/expressions/base.py:
##########
@@ -346,3 +346,116 @@ def bind(self, schema: Schema, case_sensitive: bool) -> BoundReference:
             raise ValueError(f"Cannot find field '{self.name}' in schema: {schema}")
 
         return BoundReference(field=field, accessor=schema.accessor_for_field(field.field_id))
+
+
+class BooleanExpressionVisitor(ABC):
+    @abstractmethod
+    def visit_always_true(self) -> BooleanExpression:
+        """Visit method for an AlwaysTrue boolean expression
+
+        Note: This visit method has no arguments since AlwaysTrue instances have no context.
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_always_false(self) -> BooleanExpression:
+        """Visit method for an AlwaysFalse boolean expression
+
+        Note: This visit method has no arguments since AlwaysFalse instances have no context.
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_not(self, result: BooleanExpression) -> BooleanExpression:
+        """Visit method for a Not boolean expression
+
+        Args:
+            left (BooleanExpression): The left side of the expression
+            right (BooleanExpression): The right side of the expression
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_and(self, left: BooleanExpression, right: BooleanExpression) -> BooleanExpression:
+        """Visit method for an And boolean expression
+
+        Args:
+            left (BooleanExpression): The left side of the expression
+            right (BooleanExpression): The right side of the expression
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_or(self, left: BooleanExpression, right: BooleanExpression) -> BooleanExpression:
+        """Visit method for an Or boolean expression
+
+        Args:
+            left (BooleanExpression): The left side of the expression
+            right (BooleanExpression): The right side of the expression
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_predicate(self, predicate):
+        """Visit method for a predicate in an expression tree
+
+        Args:
+            predicate (UnboundPredicate | BoundPredicate): An instance of either an UnboundPredicate or a BoundPredicate,
+            depending on the concrete implementation.
+        """
+
+
+@singledispatch
+def visit(obj, visitor: BooleanExpressionVisitor) -> BooleanExpression:
+    """A generic function for applying a boolean expression visitor to any point within an expression
+
+    The function traverses the expression in post-order fashion
+
+    Args:
+        obj(BooleanExpression): An instance of a BooleanExpression
+        visitor (BooleanExpressionVisitor): An instance of an implementation of the generic BooleanExpressionVisitor base class
+
+    Raises:
+        NotImplementedError: If attempting to visit an unrecognized object type
+    """
+    raise NotImplementedError(f"Cannot visit non-type: {obj}")
+
+
+@visit.register(AlwaysTrue)
+def _(obj: AlwaysTrue, visitor: BooleanExpressionVisitor) -> BooleanExpression:
+    """Visit an AlwaysTrue boolean expression with a concrete BooleanExpressionVisitor"""
+    return visitor.visit_always_true()
+
+
+@visit.register(AlwaysFalse)
+def _(obj: AlwaysFalse, visitor: BooleanExpressionVisitor) -> BooleanExpression:
+    """Visit an AlwaysFalse boolean expression with a concrete BooleanExpressionVisitor"""
+    return visitor.visit_always_true()
+
+
+@visit.register(Not)
+def _(obj: Not, visitor: BooleanExpressionVisitor) -> BooleanExpression:
+    """Visit a Not boolean expression with a concrete BooleanExpressionVisitor"""
+    return visitor.visit_not(result=obj)
+
+
+@visit.register(And)
+def _(obj: And, visitor: BooleanExpressionVisitor) -> BooleanExpression:
+    """Visit a And boolean expression with a concrete BooleanExpressionVisitor"""
+    return visitor.visit_and(left=obj.left, right=obj.right)

Review Comment:
   Done!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] samredai commented on a diff in pull request #4815: BooleanExpressionVisitor and visit method

Posted by GitBox <gi...@apache.org>.
samredai commented on code in PR #4815:
URL: https://github.com/apache/iceberg/pull/4815#discussion_r878987770


##########
python/src/iceberg/expressions/base.py:
##########
@@ -346,3 +346,116 @@ def bind(self, schema: Schema, case_sensitive: bool) -> BoundReference:
             raise ValueError(f"Cannot find field '{self.name}' in schema: {schema}")
 
         return BoundReference(field=field, accessor=schema.accessor_for_field(field.field_id))
+
+
+class BooleanExpressionVisitor(ABC):
+    @abstractmethod
+    def visit_always_true(self) -> BooleanExpression:
+        """Visit method for an AlwaysTrue boolean expression
+
+        Note: This visit method has no arguments since AlwaysTrue instances have no context.
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_always_false(self) -> BooleanExpression:
+        """Visit method for an AlwaysFalse boolean expression
+
+        Note: This visit method has no arguments since AlwaysFalse instances have no context.
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_not(self, result: BooleanExpression) -> BooleanExpression:
+        """Visit method for a Not boolean expression
+
+        Args:
+            left (BooleanExpression): The left side of the expression
+            right (BooleanExpression): The right side of the expression
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_and(self, left: BooleanExpression, right: BooleanExpression) -> BooleanExpression:
+        """Visit method for an And boolean expression
+
+        Args:
+            left (BooleanExpression): The left side of the expression
+            right (BooleanExpression): The right side of the expression
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_or(self, left: BooleanExpression, right: BooleanExpression) -> BooleanExpression:
+        """Visit method for an Or boolean expression
+
+        Args:
+            left (BooleanExpression): The left side of the expression
+            right (BooleanExpression): The right side of the expression
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_predicate(self, predicate):

Review Comment:
   Good idea, updated! I was initially thinking an implementation would just use singledispatchmethod but there's no reason to require that. Using your suggest still leaves open the option the implement a `visit_predicate` singledispatchmethod and just have it called by both the `visit_unbound_predicate` and `visit_bound_predicate` methods.



##########
python/src/iceberg/expressions/base.py:
##########
@@ -346,3 +346,116 @@ def bind(self, schema: Schema, case_sensitive: bool) -> BoundReference:
             raise ValueError(f"Cannot find field '{self.name}' in schema: {schema}")
 
         return BoundReference(field=field, accessor=schema.accessor_for_field(field.field_id))
+
+
+class BooleanExpressionVisitor(ABC):
+    @abstractmethod
+    def visit_always_true(self) -> BooleanExpression:
+        """Visit method for an AlwaysTrue boolean expression
+
+        Note: This visit method has no arguments since AlwaysTrue instances have no context.
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_always_false(self) -> BooleanExpression:
+        """Visit method for an AlwaysFalse boolean expression
+
+        Note: This visit method has no arguments since AlwaysFalse instances have no context.
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_not(self, result: BooleanExpression) -> BooleanExpression:
+        """Visit method for a Not boolean expression
+
+        Args:
+            left (BooleanExpression): The left side of the expression
+            right (BooleanExpression): The right side of the expression
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_and(self, left: BooleanExpression, right: BooleanExpression) -> BooleanExpression:
+        """Visit method for an And boolean expression
+
+        Args:
+            left (BooleanExpression): The left side of the expression
+            right (BooleanExpression): The right side of the expression
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_or(self, left: BooleanExpression, right: BooleanExpression) -> BooleanExpression:
+        """Visit method for an Or boolean expression
+
+        Args:
+            left (BooleanExpression): The left side of the expression
+            right (BooleanExpression): The right side of the expression
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_predicate(self, predicate):
+        """Visit method for a predicate in an expression tree
+
+        Args:
+            predicate (UnboundPredicate | BoundPredicate): An instance of either an UnboundPredicate or a BoundPredicate,
+            depending on the concrete implementation.
+        """
+
+
+@singledispatch
+def visit(obj, visitor: BooleanExpressionVisitor) -> BooleanExpression:
+    """A generic function for applying a boolean expression visitor to any point within an expression
+
+    The function traverses the expression in post-order fashion
+
+    Args:
+        obj(BooleanExpression): An instance of a BooleanExpression
+        visitor (BooleanExpressionVisitor): An instance of an implementation of the generic BooleanExpressionVisitor base class
+
+    Raises:
+        NotImplementedError: If attempting to visit an unrecognized object type
+    """
+    raise NotImplementedError(f"Cannot visit non-type: {obj}")

Review Comment:
   Ah right, fixed!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #4815: BooleanExpressionVisitor and visit method

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #4815:
URL: https://github.com/apache/iceberg/pull/4815#discussion_r877596453


##########
python/src/iceberg/expressions/base.py:
##########
@@ -346,3 +346,116 @@ def bind(self, schema: Schema, case_sensitive: bool) -> BoundReference:
             raise ValueError(f"Cannot find field '{self.name}' in schema: {schema}")
 
         return BoundReference(field=field, accessor=schema.accessor_for_field(field.field_id))
+
+
+class BooleanExpressionVisitor(ABC):
+    @abstractmethod
+    def visit_always_true(self) -> BooleanExpression:
+        """Visit method for an AlwaysTrue boolean expression
+
+        Note: This visit method has no arguments since AlwaysTrue instances have no context.
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_always_false(self) -> BooleanExpression:
+        """Visit method for an AlwaysFalse boolean expression
+
+        Note: This visit method has no arguments since AlwaysFalse instances have no context.
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_not(self, result: BooleanExpression) -> BooleanExpression:
+        """Visit method for a Not boolean expression
+
+        Args:
+            left (BooleanExpression): The left side of the expression
+            right (BooleanExpression): The right side of the expression
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_and(self, left: BooleanExpression, right: BooleanExpression) -> BooleanExpression:
+        """Visit method for an And boolean expression
+
+        Args:
+            left (BooleanExpression): The left side of the expression
+            right (BooleanExpression): The right side of the expression
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_or(self, left: BooleanExpression, right: BooleanExpression) -> BooleanExpression:
+        """Visit method for an Or boolean expression
+
+        Args:
+            left (BooleanExpression): The left side of the expression
+            right (BooleanExpression): The right side of the expression
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_predicate(self, predicate):
+        """Visit method for a predicate in an expression tree
+
+        Args:
+            predicate (UnboundPredicate | BoundPredicate): An instance of either an UnboundPredicate or a BoundPredicate,
+            depending on the concrete implementation.
+        """
+
+
+@singledispatch
+def visit(obj, visitor: BooleanExpressionVisitor) -> BooleanExpression:
+    """A generic function for applying a boolean expression visitor to any point within an expression
+
+    The function traverses the expression in post-order fashion
+
+    Args:
+        obj(BooleanExpression): An instance of a BooleanExpression
+        visitor (BooleanExpressionVisitor): An instance of an implementation of the generic BooleanExpressionVisitor base class
+
+    Raises:
+        NotImplementedError: If attempting to visit an unrecognized object type
+    """
+    raise NotImplementedError(f"Cannot visit non-type: {obj}")
+
+
+@visit.register(AlwaysTrue)
+def _(obj: AlwaysTrue, visitor: BooleanExpressionVisitor) -> BooleanExpression:
+    """Visit an AlwaysTrue boolean expression with a concrete BooleanExpressionVisitor"""
+    return visitor.visit_always_true()
+
+
+@visit.register(AlwaysFalse)
+def _(obj: AlwaysFalse, visitor: BooleanExpressionVisitor) -> BooleanExpression:
+    """Visit an AlwaysFalse boolean expression with a concrete BooleanExpressionVisitor"""
+    return visitor.visit_always_true()
+
+
+@visit.register(Not)
+def _(obj: Not, visitor: BooleanExpressionVisitor) -> BooleanExpression:
+    """Visit a Not boolean expression with a concrete BooleanExpressionVisitor"""
+    return visitor.visit_not(result=obj)
+
+
+@visit.register(And)
+def _(obj: And, visitor: BooleanExpressionVisitor) -> BooleanExpression:
+    """Visit a And boolean expression with a concrete BooleanExpressionVisitor"""
+    return visitor.visit_and(left=obj.left, right=obj.right)

Review Comment:
   Same here, this needs to visit the left and right operands.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #4815: BooleanExpressionVisitor and visit method

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #4815:
URL: https://github.com/apache/iceberg/pull/4815#discussion_r877595847


##########
python/src/iceberg/expressions/base.py:
##########
@@ -346,3 +346,116 @@ def bind(self, schema: Schema, case_sensitive: bool) -> BoundReference:
             raise ValueError(f"Cannot find field '{self.name}' in schema: {schema}")
 
         return BoundReference(field=field, accessor=schema.accessor_for_field(field.field_id))
+
+
+class BooleanExpressionVisitor(ABC):
+    @abstractmethod
+    def visit_always_true(self) -> BooleanExpression:
+        """Visit method for an AlwaysTrue boolean expression
+
+        Note: This visit method has no arguments since AlwaysTrue instances have no context.
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_always_false(self) -> BooleanExpression:

Review Comment:
   While the expression itself is `AlwaysFalse`, it's really just a way to talk about the constant expression `False`. For the visitor, I'd just use `visit_true` and `visit_false`. The `Always` is there to avoid conflicting with built-in symbols like `True` and `False`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] samredai commented on pull request #4815: BooleanExpressionVisitor and visit method

Posted by GitBox <gi...@apache.org>.
samredai commented on PR #4815:
URL: https://github.com/apache/iceberg/pull/4815#issuecomment-1134116897

   @rdblue I incorporated your review so far and also updated the PR to add complete coverage. This is ready for another look, thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #4815: BooleanExpressionVisitor and visit method

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #4815:
URL: https://github.com/apache/iceberg/pull/4815#discussion_r877594911


##########
python/src/iceberg/expressions/base.py:
##########
@@ -346,3 +346,116 @@ def bind(self, schema: Schema, case_sensitive: bool) -> BoundReference:
             raise ValueError(f"Cannot find field '{self.name}' in schema: {schema}")
 
         return BoundReference(field=field, accessor=schema.accessor_for_field(field.field_id))
+
+
+class BooleanExpressionVisitor(ABC):
+    @abstractmethod
+    def visit_always_true(self) -> BooleanExpression:
+        """Visit method for an AlwaysTrue boolean expression
+
+        Note: This visit method has no arguments since AlwaysTrue instances have no context.
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_always_false(self) -> BooleanExpression:
+        """Visit method for an AlwaysFalse boolean expression
+
+        Note: This visit method has no arguments since AlwaysFalse instances have no context.
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_not(self, result: BooleanExpression) -> BooleanExpression:
+        """Visit method for a Not boolean expression
+
+        Args:
+            left (BooleanExpression): The left side of the expression
+            right (BooleanExpression): The right side of the expression
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_and(self, left: BooleanExpression, right: BooleanExpression) -> BooleanExpression:
+        """Visit method for an And boolean expression
+
+        Args:
+            left (BooleanExpression): The left side of the expression
+            right (BooleanExpression): The right side of the expression
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_or(self, left: BooleanExpression, right: BooleanExpression) -> BooleanExpression:
+        """Visit method for an Or boolean expression
+
+        Args:
+            left (BooleanExpression): The left side of the expression
+            right (BooleanExpression): The right side of the expression
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_predicate(self, predicate):

Review Comment:
   I think it would be better to have `visit_unbound_predicate` and `visit_bound_predicate` so the implementation doesn't have to do an `isinstance` check.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #4815: BooleanExpressionVisitor and visit method

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #4815:
URL: https://github.com/apache/iceberg/pull/4815#discussion_r883871712


##########
python/tests/expressions/test_expressions_base.py:
##########
@@ -257,3 +317,111 @@ def test_bound_reference(table_schema_simple, foo_struct):
     assert bound_ref1.eval(foo_struct) == "foovalue"
     assert bound_ref2.eval(foo_struct) == 123
     assert bound_ref3.eval(foo_struct) == True
+
+
+def test_boolean_expression_visitor():
+    """Test post-order traversal of boolean expression visit method"""
+    expr = base.And(
+        base.Or(base.Not(TestExpressionA()), base.Not(TestExpressionB()), TestExpressionA(), TestExpressionB()),
+        base.Not(TestExpressionA()),
+        TestExpressionB(),
+    )
+    visitor = TestBooleanExpressionVisitor()
+    result = base.visit(expr, visitor=visitor)
+    assert result == [
+        "TestExpressionA",
+        "NOT",
+        "TestExpressionB",
+        "NOT",
+        "OR",
+        "TestExpressionA",
+        "OR",
+        "TestExpressionB",
+        "OR",
+        "TestExpressionA",
+        "NOT",
+        "AND",
+        "TestExpressionB",
+        "AND",
+    ]
+
+
+def test_or_with_always_true():

Review Comment:
   This doesn't test the visitor, it is testing `Or.__new__` that simplifies the expression as it is created. I think you can remove these and/or/false/true tests and just go with the one above for now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #4815: BooleanExpressionVisitor and visit method

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #4815:
URL: https://github.com/apache/iceberg/pull/4815#discussion_r883871877


##########
python/tests/expressions/test_expressions_base.py:
##########
@@ -257,3 +317,111 @@ def test_bound_reference(table_schema_simple, foo_struct):
     assert bound_ref1.eval(foo_struct) == "foovalue"
     assert bound_ref2.eval(foo_struct) == 123
     assert bound_ref3.eval(foo_struct) == True
+
+
+def test_boolean_expression_visitor():
+    """Test post-order traversal of boolean expression visit method"""
+    expr = base.And(
+        base.Or(base.Not(TestExpressionA()), base.Not(TestExpressionB()), TestExpressionA(), TestExpressionB()),
+        base.Not(TestExpressionA()),
+        TestExpressionB(),
+    )
+    visitor = TestBooleanExpressionVisitor()
+    result = base.visit(expr, visitor=visitor)
+    assert result == [
+        "TestExpressionA",
+        "NOT",
+        "TestExpressionB",
+        "NOT",
+        "OR",
+        "TestExpressionA",
+        "OR",
+        "TestExpressionB",
+        "OR",
+        "TestExpressionA",
+        "NOT",
+        "AND",
+        "TestExpressionB",
+        "AND",
+    ]
+
+
+def test_or_with_always_true():
+    """Test visiting an Or expression with AlwaysTrue
+
+    Any Or expression with an AlwaysTrue child should be reduced to a single AlwaysTrue expression and a visitor should only call the `visit_true` method.
+    """
+    expr = base.Or(
+        base.And(
+            base.Or(base.Not(TestExpressionA()), base.Not(TestExpressionB()), TestExpressionA(), TestExpressionB()),
+            base.Not(TestExpressionA()),
+            TestExpressionB(),
+        ),
+        base.AlwaysTrue(),
+    )
+    visitor = TestBooleanExpressionVisitor()
+    result = base.visit(expr, visitor=visitor)
+    assert result == ["TRUE"]
+
+
+def test_or_with_always_false():
+    """Test visiting an Or expression with AlwaysFalse
+
+    An Or expression with an AlwaysFalse child should skip calling `visit_false`, yet visit the rest of the child expressions.
+    """
+    expr = base.Or(
+        base.And(
+            base.Or(base.Not(TestExpressionA()), base.Not(TestExpressionB()), TestExpressionA(), TestExpressionB()),
+            base.Not(TestExpressionA()),
+            TestExpressionB(),
+        ),
+        base.AlwaysFalse(),
+    )
+    visitor = TestBooleanExpressionVisitor()
+    result = base.visit(expr, visitor=visitor)
+    assert result == [
+        "TestExpressionA",
+        "NOT",
+        "TestExpressionB",
+        "NOT",
+        "OR",
+        "TestExpressionA",
+        "OR",
+        "TestExpressionB",
+        "OR",
+        "TestExpressionA",
+        "NOT",
+        "AND",
+        "TestExpressionB",
+        "AND",
+    ]
+
+
+def test_or_with_only_always_false():
+    """Test visiting an Or expression with only AlwaysFalse expressions
+
+    An Or expression with only AlwaysFalse expressions should be reduced to a single AlwaysFalse expression
+    and only require a single call to the `visit_false` method.
+    """
+    expr = base.Or(
+        base.AlwaysFalse(),
+        base.AlwaysFalse(),
+        base.AlwaysFalse(),
+        base.AlwaysFalse(),
+        base.AlwaysFalse(),
+        base.AlwaysFalse(),
+        base.AlwaysFalse(),
+    )
+    visitor = TestBooleanExpressionVisitor()
+    result = base.visit(expr, visitor=visitor)
+    assert isinstance(expr, base.AlwaysFalse)
+    assert result == ["FALSE"]
+
+
+def test_boolean_expression_visit_raise_not_implemented_error():

Review Comment:
   I'd keep this one, but you just need the one test above.



##########
python/tests/expressions/test_expressions_base.py:
##########
@@ -257,3 +317,111 @@ def test_bound_reference(table_schema_simple, foo_struct):
     assert bound_ref1.eval(foo_struct) == "foovalue"
     assert bound_ref2.eval(foo_struct) == 123
     assert bound_ref3.eval(foo_struct) == True
+
+
+def test_boolean_expression_visitor():

Review Comment:
   This test looks good to me.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] samredai commented on a diff in pull request #4815: BooleanExpressionVisitor and visit method

Posted by GitBox <gi...@apache.org>.
samredai commented on code in PR #4815:
URL: https://github.com/apache/iceberg/pull/4815#discussion_r878991275


##########
python/src/iceberg/expressions/base.py:
##########
@@ -346,3 +346,112 @@ def bind(self, schema: Schema, case_sensitive: bool) -> BoundReference:
             raise ValueError(f"Cannot find field '{self.name}' in schema: {schema}")
 
         return BoundReference(field=field, accessor=schema.accessor_for_field(field.field_id))
+
+
+class BooleanExpressionVisitor(Generic[T], ABC):
+    @abstractmethod
+    def visit_true(self) -> T:
+        """Visit method for an AlwaysTrue boolean expression
+
+        Note: This visit method has no arguments since AlwaysTrue instances have no context.
+        """
+
+    @abstractmethod
+    def visit_false(self) -> T:
+        """Visit method for an AlwaysFalse boolean expression
+
+        Note: This visit method has no arguments since AlwaysFalse instances have no context.
+        """
+
+    @abstractmethod
+    def visit_not(self, child_result: T) -> T:
+        """Visit method for a Not boolean expression
+
+        Args:
+            result (T): The result of visiting the child of the Not boolean expression
+        """
+
+    @abstractmethod
+    def visit_and(self, left_result: T, right_result: T) -> T:
+        """Visit method for an And boolean expression
+
+        Args:
+            left_result (T): The result of visiting the left side of the expression
+            right_result (T): The result of visiting the right side of the expression
+        """
+
+    @abstractmethod
+    def visit_or(self, left_result: T, right_result: T) -> T:
+        """Visit method for an Or boolean expression
+
+        Args:
+            left_result (T): The result of visiting the left side of the expression
+            right_result (T): The result of visiting the right side of the expression
+        """
+
+    @abstractmethod
+    def visit_unbound_predicate(self, predicate) -> T:
+        """Visit method for an unbound predicate in an expression tree
+
+        Args:
+            predicate (UnboundPredicate): An instance of an UnboundPredicate
+        """
+
+    @abstractmethod
+    def visit_bound_predicate(self, predicate) -> T:

Review Comment:
   `predicate` has no typehint here because `BoundPredicate` and `UnboundPredicate` have not been added yet, but I'll circle back and add if after PR #4816 gets in (or rebase this PR if that one gets in first).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] samredai commented on a diff in pull request #4815: BooleanExpressionVisitor and visit method

Posted by GitBox <gi...@apache.org>.
samredai commented on code in PR #4815:
URL: https://github.com/apache/iceberg/pull/4815#discussion_r878990780


##########
python/tests/expressions/test_expressions_base.py:
##########
@@ -84,6 +85,65 @@ def __str__(self):
         return "testexprb"
 
 
+class TestBooleanExpressionVisitor(base.BooleanExpressionVisitor[List]):
+    """A test implementation of a BooleanExpressionVisit
+
+    As this visitor visits each node, it appends an element to a `visit_histor` list. This enables testing that a given expression is
+    visited in an expected order by the `visit` method.
+    """
+
+    def __init__(self):
+        self.visit_history: List = []

Review Comment:
   Since we don't have predicates yet (coming in PR #4816), I added this test-only visitor. It's super simple and just appends a string for each visit method that gets called. I use this in the tests added to confirm that the visit methods are called in the expected post-order traversal of a given expression. I think it's useful even after predicates are added since we can use it to test purely the visit pattern of complex expressions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] samredai commented on a diff in pull request #4815: BooleanExpressionVisitor and visit method

Posted by GitBox <gi...@apache.org>.
samredai commented on code in PR #4815:
URL: https://github.com/apache/iceberg/pull/4815#discussion_r878987092


##########
python/src/iceberg/expressions/base.py:
##########
@@ -346,3 +346,116 @@ def bind(self, schema: Schema, case_sensitive: bool) -> BoundReference:
             raise ValueError(f"Cannot find field '{self.name}' in schema: {schema}")
 
         return BoundReference(field=field, accessor=schema.accessor_for_field(field.field_id))
+
+
+class BooleanExpressionVisitor(ABC):
+    @abstractmethod
+    def visit_always_true(self) -> BooleanExpression:
+        """Visit method for an AlwaysTrue boolean expression
+
+        Note: This visit method has no arguments since AlwaysTrue instances have no context.
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_always_false(self) -> BooleanExpression:
+        """Visit method for an AlwaysFalse boolean expression
+
+        Note: This visit method has no arguments since AlwaysFalse instances have no context.
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_not(self, result: BooleanExpression) -> BooleanExpression:

Review Comment:
   Updated this to `visit_not(self, child_result: T) -> T` 👍 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] samredai commented on a diff in pull request #4815: BooleanExpressionVisitor and visit method

Posted by GitBox <gi...@apache.org>.
samredai commented on code in PR #4815:
URL: https://github.com/apache/iceberg/pull/4815#discussion_r878988048


##########
python/src/iceberg/expressions/base.py:
##########
@@ -346,3 +346,116 @@ def bind(self, schema: Schema, case_sensitive: bool) -> BoundReference:
             raise ValueError(f"Cannot find field '{self.name}' in schema: {schema}")
 
         return BoundReference(field=field, accessor=schema.accessor_for_field(field.field_id))
+
+
+class BooleanExpressionVisitor(ABC):
+    @abstractmethod
+    def visit_always_true(self) -> BooleanExpression:
+        """Visit method for an AlwaysTrue boolean expression
+
+        Note: This visit method has no arguments since AlwaysTrue instances have no context.
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_always_false(self) -> BooleanExpression:

Review Comment:
   Makes sense, updated!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #4815: BooleanExpressionVisitor and visit method

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #4815:
URL: https://github.com/apache/iceberg/pull/4815#discussion_r877594133


##########
python/src/iceberg/expressions/base.py:
##########
@@ -346,3 +346,116 @@ def bind(self, schema: Schema, case_sensitive: bool) -> BoundReference:
             raise ValueError(f"Cannot find field '{self.name}' in schema: {schema}")
 
         return BoundReference(field=field, accessor=schema.accessor_for_field(field.field_id))
+
+
+class BooleanExpressionVisitor(ABC):

Review Comment:
   I think this should be parameterized. The visitor will return some type, like the schema visitor.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #4815: BooleanExpressionVisitor and visit method

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #4815:
URL: https://github.com/apache/iceberg/pull/4815#discussion_r877593469


##########
python/src/iceberg/expressions/base.py:
##########
@@ -346,3 +346,116 @@ def bind(self, schema: Schema, case_sensitive: bool) -> BoundReference:
             raise ValueError(f"Cannot find field '{self.name}' in schema: {schema}")
 
         return BoundReference(field=field, accessor=schema.accessor_for_field(field.field_id))
+
+
+class BooleanExpressionVisitor(ABC):
+    @abstractmethod
+    def visit_always_true(self) -> BooleanExpression:
+        """Visit method for an AlwaysTrue boolean expression
+
+        Note: This visit method has no arguments since AlwaysTrue instances have no context.
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_always_false(self) -> BooleanExpression:
+        """Visit method for an AlwaysFalse boolean expression
+
+        Note: This visit method has no arguments since AlwaysFalse instances have no context.
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_not(self, result: BooleanExpression) -> BooleanExpression:
+        """Visit method for a Not boolean expression
+
+        Args:
+            left (BooleanExpression): The left side of the expression
+            right (BooleanExpression): The right side of the expression

Review Comment:
   Looks like this was copy-pasted and not updated.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] samredai commented on a diff in pull request #4815: BooleanExpressionVisitor and visit method

Posted by GitBox <gi...@apache.org>.
samredai commented on code in PR #4815:
URL: https://github.com/apache/iceberg/pull/4815#discussion_r878989810


##########
python/tests/expressions/test_expressions_base.py:
##########
@@ -84,6 +85,65 @@ def __str__(self):
         return "testexprb"
 
 
+class TestBooleanExpressionVisitor(base.BooleanExpressionVisitor[List]):
+    """A test implementation of a BooleanExpressionVisit
+
+    As this visitor visits each node, it appends an element to a `visit_histor` list. This enables testing that a given expression is
+    visited in an expected order by the `visit` method.
+    """
+
+    def __init__(self):
+        self.visit_history: List = []
+
+    def visit_true(self) -> List:
+        self.visit_history.append("TRUE")
+        return self.visit_history
+
+    def visit_false(self) -> List:
+        self.visit_history.append("FALSE")
+        return self.visit_history
+
+    def visit_not(self, child_result: List) -> List:
+        self.visit_history.append("NOT")
+        return self.visit_history
+
+    def visit_and(self, left_result: List, right_result: List) -> List:
+        self.visit_history.append("AND")
+        return self.visit_history
+
+    def visit_or(self, left_result: List, right_result: List) -> List:
+        self.visit_history.append("OR")
+        return self.visit_history
+
+    def visit_unbound_predicate(self, predicate) -> List:
+        self.visit_history.append("UNBOUND PREDICATE")
+        return self.visit_history
+
+    def visit_bound_predicate(self, predicate) -> List:
+        self.visit_history.append("BOUND PREDICATE")
+        return self.visit_history
+
+    def visit_test_expression_a(self) -> List:
+        self.visit_history.append("TestExpressionA")
+        return self.visit_history
+
+    def visit_test_expression_b(self) -> List:
+        self.visit_history.append("TestExpressionB")
+        return self.visit_history
+
+
+@base.visit.register(TestExpressionA)
+def _(obj: TestExpressionA, visitor: TestBooleanExpressionVisitor) -> List:
+    """Visit a TestExpressionA with a TestBooleanExpressionVisitor"""
+    return visitor.visit_test_expression_a()
+
+
+@base.visit.register(TestExpressionB)
+def _(obj: TestExpressionB, visitor: TestBooleanExpressionVisitor) -> List:

Review Comment:
   These are only for tests and dispatch a method for `TestExpressionA` and `TestExpressionB`. Otherwise a `NotImplementedError` would be raised.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #4815: BooleanExpressionVisitor and visit method

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #4815:
URL: https://github.com/apache/iceberg/pull/4815#discussion_r877594633


##########
python/src/iceberg/expressions/base.py:
##########
@@ -346,3 +346,116 @@ def bind(self, schema: Schema, case_sensitive: bool) -> BoundReference:
             raise ValueError(f"Cannot find field '{self.name}' in schema: {schema}")
 
         return BoundReference(field=field, accessor=schema.accessor_for_field(field.field_id))
+
+
+class BooleanExpressionVisitor(ABC):
+    @abstractmethod
+    def visit_always_true(self) -> BooleanExpression:
+        """Visit method for an AlwaysTrue boolean expression
+
+        Note: This visit method has no arguments since AlwaysTrue instances have no context.
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_always_false(self) -> BooleanExpression:
+        """Visit method for an AlwaysFalse boolean expression
+
+        Note: This visit method has no arguments since AlwaysFalse instances have no context.
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_not(self, result: BooleanExpression) -> BooleanExpression:

Review Comment:
   The result should have type `T` because this is passed the result of visiting the child.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #4815: BooleanExpressionVisitor and visit method

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #4815:
URL: https://github.com/apache/iceberg/pull/4815#discussion_r877595415


##########
python/src/iceberg/expressions/base.py:
##########
@@ -346,3 +346,116 @@ def bind(self, schema: Schema, case_sensitive: bool) -> BoundReference:
             raise ValueError(f"Cannot find field '{self.name}' in schema: {schema}")
 
         return BoundReference(field=field, accessor=schema.accessor_for_field(field.field_id))
+
+
+class BooleanExpressionVisitor(ABC):
+    @abstractmethod
+    def visit_always_true(self) -> BooleanExpression:
+        """Visit method for an AlwaysTrue boolean expression
+
+        Note: This visit method has no arguments since AlwaysTrue instances have no context.
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_always_false(self) -> BooleanExpression:
+        """Visit method for an AlwaysFalse boolean expression
+
+        Note: This visit method has no arguments since AlwaysFalse instances have no context.
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_not(self, result: BooleanExpression) -> BooleanExpression:
+        """Visit method for a Not boolean expression
+
+        Args:
+            left (BooleanExpression): The left side of the expression
+            right (BooleanExpression): The right side of the expression
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_and(self, left: BooleanExpression, right: BooleanExpression) -> BooleanExpression:
+        """Visit method for an And boolean expression
+
+        Args:
+            left (BooleanExpression): The left side of the expression
+            right (BooleanExpression): The right side of the expression
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_or(self, left: BooleanExpression, right: BooleanExpression) -> BooleanExpression:
+        """Visit method for an Or boolean expression
+
+        Args:
+            left (BooleanExpression): The left side of the expression
+            right (BooleanExpression): The right side of the expression
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_predicate(self, predicate):
+        """Visit method for a predicate in an expression tree
+
+        Args:
+            predicate (UnboundPredicate | BoundPredicate): An instance of either an UnboundPredicate or a BoundPredicate,
+            depending on the concrete implementation.
+        """
+
+
+@singledispatch
+def visit(obj, visitor: BooleanExpressionVisitor) -> BooleanExpression:
+    """A generic function for applying a boolean expression visitor to any point within an expression
+
+    The function traverses the expression in post-order fashion
+
+    Args:
+        obj(BooleanExpression): An instance of a BooleanExpression
+        visitor (BooleanExpressionVisitor): An instance of an implementation of the generic BooleanExpressionVisitor base class
+
+    Raises:
+        NotImplementedError: If attempting to visit an unrecognized object type
+    """
+    raise NotImplementedError(f"Cannot visit non-type: {obj}")
+
+
+@visit.register(AlwaysTrue)
+def _(obj: AlwaysTrue, visitor: BooleanExpressionVisitor) -> BooleanExpression:
+    """Visit an AlwaysTrue boolean expression with a concrete BooleanExpressionVisitor"""
+    return visitor.visit_always_true()
+
+
+@visit.register(AlwaysFalse)
+def _(obj: AlwaysFalse, visitor: BooleanExpressionVisitor) -> BooleanExpression:
+    """Visit an AlwaysFalse boolean expression with a concrete BooleanExpressionVisitor"""
+    return visitor.visit_always_true()

Review Comment:
   Should be `false` rather than `true`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #4815: BooleanExpressionVisitor and visit method

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #4815:
URL: https://github.com/apache/iceberg/pull/4815#discussion_r877595238


##########
python/src/iceberg/expressions/base.py:
##########
@@ -346,3 +346,116 @@ def bind(self, schema: Schema, case_sensitive: bool) -> BoundReference:
             raise ValueError(f"Cannot find field '{self.name}' in schema: {schema}")
 
         return BoundReference(field=field, accessor=schema.accessor_for_field(field.field_id))
+
+
+class BooleanExpressionVisitor(ABC):
+    @abstractmethod
+    def visit_always_true(self) -> BooleanExpression:
+        """Visit method for an AlwaysTrue boolean expression
+
+        Note: This visit method has no arguments since AlwaysTrue instances have no context.
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_always_false(self) -> BooleanExpression:
+        """Visit method for an AlwaysFalse boolean expression
+
+        Note: This visit method has no arguments since AlwaysFalse instances have no context.
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_not(self, result: BooleanExpression) -> BooleanExpression:
+        """Visit method for a Not boolean expression
+
+        Args:
+            left (BooleanExpression): The left side of the expression
+            right (BooleanExpression): The right side of the expression
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_and(self, left: BooleanExpression, right: BooleanExpression) -> BooleanExpression:
+        """Visit method for an And boolean expression
+
+        Args:
+            left (BooleanExpression): The left side of the expression
+            right (BooleanExpression): The right side of the expression
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_or(self, left: BooleanExpression, right: BooleanExpression) -> BooleanExpression:
+        """Visit method for an Or boolean expression
+
+        Args:
+            left (BooleanExpression): The left side of the expression
+            right (BooleanExpression): The right side of the expression
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_predicate(self, predicate):
+        """Visit method for a predicate in an expression tree
+
+        Args:
+            predicate (UnboundPredicate | BoundPredicate): An instance of either an UnboundPredicate or a BoundPredicate,
+            depending on the concrete implementation.
+        """
+
+
+@singledispatch
+def visit(obj, visitor: BooleanExpressionVisitor) -> BooleanExpression:
+    """A generic function for applying a boolean expression visitor to any point within an expression
+
+    The function traverses the expression in post-order fashion
+
+    Args:
+        obj(BooleanExpression): An instance of a BooleanExpression
+        visitor (BooleanExpressionVisitor): An instance of an implementation of the generic BooleanExpressionVisitor base class
+
+    Raises:
+        NotImplementedError: If attempting to visit an unrecognized object type
+    """
+    raise NotImplementedError(f"Cannot visit non-type: {obj}")

Review Comment:
   I don't think "non-type" is correct. This should be "unsupported expression" right?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] samredai commented on a diff in pull request #4815: BooleanExpressionVisitor and visit method

Posted by GitBox <gi...@apache.org>.
samredai commented on code in PR #4815:
URL: https://github.com/apache/iceberg/pull/4815#discussion_r878987937


##########
python/src/iceberg/expressions/base.py:
##########
@@ -346,3 +346,116 @@ def bind(self, schema: Schema, case_sensitive: bool) -> BoundReference:
             raise ValueError(f"Cannot find field '{self.name}' in schema: {schema}")
 
         return BoundReference(field=field, accessor=schema.accessor_for_field(field.field_id))
+
+
+class BooleanExpressionVisitor(ABC):
+    @abstractmethod
+    def visit_always_true(self) -> BooleanExpression:
+        """Visit method for an AlwaysTrue boolean expression
+
+        Note: This visit method has no arguments since AlwaysTrue instances have no context.
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_always_false(self) -> BooleanExpression:
+        """Visit method for an AlwaysFalse boolean expression
+
+        Note: This visit method has no arguments since AlwaysFalse instances have no context.
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_not(self, result: BooleanExpression) -> BooleanExpression:
+        """Visit method for a Not boolean expression
+
+        Args:
+            left (BooleanExpression): The left side of the expression
+            right (BooleanExpression): The right side of the expression
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_and(self, left: BooleanExpression, right: BooleanExpression) -> BooleanExpression:
+        """Visit method for an And boolean expression
+
+        Args:
+            left (BooleanExpression): The left side of the expression
+            right (BooleanExpression): The right side of the expression
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_or(self, left: BooleanExpression, right: BooleanExpression) -> BooleanExpression:
+        """Visit method for an Or boolean expression
+
+        Args:
+            left (BooleanExpression): The left side of the expression
+            right (BooleanExpression): The right side of the expression
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_predicate(self, predicate):
+        """Visit method for a predicate in an expression tree
+
+        Args:
+            predicate (UnboundPredicate | BoundPredicate): An instance of either an UnboundPredicate or a BoundPredicate,
+            depending on the concrete implementation.
+        """
+
+
+@singledispatch
+def visit(obj, visitor: BooleanExpressionVisitor) -> BooleanExpression:
+    """A generic function for applying a boolean expression visitor to any point within an expression
+
+    The function traverses the expression in post-order fashion
+
+    Args:
+        obj(BooleanExpression): An instance of a BooleanExpression
+        visitor (BooleanExpressionVisitor): An instance of an implementation of the generic BooleanExpressionVisitor base class
+
+    Raises:
+        NotImplementedError: If attempting to visit an unrecognized object type
+    """
+    raise NotImplementedError(f"Cannot visit non-type: {obj}")
+
+
+@visit.register(AlwaysTrue)
+def _(obj: AlwaysTrue, visitor: BooleanExpressionVisitor) -> BooleanExpression:
+    """Visit an AlwaysTrue boolean expression with a concrete BooleanExpressionVisitor"""
+    return visitor.visit_always_true()
+
+
+@visit.register(AlwaysFalse)
+def _(obj: AlwaysFalse, visitor: BooleanExpressionVisitor) -> BooleanExpression:
+    """Visit an AlwaysFalse boolean expression with a concrete BooleanExpressionVisitor"""
+    return visitor.visit_always_true()

Review Comment:
   Thanks for catching this, updated!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] samredai commented on a diff in pull request #4815: BooleanExpressionVisitor and visit method

Posted by GitBox <gi...@apache.org>.
samredai commented on code in PR #4815:
URL: https://github.com/apache/iceberg/pull/4815#discussion_r883878143


##########
python/tests/expressions/test_expressions_base.py:
##########
@@ -257,3 +317,111 @@ def test_bound_reference(table_schema_simple, foo_struct):
     assert bound_ref1.eval(foo_struct) == "foovalue"
     assert bound_ref2.eval(foo_struct) == 123
     assert bound_ref3.eval(foo_struct) == True
+
+
+def test_boolean_expression_visitor():
+    """Test post-order traversal of boolean expression visit method"""
+    expr = base.And(
+        base.Or(base.Not(TestExpressionA()), base.Not(TestExpressionB()), TestExpressionA(), TestExpressionB()),
+        base.Not(TestExpressionA()),
+        TestExpressionB(),
+    )
+    visitor = TestBooleanExpressionVisitor()
+    result = base.visit(expr, visitor=visitor)
+    assert result == [
+        "TestExpressionA",
+        "NOT",
+        "TestExpressionB",
+        "NOT",
+        "OR",
+        "TestExpressionA",
+        "OR",
+        "TestExpressionB",
+        "OR",
+        "TestExpressionA",
+        "NOT",
+        "AND",
+        "TestExpressionB",
+        "AND",
+    ]
+
+
+def test_or_with_always_true():
+    """Test visiting an Or expression with AlwaysTrue
+
+    Any Or expression with an AlwaysTrue child should be reduced to a single AlwaysTrue expression and a visitor should only call the `visit_true` method.
+    """
+    expr = base.Or(
+        base.And(
+            base.Or(base.Not(TestExpressionA()), base.Not(TestExpressionB()), TestExpressionA(), TestExpressionB()),
+            base.Not(TestExpressionA()),
+            TestExpressionB(),
+        ),
+        base.AlwaysTrue(),
+    )
+    visitor = TestBooleanExpressionVisitor()
+    result = base.visit(expr, visitor=visitor)
+    assert result == ["TRUE"]
+
+
+def test_or_with_always_false():
+    """Test visiting an Or expression with AlwaysFalse
+
+    An Or expression with an AlwaysFalse child should skip calling `visit_false`, yet visit the rest of the child expressions.
+    """
+    expr = base.Or(
+        base.And(
+            base.Or(base.Not(TestExpressionA()), base.Not(TestExpressionB()), TestExpressionA(), TestExpressionB()),
+            base.Not(TestExpressionA()),
+            TestExpressionB(),
+        ),
+        base.AlwaysFalse(),
+    )
+    visitor = TestBooleanExpressionVisitor()
+    result = base.visit(expr, visitor=visitor)
+    assert result == [
+        "TestExpressionA",
+        "NOT",
+        "TestExpressionB",
+        "NOT",
+        "OR",
+        "TestExpressionA",
+        "OR",
+        "TestExpressionB",
+        "OR",
+        "TestExpressionA",
+        "NOT",
+        "AND",
+        "TestExpressionB",
+        "AND",
+    ]
+
+
+def test_or_with_only_always_false():
+    """Test visiting an Or expression with only AlwaysFalse expressions
+
+    An Or expression with only AlwaysFalse expressions should be reduced to a single AlwaysFalse expression
+    and only require a single call to the `visit_false` method.
+    """
+    expr = base.Or(
+        base.AlwaysFalse(),
+        base.AlwaysFalse(),
+        base.AlwaysFalse(),
+        base.AlwaysFalse(),
+        base.AlwaysFalse(),
+        base.AlwaysFalse(),
+        base.AlwaysFalse(),
+    )
+    visitor = TestBooleanExpressionVisitor()
+    result = base.visit(expr, visitor=visitor)
+    assert isinstance(expr, base.AlwaysFalse)
+    assert result == ["FALSE"]
+
+
+def test_boolean_expression_visit_raise_not_implemented_error():

Review Comment:
   Updated!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] samredai commented on a diff in pull request #4815: BooleanExpressionVisitor and visit method

Posted by GitBox <gi...@apache.org>.
samredai commented on code in PR #4815:
URL: https://github.com/apache/iceberg/pull/4815#discussion_r883878302


##########
python/tests/expressions/test_expressions_base.py:
##########
@@ -257,3 +317,111 @@ def test_bound_reference(table_schema_simple, foo_struct):
     assert bound_ref1.eval(foo_struct) == "foovalue"
     assert bound_ref2.eval(foo_struct) == 123
     assert bound_ref3.eval(foo_struct) == True
+
+
+def test_boolean_expression_visitor():
+    """Test post-order traversal of boolean expression visit method"""
+    expr = base.And(
+        base.Or(base.Not(TestExpressionA()), base.Not(TestExpressionB()), TestExpressionA(), TestExpressionB()),
+        base.Not(TestExpressionA()),
+        TestExpressionB(),
+    )
+    visitor = TestBooleanExpressionVisitor()
+    result = base.visit(expr, visitor=visitor)
+    assert result == [
+        "TestExpressionA",
+        "NOT",
+        "TestExpressionB",
+        "NOT",
+        "OR",
+        "TestExpressionA",
+        "OR",
+        "TestExpressionB",
+        "OR",
+        "TestExpressionA",
+        "NOT",
+        "AND",
+        "TestExpressionB",
+        "AND",
+    ]
+
+
+def test_or_with_always_true():

Review Comment:
   Removed 👍 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue merged pull request #4815: BooleanExpressionVisitor and visit method

Posted by GitBox <gi...@apache.org>.
rdblue merged PR #4815:
URL: https://github.com/apache/iceberg/pull/4815


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #4815: BooleanExpressionVisitor and visit method

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #4815:
URL: https://github.com/apache/iceberg/pull/4815#discussion_r877596207


##########
python/src/iceberg/expressions/base.py:
##########
@@ -346,3 +346,116 @@ def bind(self, schema: Schema, case_sensitive: bool) -> BoundReference:
             raise ValueError(f"Cannot find field '{self.name}' in schema: {schema}")
 
         return BoundReference(field=field, accessor=schema.accessor_for_field(field.field_id))
+
+
+class BooleanExpressionVisitor(ABC):
+    @abstractmethod
+    def visit_always_true(self) -> BooleanExpression:
+        """Visit method for an AlwaysTrue boolean expression
+
+        Note: This visit method has no arguments since AlwaysTrue instances have no context.
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_always_false(self) -> BooleanExpression:
+        """Visit method for an AlwaysFalse boolean expression
+
+        Note: This visit method has no arguments since AlwaysFalse instances have no context.
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_not(self, result: BooleanExpression) -> BooleanExpression:
+        """Visit method for a Not boolean expression
+
+        Args:
+            left (BooleanExpression): The left side of the expression
+            right (BooleanExpression): The right side of the expression
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_and(self, left: BooleanExpression, right: BooleanExpression) -> BooleanExpression:
+        """Visit method for an And boolean expression
+
+        Args:
+            left (BooleanExpression): The left side of the expression
+            right (BooleanExpression): The right side of the expression
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_or(self, left: BooleanExpression, right: BooleanExpression) -> BooleanExpression:
+        """Visit method for an Or boolean expression
+
+        Args:
+            left (BooleanExpression): The left side of the expression
+            right (BooleanExpression): The right side of the expression
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_predicate(self, predicate):
+        """Visit method for a predicate in an expression tree
+
+        Args:
+            predicate (UnboundPredicate | BoundPredicate): An instance of either an UnboundPredicate or a BoundPredicate,
+            depending on the concrete implementation.
+        """
+
+
+@singledispatch
+def visit(obj, visitor: BooleanExpressionVisitor) -> BooleanExpression:
+    """A generic function for applying a boolean expression visitor to any point within an expression
+
+    The function traverses the expression in post-order fashion
+
+    Args:
+        obj(BooleanExpression): An instance of a BooleanExpression
+        visitor (BooleanExpressionVisitor): An instance of an implementation of the generic BooleanExpressionVisitor base class
+
+    Raises:
+        NotImplementedError: If attempting to visit an unrecognized object type
+    """
+    raise NotImplementedError(f"Cannot visit non-type: {obj}")
+
+
+@visit.register(AlwaysTrue)
+def _(obj: AlwaysTrue, visitor: BooleanExpressionVisitor) -> BooleanExpression:
+    """Visit an AlwaysTrue boolean expression with a concrete BooleanExpressionVisitor"""
+    return visitor.visit_always_true()
+
+
+@visit.register(AlwaysFalse)
+def _(obj: AlwaysFalse, visitor: BooleanExpressionVisitor) -> BooleanExpression:
+    """Visit an AlwaysFalse boolean expression with a concrete BooleanExpressionVisitor"""
+    return visitor.visit_always_true()
+
+
+@visit.register(Not)
+def _(obj: Not, visitor: BooleanExpressionVisitor) -> BooleanExpression:
+    """Visit a Not boolean expression with a concrete BooleanExpressionVisitor"""
+    return visitor.visit_not(result=obj)

Review Comment:
   This needs to visit the object and then call `visit_not` to pass in the result.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] samredai commented on a diff in pull request #4815: BooleanExpressionVisitor and visit method

Posted by GitBox <gi...@apache.org>.
samredai commented on code in PR #4815:
URL: https://github.com/apache/iceberg/pull/4815#discussion_r877267681


##########
python/src/iceberg/expressions/base.py:
##########
@@ -346,3 +346,116 @@ def bind(self, schema: Schema, case_sensitive: bool) -> BoundReference:
             raise ValueError(f"Cannot find field '{self.name}' in schema: {schema}")
 
         return BoundReference(field=field, accessor=schema.accessor_for_field(field.field_id))
+
+
+class BooleanExpressionVisitor(ABC):
+    @abstractmethod
+    def visit_always_true(self) -> BooleanExpression:
+        """Visit method for an AlwaysTrue boolean expression
+
+        Note: This visit method has no arguments since AlwaysTrue instances have no context.
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_always_false(self) -> BooleanExpression:
+        """Visit method for an AlwaysFalse boolean expression
+
+        Note: This visit method has no arguments since AlwaysTrue instances have no context.

Review Comment:
   ```suggestion
           Note: This visit method has no arguments since AlwaysFalse instances have no context.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] samredai commented on a diff in pull request #4815: BooleanExpressionVisitor and visit method

Posted by GitBox <gi...@apache.org>.
samredai commented on code in PR #4815:
URL: https://github.com/apache/iceberg/pull/4815#discussion_r878986645


##########
python/src/iceberg/expressions/base.py:
##########
@@ -346,3 +346,116 @@ def bind(self, schema: Schema, case_sensitive: bool) -> BoundReference:
             raise ValueError(f"Cannot find field '{self.name}' in schema: {schema}")
 
         return BoundReference(field=field, accessor=schema.accessor_for_field(field.field_id))
+
+
+class BooleanExpressionVisitor(ABC):
+    @abstractmethod
+    def visit_always_true(self) -> BooleanExpression:
+        """Visit method for an AlwaysTrue boolean expression
+
+        Note: This visit method has no arguments since AlwaysTrue instances have no context.
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_always_false(self) -> BooleanExpression:
+        """Visit method for an AlwaysFalse boolean expression
+
+        Note: This visit method has no arguments since AlwaysFalse instances have no context.
+
+        Returns:
+            BooleanExpression: The resulting expression after any modifications performed by the visitor
+        """
+
+    @abstractmethod
+    def visit_not(self, result: BooleanExpression) -> BooleanExpression:
+        """Visit method for a Not boolean expression
+
+        Args:
+            left (BooleanExpression): The left side of the expression
+            right (BooleanExpression): The right side of the expression

Review Comment:
   Fixed!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] samredai commented on a diff in pull request #4815: BooleanExpressionVisitor and visit method

Posted by GitBox <gi...@apache.org>.
samredai commented on code in PR #4815:
URL: https://github.com/apache/iceberg/pull/4815#discussion_r878990780


##########
python/tests/expressions/test_expressions_base.py:
##########
@@ -84,6 +85,65 @@ def __str__(self):
         return "testexprb"
 
 
+class TestBooleanExpressionVisitor(base.BooleanExpressionVisitor[List]):
+    """A test implementation of a BooleanExpressionVisit
+
+    As this visitor visits each node, it appends an element to a `visit_histor` list. This enables testing that a given expression is
+    visited in an expected order by the `visit` method.
+    """
+
+    def __init__(self):
+        self.visit_history: List = []

Review Comment:
   Since we don't have predicates yet (coming in PR #4816), I added this test-only visitor. It's super simple and just appends a string for each visit method that gets called. I use this in the tests added to confirm that the visit methods are called in the expected post-order traversal of a given expression.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] samredai commented on pull request #4815: BooleanExpressionVisitor and visit method

Posted by GitBox <gi...@apache.org>.
samredai commented on PR #4815:
URL: https://github.com/apache/iceberg/pull/4815#issuecomment-1134117966

   cc: @emkornfield @Fokko @kbendick @dramaticlly @dhruv-pratap


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org