You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "Fokko (via GitHub)" <gi...@apache.org> on 2023/01/24 08:18:22 UTC

[GitHub] [iceberg] Fokko opened a new pull request, #6657: Python: Allow to pass in a string as filter

Fokko opened a new pull request, #6657:
URL: https://github.com/apache/iceberg/pull/6657

   Often I have to look up the name of the operator, I think it would be nice to allow the end user to provide a string that we'll parse with the excellent parser that we already have.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Fokko commented on a diff in pull request #6657: Python: Allow to pass in a string as filter

Posted by "Fokko (via GitHub)" <gi...@apache.org>.
Fokko commented on code in PR #6657:
URL: https://github.com/apache/iceberg/pull/6657#discussion_r1093256570


##########
python/pyiceberg/table/__init__.py:
##########
@@ -183,14 +185,17 @@ class TableScan(Generic[S], ABC):
     def __init__(
         self,
         table: Table,
-        row_filter: Optional[BooleanExpression] = None,
+        row_filter: Optional[Union[str, BooleanExpression]] = None,
         selected_fields: Tuple[str] = ("*",),
         case_sensitive: bool = True,
         snapshot_id: Optional[int] = None,
         options: Properties = EMPTY_DICT,
     ):
         self.table = table
-        self.row_filter = row_filter or AlwaysTrue()
+        if row_filter is None:
+            self.row_filter = AlwaysTrue()
+        else:

Review Comment:
   We also do the same logic in the `filter()` method below, therefore I added it to the parse. But I agree, it will clutter up the parse method in the long run.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Fokko commented on a diff in pull request #6657: Python: Allow to pass in a string as filter

Posted by "Fokko (via GitHub)" <gi...@apache.org>.
Fokko commented on code in PR #6657:
URL: https://github.com/apache/iceberg/pull/6657#discussion_r1093256954


##########
python/pyiceberg/table/__init__.py:
##########
@@ -183,14 +185,17 @@ class TableScan(Generic[S], ABC):
     def __init__(
         self,
         table: Table,
-        row_filter: Optional[BooleanExpression] = None,
+        row_filter: Optional[Union[str, BooleanExpression]] = None,
         selected_fields: Tuple[str] = ("*",),
         case_sensitive: bool = True,
         snapshot_id: Optional[int] = None,
         options: Properties = EMPTY_DICT,
     ):
         self.table = table
-        self.row_filter = row_filter or AlwaysTrue()
+        if row_filter is None:
+            self.row_filter = AlwaysTrue()
+        else:
+            self.row_filter = parser.parse(row_filter)

Review Comment:
   Good suggestion, I've split it into a separate method for now, that also takes care of the parsing.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #6657: Python: Allow to pass in a string as filter

Posted by "rdblue (via GitHub)" <gi...@apache.org>.
rdblue commented on code in PR #6657:
URL: https://github.com/apache/iceberg/pull/6657#discussion_r1092399725


##########
python/pyiceberg/expressions/parser.py:
##########
@@ -232,6 +233,9 @@ def handle_or(result: ParseResults) -> Or:
 ).set_name("expr")
 
 
-def parse(expr: str) -> BooleanExpression:
+def parse(expr: Union[str, BooleanExpression]) -> BooleanExpression:

Review Comment:
   I can see why you'd do this, but it seems better to make the caller check before calling `parse`. It's a bit confusing to accept an expression here and do nothing.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6657: Python: Allow to pass in a string as filter

Posted by "amogh-jahagirdar (via GitHub)" <gi...@apache.org>.
amogh-jahagirdar commented on code in PR #6657:
URL: https://github.com/apache/iceberg/pull/6657#discussion_r1090014264


##########
python/pyiceberg/table/__init__.py:
##########
@@ -183,14 +185,17 @@ class TableScan(Generic[S], ABC):
     def __init__(
         self,
         table: Table,
-        row_filter: Optional[BooleanExpression] = None,
+        row_filter: Optional[Union[str, BooleanExpression]] = None,
         selected_fields: Tuple[str] = ("*",),
         case_sensitive: bool = True,
         snapshot_id: Optional[int] = None,
         options: Properties = EMPTY_DICT,
     ):
         self.table = table
-        self.row_filter = row_filter or AlwaysTrue()
+        if row_filter is None:
+            self.row_filter = AlwaysTrue()
+        else:
+            self.row_filter = parser.parse(row_filter)

Review Comment:
   Nit: Maybe we could use a ternary operator here? 
   
   ```
   self.row_filter = parser.parse(row_filter) if row_filter else AlwaysTrue()
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #6657: Python: Allow to pass in a string as filter

Posted by "rdblue (via GitHub)" <gi...@apache.org>.
rdblue commented on PR #6657:
URL: https://github.com/apache/iceberg/pull/6657#issuecomment-1421061183

   Thanks for the update, @Fokko! Looks good now so I merged.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue merged pull request #6657: Python: Allow to pass in a string as filter

Posted by "rdblue (via GitHub)" <gi...@apache.org>.
rdblue merged PR #6657:
URL: https://github.com/apache/iceberg/pull/6657


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #6657: Python: Allow to pass in a string as filter

Posted by "rdblue (via GitHub)" <gi...@apache.org>.
rdblue commented on code in PR #6657:
URL: https://github.com/apache/iceberg/pull/6657#discussion_r1092400176


##########
python/pyiceberg/table/__init__.py:
##########
@@ -183,14 +185,17 @@ class TableScan(Generic[S], ABC):
     def __init__(
         self,
         table: Table,
-        row_filter: Optional[BooleanExpression] = None,
+        row_filter: Optional[Union[str, BooleanExpression]] = None,
         selected_fields: Tuple[str] = ("*",),
         case_sensitive: bool = True,
         snapshot_id: Optional[int] = None,
         options: Properties = EMPTY_DICT,
     ):
         self.table = table
-        self.row_filter = row_filter or AlwaysTrue()
+        if row_filter is None:
+            self.row_filter = AlwaysTrue()
+        else:

Review Comment:
   `elif isinstance(row_filter, str):`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org