You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "Fokko (via GitHub)" <gi...@apache.org> on 2024/04/17 11:44:15 UTC

Re: [PR] Incremental Append Scan [iceberg-python]

Fokko commented on code in PR #533:
URL: https://github.com/apache/iceberg-python/pull/533#discussion_r1568698784


##########
pyiceberg/table/__init__.py:
##########
@@ -1594,6 +1617,197 @@ def to_ray(self) -> ray.data.dataset.Dataset:
         return ray.data.from_arrow(self.to_arrow())
 
 
+class BaseIncrementalScan(TableScan):
+    to_snapshot_id: Optional[int]
+    from_snapshot_id_exclusive: Optional[int]
+
+    def __init__(
+        self,
+        table: Table,
+        row_filter: Union[str, BooleanExpression] = ALWAYS_TRUE,
+        selected_fields: Tuple[str, ...] = ("*",),
+        case_sensitive: bool = True,
+        options: Properties = EMPTY_DICT,
+        limit: Optional[int] = None,
+        to_snapshot_id: Optional[int] = None,
+        from_snapshot_id_exclusive: Optional[int] = None,

Review Comment:
   I think it is worthwhile to add a docstring there to describe these parameters to help the end-user.



##########
pyiceberg/table/__init__.py:
##########
@@ -3014,3 +3145,35 @@ def _new_field_id(self) -> int:
 
     def _is_duplicate_partition(self, transform: Transform[Any, Any], partition_field: PartitionField) -> bool:
         return partition_field.field_id not in self._deletes and partition_field.transform == transform
+
+
+def ancestors_between(to_snapshot: int, from_snapshot: Optional[int], table: Table) -> Iterable[Snapshot]:

Review Comment:
   This file is getting rather big, how about `snapshots.py`?



##########
pyiceberg/manifest.py:
##########
@@ -573,6 +573,17 @@ class ManifestFile(Record):
     def __init__(self, *data: Any, **named_data: Any) -> None:
         super().__init__(*data, **{"struct": MANIFEST_LIST_FILE_STRUCTS[DEFAULT_READ_VERSION], **named_data})
 
+    def __eq__(self, other: Any) -> bool:
+        """Return the equality of two instances of the ManifestFile class."""
+        if not isinstance(other, ManifestFile):
+            return False
+        else:
+            return self.manifest_path == other.manifest_path

Review Comment:
   nit: I always do this in a single statement:
   ```suggestion
           return self.manifest_path == other.manifest_path if isinstance(other, ManifestFile) else False
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org