You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "rubenvdg (via GitHub)" <gi...@apache.org> on 2023/01/22 20:43:17 UTC

[GitHub] [iceberg] rubenvdg opened a new pull request, #6644: Python: Add support for static table

rubenvdg opened a new pull request, #6644:
URL: https://github.com/apache/iceberg/pull/6644

   This PR proposes adding support for static tables (i.e., reading a table directly from a metadata file without using a catalog, see also https://github.com/apache/iceberg/issues/6430). Happy to hear if this makes sense.
   
   Regarding unit tests:
   We could run all the existing `Table` tests for `StaticTable` too, but that might be a bit artificial. You'd get something ugly like this:
   
   ```python
   @pytest.fixture
   def table(..) -> Table:
       ...
   
   @pytest.fixture
   def static_table(...) -> StaticTable:
       ...
   
   @pytest.mark.parametrize("the_table", ["table", "static_table"])
   def test_schema(the_table: Table, request: FixtureRequest) -> None:
       assert request.getfixturevalue(the_table).schema() == Schema(...)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Fokko commented on a diff in pull request #6644: Python: Add support for static table

Posted by "Fokko (via GitHub)" <gi...@apache.org>.
Fokko commented on code in PR #6644:
URL: https://github.com/apache/iceberg/pull/6644#discussion_r1083543881


##########
python/pyiceberg/table/__init__.py:
##########
@@ -167,6 +169,32 @@ def __eq__(self, other: Any) -> bool:
         )
 
 
+class StaticTable(Table):
+    """Load a table directly from a metadata file (i.e., without using a catalog)."""
+
+    def refresh(self) -> Table:
+        """StaticTable metadata cannot be refreshed."""
+        raise StaticTableImmutableError("StaticTable metadata cannot be refreshed.")
+
+    @classmethod
+    def from_metadata(cls, metadata_location: str, properties: Properties = EMPTY_DICT) -> StaticTable:
+
+        metadata = cls._load_metadata(metadata_location, properties)
+
+        return cls(
+            identifier=("static-table", metadata_location),
+            metadata_location=metadata_location,
+            metadata=metadata,
+            io=load_file_io({**properties, **metadata.properties}),
+        )
+
+    @staticmethod
+    def _load_metadata(metadata_location: str, properties: Properties) -> TableMetadata:

Review Comment:
   How do you feel about merging this logic into `from_metadata`? I t looks like we don't reuse this anywhere.



##########
python/pyiceberg/exceptions.py:
##########
@@ -90,3 +90,7 @@ class SignError(Exception):
 
 class ResolveError(Exception):
     pass
+
+
+class StaticTableImmutableError(Exception):

Review Comment:
   How do you feel like re-using the internal `NotImplementedError`?



##########
python/pyiceberg/catalog/rest.py:
##########
@@ -175,11 +175,7 @@ class RestCatalog(Catalog):
     session: Session
     properties: Properties
 
-    def __init__(
-        self,
-        name: str,
-        **properties: str,
-    ):
+    def __init__(self, name: str, **properties: str):

Review Comment:
   Why did this change? I always assumed that black was deterministic.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Fokko commented on pull request #6644: Python: Add support for static table

Posted by "Fokko (via GitHub)" <gi...@apache.org>.
Fokko commented on PR #6644:
URL: https://github.com/apache/iceberg/pull/6644#issuecomment-1421118698

   @rubenvdg can you rebase? We were waiting for https://github.com/apache/iceberg/pull/6719 to be merged so we don't update the docs right away


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Fokko commented on a diff in pull request #6644: Python: Add support for static table

Posted by "Fokko (via GitHub)" <gi...@apache.org>.
Fokko commented on code in PR #6644:
URL: https://github.com/apache/iceberg/pull/6644#discussion_r1087211802


##########
python/pyiceberg/catalog/rest.py:
##########
@@ -175,11 +175,7 @@ class RestCatalog(Catalog):
     session: Session
     properties: Properties
 
-    def __init__(
-        self,
-        name: str,
-        **properties: str,
-    ):
+    def __init__(self, name: str, **properties: str):

Review Comment:
   Nice!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Fokko merged pull request #6644: Python: Add support for static table

Posted by "Fokko (via GitHub)" <gi...@apache.org>.
Fokko merged PR #6644:
URL: https://github.com/apache/iceberg/pull/6644


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rubenvdg commented on a diff in pull request #6644: Python: Add support for static table

Posted by "rubenvdg (via GitHub)" <gi...@apache.org>.
rubenvdg commented on code in PR #6644:
URL: https://github.com/apache/iceberg/pull/6644#discussion_r1083761641


##########
python/pyiceberg/exceptions.py:
##########
@@ -90,3 +90,7 @@ class SignError(Exception):
 
 class ResolveError(Exception):
     pass
+
+
+class StaticTableImmutableError(Exception):

Review Comment:
   My reasoning was that for a `StaticTable` refreshing metadata doesn't make sense, hence it's never going to be implemented, hence `NotImplementedError` doesn't make sense. 
   
   I might have misunderstood the whole purpose of `refresh()`. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rubenvdg commented on pull request #6644: Python: Add support for static table

Posted by "rubenvdg (via GitHub)" <gi...@apache.org>.
rubenvdg commented on PR #6644:
URL: https://github.com/apache/iceberg/pull/6644#issuecomment-1400076844

   Added docs and fixed the rest. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rubenvdg commented on pull request #6644: Python: Add support for static table

Posted by "rubenvdg (via GitHub)" <gi...@apache.org>.
rubenvdg commented on PR #6644:
URL: https://github.com/apache/iceberg/pull/6644#issuecomment-1421340074

   Tuuuuuuurlijk maestro


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rubenvdg commented on a diff in pull request #6644: Python: Add support for static table

Posted by "rubenvdg (via GitHub)" <gi...@apache.org>.
rubenvdg commented on code in PR #6644:
URL: https://github.com/apache/iceberg/pull/6644#discussion_r1083760745


##########
python/pyiceberg/catalog/rest.py:
##########
@@ -175,11 +175,7 @@ class RestCatalog(Catalog):
     session: Session
     properties: Properties
 
-    def __init__(
-        self,
-        name: str,
-        **properties: str,
-    ):
+    def __init__(self, name: str, **properties: str):

Review Comment:
   I removed the trailing comma 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Fokko commented on pull request #6644: Python: Add support for static table

Posted by "Fokko (via GitHub)" <gi...@apache.org>.
Fokko commented on PR #6644:
URL: https://github.com/apache/iceberg/pull/6644#issuecomment-1428597503

   Thanks again @rubenvdg for picking this up 👏🏻 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org