You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/08/17 12:48:43 UTC

[GitHub] [iceberg] Fokko opened a new pull request, #5562: Python: First version of the Table API

Fokko opened a new pull request, #5562:
URL: https://github.com/apache/iceberg/pull/5562

   Add some public operations on the Table, similar to the Java implementation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue merged pull request #5562: Python: First version of the Table API

Posted by GitBox <gi...@apache.org>.
rdblue merged PR #5562:
URL: https://github.com/apache/iceberg/pull/5562


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Fokko commented on a diff in pull request #5562: Python: First version of the Table API

Posted by GitBox <gi...@apache.org>.
Fokko commented on code in PR #5562:
URL: https://github.com/apache/iceberg/pull/5562#discussion_r950364276


##########
python/pyiceberg/table/__init__.py:
##########
@@ -14,17 +14,102 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
-
-from typing import Optional, Union
+from functools import cached_property
+from typing import (
+    Dict,
+    List,
+    Optional,
+    Union,
+)
 
 from pydantic import Field
 
+from pyiceberg.schema import Schema
 from pyiceberg.table.metadata import TableMetadataV1, TableMetadataV2
+from pyiceberg.table.partitioning import PartitionSpec
+from pyiceberg.table.snapshots import Snapshot, SnapshotLogEntry
+from pyiceberg.table.sorting import SortOrder
 from pyiceberg.typedef import Identifier
 from pyiceberg.utils.iceberg_base_model import IcebergBaseModel
 
 
 class Table(IcebergBaseModel):
     identifier: Identifier = Field()
     metadata_location: str = Field()
-    metadata: Optional[Union[TableMetadataV1, TableMetadataV2]] = Field()
+    metadata: Union[TableMetadataV1, TableMetadataV2] = Field()
+
+    def refresh(self, catalog) -> "Table":

Review Comment:
   Reverted 👍🏻 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Fokko commented on pull request #5562: Python: First version of the Table API

Posted by GitBox <gi...@apache.org>.
Fokko commented on PR #5562:
URL: https://github.com/apache/iceberg/pull/5562#issuecomment-1220336846

   @rdblue updated, what did you think of the idea that I coined in the PR description?
   
   > This diverges from Java, but we could also turn the methods .schema, .spec, etc into (cached) properties. The refresh would then return a new Table object (in the case the metadata changed, otherwise return self which would keep the cached properties in place)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #5562: Python: First version of the Table API

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #5562:
URL: https://github.com/apache/iceberg/pull/5562#discussion_r950330236


##########
python/pyiceberg/table/__init__.py:
##########
@@ -14,17 +14,102 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
-
-from typing import Optional, Union
+from functools import cached_property
+from typing import (
+    Dict,
+    List,
+    Optional,
+    Union,
+)
 
 from pydantic import Field
 
+from pyiceberg.schema import Schema
 from pyiceberg.table.metadata import TableMetadataV1, TableMetadataV2
+from pyiceberg.table.partitioning import PartitionSpec
+from pyiceberg.table.snapshots import Snapshot, SnapshotLogEntry
+from pyiceberg.table.sorting import SortOrder
 from pyiceberg.typedef import Identifier
 from pyiceberg.utils.iceberg_base_model import IcebergBaseModel
 
 
 class Table(IcebergBaseModel):
     identifier: Identifier = Field()
     metadata_location: str = Field()
-    metadata: Optional[Union[TableMetadataV1, TableMetadataV2]] = Field()
+    metadata: Union[TableMetadataV1, TableMetadataV2] = Field()
+
+    def refresh(self, catalog) -> "Table":

Review Comment:
   Let's throw an exception and discuss this in a separate PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #5562: Python: First version of the Table API

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #5562:
URL: https://github.com/apache/iceberg/pull/5562#discussion_r949638636


##########
python/pyiceberg/table/snapshots.py:
##########
@@ -88,3 +88,13 @@ class Snapshot(IcebergBaseModel):
     manifest_list: Optional[str] = Field(alias="manifest-list", description="Location of the snapshot's manifest list file")
     summary: Optional[Summary] = Field()
     schema_id: Optional[int] = Field(alias="schema-id", default=None)
+
+
+class MetadataLog(IcebergBaseModel):

Review Comment:
   `MetadataLogEntry`?



##########
python/pyiceberg/table/snapshots.py:
##########
@@ -88,3 +88,13 @@ class Snapshot(IcebergBaseModel):
     manifest_list: Optional[str] = Field(alias="manifest-list", description="Location of the snapshot's manifest list file")
     summary: Optional[Summary] = Field()
     schema_id: Optional[int] = Field(alias="schema-id", default=None)
+
+
+class MetadataLog(IcebergBaseModel):
+    metadata_file: str = Field(alias="metadata-file")
+    timestamp_ms: int = Field(alias="timestamp-ms")
+
+
+class SnapshotLog(IcebergBaseModel):

Review Comment:
   `SnapshotLogEntry`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #5562: Python: First version of the Table API

Posted by GitBox <gi...@apache.org>.
rdblue commented on PR #5562:
URL: https://github.com/apache/iceberg/pull/5562#issuecomment-1220820774

   > what did you think of the idea that I coined in the PR description?
   
   I think it would be confusing to change this. While we certainly _could_ do it, I think we should make the two function the same way unless we have a compelling reason not to. Otherwise we're going to cause issues where people try to go between the two APIs and introduce subtle bugs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #5562: Python: First version of the Table API

Posted by GitBox <gi...@apache.org>.
rdblue commented on PR #5562:
URL: https://github.com/apache/iceberg/pull/5562#issuecomment-1220886974

   Thanks, @Fokko!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #5562: Python: First version of the Table API

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #5562:
URL: https://github.com/apache/iceberg/pull/5562#discussion_r949637656


##########
python/pyiceberg/table/__init__.py:
##########
@@ -15,16 +15,82 @@
 # specific language governing permissions and limitations
 # under the License.
 
-from typing import Optional, Union
+
+from typing import (
+    Dict,
+    List,
+    Optional,
+    Union,
+)
 
 from pydantic import Field
 
+from pyiceberg.schema import Schema
 from pyiceberg.table.metadata import TableMetadataV1, TableMetadataV2
+from pyiceberg.table.partitioning import PartitionSpec
+from pyiceberg.table.snapshots import Snapshot, SnapshotLog
+from pyiceberg.table.sorting import SortOrder
 from pyiceberg.typedef import Identifier
 from pyiceberg.utils.iceberg_base_model import IcebergBaseModel
 
 
 class Table(IcebergBaseModel):
     identifier: Identifier = Field()
     metadata_location: str = Field()
-    metadata: Optional[Union[TableMetadataV1, TableMetadataV2]] = Field()
+    metadata: Union[TableMetadataV1, TableMetadataV2] = Field()
+
+    def refresh(self):
+        """Refresh the current table metadata"""

Review Comment:
   Should we throw here because we don't implement `refresh` currently?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org