You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/07/26 01:18:46 UTC

[GitHub] [iceberg] jun-he opened a new pull request #2866: [Python] Support schema field ID validation in python

jun-he opened a new pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866


   Support schema field ID validation in python.  Also fix bugs in type_util and add unit tests.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jun-he commented on a change in pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
jun-he commented on a change in pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866#discussion_r684585045



##########
File path: python/iceberg/api/types/type_util.py
##########
@@ -117,15 +119,17 @@ def visit(arg, visitor): # noqa: ignore=C901
             results = list()
             for field in struct.fields:
                 visitor.field_ids.append(field.field_id)
-                visitor.field_names.append(field.name)
+                visitor.field_names.appendleft(field.name)

Review comment:
       SGTM. updated accordingly.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] TGooch44 commented on a change in pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
TGooch44 commented on a change in pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866#discussion_r679078145



##########
File path: python/tests/api/types/test_type_util.py
##########
@@ -0,0 +1,36 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import unittest
+
+from iceberg.api.types import (BooleanType, NestedField, StructType)
+from iceberg.api.types import type_util
+from iceberg.exceptions import ValidationException
+
+
+class TestConversions(unittest.TestCase):
+
+    def test_validate_schema_via_index_by_name(self):

Review comment:
       Agree.  it would be great to have some failure cases.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jun-he commented on a change in pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
jun-he commented on a change in pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866#discussion_r684585400



##########
File path: python/iceberg/api/schema.py
##########
@@ -49,6 +49,8 @@ def __init__(self, *argv):
         self._lowercase_name_to_id = None
         self._id_to_name = None
 
+        self.lazy_id_to_name()

Review comment:
       Updated




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rymurr commented on a change in pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
rymurr commented on a change in pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866#discussion_r689319922



##########
File path: python/iceberg/api/types/type_util.py
##########
@@ -121,6 +122,8 @@ def visit(arg, visitor): # noqa: ignore=C901
                 result = None
                 try:
                     result = visit(field.type, visitor)
+                except ValidationException as ve:

Review comment:
       cool, can we make a comment in the code? Or see if the tests pass if we remove `RuntimeError`. That exception type is a bit worrying for me




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jun-he commented on pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
jun-he commented on pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866#issuecomment-895336767


   @rymurr  oops, just force pushed the changes. Sorry about it. Can you take a look? Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jun-he commented on a change in pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
jun-he commented on a change in pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866#discussion_r684585427



##########
File path: python/iceberg/api/schema.py
##########
@@ -80,6 +82,15 @@ def lazy_lowercase_name_to_id(self):
 
         return self._lowercase_name_to_id
 
+    def lazy_id_to_name(self):
+        from .types import index_by_name

Review comment:
       Removed it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rymurr commented on a change in pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
rymurr commented on a change in pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866#discussion_r679790950



##########
File path: python/iceberg/api/types/type_util.py
##########
@@ -117,15 +119,17 @@ def visit(arg, visitor): # noqa: ignore=C901
             results = list()
             for field in struct.fields:
                 visitor.field_ids.append(field.field_id)
-                visitor.field_names.append(field.name)
+                visitor.field_names.appendleft(field.name)

Review comment:
       Does that mean the `reversed` has to be removed in the link you sent? Not sure I 100% understand, sorry.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jun-he commented on a change in pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
jun-he commented on a change in pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866#discussion_r689131513



##########
File path: python/iceberg/api/types/type_util.py
##########
@@ -121,6 +122,8 @@ def visit(arg, visitor): # noqa: ignore=C901
                 result = None
                 try:
                     result = visit(field.type, visitor)
+                except ValidationException as ve:

Review comment:
       Ah, I probably know why. There are many code paths with `NotImplementedError`. If not ignoring errors, multiple tests fail and the build won't pass.
   Also, looks like the visitor needs some rework to use `before` and `after` methods (similar to Java). I feel we probably can leave it as it is and resolve that after the rework is done and missing functions are implemented.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jun-he commented on a change in pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
jun-he commented on a change in pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866#discussion_r679267907



##########
File path: python/iceberg/api/types/type_util.py
##########
@@ -117,15 +119,17 @@ def visit(arg, visitor): # noqa: ignore=C901
             results = list()
             for field in struct.fields:
                 visitor.field_ids.append(field.field_id)
-                visitor.field_names.append(field.name)
+                visitor.field_names.appendleft(field.name)

Review comment:
       Use it to match Java implementation. Currently it is a list but the [full name generated](https://github.com/apache/iceberg/blob/master/python/iceberg/api/types/type_util.py#L356) later will be wrong.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] TGooch44 commented on a change in pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
TGooch44 commented on a change in pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866#discussion_r679081343



##########
File path: python/iceberg/api/schema.py
##########
@@ -49,6 +49,8 @@ def __init__(self, *argv):
         self._lowercase_name_to_id = None
         self._id_to_name = None
 
+        self.lazy_id_to_name()

Review comment:
       Probably fine to keep this PR not too complex, but if we're no longer lazy loading a lot of the code in this class can be a bit more succinct.  I know this currently matches what's in java, but I just find it a little strange that the functions are named as lazy, but at least in the case of  lazy_id_to_name(), we're eagerly loading in the constructor.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rymurr commented on a change in pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
rymurr commented on a change in pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866#discussion_r680774069



##########
File path: python/iceberg/api/schema.py
##########
@@ -80,6 +82,15 @@ def lazy_lowercase_name_to_id(self):
 
         return self._lowercase_name_to_id
 
+    def lazy_id_to_name(self):
+        from .types import index_by_name

Review comment:
       @jun-he can we add an issue to address python circular imports? I suspect its hiding an underlying structural issue in the python code.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] TGooch44 commented on a change in pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
TGooch44 commented on a change in pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866#discussion_r677524926



##########
File path: python/tests/api/types/test_type_util.py
##########
@@ -0,0 +1,36 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import unittest
+
+from iceberg.api.types import (BooleanType, NestedField, StructType)
+from iceberg.api.types import type_util
+from iceberg.exceptions import ValidationException
+
+
+class TestConversions(unittest.TestCase):

Review comment:
       It was originally `unittest` and converted en-masse to `pytest`, so there are still some straggling things that haven't been updated.  It would be appropriate to update anywhere it's still using the older unit tests.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jun-he commented on a change in pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
jun-he commented on a change in pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866#discussion_r680593384



##########
File path: python/tests/api/types/test_type_util.py
##########
@@ -0,0 +1,36 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import unittest
+
+from iceberg.api.types import (BooleanType, NestedField, StructType)
+from iceberg.api.types import type_util
+from iceberg.exceptions import ValidationException
+
+
+class TestConversions(unittest.TestCase):
+
+    def test_validate_schema_via_index_by_name(self):

Review comment:
       @rymurr @TGooch44  updated the tests. Can you take another look? Thanks.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] TGooch44 commented on a change in pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
TGooch44 commented on a change in pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866#discussion_r680885590



##########
File path: python/iceberg/api/schema.py
##########
@@ -80,6 +82,15 @@ def lazy_lowercase_name_to_id(self):
 
         return self._lowercase_name_to_id
 
+    def lazy_id_to_name(self):
+        from .types import index_by_name

Review comment:
       @rymurr Great idea, I think this actually links to your comment in @jun-he 's other PR.  There is a fair amount of non-idiomatic python code in this repo that should be refactored and would eliminate most if not all if these types of issues.  Starting with the easy ones like get rid of all classes that contain only static methods would be a good place to start, and having an issue to track against would be helpful.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rymurr commented on a change in pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
rymurr commented on a change in pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866#discussion_r685833559



##########
File path: python/iceberg/api/types/type_util.py
##########
@@ -121,6 +122,8 @@ def visit(arg, visitor): # noqa: ignore=C901
                 result = None
                 try:
                     result = visit(field.type, visitor)
+                except ValidationException as ve:

Review comment:
       This looks strange to me. Why do we catch and raise `ValidationException` but swallow `RuntimeError`. Catching a `RuntimeError` smells like a generic Java conversion as `RuntimeError` isn't a catch-all like `RuntimeException`. @TGooch44 do you know the history here? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jun-he commented on pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
jun-he commented on pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866#issuecomment-886305495


   @TGooch44 and @rymurr can you help to review it? Thanks. 
   Seems types package miss many features and block the update of schema class. Plan to update them and then schema.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jun-he commented on a change in pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
jun-he commented on a change in pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866#discussion_r679271532



##########
File path: python/tests/api/types/test_type_util.py
##########
@@ -0,0 +1,36 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import unittest
+
+from iceberg.api.types import (BooleanType, NestedField, StructType)
+from iceberg.api.types import type_util
+from iceberg.exceptions import ValidationException
+
+
+class TestConversions(unittest.TestCase):
+
+    def test_validate_schema_via_index_by_name(self):

Review comment:
       👌  will do




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] TGooch44 commented on a change in pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
TGooch44 commented on a change in pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866#discussion_r680888202



##########
File path: python/iceberg/api/schema.py
##########
@@ -80,6 +82,15 @@ def lazy_lowercase_name_to_id(self):
 
         return self._lowercase_name_to_id
 
+    def lazy_id_to_name(self):
+        from .types import index_by_name

Review comment:
       Issue is opened:
   https://github.com/apache/iceberg/issues/2915




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jun-he commented on a change in pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
jun-he commented on a change in pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866#discussion_r680586889



##########
File path: python/iceberg/api/types/type_util.py
##########
@@ -117,15 +119,17 @@ def visit(arg, visitor): # noqa: ignore=C901
             results = list()
             for field in struct.fields:
                 visitor.field_ids.append(field.field_id)
-                visitor.field_names.append(field.name)
+                visitor.field_names.appendleft(field.name)

Review comment:
       yep, that will work as well. The current code uses a `list` and `reverse()` and thus does not work. So we either use deque similar to Java or we remove `reverse()`. I did the first option to match Java implementation.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jun-he commented on a change in pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
jun-he commented on a change in pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866#discussion_r686255475



##########
File path: python/iceberg/api/types/type_util.py
##########
@@ -121,6 +122,8 @@ def visit(arg, visitor): # noqa: ignore=C901
                 result = None
                 try:
                     result = visit(field.type, visitor)
+                except ValidationException as ve:

Review comment:
       It might be better to always not swallow any errors. I can remove `except` blocks and let the caller to handle the raised errors.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rymurr commented on a change in pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
rymurr commented on a change in pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866#discussion_r680772970



##########
File path: python/iceberg/api/types/type_util.py
##########
@@ -117,15 +119,17 @@ def visit(arg, visitor): # noqa: ignore=C901
             results = list()
             for field in struct.fields:
                 visitor.field_ids.append(field.field_id)
-                visitor.field_names.append(field.name)
+                visitor.field_names.appendleft(field.name)

Review comment:
       It seems to me that using `list` and not using `reverse` would be cleaner to read and algorithmically simpler.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rymurr commented on a change in pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
rymurr commented on a change in pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866#discussion_r677341450



##########
File path: python/iceberg/api/types/type_util.py
##########
@@ -117,15 +119,17 @@ def visit(arg, visitor): # noqa: ignore=C901
             results = list()
             for field in struct.fields:
                 visitor.field_ids.append(field.field_id)
-                visitor.field_names.append(field.name)
+                visitor.field_names.appendleft(field.name)

Review comment:
       why use a `deque` here?

##########
File path: python/tests/api/types/test_type_util.py
##########
@@ -0,0 +1,36 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import unittest
+
+from iceberg.api.types import (BooleanType, NestedField, StructType)
+from iceberg.api.types import type_util
+from iceberg.exceptions import ValidationException
+
+
+class TestConversions(unittest.TestCase):

Review comment:
       we are using `pytest` is there any reason to wrap this single test in a unittest class?

##########
File path: python/iceberg/api/schema.py
##########
@@ -80,6 +82,15 @@ def lazy_lowercase_name_to_id(self):
 
         return self._lowercase_name_to_id
 
+    def lazy_id_to_name(self):
+        from .types import index_by_name

Review comment:
       why the localised import?

##########
File path: python/tests/api/types/test_type_util.py
##########
@@ -0,0 +1,36 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import unittest
+
+from iceberg.api.types import (BooleanType, NestedField, StructType)
+from iceberg.api.types import type_util
+from iceberg.exceptions import ValidationException
+
+
+class TestConversions(unittest.TestCase):
+
+    def test_validate_schema_via_index_by_name(self):

Review comment:
       a few more test cases wouldn't be bad. A passing test case. Maybe an obvious wrong one (eg add `a` to the schema twice)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] TGooch44 commented on a change in pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
TGooch44 commented on a change in pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866#discussion_r677525487



##########
File path: python/iceberg/api/schema.py
##########
@@ -80,6 +82,15 @@ def lazy_lowercase_name_to_id(self):
 
         return self._lowercase_name_to_id
 
+    def lazy_id_to_name(self):
+        from .types import index_by_name

Review comment:
       I haven't looked into it, but it's probably to prevent circular import.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rymurr commented on pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
rymurr commented on pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866#issuecomment-895092906


   Hey @jun-he did you push the new commits? I dont see the changes you mentioned in the comments. Maybe my GH diff is just not rendering correctly?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jun-he commented on a change in pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
jun-he commented on a change in pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866#discussion_r680586889



##########
File path: python/iceberg/api/types/type_util.py
##########
@@ -117,15 +119,17 @@ def visit(arg, visitor): # noqa: ignore=C901
             results = list()
             for field in struct.fields:
                 visitor.field_ids.append(field.field_id)
-                visitor.field_names.append(field.name)
+                visitor.field_names.appendleft(field.name)

Review comment:
       yep, that will work as well. The current implementation uses list but reverse() and will not work. So we either use deque similar to Java or we fix `reverse()`. I did the first option to match Java implementation.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] TGooch44 merged pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
TGooch44 merged pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] TGooch44 commented on a change in pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
TGooch44 commented on a change in pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866#discussion_r685936678



##########
File path: python/iceberg/api/types/type_util.py
##########
@@ -121,6 +122,8 @@ def visit(arg, visitor): # noqa: ignore=C901
                 result = None
                 try:
                     result = visit(field.type, visitor)
+                except ValidationException as ve:

Review comment:
       I'm going to have to step through this because I don't really remember why this is like this.  




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jun-he commented on a change in pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
jun-he commented on a change in pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866#discussion_r679271165



##########
File path: python/iceberg/api/schema.py
##########
@@ -49,6 +49,8 @@ def __init__(self, *argv):
         self._lowercase_name_to_id = None
         self._id_to_name = None
 
+        self.lazy_id_to_name()

Review comment:
       Yep, felt the same thing when I added it. We may update Java implementation as well.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jun-he commented on a change in pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
jun-he commented on a change in pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866#discussion_r679268471



##########
File path: python/tests/api/types/test_type_util.py
##########
@@ -0,0 +1,36 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import unittest
+
+from iceberg.api.types import (BooleanType, NestedField, StructType)
+from iceberg.api.types import type_util
+from iceberg.exceptions import ValidationException
+
+
+class TestConversions(unittest.TestCase):

Review comment:
       👌  will update the test to use pytest.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jun-he edited a comment on pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
jun-he edited a comment on pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866#issuecomment-903179526


   @TGooch44  can you take a look and let me know if there are any additional comments on it? Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rymurr commented on pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
rymurr commented on pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866#issuecomment-899316329


   @TGooch44 any comments or am I ok to merge?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jun-he commented on pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
jun-he commented on pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866#issuecomment-903179526


   @TGooch44  any additional comments on it? Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rymurr commented on a change in pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
rymurr commented on a change in pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866#discussion_r680775623



##########
File path: python/iceberg/api/schema.py
##########
@@ -49,6 +49,8 @@ def __init__(self, *argv):
         self._lowercase_name_to_id = None
         self._id_to_name = None
 
+        self.lazy_id_to_name()

Review comment:
       rather than trying to be 1:1 compliant w/ Java lets try and write idiomatic code. I agree with @TGooch44 that we could simplify this.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jun-he commented on a change in pull request #2866: [Python] Support schema field ID validation in python

Posted by GitBox <gi...@apache.org>.
jun-he commented on a change in pull request #2866:
URL: https://github.com/apache/iceberg/pull/2866#discussion_r689639322



##########
File path: python/iceberg/api/types/type_util.py
##########
@@ -121,6 +122,8 @@ def visit(arg, visitor): # noqa: ignore=C901
                 result = None
                 try:
                     result = visit(field.type, visitor)
+                except ValidationException as ve:

Review comment:
       @rymurr  that is a good point. I updated it to only pass `NotImplementedError` errors and put a comment there.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org