You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/01/25 14:11:21 UTC

[GitHub] [iceberg] CircArgs opened a new pull request #3981: generics and associated tests - no literals

CircArgs opened a new pull request #3981:
URL: https://github.com/apache/iceberg/pull/3981


   This PR is in contrast to https://github.com/apache/iceberg/pull/3952 which saw an attempt to use optional arguments to get a semblance of combined type/literal functionality. This one steps back to offer true generics e.g. `FixedType[8]` as opposed to `FixedType(length=8)` with all the intuitive functionality that comes from using true types and not class instances.
   
   Examples:
   
   ```python
   IntegerType==IntegerType
   DecimalType[32, 3]==DecimalType[32, 3]
   StructType[
           NestedField[Decimal[32, 3], 0, "c1", True],
           NestedField[Float, 1, "c2", False],
       ] ==
   StructType[
           NestedField[Decimal[32, 3], 0, "c1", True],
           NestedField[Float, 1, "c2", False],
       ]
   ```
   
   ```python
   >>> issubclass(NestedField[Decimal[32, 3], 0, "c1", True], NestedField)
   True
   ```
   
   It provides the functionality to create further types possibly necessary in the future such as the transforms. For example, 
   
   ```python
   >>> BucketTransform = generic_class(
       "BucketTransform",
      {
               "type": Union[
                   DateType,
                   TimeType,
                   TimestampType,
                   TimestamptzType,
                   StringType,
                   BinaryType,
                   UUIDType,
                   NumberType,
                   FixedType,
               ],
           "num_buckets", int})
   
   >>>BucketTransform[IntegerType, 10]
   BucketTransform[type=IntegerType, num_buckets=10]
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] emkornfield commented on pull request #3981: generics and associated tests - no literals

Posted by GitBox <gi...@apache.org>.
emkornfield commented on pull request #3981:
URL: https://github.com/apache/iceberg/pull/3981#issuecomment-1024802697


   @rdblue thanks for the update.  Was this conversation on slack or offline.  I'm trying to ramp up on the with the community is doing and trying to figure out which communication channels are critical to monitor?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] CircArgs closed pull request #3981: generics and associated tests - no literals

Posted by GitBox <gi...@apache.org>.
CircArgs closed pull request #3981:
URL: https://github.com/apache/iceberg/pull/3981


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #3981: generics and associated tests - no literals

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #3981:
URL: https://github.com/apache/iceberg/pull/3981#issuecomment-1024788880


   @emkornfield, I just had a discussion with @CircArgs about this yesterday. I think that we're currently evaluating whether integration with Python's type system is necessary and worth the additional complexity. It seems like the main benefit is to be able to have functions like transforms that accept a certain argument type and to make the literals conform to that type. But given how we will process actual data (and not just literal values in in expressions) won't use type safety because the data is in a dataframe or some direct representation (like a python int rather than a wrapper around int), it seems fairly limited.
   
   I think we may want to go with a simpler type implementation, but @CircArgs was going to give it some thought and weigh in.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #3981: generics and associated tests - no literals

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #3981:
URL: https://github.com/apache/iceberg/pull/3981#issuecomment-1025196798


   @emkornfield, we had an offline chat because I didn't understand the goal of this approach at first. Most of the time, we coordinate through PRs and the #python channel in the apache-iceberg Slack community.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] emkornfield commented on pull request #3981: generics and associated tests - no literals

Posted by GitBox <gi...@apache.org>.
emkornfield commented on pull request #3981:
URL: https://github.com/apache/iceberg/pull/3981#issuecomment-1024626611


   While this seems like cool functionality, I'm a bit ambivalent of conflating the iceberg type system with the python type system, I can't think of any immediate drawbacks other then it isn't clear if the issubclass relation is really the right relation to be using between types.  For instance the last example:
   ```
   >>>issubclass(StructType[
           NestedField[DecimalType[32, 3], 0, "c1", True],
           NestedField[FloatType, 1, "c2", False],
       ], StructType[
           NestedField[IcebergType, 0, "c1", True],
           NestedField[FloatType, 1, "c2", False],
       ])
   True
   ```
   strikes me as not immediately what the semantics of a StructType inheritance should be and if inheritance in this case really makes sense?
   
   
   Is there a design doc someplace that discusses this approach? Are there other projects that have gone this route that have had good results?  Does this make important constructs much easier to express?  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3981: generics and associated tests - no literals

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3981:
URL: https://github.com/apache/iceberg/pull/3981#discussion_r793976069



##########
File path: python/src/iceberg/types.py
##########
@@ -14,187 +14,124 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
-"""Data types used in describing Iceberg schemas
 
+"""Data types used in describing Iceberg schemas

Review comment:
       Can you add the newline back so git doesn't think this line changed?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org