You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/10/22 08:55:46 UTC

[GitHub] [arrow] jorisvandenbossche commented on pull request #8312: ARROW-9941: [Python] Better string representation for extension types

jorisvandenbossche commented on pull request #8312:
URL: https://github.com/apache/arrow/pull/8312#issuecomment-714341688


   Late comment on this PR: for the PyExtensionType subclasses, this is certainly a nice enhancement. But for custom extension types directly subclassing ExtesionType, this might not be needed. Using the PeriodType class from the tests, we now have:
   
   ```
   In [26]: period_type = PeriodType('D')
   
   In [27]: period_type
   Out[27]: PeriodType(DataType(int64))
   
   In [28]: str(period_type)
   Out[28]: 'extension<test.period<PeriodType>>'
   ```
   
   while before this was:
   
   ```
   In [2]: period_type = PeriodType('D')
   
   In [3]: period_type
   Out[3]: PeriodType(extension<test.period>)
   
   In [4]: str(period_type)
   Out[4]: 'extension<test.period>'
   ```
   
   Since here, the extension name is already unique and not the generic "arrow.py_extension_type", adding the class name to `__str__` seems to not add much value / give only a longer type name? 
   For the `__repr__` I am less certain about (showing the storage type is also useful), but I think also showing the identifier name of the extension type can be useful. 
   
   (happy to work on this if there is agreement to change this)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org