You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by GitBox <gi...@apache.org> on 2023/01/06 18:31:52 UTC

[GitHub] [arrow] jbrockmendel opened a new issue, #15226: ENH: dictionary_encode support duration types

jbrockmendel opened a new issue, #15226:
URL: https://github.com/apache/arrow/issues/15226

   ### Describe the enhancement requested
   
   In troubleshooting pandas xfails, I'm finding a chunk of them trace back to trying to call dictionary_encode with duration types.  @jorisvandenbossche tells me this is straightforward to do by operating on the underyling integers.  Not obvious to me what a clean way to get those integers is (i.e. without going through .buffers)
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou closed issue #15226: [Python] dictionary_encode support duration types

Posted by GitBox <gi...@apache.org>.
pitrou closed issue #15226: [Python] dictionary_encode support duration types
URL: https://github.com/apache/arrow/issues/15226


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] westonpace commented on issue #15226: [Python] dictionary_encode support duration types

Posted by GitBox <gi...@apache.org>.
westonpace commented on issue #15226:
URL: https://github.com/apache/arrow/issues/15226#issuecomment-1376578323

   Casting from duration to `int64` should be a zero-copy operation:
   
   ```
   >>> import pyarrow as pa
   >>> import datetime
   >>> arr = pa.array([datetime.timedelta(days=1), datetime.timedelta(hours=1)])
   >>> arr
   <pyarrow.lib.DurationArray object at 0x7f6efa8e60e0>
   [
     86400000000,
     3600000000
   ]
   >>> arr.cast(pa.int64())
   <pyarrow.lib.Int64Array object at 0x7f6e6edc86a0>
   [
     86400000000,
     3600000000
   ]
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] jorisvandenbossche commented on issue #15226: [Python] dictionary_encode support duration types

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on issue #15226:
URL: https://github.com/apache/arrow/issues/15226#issuecomment-1383610267

   While that is indeed the way to get the int64 values, the relevant issue for Arrow is that this is actually easy to support and that you shouldn't need this (we just "forgot" to add duration type to the list of types to support in the hashing algorithms) -> https://github.com/apache/arrow/pull/33685


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org