You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/28 09:38:20 UTC

[GitHub] [arrow-julia] bachdavi opened a new issue, #319: (de)serialization of (U)Int128

bachdavi opened a new issue, #319:
URL: https://github.com/apache/arrow-julia/issues/319

   Hey 👋 
   
   `(U)Int128` are currently serialized by setting the `bit_width=128`. That works perfectly fine with Julia, but as soon as serialized arrow batches are read from another language, such as Python or JS, they raise the error of having an `unrecognized Int type`. 
   
   We are fixing this by defining custom (de)serialization code, but that very much looks like type piracy to us. We were wondering if instead, we should by default send them as e.g. two `UInt64`s in a struct :) What do you think?
   
   Here is what we are currently doing for `UInt128`:
   
   ```julia
   # Splits `UInt128` into their respective low and high
   # bits, i.e. their least and most significant bits.
   function _split(i::UInt128)
       l, h = i % UInt64, (i >> 64) % UInt64
       return (:low => l, :high => h)
   end
   
   _merge(l::UInt64, h::UInt64) = UInt128(l) + UInt128(h) << 64
   
   ArrowTypes.ArrowType(::Type{UInt128}) = NamedTuple{(:low, :high)}
   ArrowTypes.toarrow(i::UInt128) = _split(i)
   ArrowTypes.arrowname(::Type{UInt128}) = Symbol("Julia.UInt128")
   ArrowTypes.fromarrow(T::Type{<:UInt128}, low, high) = _merge(low, high)
   ArrowTypes.JuliaType(::Val{Symbol("Julia.UInt128")}) = UInt128
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org