You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by GitBox <gi...@apache.org> on 2022/11/27 22:56:33 UTC

[GitHub] [arrow-julia] TanookiToad opened a new issue, #364: Values in PooledArray are incorrectly saved

TanookiToad opened a new issue, #364:
URL: https://github.com/apache/arrow-julia/issues/364

   If a float-type element in ```PooledVector{Real, UInt32, Vector{UInt32}}``` is replaced by an integer, Arrow will incorrectly save this value.
   
   ```
   using Arrow, DataFrames, PooledArrays
   
   df = DataFrame(x = PooledArray(Vector{Real}([1.0])))
   df[1, 1] = 2
   Arrow.write("test.arrow", df)
   Arrow.Table(io) |> DataFrame
   ```
   
   The following code incorrectly saves 11 as 1.0e-323.
   
   ```
   1×1 DataFrame
    Row │ x        
        │ Float64
   ─────┼──────────
      1 │ 1.0e-323
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-julia] quinnj commented on issue #364: PooledArray are incorrectly saved

Posted by GitBox <gi...@apache.org>.
quinnj commented on issue #364:
URL: https://github.com/apache/arrow-julia/issues/364#issuecomment-1329872018

   Yeah, this is a pretty bummer bug. PR up: https://github.com/apache/arrow-julia/pull/365.
   
   Yeah, I think I agree that we probably shouldn't have tried to be so loose about turning abstractly typed column vectors into promoted union/concrete types (and thus exposing ourselves to this bug), but it is a pretty convenient usability thing and not super easy to deal w/ if you're new to Julia/data in julia. We can perhaps consider alternative approaches for a 3.0 release.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-julia] quinnj closed issue #364: PooledArray are incorrectly saved

Posted by GitBox <gi...@apache.org>.
quinnj closed issue #364: PooledArray are incorrectly saved
URL: https://github.com/apache/arrow-julia/issues/364


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-julia] jrevels commented on issue #364: PooledArray are incorrectly saved

Posted by GitBox <gi...@apache.org>.
jrevels commented on issue #364:
URL: https://github.com/apache/arrow-julia/issues/364#issuecomment-1329395061

   Oof, this is quite a bad bug. Good catch.
   
   ---
   
   A bit of an aside, but FWIW, I wouldn't be opposed to Arrow.jl defaulting to an error when asked to serialize any container with nonconcrete eltype (basically, extend https://github.com/apache/arrow-julia/pull/305 to all containers)
   
   Bugs that can result in silent data loss/corruption are amongst the most insidious type of bug that a package like Arrow.jl can possibly have, I'd much rather limit functionality to avoid them
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-julia] TanookiToad commented on issue #364: PooledArray are incorrectly saved

Posted by GitBox <gi...@apache.org>.
TanookiToad commented on issue #364:
URL: https://github.com/apache/arrow-julia/issues/364#issuecomment-1328370108

   Seems like it's caused by the wrong serialization of non-concrete type. Similar to https://github.com/apache/arrow-julia/issues/232. Maybe we could also add a check to the type of PooledArray elements?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org