You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by qu...@apache.org on 2023/05/22 22:45:28 UTC
[arrow-julia] branch jq-neg-one-len created (now b622bec)
This is an automated email from the ASF dual-hosted git repository.
quinnj pushed a change to branch jq-neg-one-len
in repository https://gitbox.apache.org/repos/asf/arrow-julia.git
at b622bec Handle len of -1 in "compresses" buffers from other languages
This branch includes the following new commits:
new b622bec Handle len of -1 in "compresses" buffers from other languages
The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails. The revisions
listed as "add" were already present in the repository and have only
been added to this reference.
[arrow-julia] 01/01: Handle len of -1 in "compresses" buffers from other languages
Posted by qu...@apache.org.
This is an automated email from the ASF dual-hosted git repository.
quinnj pushed a commit to branch jq-neg-one-len
in repository https://gitbox.apache.org/repos/asf/arrow-julia.git
commit b622bec794243a7c98508b1e1cb9d30d68b28b22
Author: Jacob Quinn <qu...@gmail.com>
AuthorDate: Mon May 22 16:40:39 2023 -0600
Handle len of -1 in "compresses" buffers from other languages
It's unclear why other language implementations will have a compression set
for arrow data, then indicate that the length is -1, as a sentinel value
that the data is actually _not_ compressed. But since they do, we can handle
that case pretty easily. I'm basically just adding a test here from @DrChainsaw's
original PR.
---
src/table.jl | 5 +++--
test/java_compress_len_neg_one.arrow | Bin 0 -> 6050 bytes
test/runtests.jl | 8 ++++++++
3 files changed, 11 insertions(+), 2 deletions(-)
diff --git a/src/table.jl b/src/table.jl
index ff44f05..49b6153 100644
--- a/src/table.jl
+++ b/src/table.jl
@@ -521,11 +521,12 @@ function uncompress(ptr::Ptr{UInt8}, buffer, compression)
len = unsafe_load(convert(Ptr{Int64}, ptr))
ptr += 8 # skip past uncompressed length as Int64
encodedbytes = unsafe_wrap(Array, ptr, buffer.length - 8)
- if len === -1
+ if len == -1
# len = -1 means data is not compressed
+ # it's unclear why other language implementations allow this
+ # but we support to be able to read data produced as such
return length(encodedbytes), copy(encodedbytes)
end
-
decodedbytes = Vector{UInt8}(undef, len)
if compression.codec === Meta.CompressionTypes.LZ4_FRAME
transcode(LZ4FrameDecompressor, encodedbytes, decodedbytes)
diff --git a/test/java_compress_len_neg_one.arrow b/test/java_compress_len_neg_one.arrow
new file mode 100644
index 0000000..1d0f864
Binary files /dev/null and b/test/java_compress_len_neg_one.arrow differ
diff --git a/test/runtests.jl b/test/runtests.jl
index 3cdac88..47a137f 100644
--- a/test/runtests.jl
+++ b/test/runtests.jl
@@ -666,6 +666,14 @@ t2 = Arrow.Table(buf2)
end
+@testset "# 435" begin
+
+t = Arrow.Table(joinpath(dirname(pathof(Arrow)), "../test/java_compress_len_neg_one.arrow"))
+@test length(t) == 15
+@test length(t.isA) == 102
+
+end
+
end # @testset "misc"
end