You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "DrChainsaw (via GitHub)" <gi...@apache.org> on 2023/05/04 10:42:48 UTC
[GitHub] [arrow-julia] DrChainsaw opened a new pull request, #436: Add handling of len = -1 in uncompress
DrChainsaw opened a new pull request, #436:
URL: https://github.com/apache/arrow-julia/pull/436
Fix #435
Hipshot PR from the github API. Not sure how to add tests for this if needed. It reads my files correctly at least :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-julia] DrChainsaw commented on a diff in pull request #436: Add handling of len = -1 in uncompress
Posted by "DrChainsaw (via GitHub)" <gi...@apache.org>.
DrChainsaw commented on code in PR #436:
URL: https://github.com/apache/arrow-julia/pull/436#discussion_r1192989917
##########
src/table.jl:
##########
@@ -521,6 +521,11 @@ function uncompress(ptr::Ptr{UInt8}, buffer, compression)
len = unsafe_load(convert(Ptr{Int64}, ptr))
ptr += 8 # skip past uncompressed length as Int64
encodedbytes = unsafe_wrap(Array, ptr, buffer.length - 8)
+ if len === -1
Review Comment:
Comment about the flag which might shed some light: https://github.com/apache/arrow/blob/cd6e2a4d2b9373b942da18b4cc82cb41431764d9/java/vector/src/main/java/org/apache/arrow/vector/compression/CompressionUtil.java#L67-L71
It could be that the java implementation just does its own thing here without regards to any standard, wouldn't that be a pretty irresponsible thing to do given that the repo markets itself as being cross platform.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-julia] codecov-commenter commented on pull request #436: Add handling of len = -1 in uncompress
Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #436:
URL: https://github.com/apache/arrow-julia/pull/436#issuecomment-1546658927
## [Codecov](https://app.codecov.io/gh/apache/arrow-julia/pull/436?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) Report
> Merging [#436](https://app.codecov.io/gh/apache/arrow-julia/pull/436?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) (5dec5f8) into [main](https://app.codecov.io/gh/apache/arrow-julia/commit/e893c327f177f5a4d5efeab831df0fe93ab4ec5b?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) (e893c32) will **decrease** coverage by `82.13%`.
> The diff coverage is `0.00%`.
```diff
@@ Coverage Diff @@
## main #436 +/- ##
==========================================
- Coverage 87.02% 4.89% -82.13%
==========================================
Files 26 25 -1
Lines 3261 3123 -138
==========================================
- Hits 2838 153 -2685
- Misses 423 2970 +2547
```
| [Impacted Files](https://app.codecov.io/gh/apache/arrow-julia/pull/436?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | Coverage Δ | |
|---|---|---|
| [src/table.jl](https://app.codecov.io/gh/apache/arrow-julia/pull/436?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-c3JjL3RhYmxlLmps) | `0.00% <0.00%> (-92.86%)` | :arrow_down: |
... and [23 files with indirect coverage changes](https://app.codecov.io/gh/apache/arrow-julia/pull/436/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
:mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-julia] DrChainsaw commented on a diff in pull request #436: Add handling of len = -1 in uncompress
Posted by "DrChainsaw (via GitHub)" <gi...@apache.org>.
DrChainsaw commented on code in PR #436:
URL: https://github.com/apache/arrow-julia/pull/436#discussion_r1192989113
##########
src/table.jl:
##########
@@ -521,6 +521,11 @@ function uncompress(ptr::Ptr{UInt8}, buffer, compression)
len = unsafe_load(convert(Ptr{Int64}, ptr))
ptr += 8 # skip past uncompressed length as Int64
encodedbytes = unsafe_wrap(Array, ptr, buffer.length - 8)
+ if len === -1
Review Comment:
I don't know the logic why the writer would do this. I encountered the issue when trying to read files generated from the official java arrow package. When stepping through the debugger during reading in java I encountered this lines which I just copied over to the Julia implementation: https://github.com/apache/arrow/blob/febd0ff144cfb8b2baffb1cb0be57ca40dc7cc77/java/vector/src/main/java/org/apache/arrow/vector/compression/AbstractCompressionCodec.java#L72-L75
In the linked issue there is an example file which also reads in pyarrow. I haven't made any effort to find the corresponding line in there though.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-julia] DrChainsaw closed pull request #436: Add handling of len = -1 in uncompress
Posted by "DrChainsaw (via GitHub)" <gi...@apache.org>.
DrChainsaw closed pull request #436: Add handling of len = -1 in uncompress
URL: https://github.com/apache/arrow-julia/pull/436
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-julia] baumgold commented on a diff in pull request #436: Add handling of len = -1 in uncompress
Posted by "baumgold (via GitHub)" <gi...@apache.org>.
baumgold commented on code in PR #436:
URL: https://github.com/apache/arrow-julia/pull/436#discussion_r1192988532
##########
src/table.jl:
##########
@@ -521,6 +521,11 @@ function uncompress(ptr::Ptr{UInt8}, buffer, compression)
len = unsafe_load(convert(Ptr{Int64}, ptr))
ptr += 8 # skip past uncompressed length as Int64
encodedbytes = unsafe_wrap(Array, ptr, buffer.length - 8)
+ if len === -1
Review Comment:
How do we end up in the `uncompress` method for date that is not compressed? I believe we should be checking for compression earlier than here. Is that a branch where this check is missing?
##########
src/table.jl:
##########
@@ -521,6 +521,11 @@ function uncompress(ptr::Ptr{UInt8}, buffer, compression)
len = unsafe_load(convert(Ptr{Int64}, ptr))
ptr += 8 # skip past uncompressed length as Int64
encodedbytes = unsafe_wrap(Array, ptr, buffer.length - 8)
+ if len === -1
Review Comment:
How do we end up in the `uncompress` method for data that is not compressed? I believe we should be checking for compression earlier than here. Is that a branch where this check is missing?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-julia] baumgold commented on a diff in pull request #436: Add handling of len = -1 in uncompress
Posted by "baumgold (via GitHub)" <gi...@apache.org>.
baumgold commented on code in PR #436:
URL: https://github.com/apache/arrow-julia/pull/436#discussion_r1192988532
##########
src/table.jl:
##########
@@ -521,6 +521,11 @@ function uncompress(ptr::Ptr{UInt8}, buffer, compression)
len = unsafe_load(convert(Ptr{Int64}, ptr))
ptr += 8 # skip past uncompressed length as Int64
encodedbytes = unsafe_wrap(Array, ptr, buffer.length - 8)
+ if len === -1
Review Comment:
How do we end up in the `uncompress` method for data that is not compressed? I believe we should be checking for compression earlier than here. Is there a branch of code where this check is missing?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-julia] quinnj merged pull request #436: Add handling of len = -1 in uncompress
Posted by "quinnj (via GitHub)" <gi...@apache.org>.
quinnj merged PR #436:
URL: https://github.com/apache/arrow-julia/pull/436
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-julia] DrChainsaw commented on pull request #436: Add handling of len = -1 in uncompress
Posted by "DrChainsaw (via GitHub)" <gi...@apache.org>.
DrChainsaw commented on PR #436:
URL: https://github.com/apache/arrow-julia/pull/436#issuecomment-1549535374
CI timeout on macos seemed to happen during precompilation of dependencies. Trying to restart through open-close.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org