You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "quinnj (via GitHub)" <gi...@apache.org> on 2023/06/01 04:36:01 UTC

[GitHub] [arrow-julia] quinnj opened a new pull request, #446: Return SubArrays when possible for arrow list types

quinnj opened a new pull request, #446:
URL: https://github.com/apache/arrow-julia/pull/446

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-julia] quinnj commented on a diff in pull request #446: Return SubArrays when possible for arrow list types

Posted by "quinnj (via GitHub)" <gi...@apache.org>.
quinnj commented on code in PR #446:
URL: https://github.com/apache/arrow-julia/pull/446#discussion_r1213607361


##########
src/append.jl:
##########
@@ -197,7 +197,13 @@ end
 function is_equivalent_schema(sch1::Tables.Schema, sch2::Tables.Schema)
     (sch1.names == sch2.names) || (return false)
     for (t1,t2) in zip(sch1.types, sch2.types)
-        (t1 === t2) || (return false)
+        tt1 = Base.nonmissingtype(t1)
+        tt2 = Base.nonmissingtype(t2)
+        if t1 == t2 || (tt1 <: AbstractVector && tt2 <: AbstractVector && eltype(tt1) == eltype(tt2))

Review Comment:
   This is needed to loosen what we consider "equivalent schemas"; i.e. a column w/ eltype `Vector{Int}` is now "equal" to `SubArray{Int, 1, ...}`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-julia] quinnj merged pull request #446: Return SubArrays when possible for arrow list types

Posted by "quinnj (via GitHub)" <gi...@apache.org>.
quinnj merged PR #446:
URL: https://github.com/apache/arrow-julia/pull/446


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-julia] quinnj commented on pull request #446: Return SubArrays when possible for arrow list types

Posted by "quinnj (via GitHub)" <gi...@apache.org>.
quinnj commented on PR #446:
URL: https://github.com/apache/arrow-julia/pull/446#issuecomment-1573059668

   @baumgold, this is ready to review if you have a moment.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-julia] codecov-commenter commented on pull request #446: Return SubArrays when possible for arrow list types

Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #446:
URL: https://github.com/apache/arrow-julia/pull/446#issuecomment-1571326285

   ## [Codecov](https://app.codecov.io/gh/apache/arrow-julia/pull/446?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) Report
   > Merging [#446](https://app.codecov.io/gh/apache/arrow-julia/pull/446?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) (61c4c03) into [main](https://app.codecov.io/gh/apache/arrow-julia/commit/d1b53263cd73a4a8fb61cfb7c0392a92239fe43e?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) (d1b5326) will **decrease** coverage by `82.58%`.
   > The diff coverage is `0.00%`.
   
   ```diff
   @@            Coverage Diff             @@
   ##             main    #446       +/-   ##
   ==========================================
   - Coverage   87.46%   4.88%   -82.58%     
   ==========================================
     Files          26      25        -1     
     Lines        3263    3191       -72     
   ==========================================
   - Hits         2854     156     -2698     
   - Misses        409    3035     +2626     
   ```
   
   
   | [Impacted Files](https://app.codecov.io/gh/apache/arrow-julia/pull/446?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | Coverage Δ | |
   |---|---|---|
   | [src/table.jl](https://app.codecov.io/gh/apache/arrow-julia/pull/446?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-c3JjL3RhYmxlLmps) | `0.00% <0.00%> (-92.44%)` | :arrow_down: |
   
   ... and [23 files with indirect coverage changes](https://app.codecov.io/gh/apache/arrow-julia/pull/446/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-julia] baumgold commented on a diff in pull request #446: Return SubArrays when possible for arrow list types

Posted by "baumgold (via GitHub)" <gi...@apache.org>.
baumgold commented on code in PR #446:
URL: https://github.com/apache/arrow-julia/pull/446#discussion_r1213881772


##########
src/append.jl:
##########
@@ -197,7 +197,13 @@ end
 function is_equivalent_schema(sch1::Tables.Schema, sch2::Tables.Schema)
     (sch1.names == sch2.names) || (return false)
     for (t1,t2) in zip(sch1.types, sch2.types)
-        (t1 === t2) || (return false)
+        tt1 = Base.nonmissingtype(t1)
+        tt2 = Base.nonmissingtype(t2)
+        if t1 == t2 || (tt1 <: AbstractVector && tt2 <: AbstractVector && eltype(tt1) == eltype(tt2))

Review Comment:
   This looks good to me.  Small nit-pick... small simplification without the continue:
   
   ```julia
   t1 != t2 && (tt1 <: AbstractVector && tt2 <: AbstractVector && eltype(tt1) != eltype(tt2)) && return false
   ```



##########
src/table.jl:
##########
@@ -637,16 +639,23 @@ function build(f::Meta.Field, L::ListTypes, batch, rb, de, nodeidx, bufferidx, c
     bufferidx += 1
     len = rb.nodes[nodeidx].length
     nodeidx += 1
+    meta = buildmetadata(f.custom_metadata)
     if L isa Meta.Utf8 || L isa Meta.LargeUtf8 || L isa Meta.Binary || L isa Meta.LargeBinary
         buffer = rb.buffers[bufferidx]
         bytes, A = reinterp(UInt8, batch, buffer, rb.compression)
         bufferidx += 1
+        T = juliaeltype(f, meta, convert)

Review Comment:
   This can move out of the if/else statement to reduce duplication since it's used in both the if and the else.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org