You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/07/08 06:43:43 UTC

[GitHub] [arrow-datafusion] v0y4g3r opened a new pull request, #2855: Bump arrow2

v0y4g3r opened a new pull request, #2855:
URL: https://github.com/apache/arrow-datafusion/pull/2855

   # Which issue does this PR close?
   Closes #2709.
   
    # Rationale for this change
   
   Current branch `arrow2` is still using arrow2 version `v0.10` and falls far behind the latest version `v0.12`
   
   # What changes are included in this PR?
   
   Dependencies upgraded: 
   - arrow2: `v0.10` ->  `v0.12`
   - parquet2: `v0.12` -> `v0.13`
   - arrow-format: `0.4`->`0.6`
   - prost: `0.9` -> `0.10 `
   - prost-types: `0.9` -> `0.10`
   - tonic: 0.6 -> `0.7`
   - topic-build: `0.6` -> `0.7`
   
   API change: 
   
   - `arrow2::error::ArrowError` -> `arrow2::error::Error`
   - arrow2 FileWriter now accepts `Vec<Vec<Encoding>>` instead of `Vec<Encoding>` and `transverse` can be used to create encodings from schema with a customized mapping.
   - arrow2 `RowGroupMetadata` no longer as a `column(usize)` method to fetch column metadata at given index, instead, it provides a `columns()` method that returns a column slice.
   - `datafusion::dataframe::DataFrame::write_parquet` method now accepts `arrow::io::parquet::write::WriteOptions` instead of `parquet::write::WriteOptions`
   - some other minor changes.
   
   
   # Are there any user-facing changes?
   Yes, please refer to previous section for full API change list.
   But I think it's rather easy for users of `arrow2` branch to adapt to these changes.
   
   # About tests
   Some tests are failling in current arrow2 branch, so I'm not going to fix all failing tests in this PR, instead I'll make sure no more unit test fails. I'd be happy to fix all failing tests after this PR is merged so I don't have to fix them twice :)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] v0y4g3r commented on pull request #2855: Bump arrow2

Posted by GitBox <gi...@apache.org>.
v0y4g3r commented on PR #2855:
URL: https://github.com/apache/arrow-datafusion/pull/2855#issuecomment-1179925580

   @andygrove  Wondering what I still need to do in order to get this merged and I'd be happy to follow up.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] v0y4g3r commented on pull request #2855: Bump arrow2

Posted by GitBox <gi...@apache.org>.
v0y4g3r commented on PR #2855:
URL: https://github.com/apache/arrow-datafusion/pull/2855#issuecomment-1179468808

   > @v0y4g3r I'd like to start understanding more about the efforts here. Is the plan to put this behind a cargo `feature` so that we can switch between `arrow` and `arrow2`?
   
   No, it's simply about upgrading dependencies. 
   
   But I'm also interested in something like defining a trait to hide the API differences between arrow and arrow2 and use cargo feature to provide an option. But I have no ideas on how to implement that now.
   
   > There are some license headers missing:
   > 
   > ```
   > NOT APPROVED: ballista/rust/core/src/memory_stream.rs (./ballista/rust/core/src/memory_stream.rs): false
   > NOT APPROVED: datafusion/src/physical_plan/sort.rs (./datafusion/src/physical_plan/sort.rs): false
   > NOT APPROVED: datafusion/tests/sql_integration.rs (./datafusion/tests/sql_integration.rs): false
   > NOT APPROVED: datafusion-physical-expr/src/test_util.rs (./datafusion-physical-expr/src/test_util.rs): false
   > ```
   
   Seems like current arrow2 branch doesn't have license header. But it's ok I added headers.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] v0y4g3r commented on a diff in pull request #2855: Bump arrow2

Posted by GitBox <gi...@apache.org>.
v0y4g3r commented on code in PR #2855:
URL: https://github.com/apache/arrow-datafusion/pull/2855#discussion_r920698761


##########
datafusion/src/lib.rs:
##########
@@ -1,3 +1,4 @@
+#![feature(assert_matches)]

Review Comment:
   `assert_macthes` was used in unit test to check error variant. I removed this nightly-only feature in latest commit.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] houqp commented on pull request #2855: Bump arrow2

Posted by GitBox <gi...@apache.org>.
houqp commented on PR #2855:
URL: https://github.com/apache/arrow-datafusion/pull/2855#issuecomment-1186711316

   @v0y4g3r looks like there are some minor build issues that need to fixed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] dbr commented on a diff in pull request #2855: Bump arrow2

Posted by GitBox <gi...@apache.org>.
dbr commented on code in PR #2855:
URL: https://github.com/apache/arrow-datafusion/pull/2855#discussion_r920095963


##########
datafusion/src/lib.rs:
##########
@@ -1,3 +1,4 @@
+#![feature(assert_matches)]

Review Comment:
   It's not clear why this is added - it makes the library only compile on nightly toolchain, and seems to work fine with the line removed (at least with Rust 1.60)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb closed pull request #2855: Bump arrow2

Posted by GitBox <gi...@apache.org>.
alamb closed pull request #2855: Bump arrow2
URL: https://github.com/apache/arrow-datafusion/pull/2855


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] andygrove commented on pull request #2855: Bump arrow2

Posted by GitBox <gi...@apache.org>.
andygrove commented on PR #2855:
URL: https://github.com/apache/arrow-datafusion/pull/2855#issuecomment-1179187486

   @v0y4g3r I'd like to start understanding more about the efforts here. Is the plan to put this behind a cargo `feature` so that we can switch between `arrow` and `arrow2`?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] v0y4g3r commented on pull request #2855: Bump arrow2

Posted by GitBox <gi...@apache.org>.
v0y4g3r commented on PR #2855:
URL: https://github.com/apache/arrow-datafusion/pull/2855#issuecomment-1183919397

   @andygrove Seems like some workflows need a maintainer's approval to run. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on pull request #2855: Bump arrow2

Posted by GitBox <gi...@apache.org>.
alamb commented on PR #2855:
URL: https://github.com/apache/arrow-datafusion/pull/2855#issuecomment-1382717626

   This PR is more than 6 month old, so closing it down for now to clean up the PR list. Please reopen if this is a mistake and you plan to work on it more 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on pull request #2855: Bump arrow2

Posted by GitBox <gi...@apache.org>.
alamb commented on PR #2855:
URL: https://github.com/apache/arrow-datafusion/pull/2855#issuecomment-1209520407

   Marking as draft to note it is not ready for merging  --  please change it back to ready for review when it is ready. Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on pull request #2855: Bump arrow2

Posted by GitBox <gi...@apache.org>.
alamb commented on PR #2855:
URL: https://github.com/apache/arrow-datafusion/pull/2855#issuecomment-1276550437

   Marking this PR as a draft as I am cleaning up the review queue. Please mark it as ready for review when it is


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] v0y4g3r commented on pull request #2855: Bump arrow2

Posted by GitBox <gi...@apache.org>.
v0y4g3r commented on PR #2855:
URL: https://github.com/apache/arrow-datafusion/pull/2855#issuecomment-1257543333

   > @v0y4g3r looks like there are some minor build issues that need to fixed.
   
   These tests are failing in current arrow2 branch, and is not introduced by this PR. 
   
   Anyway, I'm working on fixing these tests in another pull request, but there's something different between arrow and arrow2.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org