You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/04/29 21:57:45 UTC

[GitHub] [arrow] pauldix opened a new pull request #7064: ARROW-6945: [Rust] WIP: Add initial skeleton for Rust integration tests

pauldix opened a new pull request #7064:
URL: https://github.com/apache/arrow/pull/7064


   This isn't even close to being mergable, but I'm hoping to get some additional eyes on this at this point. I have an initial skeleton of the integration test. I started out only implementing part of the `arrow-json-integration-test` executor. I started by trying to get Rust to consume JSON and produce Arrow for the `generated_primitive` test/JSON.
   
   I was only able to build the Go version of things. For some reason none of the docker-compose builds worked for C++ or Java and I was completely unsuccessful getting them to build on OSX Catalina. I figured if Go works then I'd use that as my comparison for now and try to get C++ to build later when I need to hook up the Flight integration test.
   
   On running this with Archery and having Rust producing and Go consuming, it comes up with the following error output from the Go executable when trying to validate the Arrow file produced by Rust:
   
   ```
   arrow-json:  could not read record 0 from ARROW file: arrow/ipc: invalid file metadata=1580 position for record 0
   ```
   
   I checked what's being passed to the Rust FileWriter and I think it looks correct (2 record batches with 17 and 20 rows respectively). I can continue digging to see if it's something in the FileWriter, but I wanted to get some feedback on all of this before I go even deeper into this rabbit hole.
   
   So here are a few specific questions:
   * Does the structure of this look roughly correct?
   * How should I go about troubleshooting and fixing the generated Arrow file that is failing to be read by the Go reader?
   
   Apologies if there is a more preferred way to open up this kind of discussion, any pointers are appreciated. I'd love to contribute to the Rust implementation and get it over the line to be included in the integration tests for release.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson commented on pull request #7064: ARROW-6945: [Rust] WIP: Add initial skeleton for Rust integration tests

Posted by GitBox <gi...@apache.org>.
nealrichardson commented on pull request #7064:
URL: https://github.com/apache/arrow/pull/7064#issuecomment-621544706


   This is great. To help us keep track of what is tested and not across the project, we have [this doc](https://docs.google.com/spreadsheets/d/1Yu68rn2XMBpAArUfCOP9LC7uHb06CQrtqKE5vQ4bQx4/edit#gid=782909347), which may be useful as you incrementally add the tests. I'm happy to give write access to it on request.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] andygrove commented on pull request #7064: ARROW-6945: [Rust] WIP: Add initial skeleton for Rust integration tests

Posted by GitBox <gi...@apache.org>.
andygrove commented on pull request #7064:
URL: https://github.com/apache/arrow/pull/7064#issuecomment-629688732


   @pauldix Any chance you could rebase this against the new `rust-integration-testing` branch, or give arrow committers permissions to push to your branch?
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pauldix commented on pull request #7064: ARROW-6945: [Rust] WIP: Add initial skeleton for Rust integration tests

Posted by GitBox <gi...@apache.org>.
pauldix commented on pull request #7064:
URL: https://github.com/apache/arrow/pull/7064#issuecomment-622136810


   @nealrichardson are there individual tests for each of the rows in that sheet? Or are many covered with each integration scenario?
   
   @andygrove I think if you can PR into my branch here(https://github.com/influxdata/arrow/tree/pd-rust-integration-test), then I can update this PR with that commit accordingly? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] andygrove commented on pull request #7064: ARROW-6945: [Rust] WIP: Add initial skeleton for Rust integration tests

Posted by GitBox <gi...@apache.org>.
andygrove commented on pull request #7064:
URL: https://github.com/apache/arrow/pull/7064#issuecomment-629811914


   Merged in https://github.com/apache/arrow/pull/7206


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] andygrove commented on pull request #7064: ARROW-6945: [Rust] WIP: Add initial skeleton for Rust integration tests

Posted by GitBox <gi...@apache.org>.
andygrove commented on pull request #7064:
URL: https://github.com/apache/arrow/pull/7064#issuecomment-628922079


   @pauldix ok, that other PR is merged. My next step is to get the integration tests running locally and then I can hopefully contribute more to this effort.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] andygrove commented on pull request #7064: ARROW-6945: [Rust] WIP: Add initial skeleton for Rust integration tests

Posted by GitBox <gi...@apache.org>.
andygrove commented on pull request #7064:
URL: https://github.com/apache/arrow/pull/7064#issuecomment-621515127


   Thanks, Paul. This is an awesome start. I will jump in and help as soon as I can. 
   
   Given the importance of this integration testing and the need for Rust contributors to be able to collaborate on this, I would be in support of incrementally merging work on the integration tests and keeping them disabled until we get them passing.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] vertexclique commented on a change in pull request #7064: ARROW-6945: [Rust] WIP: Add initial skeleton for Rust integration tests

Posted by GitBox <gi...@apache.org>.
vertexclique commented on a change in pull request #7064:
URL: https://github.com/apache/arrow/pull/7064#discussion_r419001113



##########
File path: rust/arrow/Cargo.toml
##########
@@ -50,6 +50,7 @@ chrono = "0.4"
 flatbuffers = "0.6"
 hex = "0.4"
 arrow-flight = { path = "../arrow-flight", optional = true }
+clap = "2.33.0"

Review comment:
       That would be super nice actually. That will keep the library lean.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #7064: ARROW-6945: [Rust] WIP: Add initial skeleton for Rust integration tests

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #7064:
URL: https://github.com/apache/arrow/pull/7064#issuecomment-621488703


   https://issues.apache.org/jira/browse/ARROW-6945


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] andygrove closed pull request #7064: ARROW-6945: [Rust] WIP: Add initial skeleton for Rust integration tests

Posted by GitBox <gi...@apache.org>.
andygrove closed pull request #7064:
URL: https://github.com/apache/arrow/pull/7064


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson commented on pull request #7064: ARROW-6945: [Rust] WIP: Add initial skeleton for Rust integration tests

Posted by GitBox <gi...@apache.org>.
nealrichardson commented on pull request #7064:
URL: https://github.com/apache/arrow/pull/7064#issuecomment-622139195


   Column I in the first sheet shows which test (generated) files cover which types. So many are covered in a single "test", in that the test JSON it produces includes many different types. (The second worksheet was oriented by test file, but I/we've stopped updating it. It was more important to focus on the Arrow types that were covered.)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] andygrove commented on a change in pull request #7064: ARROW-6945: [Rust] WIP: Add initial skeleton for Rust integration tests

Posted by GitBox <gi...@apache.org>.
andygrove commented on a change in pull request #7064:
URL: https://github.com/apache/arrow/pull/7064#discussion_r425486631



##########
File path: dev/archery/archery/integration/runner.py
##########
@@ -303,7 +304,7 @@ def get_static_json_files():
 
 
 def run_all_tests(with_cpp=True, with_java=True, with_js=True,
-                  with_go=True, run_flight=False,
+                  with_go=True, with_rust=True, run_flight=False,

Review comment:
       we probably don't want to add rust here until the tests are passing




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] maxburke commented on pull request #7064: ARROW-6945: [Rust] WIP: Add initial skeleton for Rust integration tests

Posted by GitBox <gi...@apache.org>.
maxburke commented on pull request #7064:
URL: https://github.com/apache/arrow/pull/7064#issuecomment-628719168


   > I don't have enough time to reproduce a minimal case, but I intend on doing so and then fixing that issue. @maxburke seems to be using the IPC format, wonder if you've seen this error too?
   
   Hiya! Yup, we use the IPC format quite heavily but unfortunately, no, we haven't run into any alignment issues. That said, we've been sticking primarily to smallish record batches (1k, 4k)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] vertexclique commented on a change in pull request #7064: ARROW-6945: [Rust] WIP: Add initial skeleton for Rust integration tests

Posted by GitBox <gi...@apache.org>.
vertexclique commented on a change in pull request #7064:
URL: https://github.com/apache/arrow/pull/7064#discussion_r418159133



##########
File path: rust/arrow/Cargo.toml
##########
@@ -50,6 +50,7 @@ chrono = "0.4"
 flatbuffers = "0.6"
 hex = "0.4"
 arrow-flight = { path = "../arrow-flight", optional = true }
+clap = "2.33.0"

Review comment:
       This adds the extra dependency to every build. Adding an external test harness that compiles outside the project would be nice. I think that's better to go to [dev-dependencies]. I strongly suggest using [trybuild](https://crates.io/crates/trybuild), compiler like test harness.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] andygrove commented on pull request #7064: ARROW-6945: [Rust] WIP: Add initial skeleton for Rust integration tests

Posted by GitBox <gi...@apache.org>.
andygrove commented on pull request #7064:
URL: https://github.com/apache/arrow/pull/7064#issuecomment-629690900


   @pauldix Never mind, I pulled your branch into mine and created a new PR here https://github.com/apache/arrow/pull/7206 which preserves your commits.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pauldix commented on pull request #7064: ARROW-6945: [Rust] WIP: Add initial skeleton for Rust integration tests

Posted by GitBox <gi...@apache.org>.
pauldix commented on pull request #7064:
URL: https://github.com/apache/arrow/pull/7064#issuecomment-628194857


   @nevi-me they're not compared byte for byte, the Go tester is invoked, which has its own file reader read in the file and try to marshal it and compare with the JSON representation. The custom metadata definition in the first RecordBatch is where it trips up. Although I've reproduced this on a much smaller output than the already fairly small generated primitive test so I doubt it's related to the issue you're seeing.
   
   What seems to happen is that the Go reader tries to read the length of the metadata, which fails because that length isn't divisible by 8 (because of the padding restriction in the Arrow format spec). I'll have some more time to look into this specific failure tomorrow and I might be able to puzzle it out. At this point I'm guessing that the Flatbuffer serialization isn't taking Arrow's expected padding into account, but this test doesn't have actual metadata in it so I'm still not entirely sure what's going on.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wesm commented on pull request #7064: ARROW-6945: [Rust] WIP: Add initial skeleton for Rust integration tests

Posted by GitBox <gi...@apache.org>.
wesm commented on pull request #7064:
URL: https://github.com/apache/arrow/pull/7064#issuecomment-621525214


   You can also merge patches into a branch -- the PR merge tool will merge patches into branches other than master if the PR is configured in the right way


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pauldix commented on a change in pull request #7064: ARROW-6945: [Rust] WIP: Add initial skeleton for Rust integration tests

Posted by GitBox <gi...@apache.org>.
pauldix commented on a change in pull request #7064:
URL: https://github.com/apache/arrow/pull/7064#discussion_r418732315



##########
File path: rust/arrow/Cargo.toml
##########
@@ -50,6 +50,7 @@ chrono = "0.4"
 flatbuffers = "0.6"
 hex = "0.4"
 arrow-flight = { path = "../arrow-flight", optional = true }
+clap = "2.33.0"

Review comment:
       It's not clear to me how trybuild would be used in conjunction with the Archery suite, which just seems to want three CLI executables to pipe invoke.  Maybe these would be best in a separate project on the same level as `arrow`, `arrow-flight` etd? `arrow-integration` perhaps? Then I'll just import from arrow and we keep the actual library dependencies clean.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org