You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/12/23 14:58:20 UTC

[GitHub] [arrow] nevi-me commented on a change in pull request #8996: ARROW-10967 [Rust]: Run tests without requiring environment variables

nevi-me commented on a change in pull request #8996:
URL: https://github.com/apache/arrow/pull/8996#discussion_r547997352



##########
File path: rust/arrow/src/util/test_util.rs
##########
@@ -60,3 +60,152 @@ pub fn get_temp_file(file_name: &str, content: &[u8]) -> fs::File {
     assert!(file.is_ok());
     file.unwrap()
 }
+
+/// Returns the arrow test data directory, which is by default stored
+/// in a git submodule rooted at `arrow/testing/data`.
+///
+/// The default can be overridden by the optional environment
+/// variable `ARROW_TEST_DATA`
+///
+/// panics when the directory can not be found.
+///
+/// Example:
+/// ```
+/// let testdata = arrow::util::test_util::arrow_test_data();
+/// let csvdata = format!("{}/csv/aggregate_test_100.csv", testdata);
+/// assert!(std::path::PathBuf::from(csvdata).exists());
+/// ```
+pub fn arrow_test_data() -> String {
+    match get_data_dir("ARROW_TEST_DATA", "../../testing/data") {
+        Ok(pb) => pb.display().to_string(),
+        Err(err) => panic!(format!("failed to get arrow data dir: {}", err)),
+    }
+}
+
+/// Returns the parquest test data directory, which is by default
+/// stored in a git submodule rooted at
+/// `arrow/cpp/submodules/parquest-testing/data`.
+///
+/// The default can be overridden by the optional environment variable
+/// `PARQUET_TEST_DATA`
+///
+/// panics when the directory can not be found.
+///
+/// Example:
+/// ```
+/// let testdata = arrow::util::test_util::parquet_test_data();
+/// let filename = format!("{}/binary.parquet", testdata);
+/// assert!(std::path::PathBuf::from(filename).exists());
+/// ```
+pub fn parquet_test_data() -> String {
+    match get_data_dir(
+        "PARQUET_TEST_DATA",

Review comment:
       I don't expect us to use this in the `arrow` crate

##########
File path: rust/README.md
##########
@@ -92,9 +92,12 @@ This populates data in two git submodules:
 - `../cpp/submodules/parquet_testing/data` (sourced from https://github.com/apache/parquet-testing.git)
 - `../testing` (sourced from https://github.com/apache/arrow-testing)
 
-The following Env vars are required to run `cargo test`, examples, etc.
+By default, `cargo test` will look for these directories at their
+standard location. The following Env vars can be used to override the
+location shoud you choose

Review comment:
       ```suggestion
   location should you choose.
   ```

##########
File path: rust/arrow-flight/src/arrow.flight.protocol.rs
##########
@@ -498,9 +498,8 @@ pub mod flight_service_server {
     #[async_trait]
     pub trait FlightService: Send + Sync + 'static {
         #[doc = "Server streaming response type for the Handshake method."]
-        type HandshakeStream: Stream<
-            Item = Result<super::HandshakeResponse, tonic::Status>,
-        > + Send
+        type HandshakeStream: Stream<Item = Result<super::HandshakeResponse, tonic::Status>>

Review comment:
       There might be a `rustfmt` version mismatch that's going to have us changing this back and forth. I've also seen rustfmt suggest changing this, but I've been ignoring it so far. What version of `rustfmt` are you using @Dandandan ?

##########
File path: rust/parquet/src/util/test_common/file_util.rs
##########
@@ -19,17 +19,8 @@ use std::{env, fs, io::Write, path::PathBuf, str::FromStr};
 
 /// Returns path to the test parquet file in 'data' directory
 pub fn get_test_path(file_name: &str) -> PathBuf {
-    let mut pathbuf = match env::var("PARQUET_TEST_DATA") {
-        Ok(path) => PathBuf::from_str(path.as_str()).unwrap(),
-        Err(_) => {
-            let mut pathbuf = env::current_dir().unwrap();
-            pathbuf.pop();
-            pathbuf.pop();
-            pathbuf
-                .push(PathBuf::from_str("cpp/submodules/parquet-testing/data").unwrap());
-            pathbuf
-        }
-    };
+    let mut pathbuf =
+        PathBuf::from_str(&arrow::util::test_util::parquet_test_data()).unwrap();

Review comment:
       Yea, better to move the parquet test data fn into this crate, because `arrow` is an optional dependency




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org