You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/08/13 16:51:38 UTC

[GitHub] [arrow-datafusion] cpcloud commented on a change in pull request #873: Rework the python bindings using conversion traits from arrow-rs [WIP]

cpcloud commented on a change in pull request #873:
URL: https://github.com/apache/arrow-datafusion/pull/873#discussion_r688643434



##########
File path: datafusion/Cargo.toml
##########
@@ -42,12 +42,13 @@ simd = ["arrow/simd"]
 crypto_expressions = ["md-5", "sha2"]
 regex_expressions = ["regex", "lazy_static"]
 unicode_expressions = ["unicode-segmentation"]
+pyarrow = ["pyo3", "libc", "arrow/pyarrow"]
 
 [dependencies]
 ahash = "0.7"
-hashbrown = "0.11"
-arrow = { version = "5.1", features = ["prettyprint"] }
-parquet = { version = "5.1", features = ["arrow"] }
+hashbrown = { version = "0.11", features = ["raw"] }
+arrow = { path = "../../arrow-rs/arrow", features = ["prettyprint"] }
+parquet = { path = "../../arrow-rs/parquet", features = ["arrow"] }

Review comment:
       Path-only dependencies are only fine in a private monorepo. Otherwise, you need both a version and a path.

##########
File path: datafusion/src/lib.rs
##########
@@ -230,6 +230,9 @@ pub mod variable;
 pub use arrow;
 pub use parquet;
 
+#[cfg(feature = "pyarrow")]
+pub mod pyarrow;

Review comment:
       Does the whole module really need to be public?

##########
File path: python/Cargo.toml
##########
@@ -30,16 +30,16 @@ edition = "2018"
 libc = "0.2"
 tokio = { version = "1.0", features = ["macros", "rt", "rt-multi-thread", "sync"] }
 rand = "0.7"
-pyo3 = { version = "0.14.1", features = ["extension-module"] }
-datafusion = { git = "https://github.com/apache/arrow-datafusion.git", rev = "4d61196dee8526998aee7e7bb10ea88422e5f9e1" }
+pyo3 = { version = "0.14.2", features = ["extension-module"] }
+datafusion = { path = "../datafusion", features = ["pyarrow"] }
 

Review comment:
       same here, this you should avoid a path-only dependency here, but it looks like there might be other considerations I'm unaware of given the use of git.

##########
File path: datafusion/Cargo.toml
##########
@@ -66,6 +67,8 @@ regex = { version = "^1.4.3", optional = true }
 lazy_static = { version = "^1.4.0", optional = true }
 smallvec = { version = "1.6", features = ["union"] }
 rand = "0.8"
+pyo3 = { version = "0.14", optional = true }
+libc = { version = "0.2", optional = true }

Review comment:
       Are you now using something from `libc`? I didn't catch that in the first pass.

##########
File path: python/Cargo.toml
##########
@@ -30,16 +30,16 @@ edition = "2018"
 libc = "0.2"
 tokio = { version = "1.0", features = ["macros", "rt", "rt-multi-thread", "sync"] }
 rand = "0.7"
-pyo3 = { version = "0.14.1", features = ["extension-module"] }
-datafusion = { git = "https://github.com/apache/arrow-datafusion.git", rev = "4d61196dee8526998aee7e7bb10ea88422e5f9e1" }
+pyo3 = { version = "0.14.2", features = ["extension-module"] }
+datafusion = { path = "../datafusion", features = ["pyarrow"] }
 
 [lib]
-name = "datafusion"
+name = "internals"

Review comment:
       This isn't a public thing that anyone was referring to by name?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org