You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/06/04 15:40:10 UTC

[GitHub] [arrow-datafusion] jorgecarleitao commented on a change in pull request #503: Implement missing join types for Python dataframe

jorgecarleitao commented on a change in pull request #503:
URL: https://github.com/apache/arrow-datafusion/pull/503#discussion_r645665096



##########
File path: datafusion/src/logical_plan/plan.rs
##########
@@ -50,6 +52,28 @@ pub enum JoinType {
     Anti,
 }
 
+impl FromStr for JoinType {

Review comment:
       This will tie the DataFusions' API to Python datafusion API at the naming level. I am not sure we should do that:
   
   names imo are not really datafusion's: Rust favors enums over strings. They are relevant in Python because in Python strings are prevalent. In this context, shouldn't the API for mapping a string to a join type remain specific to Pythons' implementation?

##########
File path: python/Cargo.toml
##########
@@ -31,7 +31,7 @@ libc = "0.2"
 tokio = { version = "1.0", features = ["macros", "rt", "rt-multi-thread", "sync"] }
 rand = "0.7"
 pyo3 = { version = "0.13.2", features = ["extension-module"] }
-datafusion = { git = "https://github.com/apache/arrow-datafusion.git", rev = "c3fc0c75af5ff2ebb99dba197d9d2ccd83eb5952" }

Review comment:
       This forces Python-datafusion to always use the latest datafusion release, which is not necessarily what we always want.
   
   We agreed to keep using frozen hashes and perform manual changes. The rational is that releases of datafusion-python should be pinned to a specific version of DataFusion, so that we can release independently.

##########
File path: datafusion/src/logical_plan/plan.rs
##########
@@ -50,6 +52,28 @@ pub enum JoinType {
     Anti,
 }
 
+impl FromStr for JoinType {
+    type Err = DataFusionError;
+
+    fn from_str(s: &str) -> Result<Self, Self::Err> {
+        match s {
+            "inner" => Ok(JoinType::Inner),
+            "left" => Ok(JoinType::Left),
+            "right" => Ok(JoinType::Right),
+            "full" => Ok(JoinType::Full),
+            "semi" => Ok(JoinType::Semi),
+            "anti" => Ok(JoinType::Anti),
+            how => {
+                return Err(DataFusionError::Internal(format!(

Review comment:
       This is not an internal error, as people would be able to write anything from Python.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org