You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/11/16 01:24:25 UTC

[GitHub] [arrow-datafusion] doki23 commented on a diff in pull request #4194: create table with schema

doki23 commented on code in PR #4194:
URL: https://github.com/apache/arrow-datafusion/pull/4194#discussion_r1023411537


##########
datafusion/sql/src/planner.rs:
##########
@@ -180,12 +180,36 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
                 if_not_exists,
                 or_replace,
                 ..
-            } if columns.is_empty()
-                && constraints.is_empty()
+            } if constraints.is_empty()
                 && table_properties.is_empty()
                 && with_options.is_empty() =>
             {
-                let plan = self.query_to_plan(*query, &mut HashMap::new())?;
+                let plan = self.query_to_plan(*query.clone(), &mut HashMap::new())?;
+                let input_schema = plan.schema();
+
+                let plan = if !columns.is_empty() {
+                    match *query.body {
+                        SetExpr::Values(_) => {
+                            let schema = self.build_schema(columns)?.to_dfschema_ref()?;
+                            if schema.fields().len() != input_schema.fields().len() {
+                                return Err(DataFusionError::Plan("Mismatch between schema and batches".to_string()))
+                            }
+                            let input_fields = input_schema.fields();
+                            let project_exprs = schema.fields().iter().zip(input_fields).map(|(field, input_field)| {
+                                cast(col(input_field.name()), field.data_type().clone()).alias(field.name())
+                            }).collect::<Vec<_>>();
+                            LogicalPlanBuilder::from(plan.clone())
+                                .project(project_exprs)?
+                                .build()?
+                        },
+                        _ => return Err(DataFusionError::Plan(
+                            "You can only specify schema when create table with a `values` statement"
+                                .to_string()
+                        ))

Review Comment:
   It's because `SELECT` has its own schema, but it makes sense if someone want to cast these types.



##########
datafusion/sql/src/planner.rs:
##########
@@ -180,12 +180,36 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
                 if_not_exists,
                 or_replace,
                 ..
-            } if columns.is_empty()
-                && constraints.is_empty()
+            } if constraints.is_empty()
                 && table_properties.is_empty()
                 && with_options.is_empty() =>
             {
-                let plan = self.query_to_plan(*query, &mut HashMap::new())?;
+                let plan = self.query_to_plan(*query.clone(), &mut HashMap::new())?;
+                let input_schema = plan.schema();
+
+                let plan = if !columns.is_empty() {
+                    match *query.body {
+                        SetExpr::Values(_) => {
+                            let schema = self.build_schema(columns)?.to_dfschema_ref()?;
+                            if schema.fields().len() != input_schema.fields().len() {
+                                return Err(DataFusionError::Plan("Mismatch between schema and batches".to_string()))
+                            }
+                            let input_fields = input_schema.fields();
+                            let project_exprs = schema.fields().iter().zip(input_fields).map(|(field, input_field)| {
+                                cast(col(input_field.name()), field.data_type().clone()).alias(field.name())
+                            }).collect::<Vec<_>>();
+                            LogicalPlanBuilder::from(plan.clone())
+                                .project(project_exprs)?
+                                .build()?
+                        },
+                        _ => return Err(DataFusionError::Plan(
+                            "You can only specify schema when create table with a `values` statement"
+                                .to_string()
+                        ))

Review Comment:
   It's because `SELECT` has its own schema, but it makes sense if someone wants to cast these types.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org