You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/30 09:38:10 UTC

[GitHub] [arrow-datafusion] Ted-Jiang opened a new pull request, #2389: Support struct_expr generate struct in sql

Ted-Jiang opened a new pull request, #2389:
URL: https://github.com/apache/arrow-datafusion/pull/2389

   # Which issue does this PR close?
   
   Related #2043.
   
    # Rationale for this change
   Support construct a struct in SQL.
   ```
   select STRUCT(1,2,3);
   +------------------------------------+
   | struct(Int64(1),Int64(2),Int64(3)) |
   +------------------------------------+
   | {"f_0": 1, "f_1": 2, "f_2": 3}     |
   +------------------------------------+
   1 row in set. Query took 0.003 seconds.
   ```
   
   # What changes are included in this PR?
   <!--
   There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR.
   -->
   
   # Are there any user-facing changes?
   Now the struct Field only support default name like `f_n`
   I want to use `select struct(1 as a)' as field name, but got error in parsing
   ```
    select STRUCT(1 as a);  🤔 Invalid statement: sql parser error: Expected ), found: as
   ``` 
   I think it should fix in sql-parse, if wrong plz correct me.
   I will create follow up ticket for this, if this PR is fine.
   
   <!--
   If there are any breaking changes to public APIs, please add the `api change` label.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] andygrove commented on a diff in pull request #2389: Support struct_expr generate struct in sql

Posted by GitBox <gi...@apache.org>.
andygrove commented on code in PR #2389:
URL: https://github.com/apache/arrow-datafusion/pull/2389#discussion_r866910908


##########
datafusion/physical-expr/src/struct_expressions.rs:
##########
@@ -0,0 +1,74 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+//! Struct expressions
+
+use arrow::array::*;
+use arrow::datatypes::{DataType, Field};
+use datafusion_common::{DataFusionError, Result};
+use datafusion_expr::ColumnarValue;
+use std::sync::Arc;
+
+fn array_struct(args: &[ArrayRef]) -> Result<ArrayRef> {
+    // do not accept 0 arguments.
+    if args.is_empty() {
+        return Err(DataFusionError::Internal(
+            "struct requires at least one argument".to_string(),
+        ));
+    }
+
+    let vec: Vec<_> = args
+        .iter()
+        .enumerate()
+        .map(|(i, arg)| -> (Field, ArrayRef) {
+            let field_name = format!("c{}", i);
+            match arg.data_type() {
+                DataType::Utf8
+                | DataType::LargeUtf8
+                | DataType::Boolean
+                | DataType::Float32
+                | DataType::Float64
+                | DataType::Int8
+                | DataType::Int16
+                | DataType::Int32
+                | DataType::Int64
+                | DataType::UInt8
+                | DataType::UInt16
+                | DataType::UInt32
+                | DataType::UInt64 => (
+                    Field::new(field_name.as_str(), arg.data_type().clone(), true),
+                    arg.clone(),
+                ),
+                data_type => unimplemented!("struct not support {} type", data_type),

Review Comment:
   We should return an `Err` rather than panic here



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] andygrove commented on a diff in pull request #2389: Support struct_expr generate struct in sql

Posted by GitBox <gi...@apache.org>.
andygrove commented on code in PR #2389:
URL: https://github.com/apache/arrow-datafusion/pull/2389#discussion_r862485794


##########
datafusion/physical-expr/src/struct_expressions.rs:
##########
@@ -0,0 +1,109 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+//! Struct expressions
+
+use arrow::array::*;
+use arrow::datatypes::{DataType, Field};
+use datafusion_common::{DataFusionError, Result};
+use datafusion_expr::ColumnarValue;
+use std::sync::Arc;
+
+fn array_struct(args: &[ArrayRef]) -> Result<ArrayRef> {
+    // do not accept 0 arguments.
+    if args.is_empty() {
+        return Err(DataFusionError::Internal(
+            "struct requires at least one argument".to_string(),
+        ));
+    }
+
+    let vec: Vec<_> = args
+        .iter()
+        .enumerate()
+        .map(|(i, arg)| -> (Field, ArrayRef) {
+            match arg.data_type() {
+                DataType::Utf8 => (
+                    Field::new(&*format!("f_{}", i), DataType::Utf8, true),

Review Comment:
   `format!("f_{}", i)` is duplicated in each match. Could we create the column name once before the match statement?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] andygrove commented on a diff in pull request #2389: Support struct_expr generate struct in sql

Posted by GitBox <gi...@apache.org>.
andygrove commented on code in PR #2389:
URL: https://github.com/apache/arrow-datafusion/pull/2389#discussion_r865017538


##########
datafusion/physical-expr/src/struct_expressions.rs:
##########
@@ -0,0 +1,110 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+//! Struct expressions
+
+use arrow::array::*;
+use arrow::datatypes::{DataType, Field};
+use datafusion_common::{DataFusionError, Result};
+use datafusion_expr::ColumnarValue;
+use std::sync::Arc;
+
+fn array_struct(args: &[ArrayRef]) -> Result<ArrayRef> {
+    // do not accept 0 arguments.
+    if args.is_empty() {
+        return Err(DataFusionError::Internal(
+            "struct requires at least one argument".to_string(),
+        ));
+    }
+
+    let vec: Vec<_> = args
+        .iter()
+        .enumerate()
+        .map(|(i, arg)| -> (Field, ArrayRef) {
+            let field_name = format!("c_{}", i);

Review Comment:
   The underscore seems unecessary IMO.
   
   ```suggestion
               let field_name = format!("c{}", i);
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] Ted-Jiang closed pull request #2389: Support struct_expr generate struct in sql

Posted by GitBox <gi...@apache.org>.
Ted-Jiang closed pull request #2389: Support struct_expr generate struct in sql
URL: https://github.com/apache/arrow-datafusion/pull/2389


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] andygrove commented on a diff in pull request #2389: Support struct_expr generate struct in sql

Posted by GitBox <gi...@apache.org>.
andygrove commented on code in PR #2389:
URL: https://github.com/apache/arrow-datafusion/pull/2389#discussion_r865008313


##########
datafusion/physical-expr/src/struct_expressions.rs:
##########
@@ -0,0 +1,110 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+//! Struct expressions
+
+use arrow::array::*;
+use arrow::datatypes::{DataType, Field};
+use datafusion_common::{DataFusionError, Result};
+use datafusion_expr::ColumnarValue;
+use std::sync::Arc;
+
+fn array_struct(args: &[ArrayRef]) -> Result<ArrayRef> {
+    // do not accept 0 arguments.
+    if args.is_empty() {
+        return Err(DataFusionError::Internal(
+            "struct requires at least one argument".to_string(),
+        ));
+    }
+
+    let vec: Vec<_> = args
+        .iter()
+        .enumerate()
+        .map(|(i, arg)| -> (Field, ArrayRef) {
+            let field_name = format!("c_{}", i);
+            match arg.data_type() {
+                DataType::Utf8 => (
+                    Field::new(field_name.as_str(), DataType::Utf8, true),
+                    arg.clone(),
+                ),
+                DataType::LargeUtf8 => (
+                    Field::new(field_name.as_str(), DataType::LargeUtf8, true),
+                    arg.clone(),
+                ),
+                DataType::Boolean => (
+                    Field::new(field_name.as_str(), DataType::Boolean, true),
+                    arg.clone(),
+                ),
+                DataType::Float32 => (
+                    Field::new(field_name.as_str(), DataType::Float64, true),
+                    arg.clone(),

Review Comment:
   Is it intentional that we map `Float32` to `Float64` here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] andygrove commented on a diff in pull request #2389: Support struct_expr generate struct in sql

Posted by GitBox <gi...@apache.org>.
andygrove commented on code in PR #2389:
URL: https://github.com/apache/arrow-datafusion/pull/2389#discussion_r865017123


##########
datafusion/physical-expr/src/struct_expressions.rs:
##########
@@ -0,0 +1,110 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+//! Struct expressions
+
+use arrow::array::*;
+use arrow::datatypes::{DataType, Field};
+use datafusion_common::{DataFusionError, Result};
+use datafusion_expr::ColumnarValue;
+use std::sync::Arc;
+
+fn array_struct(args: &[ArrayRef]) -> Result<ArrayRef> {
+    // do not accept 0 arguments.
+    if args.is_empty() {
+        return Err(DataFusionError::Internal(
+            "struct requires at least one argument".to_string(),
+        ));
+    }
+
+    let vec: Vec<_> = args
+        .iter()
+        .enumerate()
+        .map(|(i, arg)| -> (Field, ArrayRef) {
+            let field_name = format!("c_{}", i);
+            match arg.data_type() {
+                DataType::Utf8 => (
+                    Field::new(field_name.as_str(), DataType::Utf8, true),
+                    arg.clone(),
+                ),
+                DataType::LargeUtf8 => (
+                    Field::new(field_name.as_str(), DataType::LargeUtf8, true),
+                    arg.clone(),
+                ),
+                DataType::Boolean => (
+                    Field::new(field_name.as_str(), DataType::Boolean, true),
+                    arg.clone(),
+                ),
+                DataType::Float32 => (
+                    Field::new(field_name.as_str(), DataType::Float64, true),
+                    arg.clone(),
+                ),
+                DataType::Float64 => (
+                    Field::new(field_name.as_str(), DataType::Float64, true),
+                    arg.clone(),
+                ),
+                DataType::Int8 => (
+                    Field::new(field_name.as_str(), DataType::Int8, true),
+                    arg.clone(),
+                ),
+                DataType::Int16 => (
+                    Field::new(field_name.as_str(), DataType::Int16, true),
+                    arg.clone(),
+                ),
+                DataType::Int32 => (
+                    Field::new(field_name.as_str(), DataType::Int32, true),
+                    arg.clone(),
+                ),
+                DataType::Int64 => (
+                    Field::new(field_name.as_str(), DataType::Int64, true),
+                    arg.clone(),
+                ),
+                DataType::UInt8 => (
+                    Field::new(field_name.as_str(), DataType::UInt8, true),
+                    arg.clone(),
+                ),
+                DataType::UInt16 => (
+                    Field::new(field_name.as_str(), DataType::UInt16, true),
+                    arg.clone(),
+                ),
+                DataType::UInt32 => (
+                    Field::new(field_name.as_str(), DataType::UInt32, true),
+                    arg.clone(),
+                ),
+                DataType::UInt64 => (
+                    Field::new(field_name.as_str(), DataType::UInt64, true),
+                    arg.clone(),
+                ),
+                data_type => unimplemented!("struct not support {} type", data_type),
+            }

Review Comment:
   I think this could be simplified quite a bit. I assume the mapping from `Float32` to `Float64` was unintended.
   
   ```suggestion
               match arg.data_type() {
                   DataType::Utf8
                   | DataType::LargeUtf8
                   | DataType::Boolean
                   | DataType::Float32
                   | DataType::Float64
                   | DataType::Int8
                   | DataType::Int16
                   | DataType::Int32
                   | DataType::Int64
                   | DataType::UInt8
                   | DataType::UInt16
                   | DataType::UInt32
                   | DataType::UInt64 => (
                       Field::new(field_name.as_str(), arg.data_type().clone(), true),
                       arg.clone(),
                   ),
                   data_type => unimplemented!("struct not support {} type", data_type),
               }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] andygrove commented on a diff in pull request #2389: Support struct_expr generate struct in sql

Posted by GitBox <gi...@apache.org>.
andygrove commented on code in PR #2389:
URL: https://github.com/apache/arrow-datafusion/pull/2389#discussion_r862485635


##########
datafusion/expr/src/function.rs:
##########
@@ -222,6 +222,14 @@ pub fn return_type(
             _ => Ok(DataType::Float64),
         },
 
+        BuiltinScalarFunction::Struct => {
+            let fields = input_expr_types
+                .iter()
+                .map(|x| Field::new("item", x.clone(), true))

Review Comment:
   This looks like it is generating a struct where all of the fields have the same name. I would expect that to cause an error.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] Ted-Jiang commented on a diff in pull request #2389: Support struct_expr generate struct in sql

Posted by GitBox <gi...@apache.org>.
Ted-Jiang commented on code in PR #2389:
URL: https://github.com/apache/arrow-datafusion/pull/2389#discussion_r863566521


##########
datafusion/expr/src/function.rs:
##########
@@ -222,6 +222,14 @@ pub fn return_type(
             _ => Ok(DataType::Float64),
         },
 
+        BuiltinScalarFunction::Struct => {
+            let fields = input_expr_types
+                .iter()
+                .map(|x| Field::new("item", x.clone(), true))

Review Comment:
   Fix, It should only need the DataType info, no need for fields name



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] andygrove merged pull request #2389: Support struct_expr generate struct in sql

Posted by GitBox <gi...@apache.org>.
andygrove merged PR #2389:
URL: https://github.com/apache/arrow-datafusion/pull/2389


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] andygrove commented on a diff in pull request #2389: Support struct_expr generate struct in sql

Posted by GitBox <gi...@apache.org>.
andygrove commented on code in PR #2389:
URL: https://github.com/apache/arrow-datafusion/pull/2389#discussion_r862485429


##########
datafusion/core/tests/sql/expr.rs:
##########
@@ -510,6 +510,23 @@ async fn test_array_literals() -> Result<()> {
     Ok(())
 }
 
+#[tokio::test]
+async fn test_struct_literals() -> Result<()> {
+    test_expression!(
+        "STRUCT(1,2,3,4,5)",
+        "{\"f_0\": 1, \"f_1\": 2, \"f_2\": 3, \"f_3\": 4, \"f_4\": 5}"

Review Comment:
   I would prefer the generated column names to be `c0`, `c1`, ... but that is just personal preference (and probably based on experience with Spark).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] Ted-Jiang closed pull request #2389: Support struct_expr generate struct in sql

Posted by GitBox <gi...@apache.org>.
Ted-Jiang closed pull request #2389: Support struct_expr generate struct in sql
URL: https://github.com/apache/arrow-datafusion/pull/2389


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] Ted-Jiang commented on a diff in pull request #2389: Support struct_expr generate struct in sql

Posted by GitBox <gi...@apache.org>.
Ted-Jiang commented on code in PR #2389:
URL: https://github.com/apache/arrow-datafusion/pull/2389#discussion_r866109209


##########
datafusion/physical-expr/src/struct_expressions.rs:
##########
@@ -0,0 +1,110 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+//! Struct expressions
+
+use arrow::array::*;
+use arrow::datatypes::{DataType, Field};
+use datafusion_common::{DataFusionError, Result};
+use datafusion_expr::ColumnarValue;
+use std::sync::Arc;
+
+fn array_struct(args: &[ArrayRef]) -> Result<ArrayRef> {
+    // do not accept 0 arguments.
+    if args.is_empty() {
+        return Err(DataFusionError::Internal(
+            "struct requires at least one argument".to_string(),
+        ));
+    }
+
+    let vec: Vec<_> = args
+        .iter()
+        .enumerate()
+        .map(|(i, arg)| -> (Field, ArrayRef) {
+            let field_name = format!("c_{}", i);
+            match arg.data_type() {
+                DataType::Utf8 => (
+                    Field::new(field_name.as_str(), DataType::Utf8, true),
+                    arg.clone(),
+                ),
+                DataType::LargeUtf8 => (
+                    Field::new(field_name.as_str(), DataType::LargeUtf8, true),
+                    arg.clone(),
+                ),
+                DataType::Boolean => (
+                    Field::new(field_name.as_str(), DataType::Boolean, true),
+                    arg.clone(),
+                ),
+                DataType::Float32 => (
+                    Field::new(field_name.as_str(), DataType::Float64, true),
+                    arg.clone(),
+                ),
+                DataType::Float64 => (
+                    Field::new(field_name.as_str(), DataType::Float64, true),
+                    arg.clone(),
+                ),
+                DataType::Int8 => (
+                    Field::new(field_name.as_str(), DataType::Int8, true),
+                    arg.clone(),
+                ),
+                DataType::Int16 => (
+                    Field::new(field_name.as_str(), DataType::Int16, true),
+                    arg.clone(),
+                ),
+                DataType::Int32 => (
+                    Field::new(field_name.as_str(), DataType::Int32, true),
+                    arg.clone(),
+                ),
+                DataType::Int64 => (
+                    Field::new(field_name.as_str(), DataType::Int64, true),
+                    arg.clone(),
+                ),
+                DataType::UInt8 => (
+                    Field::new(field_name.as_str(), DataType::UInt8, true),
+                    arg.clone(),
+                ),
+                DataType::UInt16 => (
+                    Field::new(field_name.as_str(), DataType::UInt16, true),
+                    arg.clone(),
+                ),
+                DataType::UInt32 => (
+                    Field::new(field_name.as_str(), DataType::UInt32, true),
+                    arg.clone(),
+                ),
+                DataType::UInt64 => (
+                    Field::new(field_name.as_str(), DataType::UInt64, true),
+                    arg.clone(),
+                ),
+                data_type => unimplemented!("struct not support {} type", data_type),
+            }

Review Comment:
   @andygrove Thanks, fix in https://github.com/apache/arrow-datafusion/pull/2389/commits/b26c388aa5e6c7c727bb3a655e7a7a6cc56b492d



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] Ted-Jiang commented on a diff in pull request #2389: Support struct_expr generate struct in sql

Posted by GitBox <gi...@apache.org>.
Ted-Jiang commented on code in PR #2389:
URL: https://github.com/apache/arrow-datafusion/pull/2389#discussion_r863566359


##########
datafusion/physical-expr/src/struct_expressions.rs:
##########
@@ -0,0 +1,109 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+//! Struct expressions
+
+use arrow::array::*;
+use arrow::datatypes::{DataType, Field};
+use datafusion_common::{DataFusionError, Result};
+use datafusion_expr::ColumnarValue;
+use std::sync::Arc;
+
+fn array_struct(args: &[ArrayRef]) -> Result<ArrayRef> {
+    // do not accept 0 arguments.
+    if args.is_empty() {
+        return Err(DataFusionError::Internal(
+            "struct requires at least one argument".to_string(),
+        ));
+    }
+
+    let vec: Vec<_> = args
+        .iter()
+        .enumerate()
+        .map(|(i, arg)| -> (Field, ArrayRef) {
+            match arg.data_type() {
+                DataType::Utf8 => (
+                    Field::new(&*format!("f_{}", i), DataType::Utf8, true),

Review Comment:
   Sure! Thanks for your advice.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] Ted-Jiang commented on a diff in pull request #2389: Support struct_expr generate struct in sql

Posted by GitBox <gi...@apache.org>.
Ted-Jiang commented on code in PR #2389:
URL: https://github.com/apache/arrow-datafusion/pull/2389#discussion_r863558014


##########
datafusion/core/tests/sql/expr.rs:
##########
@@ -510,6 +510,23 @@ async fn test_array_literals() -> Result<()> {
     Ok(())
 }
 
+#[tokio::test]
+async fn test_struct_literals() -> Result<()> {
+    test_expression!(
+        "STRUCT(1,2,3,4,5)",
+        "{\"f_0\": 1, \"f_1\": 2, \"f_2\": 3, \"f_3\": 4, \"f_4\": 5}"

Review Comment:
   Agree! align with Spark.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] andygrove commented on a diff in pull request #2389: Support struct_expr generate struct in sql

Posted by GitBox <gi...@apache.org>.
andygrove commented on code in PR #2389:
URL: https://github.com/apache/arrow-datafusion/pull/2389#discussion_r867367171


##########
datafusion/physical-expr/src/struct_expressions.rs:
##########
@@ -0,0 +1,78 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+//! Struct expressions
+
+use arrow::array::*;
+use arrow::datatypes::{DataType, Field};
+use datafusion_common::{DataFusionError, Result};
+use datafusion_expr::ColumnarValue;
+use std::sync::Arc;
+
+fn array_struct(args: &[ArrayRef]) -> Result<ArrayRef> {
+    // do not accept 0 arguments.
+    if args.is_empty() {
+        return Err(DataFusionError::Internal(
+            "struct requires at least one argument".to_string(),
+        ));
+    }
+
+    let vec: Vec<_> = args
+        .iter()
+        .enumerate()
+        .map(|(i, arg)| -> Result<(Field, ArrayRef)> {
+            let field_name = format!("c{}", i);
+            match arg.data_type() {
+                DataType::Utf8
+                | DataType::LargeUtf8
+                | DataType::Boolean
+                | DataType::Float32
+                | DataType::Float64
+                | DataType::Int8
+                | DataType::Int16
+                | DataType::Int32
+                | DataType::Int64
+                | DataType::UInt8
+                | DataType::UInt16
+                | DataType::UInt32
+                | DataType::UInt64 => Ok((
+                    Field::new(field_name.as_str(), arg.data_type().clone(), true),
+                    arg.clone(),
+                )),
+                data_type => Err(DataFusionError::NotImplemented(format!(
+                    "Struct is not implemented for type '{:?}'.",
+                    data_type
+                ))),
+            }
+        })
+        .map(|x| x.unwrap())
+        .collect();

Review Comment:
   The `unwrap` here would still cause a panic. We can use `Result` instead:
   
   ```suggestion
           .collect::<Result<Vec<_>>>()?;
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] Ted-Jiang commented on a diff in pull request #2389: Support struct_expr generate struct in sql

Posted by GitBox <gi...@apache.org>.
Ted-Jiang commented on code in PR #2389:
URL: https://github.com/apache/arrow-datafusion/pull/2389#discussion_r867426610


##########
datafusion/physical-expr/src/struct_expressions.rs:
##########
@@ -0,0 +1,78 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+//! Struct expressions
+
+use arrow::array::*;
+use arrow::datatypes::{DataType, Field};
+use datafusion_common::{DataFusionError, Result};
+use datafusion_expr::ColumnarValue;
+use std::sync::Arc;
+
+fn array_struct(args: &[ArrayRef]) -> Result<ArrayRef> {
+    // do not accept 0 arguments.
+    if args.is_empty() {
+        return Err(DataFusionError::Internal(
+            "struct requires at least one argument".to_string(),
+        ));
+    }
+
+    let vec: Vec<_> = args
+        .iter()
+        .enumerate()
+        .map(|(i, arg)| -> Result<(Field, ArrayRef)> {
+            let field_name = format!("c{}", i);
+            match arg.data_type() {
+                DataType::Utf8
+                | DataType::LargeUtf8
+                | DataType::Boolean
+                | DataType::Float32
+                | DataType::Float64
+                | DataType::Int8
+                | DataType::Int16
+                | DataType::Int32
+                | DataType::Int64
+                | DataType::UInt8
+                | DataType::UInt16
+                | DataType::UInt32
+                | DataType::UInt64 => Ok((
+                    Field::new(field_name.as_str(), arg.data_type().clone(), true),
+                    arg.clone(),
+                )),
+                data_type => Err(DataFusionError::NotImplemented(format!(
+                    "Struct is not implemented for type '{:?}'.",
+                    data_type
+                ))),
+            }
+        })
+        .map(|x| x.unwrap())
+        .collect();

Review Comment:
   @andygrove  Thanks, I think it should return `Vec<Result<_>>` , Is the `collect` can change Vec<Result<_>>` to `Result<Vec<_>>`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org