You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "izveigor (via GitHub)" <gi...@apache.org> on 2023/06/06 11:56:09 UTC

[GitHub] [arrow-datafusion] izveigor opened a new issue, #6560: Support `FixedSizeList` in array methods

izveigor opened a new issue, #6560:
URL: https://github.com/apache/arrow-datafusion/issues/6560

   ### Is your feature request related to a problem or challenge?
   
   Follow on to https://github.com/apache/arrow-datafusion/pull/6384
   
   Data can come from different sources (such as `apache-avro`, `parquet` ...) where the list data type is supported, which can become `FixedSizeList` in `arrow-datafusion`. So as not to cause difficulties regarding the casting between `FixedSizeList` and `List`, I suggest to support `FixedSizeList` in all array functions.
   
   ### Describe the solution you'd like
   
   Pre casting `FixedSizeList` to `List` before calling array functions or native support for that data type.
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   Simillar Issues:
   https://github.com/apache/arrow-datafusion/issues/6119
   https://github.com/apache/arrow-datafusion/issues/6075


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on issue #6560: Support `FixedSizeList` in array methods

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #6560:
URL: https://github.com/apache/arrow-datafusion/issues/6560#issuecomment-1594914643

   Since `FixedSizeList` is a different Arrow type, we probably need a new ScalarValue variant for it. Maybe `ScalarValue::FixedSizedList` 🤔 
   
   
   https://docs.rs/datafusion/latest/datafusion/scalar/enum.ScalarValue.html#variant.List


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on issue #6560: Support `FixedSizeList` in array methods

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #6560:
URL: https://github.com/apache/arrow-datafusion/issues/6560#issuecomment-1593145969

   > 
   We should do coercion in datafusion/optimizer/src/analyzer/type_coercion.rs, right?
   
   
   
   Yes, that is the preferred location (as it happens before any optimization passes) -- the reason is that type coercion can change the semantics of the plan, but the optimizer passes should not
   
   https://docs.rs/datafusion/latest/datafusion/index.html#planning


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on issue #6560: Support `FixedSizeList` in array methods

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #6560:
URL: https://github.com/apache/arrow-datafusion/issues/6560#issuecomment-1594917841

   I wonder if a reasonable approach would be to start with the code to cast to/from `List` and `FixedSizedList` -- that way there would at least be some path for working with data that came in as FixedSizedList
   
   So like
   
   ```sql
   SELECT 
     arrow_cast(my_column, 'List(Utf8)') -- would call the `cast` kernel to convert `my_column` to `LIstArray`
   FROM
     my_parquet_table_with_fixed_size_lists
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] jayzhan211 commented on issue #6560: Support `FixedSizeList` in array methods

Posted by "jayzhan211 (via GitHub)" <gi...@apache.org>.
jayzhan211 commented on issue #6560:
URL: https://github.com/apache/arrow-datafusion/issues/6560#issuecomment-1592198246

   We should do coercion in `datafusion/optimizer/src/analyzer/type_coercion.rs`, right?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] jayzhan211 commented on issue #6560: Support `FixedSizeList` in array methods

Posted by "jayzhan211 (via GitHub)" <gi...@apache.org>.
jayzhan211 commented on issue #6560:
URL: https://github.com/apache/arrow-datafusion/issues/6560#issuecomment-1594040178

   My first thought is to add casting from `FixedSizeList` to `List` in `coerce_arguments_for_signature`.
   
   However, I fail to construct a test.
   I don't know how to create a test case with `DataType::FixedSizeList` in `sqllogictests array.slt`. make_array([1,2]) gives us `ScalarValue::List` which converted to `DataType::List`, anyone have idea?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Support `FixedSizeList` in array methods [datafusion]

Posted by "jayzhan211 (via GitHub)" <gi...@apache.org>.
jayzhan211 commented on issue #6560:
URL: https://github.com/apache/datafusion/issues/6560#issuecomment-2072281672

   I think we have support FixedSizeList already. Thanks for @Weijun-H 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscribe@datafusion.apache.org
For additional commands, e-mail: github-help@datafusion.apache.org


Re: [I] Support `FixedSizeList` in array methods [datafusion]

Posted by "jayzhan211 (via GitHub)" <gi...@apache.org>.
jayzhan211 closed issue #6560: Support `FixedSizeList` in array methods
URL: https://github.com/apache/datafusion/issues/6560


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscribe@datafusion.apache.org
For additional commands, e-mail: github-help@datafusion.apache.org