You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "izveigor (via GitHub)" <gi...@apache.org> on 2023/06/06 11:56:09 UTC
[GitHub] [arrow-datafusion] izveigor opened a new issue, #6560: Support `FixedSizeList` in array methods
izveigor opened a new issue, #6560:
URL: https://github.com/apache/arrow-datafusion/issues/6560
### Is your feature request related to a problem or challenge?
Follow on to https://github.com/apache/arrow-datafusion/pull/6384
Data can come from different sources (such as `apache-avro`, `parquet` ...) where the list data type is supported, which can become `FixedSizeList` in `arrow-datafusion`. So as not to cause difficulties regarding the casting between `FixedSizeList` and `List`, I suggest to support `FixedSizeList` in all array functions.
### Describe the solution you'd like
Pre casting `FixedSizeList` to `List` before calling array functions or native support for that data type.
### Describe alternatives you've considered
_No response_
### Additional context
Simillar Issues:
https://github.com/apache/arrow-datafusion/issues/6119
https://github.com/apache/arrow-datafusion/issues/6075
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] alamb commented on issue #6560: Support `FixedSizeList` in array methods
Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #6560:
URL: https://github.com/apache/arrow-datafusion/issues/6560#issuecomment-1594914643
Since `FixedSizeList` is a different Arrow type, we probably need a new ScalarValue variant for it. Maybe `ScalarValue::FixedSizedList` 🤔
https://docs.rs/datafusion/latest/datafusion/scalar/enum.ScalarValue.html#variant.List
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] alamb commented on issue #6560: Support `FixedSizeList` in array methods
Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #6560:
URL: https://github.com/apache/arrow-datafusion/issues/6560#issuecomment-1593145969
>
We should do coercion in datafusion/optimizer/src/analyzer/type_coercion.rs, right?
Yes, that is the preferred location (as it happens before any optimization passes) -- the reason is that type coercion can change the semantics of the plan, but the optimizer passes should not
https://docs.rs/datafusion/latest/datafusion/index.html#planning
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] alamb commented on issue #6560: Support `FixedSizeList` in array methods
Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #6560:
URL: https://github.com/apache/arrow-datafusion/issues/6560#issuecomment-1594917841
I wonder if a reasonable approach would be to start with the code to cast to/from `List` and `FixedSizedList` -- that way there would at least be some path for working with data that came in as FixedSizedList
So like
```sql
SELECT
arrow_cast(my_column, 'List(Utf8)') -- would call the `cast` kernel to convert `my_column` to `LIstArray`
FROM
my_parquet_table_with_fixed_size_lists
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] jayzhan211 commented on issue #6560: Support `FixedSizeList` in array methods
Posted by "jayzhan211 (via GitHub)" <gi...@apache.org>.
jayzhan211 commented on issue #6560:
URL: https://github.com/apache/arrow-datafusion/issues/6560#issuecomment-1592198246
We should do coercion in `datafusion/optimizer/src/analyzer/type_coercion.rs`, right?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] jayzhan211 commented on issue #6560: Support `FixedSizeList` in array methods
Posted by "jayzhan211 (via GitHub)" <gi...@apache.org>.
jayzhan211 commented on issue #6560:
URL: https://github.com/apache/arrow-datafusion/issues/6560#issuecomment-1594040178
My first thought is to add casting from `FixedSizeList` to `List` in `coerce_arguments_for_signature`.
However, I fail to construct a test.
I don't know how to create a test case with `DataType::FixedSizeList` in `sqllogictests array.slt`. make_array([1,2]) gives us `ScalarValue::List` which converted to `DataType::List`, anyone have idea?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] Support `FixedSizeList` in array methods [datafusion]
Posted by "jayzhan211 (via GitHub)" <gi...@apache.org>.
jayzhan211 commented on issue #6560:
URL: https://github.com/apache/datafusion/issues/6560#issuecomment-2072281672
I think we have support FixedSizeList already. Thanks for @Weijun-H
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@datafusion.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscribe@datafusion.apache.org
For additional commands, e-mail: github-help@datafusion.apache.org
Re: [I] Support `FixedSizeList` in array methods [datafusion]
Posted by "jayzhan211 (via GitHub)" <gi...@apache.org>.
jayzhan211 closed issue #6560: Support `FixedSizeList` in array methods
URL: https://github.com/apache/datafusion/issues/6560
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@datafusion.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscribe@datafusion.apache.org
For additional commands, e-mail: github-help@datafusion.apache.org