You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/08/04 16:08:57 UTC

[GitHub] [arrow-rs] alamb opened a new issue #655: Allow `collect`ing from Values as well as `Option`

alamb opened a new issue #655:
URL: https://github.com/apache/arrow-rs/issues/655


   **Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
   I would like to create integer arrays out of iterators of integers rather than an iterator of `Option<integer>`
   
   Specifically I would like to do something like this:
   ```rust
   let column_a = vec![1, 2, 3, 4].into_iter().collect::<Int32Array>();
   ```
   To collect a vector of integers into a `Int32Array` with no null values. However, this results in a compiler error:
   ```
   error[E0277]: the trait bound `{integer}: Borrow<Option<i32>>` is not satisfied
     --> src/main.rs:53:49
      |
   53 |     let column_a = vec![1, 2, 3, 4].into_iter().collect::<Int32Array>();
      |                                                 ^^^^^^^ the trait `Borrow<Option<i32>>` is not implemented for `{integer}`
      |
      = help: the following implementations were found:
                <&T as Borrow<T>>
                <&mut T as Borrow<T>>
                <Arc<T> as Borrow<T>>
                <Box<T, A> as Borrow<T>>
              and 16 others
      = note: required because of the requirements on the impl of `FromIterator<{integer}>` for `PrimitiveArray<datafusion::arrow::datatypes::Int32Type>`
   ```
   
   The `FromIter` implementation that is provided uses `Option<i32>` so this *does* work
   ```rust
   let column_a = vec![Some(1), Some(2), Some(3), Some(4)].into_iter().collect::<Int32Array>();
   ```
   
   **Describe the solution you'd like**
   I would like a FromIter impementation for the various types that allows the following to work:
   ```rust
       // signed
       let column_a = vec![1, 2, 3, 4].into_iter().collect::<Int8Array>();
       let column_a = vec![1, 2, 3, 4].into_iter().collect::<Int16Array>();
       let column_a = vec![1, 2, 3, 4].into_iter().collect::<Int32Array>();
       let column_a = vec![1, 2, 3, 4].into_iter().collect::<Int64Array>();
      // unsigned
       let column_a = vec![1, 2, 3, 4].into_iter().collect::<UInt8Array>();
       let column_a = vec![1, 2, 3, 4].into_iter().collect::<UInt16Array>();
       let column_a = vec![1, 2, 3, 4].into_iter().collect::<UInt32Array>();
       let column_a = vec![1, 2, 3, 4].into_iter().collect::<UInt64Array>();
      // floats
       let column_a = vec![1, 2, 3, 4].into_iter().collect::<Float32>();
       let column_a = vec![1, 2, 3, 4].into_iter().collect::<Float64>();
       let column_a = vec![1.0, 2.0, 3.0, 4.0].into_iter().collect::<Float32>();
       let column_a = vec![1.0, 2.0, 3.0, 40].into_iter().collect::<Float64>();
      // strs
       let column_a = vec!["foo", "bar", "baz"].into_iter().collect::<Float32>();
   ```
   
   **Additional context**
   Add any other context or screenshots about the feature request here.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb commented on issue #655: Allow `collect`ing from Values as well as `Option`

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #655:
URL: https://github.com/apache/arrow-rs/issues/655#issuecomment-899807472


   Note I also just found `from_iter_values()` -- https://github.com/apache/arrow-rs/blob/master/arrow/src/array/array_primitive.rs#L112-L124 which is basically made for this usecase 🤔 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb commented on issue #655: Allow `collect`ing from Values as well as `Option`

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #655:
URL: https://github.com/apache/arrow-rs/issues/655#issuecomment-895124611


   That is great to hear @arucil  -- I have assigned the ticket to you as requested


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] carols10cents commented on issue #655: Allow `collect`ing from Values as well as `Option`

Posted by GitBox <gi...@apache.org>.
carols10cents commented on issue #655:
URL: https://github.com/apache/arrow-rs/issues/655#issuecomment-899917303


   So this doesn't work because specialization isn't done yet... the problem is that if these two trait implementations:
   
   ```
   impl<T: ArrowPrimitiveType, Ptr: Borrow<Option<<T as ArrowPrimitiveType>::Native>>>
       FromIterator<Ptr> for PrimitiveArray<T>
   { ... }
   
   impl<T: ArrowPrimitiveType, Ptr: Borrow<<T as ArrowPrimitiveType>::Native>>
       FromIterator<Ptr> for PrimitiveArray<T>
   { ... }
   ```
   
   were allowed, then someone could come along and add:
   
   ```
   impl Borrow<Option<i32>> for Integer32 { ... }
   impl Borrow<i32> for Integer32 { ... }
   ```
   
   and then Rust wouldn't know which `FromIterator` implementation to use. 
   
   @shepmaster helped with this answer; I only got as far as "I think it's specialization related, but I don't know why" ;) 
   
   He also pointed me to https://stackoverflow.com/questions/37347311/how-is-there-a-conflicting-implementation-of-from-when-using-a-generic-type, perhaps that will be of some help to someone.
   
   I'm working with @shepmaster on a way around this but we might not be able to finish it until Wednesday.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] arucil commented on issue #655: Allow `collect`ing from Values as well as `Option`

Posted by GitBox <gi...@apache.org>.
arucil commented on issue #655:
URL: https://github.com/apache/arrow-rs/issues/655#issuecomment-898813445


   I couldn't figure out how to add another `impl FromIterator` for `PrimitiveArray`. At first I simply copy the impl block of `FromIterator<Option<T>>` and delete `Option`,
   ```rust
   impl<T: ArrowPrimitiveType, Ptr: Borrow<Option<<T as ArrowPrimitiveType>::Native>>>
       FromIterator<Ptr> for PrimitiveArray<T>
   { ... }
   
   impl<T: ArrowPrimitiveType, Ptr: Borrow<<T as ArrowPrimitiveType>::Native>>
       FromIterator<Ptr> for PrimitiveArray<T>
   { ... }
   ```
   
   however this yields a
   > conflicting implementations of trait `std::iter::FromIterator<_>` for type `array::array_primitive::PrimitiveArray<_>`
   
   error.
   
   Then I changed `Option<T>` to an additional generic parameter `U: Into<Option<T>>`, 
   ```rust
   impl<T: ArrowPrimitiveType, Ptr: Borrow<U>, U: Into<Option<T::Native>>>
       FromIterator<Ptr> for PrimitiveArray<T>
   { ... }
   ```
   
   This didn't work either. This yields a `the type parameter U is not constrained by the impl trait, self type, or predicates` error.
   
   Any help is appreciated.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] mbrobbel commented on issue #655: Allow `collect`ing from Values as well as `Option`

Posted by GitBox <gi...@apache.org>.
mbrobbel commented on issue #655:
URL: https://github.com/apache/arrow-rs/issues/655#issuecomment-900092027


   I've been working on [an Arrow implementation](https://github.com/mbrobbel/narrow) for applications with statically known data types. The way that I've managed to express this is by adding a boolean const generic to array types that indicate the nullability of an array (`true` indicates nullable). This makes nullable and non-nullable arrays different types allowing:
   
   ```rust
   impl FromIterator<Option<Foo>> for FooArray<true> { ... }
   impl FromIterator<Foo> for FooArray<false> { ... }
   ```
   
   Type check fails when you attempt to collect the iterator in an array type with the wrong nullability generic value.
   
   ```rust
   #[derive(Array, Copy, Clone, Default)]
   struct Foo {
       a: bool,
   }
   
   // Fails to type check
   let array = [Foo::default(); 4]
       .iter()
       .copied()
       .collect::<StructArray<Foo, true>>();
   
   // Compiles
   let array = [Foo::default(); 4]
       .iter()
       .copied()
       .collect::<StructArray<Foo, false>>();
   ```
   
   ```rust
   a value of type `StructArray<Foo, true>` cannot be built from an iterator over elements of type `Foo`
   the trait `FromIterator<Foo>` is not implemented for `StructArray<Foo, true>`
   ```
   
   This would work for a nullable array:
   ```rust
   let array = [Some(Foo::default()); 4]
       .iter()
       .copied()
       .collect::<StructArray<Foo, true>>();
   ```
   
   Currently const arguments cannot be inferred by the compiler, but that might be possible in the future making this more ergonomic.
   
   ```rust
   let array = [Foo::default(); 4]
       .iter()
       .copied()
       .collect::<StructArray<Foo, _>>();
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] arucil commented on issue #655: Allow `collect`ing from Values as well as `Option`

Posted by GitBox <gi...@apache.org>.
arucil commented on issue #655:
URL: https://github.com/apache/arrow-rs/issues/655#issuecomment-895076243


   I'd like to contribute to this feature. Can you assign this issue to me?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb commented on issue #655: Allow `collect`ing from Values as well as `Option`

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #655:
URL: https://github.com/apache/arrow-rs/issues/655#issuecomment-899806005


   Hi @arucil  -- this is a tricky one for sure. My Rust Fu this afternoon is not quite up to the task
   
   I think I was making progress by defining a marker trait so we could have a single `FromIter` implementation
   
   Something like this:
   
   ```rust
   // Define a marker trait so we can implement FromIter for wrapped
   // Options variants as well as non Option variants
   // https://stackoverflow.com/questions/39159226/conflicting-implementations-of-trait-in-rust
   trait IntoPrimitiveOption : Sized {
       type Primitive;
   
       // Convert self into an Option
       fn into_primitive_opt(&self) -> Option<Self::Primitive>;
   }
   
   ...
   impl<T: ArrowPrimitiveType, P: IntoPrimitiveOption<Primitive=T>>
       FromIterator<P> for PrimitiveArray<T>
   {
   ...
   }
   
   
   ```
   
   However,  my attempts to define `IntoPrimitiveOption` correctly for `ArrowPrimitiveType` was not getting anywhere. Maybe we need to do it for individually for each type that implements ArrowPrimitiveType? I am not sure though.
   
   This didn't work 
   
   ```rust
   
   impl <T: ArrowPrimitiveType> IntoPrimitiveOption for T{
       type Primitive = T::Native;
   
       fn into_primitive_opt(&self) -> Option<Self::Primitive> {
           let v:T = *self;
           Some(v)
       }
   }
   
   impl <T: ArrowPrimitiveType> IntoPrimitiveOption for Option<T>{
       type Primitive = T::Native;
   
       fn into_primitive_opt(&self) -> Option<Self::Primitive> {
           self.cloned()
       }
   }
   
   ```
   
   FYI @jorgecarleitao 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org