You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Micah Kornfield <em...@gmail.com> on 2022/12/01 05:46:05 UTC

Re: Array::GetValue ?

Yeah, another base class doesn't seem great.  One option would be to use a
union type like variant, but that seems also not particularly great?



On Thu, Nov 17, 2022 at 1:15 AM Antoine Pitrou <an...@python.org> wrote:

>
> Uh, you're right.  We may want another base class, not sure how it
> should be named though (also, we may want to be careful with multiple
> inheritance?).
>
> Regards
>
> Antoine.
>
>
> Le 17/11/2022 à 06:15, Micah Kornfield a écrit :
> >>
> >> std::string_view FlatArray::GetValueBytes(int64_t index)
> >
> >
> > I think this would be problematic for Boolean?
> >
> > On Tue, Nov 15, 2022 at 11:01 AM John Muehlhausen <jg...@jgm.org> wrote:
> >
> >> If that covers primitive and binary(string) types, that would work for
> me.
> >>
> >> On Tue, Nov 15, 2022 at 13:50 Antoine Pitrou <an...@python.org>
> wrote:
> >>
> >>>
> >>> Then perhaps we can define a method:
> >>>
> >>> std::string_view FlatArray::GetValueBytes(int64_t index)
> >>>
> >>> ?
> >>>
> >>>
> >>> Le 15/11/2022 à 19:39, John Muehlhausen a écrit :
> >>>> I had a use-case where untyped access to bytes would have been
> >>> sufficient,
> >>>> vs branching depending on array type.  This is what brought the idea
> to
> >>>> mind.
> >>>>
> >>>> On Tue, Nov 15, 2022 at 02:34 Jin Shang <sh...@gmail.com>
> >> wrote:
> >>>>
> >>>>> Hi John,
> >>>>>
> >>>>> In addition to Micah’s reply, does the member method Value(int64_t
> >>>>> i)[1][2][3] satisfy your need? It is defined for all array types with
> >> a
> >>>>> primitive value representation, i.e. all primitive arrays and binary
> >>> arrays.
> >>>>>
> >>>>> [1]
> >>>>>
> >>>
> >>
> https://github.com/js8544/arrow/blob/master/cpp/src/arrow/array/array_primitive.h#L50
> >>>>> <
> >>>>>
> >>>
> >>
> https://github.com/js8544/arrow/blob/master/cpp/src/arrow/array/array_primitive.h#L50
> >>>>>>
> >>>>> [2]
> >>>>>
> >>>
> >>
> https://github.com/js8544/arrow/blob/master/cpp/src/arrow/array/array_primitive.h#L109
> >>>>> <
> >>>>>
> >>>
> >>
> https://github.com/js8544/arrow/blob/master/cpp/src/arrow/array/array_primitive.h#L109
> >>>>>>
> >>>>> [3]
> >>>>>
> >>>
> >>
> https://github.com/js8544/arrow/blob/master/cpp/src/arrow/array/array_binary.h#L87
> >>>>> <
> >>>>>
> >>>
> >>
> https://github.com/js8544/arrow/blob/master/cpp/src/arrow/array/array_binary.h#L87
> >>>>>>
> >>>>>
> >>>>>
> >>>>>> 2022年11月15日 13:06,Micah Kornfield <em...@gmail.com> 写道:
> >>>>>>
> >>>>>> Hi John,
> >>>>>>
> >>>>>> There are a couple of edge cases that need to be discussed to move
> >> the
> >>>>>> function to the base array class (which IIUC is this proposal):
> >>>>>> 1. boolean
> >>>>>> 2. struct
> >>>>>> 3.  lists/LargeList
> >>>>>> 4.  DictionaryArray
> >>>>>>
> >>>>>> FlatArray [1] seems like a better place for this method if there is
> >>>>>> consensus on adding it.
> >>>>>>
> >>>>>> Cheers,
> >>>>>> Micah
> >>>>>>
> >>>>>> [1]
> >>>>>>
> >>>>>
> >>>
> >>
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/array/array_base.h#L219
> >>>>>>
> >>>>>> On Mon, Nov 14, 2022 at 11:46 AM John Muehlhausen <jg...@jgm.org>
> >> wrote:
> >>>>>>
> >>>>>>> There exists:
> >>>>>>> const uint8_t* BaseBinaryArray::GetValue(int64_t i, offset_type*
> >>>>>>> out_length) const
> >>>>>>>
> >>>>>>> What about adding:
> >>>>>>> const uint8_t* Array::GetValue(int64_t i, offset_type* out_length)
> >>> const
> >>>>>>>
> >>>>>>> This would allow GetValue to get the untyped bytes/length of any
> >>> value?
> >>>>>>> E.g. out_length would be set to sizeof(T) for arrays of primitive
> >> type
> >>>>> T?
> >>>>>>>
> >>>>>>> For FixedSizeBinaryArray the existing GetValue would still be a
> >> valid
> >>>>>>> overload.
> >>>>>>>
> >>>>>>> -John
> >>>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >>
> >
>

Re: Array::GetValue ?

Posted by John Muehlhausen <jg...@jgm.org>.
std::pair<std::string_view,int8_t> FlatArray::GetValueBytes(int64_t index)

.second being -1 in every case other than a boolean ... in that case it
indicates the bit of importance in the single-byte string_view?

Just an idea.

On Wed, Nov 30, 2022 at 11:46 PM Micah Kornfield <em...@gmail.com>
wrote:

> Yeah, another base class doesn't seem great.  One option would be to use a
> union type like variant, but that seems also not particularly great?
>
>
>
> On Thu, Nov 17, 2022 at 1:15 AM Antoine Pitrou <an...@python.org> wrote:
>
> >
> > Uh, you're right.  We may want another base class, not sure how it
> > should be named though (also, we may want to be careful with multiple
> > inheritance?).
> >
> > Regards
> >
> > Antoine.
> >
> >
> > Le 17/11/2022 à 06:15, Micah Kornfield a écrit :
> > >>
> > >> std::string_view FlatArray::GetValueBytes(int64_t index)
> > >
> > >
> > > I think this would be problematic for Boolean?
> > >
> > > On Tue, Nov 15, 2022 at 11:01 AM John Muehlhausen <jg...@jgm.org> wrote:
> > >
> > >> If that covers primitive and binary(string) types, that would work for
> > me.
> > >>
> > >> On Tue, Nov 15, 2022 at 13:50 Antoine Pitrou <an...@python.org>
> > wrote:
> > >>
> > >>>
> > >>> Then perhaps we can define a method:
> > >>>
> > >>> std::string_view FlatArray::GetValueBytes(int64_t index)
> > >>>
> > >>> ?
> > >>>
> > >>>
> > >>> Le 15/11/2022 à 19:39, John Muehlhausen a écrit :
> > >>>> I had a use-case where untyped access to bytes would have been
> > >>> sufficient,
> > >>>> vs branching depending on array type.  This is what brought the idea
> > to
> > >>>> mind.
> > >>>>
> > >>>> On Tue, Nov 15, 2022 at 02:34 Jin Shang <sh...@gmail.com>
> > >> wrote:
> > >>>>
> > >>>>> Hi John,
> > >>>>>
> > >>>>> In addition to Micah’s reply, does the member method Value(int64_t
> > >>>>> i)[1][2][3] satisfy your need? It is defined for all array types
> with
> > >> a
> > >>>>> primitive value representation, i.e. all primitive arrays and
> binary
> > >>> arrays.
> > >>>>>
> > >>>>> [1]
> > >>>>>
> > >>>
> > >>
> >
> https://github.com/js8544/arrow/blob/master/cpp/src/arrow/array/array_primitive.h#L50
> > >>>>> <
> > >>>>>
> > >>>
> > >>
> >
> https://github.com/js8544/arrow/blob/master/cpp/src/arrow/array/array_primitive.h#L50
> > >>>>>>
> > >>>>> [2]
> > >>>>>
> > >>>
> > >>
> >
> https://github.com/js8544/arrow/blob/master/cpp/src/arrow/array/array_primitive.h#L109
> > >>>>> <
> > >>>>>
> > >>>
> > >>
> >
> https://github.com/js8544/arrow/blob/master/cpp/src/arrow/array/array_primitive.h#L109
> > >>>>>>
> > >>>>> [3]
> > >>>>>
> > >>>
> > >>
> >
> https://github.com/js8544/arrow/blob/master/cpp/src/arrow/array/array_binary.h#L87
> > >>>>> <
> > >>>>>
> > >>>
> > >>
> >
> https://github.com/js8544/arrow/blob/master/cpp/src/arrow/array/array_binary.h#L87
> > >>>>>>
> > >>>>>
> > >>>>>
> > >>>>>> 2022年11月15日 13:06,Micah Kornfield <em...@gmail.com> 写道:
> > >>>>>>
> > >>>>>> Hi John,
> > >>>>>>
> > >>>>>> There are a couple of edge cases that need to be discussed to move
> > >> the
> > >>>>>> function to the base array class (which IIUC is this proposal):
> > >>>>>> 1. boolean
> > >>>>>> 2. struct
> > >>>>>> 3.  lists/LargeList
> > >>>>>> 4.  DictionaryArray
> > >>>>>>
> > >>>>>> FlatArray [1] seems like a better place for this method if there
> is
> > >>>>>> consensus on adding it.
> > >>>>>>
> > >>>>>> Cheers,
> > >>>>>> Micah
> > >>>>>>
> > >>>>>> [1]
> > >>>>>>
> > >>>>>
> > >>>
> > >>
> >
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/array/array_base.h#L219
> > >>>>>>
> > >>>>>> On Mon, Nov 14, 2022 at 11:46 AM John Muehlhausen <jg...@jgm.org>
> > >> wrote:
> > >>>>>>
> > >>>>>>> There exists:
> > >>>>>>> const uint8_t* BaseBinaryArray::GetValue(int64_t i, offset_type*
> > >>>>>>> out_length) const
> > >>>>>>>
> > >>>>>>> What about adding:
> > >>>>>>> const uint8_t* Array::GetValue(int64_t i, offset_type*
> out_length)
> > >>> const
> > >>>>>>>
> > >>>>>>> This would allow GetValue to get the untyped bytes/length of any
> > >>> value?
> > >>>>>>> E.g. out_length would be set to sizeof(T) for arrays of primitive
> > >> type
> > >>>>> T?
> > >>>>>>>
> > >>>>>>> For FixedSizeBinaryArray the existing GetValue would still be a
> > >> valid
> > >>>>>>> overload.
> > >>>>>>>
> > >>>>>>> -John
> > >>>>>>>
> > >>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>
> > >
> >
>