You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Lirong Jian <ji...@gmail.com> on 2020/05/07 07:52:01 UTC
Re: [jira] [Created] (ARROW-7165) [C++] Arrow Compute Group By Support
Any update of this task?
We are very interested in this feature. Thanks.
Lirong Jian
HashData Inc.
Chendi.Xue (Jira) <ji...@apache.org> 于2019年11月14日周四 下午2:02写道:
> Chendi.Xue created ARROW-7165:
> ---------------------------------
>
> Summary: [C++] Arrow Compute Group By Support
> Key: ARROW-7165
> URL: https://issues.apache.org/jira/browse/ARROW-7165
> Project: Apache Arrow
> Issue Type: New Feature
> Components: C++ - Compute
> Reporter: Chendi.Xue
>
>
> Not sure if there is any plan to support groupby in arrow?
>
> Here is some to do in my mind:
> # To make current arrow/compute/kernels/hash supporting received a
> memo_table as input, so multiple array will be able to get dictencode and
> valuecount based on same hashmap with a unified index.
> # To add a split array function instead of using take multiple time to
> split one array to several ones.
> # so the output array can use current funcs under compute/kernels, such
> as sum/count/sort to support group by.
>
> But this is some of my basic idea, wanna know if there is a on going plan
> on this?
>
>
>
> --
> This message was sent by Atlassian Jira
> (v8.3.4#803005)
>
Re: [jira] [Created] (ARROW-7165) [C++] Arrow Compute Group By Support
Posted by Lirong Jian <ji...@gmail.com>.
Thanks.
Lirong Jian
HashData Inc.
Wes McKinney <we...@gmail.com> 于2020年5月7日周四 下午10:44写道:
> Feel free to follow ARROW-5002
>
> On Thu, May 7, 2020 at 2:52 AM Lirong Jian <ji...@gmail.com> wrote:
> >
> > Any update of this task?
> >
> > We are very interested in this feature. Thanks.
> >
> > Lirong Jian
> > HashData Inc.
> >
> >
> > Chendi.Xue (Jira) <ji...@apache.org> 于2019年11月14日周四 下午2:02写道:
> >
> > > Chendi.Xue created ARROW-7165:
> > > ---------------------------------
> > >
> > > Summary: [C++] Arrow Compute Group By Support
> > > Key: ARROW-7165
> > > URL: https://issues.apache.org/jira/browse/ARROW-7165
> > > Project: Apache Arrow
> > > Issue Type: New Feature
> > > Components: C++ - Compute
> > > Reporter: Chendi.Xue
> > >
> > >
> > > Not sure if there is any plan to support groupby in arrow?
> > >
> > > Here is some to do in my mind:
> > > # To make current arrow/compute/kernels/hash supporting received a
> > > memo_table as input, so multiple array will be able to get dictencode
> and
> > > valuecount based on same hashmap with a unified index.
> > > # To add a split array function instead of using take multiple time to
> > > split one array to several ones.
> > > # so the output array can use current funcs under compute/kernels,
> such
> > > as sum/count/sort to support group by.
> > >
> > > But this is some of my basic idea, wanna know if there is a on going
> plan
> > > on this?
> > >
> > >
> > >
> > > --
> > > This message was sent by Atlassian Jira
> > > (v8.3.4#803005)
> > >
>
Re: [jira] [Created] (ARROW-7165) [C++] Arrow Compute Group By Support
Posted by Wes McKinney <we...@gmail.com>.
Feel free to follow ARROW-5002
On Thu, May 7, 2020 at 2:52 AM Lirong Jian <ji...@gmail.com> wrote:
>
> Any update of this task?
>
> We are very interested in this feature. Thanks.
>
> Lirong Jian
> HashData Inc.
>
>
> Chendi.Xue (Jira) <ji...@apache.org> 于2019年11月14日周四 下午2:02写道:
>
> > Chendi.Xue created ARROW-7165:
> > ---------------------------------
> >
> > Summary: [C++] Arrow Compute Group By Support
> > Key: ARROW-7165
> > URL: https://issues.apache.org/jira/browse/ARROW-7165
> > Project: Apache Arrow
> > Issue Type: New Feature
> > Components: C++ - Compute
> > Reporter: Chendi.Xue
> >
> >
> > Not sure if there is any plan to support groupby in arrow?
> >
> > Here is some to do in my mind:
> > # To make current arrow/compute/kernels/hash supporting received a
> > memo_table as input, so multiple array will be able to get dictencode and
> > valuecount based on same hashmap with a unified index.
> > # To add a split array function instead of using take multiple time to
> > split one array to several ones.
> > # so the output array can use current funcs under compute/kernels, such
> > as sum/count/sort to support group by.
> >
> > But this is some of my basic idea, wanna know if there is a on going plan
> > on this?
> >
> >
> >
> > --
> > This message was sent by Atlassian Jira
> > (v8.3.4#803005)
> >