You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@arrow.apache.org by Ryan Kuhns <rn...@gmail.com> on 2022/09/26 21:26:10 UTC

String Array Concatenation function?

Hi,

I’ve started using Apache Arrow via pyarrow.

One area I’ve struggled is the ability to create a new column that is a concatenation of other string columns.

The existing string concatenation compute functions don’t appear to work for the case I’m describing.

Are there any plans to create a compute function that accepts arrays of strings and returns an array that has concatenated the input arrays element-wise?

Or is there an efficient way I could use the existing functionality to accomplish this?

Thanks in advance for the help!

Ryan

Re: String Array Concatenation function?

Posted by Ryan Kuhns <rn...@gmail.com>.
Hi Ian,

I had been busy, but would be happy to add it.

I’ll review the contributing information and see about starting that.

In general I’d be happy to help add some additional documentation for user workflows.

-Ryan

> On Sep 28, 2022, at 9:21 AM, Ian Cook <ia...@ursacomputing.com> wrote:
> 
> Ryan,
> 
> Glad to hear you got it working!
> 
> If you're feeling up to it, would you be willing to add an example to
> the Python Arrow Cookbook demonstrating elementwise string
> concatenation?
> I think it would fit nicely with the other examples here in the Data
> Manipulation section: https://arrow.apache.org/cookbook/py/data.html
> There is information here showing how to contribute:
> https://github.com/apache/arrow-cookbook/blob/main/CONTRIBUTING.md
> 
> Thanks,
> Ian
> 
>> On Tue, Sep 27, 2022 at 11:17 AM Ryan Kuhns <rn...@gmail.com> wrote:
>> 
>> Thanks to everyone for the help. The binary_join_element_wise compute fun thin does what I need. I was just calling it wrong!
>> 
>>>> On Sep 27, 2022, at 4:55 AM, Jacek Pliszka <ja...@gmail.com> wrote:
>>> 
>>> Hi!
>>> 
>>> I think API section is more user friendly:
>>> 
>>> https://arrow.apache.org/docs/python/api/compute.html#api-compute
>>> 
>>> https://arrow.apache.org/docs/python/generated/pyarrow.compute.binary_join_element_wise.html#pyarrow.compute.binary_join_element_wise
>>> 
>>> BR
>>> 
>>> J
>>> 
>>> pon., 26 wrz 2022 o 23:48 Ian Cook <ia...@ursacomputing.com> napisał(a):
>>>> 
>>>> Hi Ryan,
>>>> 
>>>> I believe the compute function "binary_join_element_wise" in the Arrow
>>>> C++ library does just this:
>>>> https://arrow.apache.org/docs/cpp/compute.html#string-joining
>>>> 
>>>> I believe you can call this function in PyArrow following the same
>>>> pattern described here:
>>>> https://arrow.apache.org/docs/python/compute.html#standard-compute-functions
>>>> 
>>>> Ian
>>>> 
>>>>> On Mon, Sep 26, 2022 at 5:26 PM Ryan Kuhns <rn...@gmail.com> wrote:
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> I’ve started using Apache Arrow via pyarrow.
>>>>> 
>>>>> One area I’ve struggled is the ability to create a new column that is a concatenation of other string columns.
>>>>> 
>>>>> The existing string concatenation compute functions don’t appear to work for the case I’m describing.
>>>>> 
>>>>> Are there any plans to create a compute function that accepts arrays of strings and returns an array that has concatenated the input arrays element-wise?
>>>>> 
>>>>> Or is there an efficient way I could use the existing functionality to accomplish this?
>>>>> 
>>>>> Thanks in advance for the help!
>>>>> 
>>>>> Ryan

Re: String Array Concatenation function?

Posted by Ian Cook <ia...@ursacomputing.com>.
Ryan,

Glad to hear you got it working!

If you're feeling up to it, would you be willing to add an example to
the Python Arrow Cookbook demonstrating elementwise string
concatenation?
I think it would fit nicely with the other examples here in the Data
Manipulation section: https://arrow.apache.org/cookbook/py/data.html
There is information here showing how to contribute:
https://github.com/apache/arrow-cookbook/blob/main/CONTRIBUTING.md

Thanks,
Ian

On Tue, Sep 27, 2022 at 11:17 AM Ryan Kuhns <rn...@gmail.com> wrote:
>
> Thanks to everyone for the help. The binary_join_element_wise compute fun thin does what I need. I was just calling it wrong!
>
> > On Sep 27, 2022, at 4:55 AM, Jacek Pliszka <ja...@gmail.com> wrote:
> >
> > Hi!
> >
> > I think API section is more user friendly:
> >
> > https://arrow.apache.org/docs/python/api/compute.html#api-compute
> >
> > https://arrow.apache.org/docs/python/generated/pyarrow.compute.binary_join_element_wise.html#pyarrow.compute.binary_join_element_wise
> >
> > BR
> >
> > J
> >
> > pon., 26 wrz 2022 o 23:48 Ian Cook <ia...@ursacomputing.com> napisał(a):
> >>
> >> Hi Ryan,
> >>
> >> I believe the compute function "binary_join_element_wise" in the Arrow
> >> C++ library does just this:
> >> https://arrow.apache.org/docs/cpp/compute.html#string-joining
> >>
> >> I believe you can call this function in PyArrow following the same
> >> pattern described here:
> >> https://arrow.apache.org/docs/python/compute.html#standard-compute-functions
> >>
> >> Ian
> >>
> >>> On Mon, Sep 26, 2022 at 5:26 PM Ryan Kuhns <rn...@gmail.com> wrote:
> >>>
> >>> Hi,
> >>>
> >>> I’ve started using Apache Arrow via pyarrow.
> >>>
> >>> One area I’ve struggled is the ability to create a new column that is a concatenation of other string columns.
> >>>
> >>> The existing string concatenation compute functions don’t appear to work for the case I’m describing.
> >>>
> >>> Are there any plans to create a compute function that accepts arrays of strings and returns an array that has concatenated the input arrays element-wise?
> >>>
> >>> Or is there an efficient way I could use the existing functionality to accomplish this?
> >>>
> >>> Thanks in advance for the help!
> >>>
> >>> Ryan

Re: String Array Concatenation function?

Posted by Ryan Kuhns <rn...@gmail.com>.
Thanks to everyone for the help. The binary_join_element_wise compute fun thin does what I need. I was just calling it wrong!

> On Sep 27, 2022, at 4:55 AM, Jacek Pliszka <ja...@gmail.com> wrote:
> 
> Hi!
> 
> I think API section is more user friendly:
> 
> https://arrow.apache.org/docs/python/api/compute.html#api-compute
> 
> https://arrow.apache.org/docs/python/generated/pyarrow.compute.binary_join_element_wise.html#pyarrow.compute.binary_join_element_wise
> 
> BR
> 
> J
> 
> pon., 26 wrz 2022 o 23:48 Ian Cook <ia...@ursacomputing.com> napisał(a):
>> 
>> Hi Ryan,
>> 
>> I believe the compute function "binary_join_element_wise" in the Arrow
>> C++ library does just this:
>> https://arrow.apache.org/docs/cpp/compute.html#string-joining
>> 
>> I believe you can call this function in PyArrow following the same
>> pattern described here:
>> https://arrow.apache.org/docs/python/compute.html#standard-compute-functions
>> 
>> Ian
>> 
>>> On Mon, Sep 26, 2022 at 5:26 PM Ryan Kuhns <rn...@gmail.com> wrote:
>>> 
>>> Hi,
>>> 
>>> I’ve started using Apache Arrow via pyarrow.
>>> 
>>> One area I’ve struggled is the ability to create a new column that is a concatenation of other string columns.
>>> 
>>> The existing string concatenation compute functions don’t appear to work for the case I’m describing.
>>> 
>>> Are there any plans to create a compute function that accepts arrays of strings and returns an array that has concatenated the input arrays element-wise?
>>> 
>>> Or is there an efficient way I could use the existing functionality to accomplish this?
>>> 
>>> Thanks in advance for the help!
>>> 
>>> Ryan

Re: String Array Concatenation function?

Posted by Jacek Pliszka <ja...@gmail.com>.
Hi!

I think API section is more user friendly:

https://arrow.apache.org/docs/python/api/compute.html#api-compute

https://arrow.apache.org/docs/python/generated/pyarrow.compute.binary_join_element_wise.html#pyarrow.compute.binary_join_element_wise

BR

J

pon., 26 wrz 2022 o 23:48 Ian Cook <ia...@ursacomputing.com> napisał(a):
>
> Hi Ryan,
>
> I believe the compute function "binary_join_element_wise" in the Arrow
> C++ library does just this:
> https://arrow.apache.org/docs/cpp/compute.html#string-joining
>
> I believe you can call this function in PyArrow following the same
> pattern described here:
> https://arrow.apache.org/docs/python/compute.html#standard-compute-functions
>
> Ian
>
> On Mon, Sep 26, 2022 at 5:26 PM Ryan Kuhns <rn...@gmail.com> wrote:
> >
> > Hi,
> >
> > I’ve started using Apache Arrow via pyarrow.
> >
> > One area I’ve struggled is the ability to create a new column that is a concatenation of other string columns.
> >
> > The existing string concatenation compute functions don’t appear to work for the case I’m describing.
> >
> > Are there any plans to create a compute function that accepts arrays of strings and returns an array that has concatenated the input arrays element-wise?
> >
> > Or is there an efficient way I could use the existing functionality to accomplish this?
> >
> > Thanks in advance for the help!
> >
> > Ryan

Re: String Array Concatenation function?

Posted by Ian Cook <ia...@ursacomputing.com>.
Hi Ryan,

I believe the compute function "binary_join_element_wise" in the Arrow
C++ library does just this:
https://arrow.apache.org/docs/cpp/compute.html#string-joining

I believe you can call this function in PyArrow following the same
pattern described here:
https://arrow.apache.org/docs/python/compute.html#standard-compute-functions

Ian

On Mon, Sep 26, 2022 at 5:26 PM Ryan Kuhns <rn...@gmail.com> wrote:
>
> Hi,
>
> I’ve started using Apache Arrow via pyarrow.
>
> One area I’ve struggled is the ability to create a new column that is a concatenation of other string columns.
>
> The existing string concatenation compute functions don’t appear to work for the case I’m describing.
>
> Are there any plans to create a compute function that accepts arrays of strings and returns an array that has concatenated the input arrays element-wise?
>
> Or is there an efficient way I could use the existing functionality to accomplish this?
>
> Thanks in advance for the help!
>
> Ryan