You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Tomek Drabas <dr...@gmail.com> on 2022/05/09 18:28:09 UTC

Arrow C-Data and DuckDB

I am new to this board so please, let me know if any of this doesn't make
sense.

I am building a FligthSQL example with DuckDB backend. DuckDB already has
an Arrow interface defined in duckdb.h that returns ArrowArray. However,
the import is not guarded in any way, and ArrowArray is redefined in
duckdb.h, so including arrow/c/bridge.h throws an error that ArrowArray is
defined in multiple places.

I'd like to propose adding canonical guardrails in arrow/c/bridge.h to
avoid this. Is this the best way to do this?

Thanks,
-Tom

Re: Arrow C-Data and DuckDB

Posted by Dewey Dunnington <de...@voltrondata.com>.
I would also love to see a canonical way to do this! My personal workaround
has been to guard my own include with #ifndef ARROW_FLAG_DICTIONARY_ORDERED
(but that's clearly a hack).

On Mon, May 9, 2022 at 3:28 PM Tomek Drabas <dr...@gmail.com> wrote:

> I am new to this board so please, let me know if any of this doesn't make
> sense.
>
> I am building a FligthSQL example with DuckDB backend. DuckDB already has
> an Arrow interface defined in duckdb.h that returns ArrowArray. However,
> the import is not guarded in any way, and ArrowArray is redefined in
> duckdb.h, so including arrow/c/bridge.h throws an error that ArrowArray is
> defined in multiple places.
>
> I'd like to propose adding canonical guardrails in arrow/c/bridge.h to
> avoid this. Is this the best way to do this?
>
> Thanks,
> -Tom
>

Re: Arrow C-Data and DuckDB

Posted by Antoine Pitrou <an...@python.org>.
For the record, https://github.com/apache/arrow/pull/13115 was merged
with the proposed change.

Regards

Antoine.


On Fri, 13 May 2022 17:48:21 +0200
Antoine Pitrou <an...@python.org> wrote:

> I don't think this needs a vote, there is no functional change in the 
> spec, it's just an additional technical recommendation that can go 
> through the regular PR process.
> 
> Regards
> 
> Antoine.
> 
> 
> Le 12/05/2022 à 22:24, David Li a écrit :
> > Thanks all for the comments. I see Tom also put up a PR to add this to DuckDB [1].
> > 
> > Do we need a vote for this? If so unless there are further comments I think we can start one.
> > 
> > [1]: https://github.com/duckdb/duckdb/pull/3628
> > 
> > On Tue, May 10, 2022, at 13:31, David Li wrote:  
> >> For discussion I've put up https://github.com/apache/arrow/pull/13115
> >> to add this for the C data/stream interfaces.
> >>
> >> On Mon, May 9, 2022, at 15:42, Antoine Pitrou wrote:  
> >>> Le 09/05/2022 à 20:28, Tomek Drabas a écrit :  
> >>>> I am new to this board so please, let me know if any of this doesn't make
> >>>> sense.
> >>>>
> >>>> I am building a FligthSQL example with DuckDB backend. DuckDB already has
> >>>> an Arrow interface defined in duckdb.h that returns ArrowArray. However,
> >>>> the import is not guarded in any way, and ArrowArray is redefined in
> >>>> duckdb.h, so including arrow/c/bridge.h throws an error that ArrowArray is
> >>>> defined in multiple places.
> >>>>
> >>>> I'd like to propose adding canonical guardrails in arrow/c/bridge.h to
> >>>> avoid this. Is this the best way to do this?  
> >>>
> >>> It should probably be included in the spec:
> >>> https://arrow.apache.org/docs/format/CDataInterface.html#structure-definitions
> >>>
> >>> Regards
> >>>
> >>> Antoine.
> >>>
> >>>  
> >>>>
> >>>> Thanks,
> >>>> -Tom
> >>>>  
> 




Re: Arrow C-Data and DuckDB

Posted by Antoine Pitrou <an...@python.org>.
I don't think this needs a vote, there is no functional change in the 
spec, it's just an additional technical recommendation that can go 
through the regular PR process.

Regards

Antoine.


Le 12/05/2022 à 22:24, David Li a écrit :
> Thanks all for the comments. I see Tom also put up a PR to add this to DuckDB [1].
> 
> Do we need a vote for this? If so unless there are further comments I think we can start one.
> 
> [1]: https://github.com/duckdb/duckdb/pull/3628
> 
> On Tue, May 10, 2022, at 13:31, David Li wrote:
>> For discussion I've put up https://github.com/apache/arrow/pull/13115
>> to add this for the C data/stream interfaces.
>>
>> On Mon, May 9, 2022, at 15:42, Antoine Pitrou wrote:
>>> Le 09/05/2022 à 20:28, Tomek Drabas a écrit :
>>>> I am new to this board so please, let me know if any of this doesn't make
>>>> sense.
>>>>
>>>> I am building a FligthSQL example with DuckDB backend. DuckDB already has
>>>> an Arrow interface defined in duckdb.h that returns ArrowArray. However,
>>>> the import is not guarded in any way, and ArrowArray is redefined in
>>>> duckdb.h, so including arrow/c/bridge.h throws an error that ArrowArray is
>>>> defined in multiple places.
>>>>
>>>> I'd like to propose adding canonical guardrails in arrow/c/bridge.h to
>>>> avoid this. Is this the best way to do this?
>>>
>>> It should probably be included in the spec:
>>> https://arrow.apache.org/docs/format/CDataInterface.html#structure-definitions
>>>
>>> Regards
>>>
>>> Antoine.
>>>
>>>
>>>>
>>>> Thanks,
>>>> -Tom
>>>>

Re: Arrow C-Data and DuckDB

Posted by David Li <li...@apache.org>.
Thanks all for the comments. I see Tom also put up a PR to add this to DuckDB [1].

Do we need a vote for this? If so unless there are further comments I think we can start one.

[1]: https://github.com/duckdb/duckdb/pull/3628

On Tue, May 10, 2022, at 13:31, David Li wrote:
> For discussion I've put up https://github.com/apache/arrow/pull/13115 
> to add this for the C data/stream interfaces.
>
> On Mon, May 9, 2022, at 15:42, Antoine Pitrou wrote:
>> Le 09/05/2022 à 20:28, Tomek Drabas a écrit :
>>> I am new to this board so please, let me know if any of this doesn't make
>>> sense.
>>> 
>>> I am building a FligthSQL example with DuckDB backend. DuckDB already has
>>> an Arrow interface defined in duckdb.h that returns ArrowArray. However,
>>> the import is not guarded in any way, and ArrowArray is redefined in
>>> duckdb.h, so including arrow/c/bridge.h throws an error that ArrowArray is
>>> defined in multiple places.
>>> 
>>> I'd like to propose adding canonical guardrails in arrow/c/bridge.h to
>>> avoid this. Is this the best way to do this?
>>
>> It should probably be included in the spec:
>> https://arrow.apache.org/docs/format/CDataInterface.html#structure-definitions
>>
>> Regards
>>
>> Antoine.
>>
>>
>>> 
>>> Thanks,
>>> -Tom
>>>

Re: Arrow C-Data and DuckDB

Posted by David Li <li...@apache.org>.
For discussion I've put up https://github.com/apache/arrow/pull/13115 to add this for the C data/stream interfaces.

On Mon, May 9, 2022, at 15:42, Antoine Pitrou wrote:
> Le 09/05/2022 à 20:28, Tomek Drabas a écrit :
>> I am new to this board so please, let me know if any of this doesn't make
>> sense.
>> 
>> I am building a FligthSQL example with DuckDB backend. DuckDB already has
>> an Arrow interface defined in duckdb.h that returns ArrowArray. However,
>> the import is not guarded in any way, and ArrowArray is redefined in
>> duckdb.h, so including arrow/c/bridge.h throws an error that ArrowArray is
>> defined in multiple places.
>> 
>> I'd like to propose adding canonical guardrails in arrow/c/bridge.h to
>> avoid this. Is this the best way to do this?
>
> It should probably be included in the spec:
> https://arrow.apache.org/docs/format/CDataInterface.html#structure-definitions
>
> Regards
>
> Antoine.
>
>
>> 
>> Thanks,
>> -Tom
>>

Re: Arrow C-Data and DuckDB

Posted by Antoine Pitrou <an...@python.org>.
Le 09/05/2022 à 20:28, Tomek Drabas a écrit :
> I am new to this board so please, let me know if any of this doesn't make
> sense.
> 
> I am building a FligthSQL example with DuckDB backend. DuckDB already has
> an Arrow interface defined in duckdb.h that returns ArrowArray. However,
> the import is not guarded in any way, and ArrowArray is redefined in
> duckdb.h, so including arrow/c/bridge.h throws an error that ArrowArray is
> defined in multiple places.
> 
> I'd like to propose adding canonical guardrails in arrow/c/bridge.h to
> avoid this. Is this the best way to do this?

It should probably be included in the spec:
https://arrow.apache.org/docs/format/CDataInterface.html#structure-definitions

Regards

Antoine.


> 
> Thanks,
> -Tom
>