You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@arrow.apache.org by Olo Sawyerr <os...@outlook.com> on 2022/09/07 20:59:45 UTC

[FLIGHT] Sending multiple record batches with different schemas

Hi,

Hope you're well.

I'm using arrow flight JAVA client -> GO Server. I'm trying to return 2 RecordBatches from the server to the client using a single ticket in a DoGet. Both RecordBatches have different schemas (length an width) but they are a "Pair" of results and belong / are generated together. What's the cleanest way to achieve this using the DoGet?

I initially thought of sending one RecordBatch as metadata and the other as the Record payload. But this didn't feel natural.

Any ideas?

Regards,

Olo

Re: [FLIGHT] Sending multiple record batches with different schemas

Posted by 1057445597 <10...@qq.com>.
1057445597
1057445597@qq.com



&nbsp;




------------------&nbsp;Original&nbsp;------------------
From:                                                                                                                        "user"                                                                                    <lidavidm@apache.org&gt;;
Date:&nbsp;Thu, Sep 8, 2022 05:12 AM
To:&nbsp;"dl"<user@arrow.apache.org&gt;;

Subject:&nbsp;Re: [FLIGHT] Sending multiple record batches with different schemas



 You can combine multiple schemas into a single schema as a union of struct types (since a struct is effectively the same as a batch), that is the closest thing you can get to having two separate schemas within one IPC stream right now. There was some discussion of various ways to relax this before [1] but the discussion died, I think because there were several distinct concepts being tied up into "schema evolution".



[1]: https://lists.apache.org/thread/zs31trnyhbofhvndtdrrvkvm82p8pscq



-David



On Wed, Sep 7, 2022, at 16:59, Olo Sawyerr wrote:

Hi,



Hope you're well.



I'm using arrow flight JAVA client -&gt; GO Server. I'm trying to return 2 RecordBatches from the server to the client using a single ticket in a DoGet. Both RecordBatches have different schemas (length an width) but they are a "Pair" of results and belong / are  generated together. What's the cleanest way to achieve this using the DoGet?



I initially thought of sending one RecordBatch as metadata and the other as the Record payload. But this didn't feel natural. 



Any ideas?



Regards,



Olo

Re: [FLIGHT] Sending multiple record batches with different schemas

Posted by Andrew Lamb <al...@influxdata.com>.
Sending record batches with different schemas in the same request is
something we wanted in IOx project well.

The way we handled it was to resend the existing schema definition messages
after data had already arrived. Our implementation  [1] is Rust and we
controlled both sender and receiver, so your mileage may vary but figured I
would mention it here.

Andrew

[1] https://github.com/influxdata/influxdb_iox/pull/4853

On Wed, Sep 7, 2022 at 5:12 PM David Li <li...@apache.org> wrote:

> You can combine multiple schemas into a single schema as a union of struct
> types (since a struct is effectively the same as a batch), that is the
> closest thing you can get to having two separate schemas within one IPC
> stream right now. There was some discussion of various ways to relax this
> before [1] but the discussion died, I think because there were several
> distinct concepts being tied up into "schema evolution".
>
> [1]: https://lists.apache.org/thread/zs31trnyhbofhvndtdrrvkvm82p8pscq
>
> -David
>
> On Wed, Sep 7, 2022, at 16:59, Olo Sawyerr wrote:
>
> Hi,
>
> Hope you're well.
>
> I'm using arrow flight JAVA client -> GO Server. I'm trying to return 2
> RecordBatches from the server to the client using a single ticket in a
> DoGet. Both RecordBatches have different schemas (length an width) but they
> are a "Pair" of results and belong / are generated together. What's the
> cleanest way to achieve this using the DoGet?
>
> I initially thought of sending one RecordBatch as metadata and the other
> as the Record payload. But this didn't feel natural.
>
> Any ideas?
>
> Regards,
>
> Olo
>
>
>

Re: [FLIGHT] Sending multiple record batches with different schemas

Posted by David Li <li...@apache.org>.
You can combine multiple schemas into a single schema as a union of struct types (since a struct is effectively the same as a batch), that is the closest thing you can get to having two separate schemas within one IPC stream right now. There was some discussion of various ways to relax this before [1] but the discussion died, I think because there were several distinct concepts being tied up into "schema evolution".

[1]: https://lists.apache.org/thread/zs31trnyhbofhvndtdrrvkvm82p8pscq

-David

On Wed, Sep 7, 2022, at 16:59, Olo Sawyerr wrote:
> Hi,
> 
> Hope you're well.
> 
> I'm using arrow flight JAVA client -> GO Server. I'm trying to return 2 RecordBatches from the server to the client using a single ticket in a DoGet. Both RecordBatches have different schemas (length an width) but they are a "Pair" of results and belong / are generated together. What's the cleanest way to achieve this using the DoGet?
> 
> I initially thought of sending one RecordBatch as metadata and the other as the Record payload. But this didn't feel natural. 
> 
> Any ideas?
> 
> Regards,
> 
> Olo