You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Wes McKinney (JIRA)" <ji...@apache.org> on 2019/01/09 21:38:00 UTC

[jira] [Commented] (ARROW-4213) [Flight] C++ and Java implementations are incompatible

    [ https://issues.apache.org/jira/browse/ARROW-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16738680#comment-16738680 ] 

Wes McKinney commented on ARROW-4213:
-------------------------------------

Re: schemas, I think that IPC messages should be used; those are the "public" way to transmit Flatbuffers message payloads. cc [~jnadeau] for comment

Re: DoGet; shouldn't the receiver already have the schema after calling GetFlightInfo? It would seem wasteful for each endpoint to send the schema again.

There are some complications to consider relating to dictionary encoding -- dictionaries are probably going to come over the wire first in a DoGet operation. There are comments in the C++ codebase about this

cc [~pitrou]

> [Flight] C++ and Java implementations are incompatible
> ------------------------------------------------------
>
>                 Key: ARROW-4213
>                 URL: https://issues.apache.org/jira/browse/ARROW-4213
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: FlightRPC
>            Reporter: David Li
>            Priority: Major
>              Labels: flight
>
> A C++ client cannot request streams from a Java service, nor can it decode the schema from GetFlightInfo.
> Schema: in Java, GetFlightInfo encodes the schema directly via flatbuffers. C++ expects it to be encoded as an IPC message. This isn't a problem in Java as a method exists to decode such schemas, but in C++ the API for reading such a schema isn't really exposed. I'm willing to submit a patch for this, but it's not clear to me which scheme is preferred.
> Streams: in Java, DoGet starts with an ArrowMessage containing a schema. C++ does not expect this and segfaults when it tries to decode the message as a record batch. Based on the presentations I've seen, I think C++ is in the wrong here; I have a patch to fix this that I could clean up and submit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)