You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@arrow.apache.org by Mihir Rege <mi...@gmail.com> on 2018/12/19 14:46:20 UTC

Arrow back-compat

Hi,

I was wondering if there are any back-compat guarantees given by Arrow and
what the behaviour would be in case of a backwards incompatible change (I
believe there haven't been any since 0.8.0). Are there checks in the
readers to detect that the format being read is incompatible or is there a
potential for incorrectness?

For example, in spark, there is only a check for the minimum version of
pyarrow [0] and none to verify that the version of the arrow jar shipped
matches the version of pyarrow being used.

Thanks!
Mihir

[0]
https://github.com/apache/spark/blob/834b8609793525a5a486013732d8c98e1c6e6504/python/pyspark/sql/utils.py#L139

Re: Arrow back-compat

Posted by Mihir Rege <mi...@gmail.com>.
Awesome. Thanks for the detailed response, Wes.



On Thu, 20 Dec 2018, 22:39 Wes McKinney <wesmckinn@gmail.com wrote:

> hi Mihir -- there is a version number in the binary protocol [1].
>
> Readers are supposed to check whether they support the version of the
> protocol they receive. In C++, we validate the metadata version for
> every IPC message we see [2]. I think Java also checks.
>
> - Wes
>
> [1]: https://github.com/apache/arrow/blob/master/format/Schema.fbs#L33
> [2]:
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/ipc/message.cc#L49
>
> On Wed, Dec 19, 2018 at 8:46 AM Mihir Rege <mi...@gmail.com> wrote:
> >
> > Hi,
> >
> > I was wondering if there are any back-compat guarantees given by Arrow
> and what the behaviour would be in case of a backwards incompatible change
> (I believe there haven't been any since 0.8.0). Are there checks in the
> readers to detect that the format being read is incompatible or is there a
> potential for incorrectness?
> >
> > For example, in spark, there is only a check for the minimum version of
> pyarrow [0] and none to verify that the version of the arrow jar shipped
> matches the version of pyarrow being used.
> >
> > Thanks!
> > Mihir
> >
> > [0]
> https://github.com/apache/spark/blob/834b8609793525a5a486013732d8c98e1c6e6504/python/pyspark/sql/utils.py#L139
>

Re: Arrow back-compat

Posted by Wes McKinney <we...@gmail.com>.
hi Mihir -- there is a version number in the binary protocol [1].

Readers are supposed to check whether they support the version of the
protocol they receive. In C++, we validate the metadata version for
every IPC message we see [2]. I think Java also checks.

- Wes

[1]: https://github.com/apache/arrow/blob/master/format/Schema.fbs#L33
[2]: https://github.com/apache/arrow/blob/master/cpp/src/arrow/ipc/message.cc#L49

On Wed, Dec 19, 2018 at 8:46 AM Mihir Rege <mi...@gmail.com> wrote:
>
> Hi,
>
> I was wondering if there are any back-compat guarantees given by Arrow and what the behaviour would be in case of a backwards incompatible change (I believe there haven't been any since 0.8.0). Are there checks in the readers to detect that the format being read is incompatible or is there a potential for incorrectness?
>
> For example, in spark, there is only a check for the minimum version of pyarrow [0] and none to verify that the version of the arrow jar shipped matches the version of pyarrow being used.
>
> Thanks!
> Mihir
>
> [0] https://github.com/apache/spark/blob/834b8609793525a5a486013732d8c98e1c6e6504/python/pyspark/sql/utils.py#L139