You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@arrow.apache.org by Cindy McMullen <cm...@twitter.com> on 2020/05/20 18:54:18 UTC

Avro and Thrift converters

I see that the Avro converter is planned for Arrow 1.0.0.  Any ideas about
when that release might be?

Any plans for a Thrift -> Avro converter?

Re: Avro and Thrift converters

Posted by Micah Kornfield <em...@gmail.com>.
Hi Cindy,

> Are you saying that the Avro -> Arrow converter is already available in
> release 0.17.1?


Yes, in Java
<https://arrow.apache.org/docs/java/org/apache/arrow/AvroToArrow.html> [1] it
exists in a separate POM
<https://mvnrepository.com/artifact/org.apache.arrow/arrow-avro> [2].  Note
that this is still in an experimental/contrib state (i.e. I'm not sure if
anyone is using it in production) and it might get some refactoring, but it
should be good place to start experimenting, and feedback on it would be
welcome.

As for use cases: we're trying to move away from Thrift in parts of our ML
> stack.  We need to support wide, row-based data with schema support, so
> probably need to convert Thrift to Avro.  However, we'd love to use Arrow
> *between* components (Spark, TensorFlow, scikit-learn), but it's likely
> our data will originate in Avro and/or Thrift.

Thanks.  Like a I said I hope to work a little bit on the C++/Python side
of Avro to Arrow but I can't give an exact time frame for it.  Thrift I
think is more complicated since it seems like there are multiple protocols
that would likely need support.  But contributions are welcome :)

Hope this helps.

Micah

[1] https://arrow.apache.org/docs/java/org/apache/arrow/AvroToArrow.html
[2] https://mvnrepository.com/artifact/org.apache.arrow/arrow-avro

On Wed, May 20, 2020 at 12:36 PM Cindy McMullen <cm...@twitter.com>
wrote:

> Hi, Micah -
>
> I wasn't aware that the Avro converter already existed in Java, since I
> couldn't find any Arrow docs on it. I was going by the Arrow/JIRA release
> tag.  Are you saying that the Avro -> Arrow converter is already available
> in release 0.17.1?
>
> As for use cases: we're trying to move away from Thrift in parts of our ML
> stack.  We need to support wide, row-based data with schema support, so
> probably need to convert Thrift to Avro.  However, we'd love to use Arrow
> *between* components (Spark, TensorFlow, scikit-learn), but it's likely
> our data will originate in Avro and/or Thrift.
>
> Thanks -
>
> -- Cindy
>
> On Wed, May 20, 2020 at 1:14 PM Micah Kornfield <em...@gmail.com>
> wrote:
>
>> The  avro to arrow converter in c++/python will not be done anytime soon
>> unless someone else takes it up (one exists in Java).  It has been on my
>> low priority backlog for a while but I haven't had time to get to it.  We
>> should remove a specific release tag from it.
>>
>> As far as I know there are no plans for thrift or other formats at this
>> point.
>>
>> May I ask what your use case is?
>>
>> Thanks,
>> Micah
>>
>> On Wednesday, May 20, 2020, Cindy McMullen <cm...@twitter.com> wrote:
>>
>>> I see that the Avro converter is planned for Arrow 1.0.0.  Any ideas
>>> about when that release might be?
>>>
>>> Any plans for a Thrift -> Avro converter?
>>>
>>

Re: Avro and Thrift converters

Posted by Cindy McMullen <cm...@twitter.com>.
Hi, Micah -

I wasn't aware that the Avro converter already existed in Java, since I
couldn't find any Arrow docs on it. I was going by the Arrow/JIRA release
tag.  Are you saying that the Avro -> Arrow converter is already available
in release 0.17.1?

As for use cases: we're trying to move away from Thrift in parts of our ML
stack.  We need to support wide, row-based data with schema support, so
probably need to convert Thrift to Avro.  However, we'd love to use Arrow
*between* components (Spark, TensorFlow, scikit-learn), but it's likely our
data will originate in Avro and/or Thrift.

Thanks -

-- Cindy

On Wed, May 20, 2020 at 1:14 PM Micah Kornfield <em...@gmail.com>
wrote:

> The  avro to arrow converter in c++/python will not be done anytime soon
> unless someone else takes it up (one exists in Java).  It has been on my
> low priority backlog for a while but I haven't had time to get to it.  We
> should remove a specific release tag from it.
>
> As far as I know there are no plans for thrift or other formats at this
> point.
>
> May I ask what your use case is?
>
> Thanks,
> Micah
>
> On Wednesday, May 20, 2020, Cindy McMullen <cm...@twitter.com> wrote:
>
>> I see that the Avro converter is planned for Arrow 1.0.0.  Any ideas
>> about when that release might be?
>>
>> Any plans for a Thrift -> Avro converter?
>>
>

Re: Avro and Thrift converters

Posted by Micah Kornfield <em...@gmail.com>.
The  avro to arrow converter in c++/python will not be done anytime soon
unless someone else takes it up (one exists in Java).  It has been on my
low priority backlog for a while but I haven't had time to get to it.  We
should remove a specific release tag from it.

As far as I know there are no plans for thrift or other formats at this
point.

May I ask what your use case is?

Thanks,
Micah

On Wednesday, May 20, 2020, Cindy McMullen <cm...@twitter.com> wrote:

> I see that the Avro converter is planned for Arrow 1.0.0.  Any ideas about
> when that release might be?
>
> Any plans for a Thrift -> Avro converter?
>