You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Jarek Potiuk <po...@apache.org> on 2021/07/13 07:07:49 UTC

[DISCUSS] What is the Plasma status currently?

Hello Arrow Community,

We've had a very interesting talk at the Apache Airflow Summit about Airflow + Ray (which is really cool BTW and I am looking forward to capabilities it will give to Airflow) and we had some discussions that followed. From what I understand (maybe I am wrong?) the Plasma which was initially developed in Ray, then contributed to Arrow, and then (https://lists.apache.org/thread.html/r65b2852e4cddb1af8bff06d789bf3822d67777c5dfcd481414acd3d7%40%3Cdev.arrow.apache.org%3E) forked (?) by Ray and is kind-of abandoned in Arrow and not really maintained in Arrow any more (and likely Ray version and Arrow version are not compatible /exchangeable).

Is this correct understanding ? Any more comments or maybe explanation what is the relation between Arrow's Plasma and Ray's Plasma?

Just to explain my interest -  I am a PMC of Apache Airflow, I am an independent Open-Source contributor and advisor, and I am genuinely interested in Open-source business models and rationale of stakeholders and how this plays out with individuals and the ASF/PMC and I wanted to understand the current state of Plasma :)

J.
 

Re: [DISCUSS] What is the Plasma status currently?

Posted by Jarek Potiuk <ja...@potiuk.com>.
I see. To add more context - as an outsider comment - I've heard "Ray is so
good for our case because they use this awesome Plasma storage" :). This
might be an overly-narrow view of course and once we start seeing more Ray
usage with Airflow, I might revise this, but this is the impression I have
so far at least.

I do not see it as a conflict to be honest, more as a potential source of
confusion.

J,


On Tue, Jul 13, 2021 at 3:04 PM Wes McKinney <we...@gmail.com> wrote:

> hi Jarek — since Plasma isn't really promoted as a standalone
> component in Ray (rather, it's an implementation detail of how Ray
> works — their documentation is FWIW out of date, claiming that Plasma
> is still being developed in Arrow), I'm not sure there is a particular
> conflict right now. Ray should update their documentation to eliminate
> references to Apache Arrow.
>
> Thanks,
> Wes
>
> On Tue, Jul 13, 2021 at 7:39 AM Jarek Potiuk <ja...@potiuk.com> wrote:
> >
> > Thanks. Interesting story then with the back-forth moves.
> >
> > I wonder if there might be some confusion between then Ray "Plasma" and
> > Arrow "Plasma" (as they seem to be different now). I guess neither Ray
> nor
> > Arrow has the name "trademark" on it in any way (nice name BTW). Maybe
> (and
> > I am sure Ray founders are listening here ;) ) - there should be some
> > effort from Ray to trademark it and get some nice "Rename
> > agreement/clarification" with Arrow since Arrow does not seem to care :)
> ?
> >
> > The reason I am suggesting it, is when I first heard of it, and searched,
> > and asked my friend - he mentioned to me that it WILL be forked in the
> > future but it's the same now. Which I understand already happened and
> it's
> > not the same already.
> >
> > I only found out the story by knowing the Apache Way and digging > 8
> month
> > back in the devlist. So - just a suggestion - maybe worth clarifying it
> as
> > Ray becomes more and more popular - both Arrow community and Ray might
> > suffer due to people understanding the relation and state differently and
> > jumping to assumptions (as my friend did).
> >
> > J.
> >
> > On Tue, Jul 13, 2021 at 2:18 PM Neal Richardson <
> neal.p.richardson@gmail.com>
> > wrote:
> >
> > > Hi Jarek,
> > > Your understanding sounds about right to me. That said, we are still
> > > building and shipping Plasma for those that have come to depend on it
> and
> > > will continue to do so unless/until it becomes a maintenance burden.
> But no
> > > one active in the Arrow community is working on Plasma.
> > >
> > > Neal
> > >
> > > On Tue, Jul 13, 2021 at 3:07 AM Jarek Potiuk <po...@apache.org>
> wrote:
> > >
> > > > Hello Arrow Community,
> > > >
> > > > We've had a very interesting talk at the Apache Airflow Summit about
> > > > Airflow + Ray (which is really cool BTW and I am looking forward to
> > > > capabilities it will give to Airflow) and we had some discussions
> that
> > > > followed. From what I understand (maybe I am wrong?) the Plasma
> which was
> > > > initially developed in Ray, then contributed to Arrow, and then (
> > > >
> > >
> https://lists.apache.org/thread.html/r65b2852e4cddb1af8bff06d789bf3822d67777c5dfcd481414acd3d7%40%3Cdev.arrow.apache.org%3E
> > > )
> > > > forked (?) by Ray and is kind-of abandoned in Arrow and not really
> > > > maintained in Arrow any more (and likely Ray version and Arrow
> version
> > > are
> > > > not compatible /exchangeable).
> > > >
> > > > Is this correct understanding ? Any more comments or maybe
> explanation
> > > > what is the relation between Arrow's Plasma and Ray's Plasma?
> > > >
> > > > Just to explain my interest -  I am a PMC of Apache Airflow, I am an
> > > > independent Open-Source contributor and advisor, and I am genuinely
> > > > interested in Open-source business models and rationale of
> stakeholders
> > > and
> > > > how this plays out with individuals and the ASF/PMC and I wanted to
> > > > understand the current state of Plasma :)
> > > >
> > > > J.
> > > >
> > > >
> > >
> >
> >
> > --
> > +48 660 796 129
>


-- 
+48 660 796 129

Re: [DISCUSS] What is the Plasma status currently?

Posted by Wes McKinney <we...@gmail.com>.
hi Jarek — since Plasma isn't really promoted as a standalone
component in Ray (rather, it's an implementation detail of how Ray
works — their documentation is FWIW out of date, claiming that Plasma
is still being developed in Arrow), I'm not sure there is a particular
conflict right now. Ray should update their documentation to eliminate
references to Apache Arrow.

Thanks,
Wes

On Tue, Jul 13, 2021 at 7:39 AM Jarek Potiuk <ja...@potiuk.com> wrote:
>
> Thanks. Interesting story then with the back-forth moves.
>
> I wonder if there might be some confusion between then Ray "Plasma" and
> Arrow "Plasma" (as they seem to be different now). I guess neither Ray nor
> Arrow has the name "trademark" on it in any way (nice name BTW). Maybe (and
> I am sure Ray founders are listening here ;) ) - there should be some
> effort from Ray to trademark it and get some nice "Rename
> agreement/clarification" with Arrow since Arrow does not seem to care :) ?
>
> The reason I am suggesting it, is when I first heard of it, and searched,
> and asked my friend - he mentioned to me that it WILL be forked in the
> future but it's the same now. Which I understand already happened and it's
> not the same already.
>
> I only found out the story by knowing the Apache Way and digging > 8 month
> back in the devlist. So - just a suggestion - maybe worth clarifying it as
> Ray becomes more and more popular - both Arrow community and Ray might
> suffer due to people understanding the relation and state differently and
> jumping to assumptions (as my friend did).
>
> J.
>
> On Tue, Jul 13, 2021 at 2:18 PM Neal Richardson <ne...@gmail.com>
> wrote:
>
> > Hi Jarek,
> > Your understanding sounds about right to me. That said, we are still
> > building and shipping Plasma for those that have come to depend on it and
> > will continue to do so unless/until it becomes a maintenance burden. But no
> > one active in the Arrow community is working on Plasma.
> >
> > Neal
> >
> > On Tue, Jul 13, 2021 at 3:07 AM Jarek Potiuk <po...@apache.org> wrote:
> >
> > > Hello Arrow Community,
> > >
> > > We've had a very interesting talk at the Apache Airflow Summit about
> > > Airflow + Ray (which is really cool BTW and I am looking forward to
> > > capabilities it will give to Airflow) and we had some discussions that
> > > followed. From what I understand (maybe I am wrong?) the Plasma which was
> > > initially developed in Ray, then contributed to Arrow, and then (
> > >
> > https://lists.apache.org/thread.html/r65b2852e4cddb1af8bff06d789bf3822d67777c5dfcd481414acd3d7%40%3Cdev.arrow.apache.org%3E
> > )
> > > forked (?) by Ray and is kind-of abandoned in Arrow and not really
> > > maintained in Arrow any more (and likely Ray version and Arrow version
> > are
> > > not compatible /exchangeable).
> > >
> > > Is this correct understanding ? Any more comments or maybe explanation
> > > what is the relation between Arrow's Plasma and Ray's Plasma?
> > >
> > > Just to explain my interest -  I am a PMC of Apache Airflow, I am an
> > > independent Open-Source contributor and advisor, and I am genuinely
> > > interested in Open-source business models and rationale of stakeholders
> > and
> > > how this plays out with individuals and the ASF/PMC and I wanted to
> > > understand the current state of Plasma :)
> > >
> > > J.
> > >
> > >
> >
>
>
> --
> +48 660 796 129

Re: [DISCUSS] What is the Plasma status currently?

Posted by Jarek Potiuk <ja...@potiuk.com>.
Thanks. Interesting story then with the back-forth moves.

I wonder if there might be some confusion between then Ray "Plasma" and
Arrow "Plasma" (as they seem to be different now). I guess neither Ray nor
Arrow has the name "trademark" on it in any way (nice name BTW). Maybe (and
I am sure Ray founders are listening here ;) ) - there should be some
effort from Ray to trademark it and get some nice "Rename
agreement/clarification" with Arrow since Arrow does not seem to care :) ?

The reason I am suggesting it, is when I first heard of it, and searched,
and asked my friend - he mentioned to me that it WILL be forked in the
future but it's the same now. Which I understand already happened and it's
not the same already.

I only found out the story by knowing the Apache Way and digging > 8 month
back in the devlist. So - just a suggestion - maybe worth clarifying it as
Ray becomes more and more popular - both Arrow community and Ray might
suffer due to people understanding the relation and state differently and
jumping to assumptions (as my friend did).

J.

On Tue, Jul 13, 2021 at 2:18 PM Neal Richardson <ne...@gmail.com>
wrote:

> Hi Jarek,
> Your understanding sounds about right to me. That said, we are still
> building and shipping Plasma for those that have come to depend on it and
> will continue to do so unless/until it becomes a maintenance burden. But no
> one active in the Arrow community is working on Plasma.
>
> Neal
>
> On Tue, Jul 13, 2021 at 3:07 AM Jarek Potiuk <po...@apache.org> wrote:
>
> > Hello Arrow Community,
> >
> > We've had a very interesting talk at the Apache Airflow Summit about
> > Airflow + Ray (which is really cool BTW and I am looking forward to
> > capabilities it will give to Airflow) and we had some discussions that
> > followed. From what I understand (maybe I am wrong?) the Plasma which was
> > initially developed in Ray, then contributed to Arrow, and then (
> >
> https://lists.apache.org/thread.html/r65b2852e4cddb1af8bff06d789bf3822d67777c5dfcd481414acd3d7%40%3Cdev.arrow.apache.org%3E
> )
> > forked (?) by Ray and is kind-of abandoned in Arrow and not really
> > maintained in Arrow any more (and likely Ray version and Arrow version
> are
> > not compatible /exchangeable).
> >
> > Is this correct understanding ? Any more comments or maybe explanation
> > what is the relation between Arrow's Plasma and Ray's Plasma?
> >
> > Just to explain my interest -  I am a PMC of Apache Airflow, I am an
> > independent Open-Source contributor and advisor, and I am genuinely
> > interested in Open-source business models and rationale of stakeholders
> and
> > how this plays out with individuals and the ASF/PMC and I wanted to
> > understand the current state of Plasma :)
> >
> > J.
> >
> >
>


-- 
+48 660 796 129

Re: [DISCUSS] What is the Plasma status currently?

Posted by Jarek Potiuk <ja...@potiuk.com>.
Happy to help with the CI part of it if needed if you go that route. Been
doing a LOT of that for Airflow.

On Wed, Jul 14, 2021 at 1:16 PM Alessandro Molina <
alessandro@ursacomputing.com> wrote:

> I was wondering, for the benefit of lowering the entry barrier for users
> and especially future contributions who might find themselves confused by
> the amount of optional pieces that you can pick when building arrow, would
> it be reasonable to think of shipping plasma as a separate library? Like
> arrow-plasma with its own packaging/release cycle? That would also have the
> benefit of giving us a better understanding of how many people are actually
> depending on it based on how many people depend on that package.
>
> It's true that there would be an initial burden in separating the codebase
> and building its own CI/release scripts, but I think it would ease life for
> people willing to contribute on arrow "ignoring" plasma and it would give
> plasma the chance to maybe get maintenance outside of the arrow developers
> from people who might not care about contributing to arrow itself at the
> moment.
>
> On Tue, Jul 13, 2021 at 2:18 PM Neal Richardson <
> neal.p.richardson@gmail.com>
> wrote:
>
> > Hi Jarek,
> > Your understanding sounds about right to me. That said, we are still
> > building and shipping Plasma for those that have come to depend on it and
> > will continue to do so unless/until it becomes a maintenance burden. But
> no
> > one active in the Arrow community is working on Plasma.
> >
> > Neal
> >
> > On Tue, Jul 13, 2021 at 3:07 AM Jarek Potiuk <po...@apache.org> wrote:
> >
> > > Hello Arrow Community,
> > >
> > > We've had a very interesting talk at the Apache Airflow Summit about
> > > Airflow + Ray (which is really cool BTW and I am looking forward to
> > > capabilities it will give to Airflow) and we had some discussions that
> > > followed. From what I understand (maybe I am wrong?) the Plasma which
> was
> > > initially developed in Ray, then contributed to Arrow, and then (
> > >
> >
> https://lists.apache.org/thread.html/r65b2852e4cddb1af8bff06d789bf3822d67777c5dfcd481414acd3d7%40%3Cdev.arrow.apache.org%3E
> > )
> > > forked (?) by Ray and is kind-of abandoned in Arrow and not really
> > > maintained in Arrow any more (and likely Ray version and Arrow version
> > are
> > > not compatible /exchangeable).
> > >
> > > Is this correct understanding ? Any more comments or maybe explanation
> > > what is the relation between Arrow's Plasma and Ray's Plasma?
> > >
> > > Just to explain my interest -  I am a PMC of Apache Airflow, I am an
> > > independent Open-Source contributor and advisor, and I am genuinely
> > > interested in Open-source business models and rationale of stakeholders
> > and
> > > how this plays out with individuals and the ASF/PMC and I wanted to
> > > understand the current state of Plasma :)
> > >
> > > J.
> > >
> > >
> >
>


-- 
+48 660 796 129

Re: [DISCUSS] What is the Plasma status currently?

Posted by Alessandro Molina <al...@ursacomputing.com>.
I was wondering, for the benefit of lowering the entry barrier for users
and especially future contributions who might find themselves confused by
the amount of optional pieces that you can pick when building arrow, would
it be reasonable to think of shipping plasma as a separate library? Like
arrow-plasma with its own packaging/release cycle? That would also have the
benefit of giving us a better understanding of how many people are actually
depending on it based on how many people depend on that package.

It's true that there would be an initial burden in separating the codebase
and building its own CI/release scripts, but I think it would ease life for
people willing to contribute on arrow "ignoring" plasma and it would give
plasma the chance to maybe get maintenance outside of the arrow developers
from people who might not care about contributing to arrow itself at the
moment.

On Tue, Jul 13, 2021 at 2:18 PM Neal Richardson <ne...@gmail.com>
wrote:

> Hi Jarek,
> Your understanding sounds about right to me. That said, we are still
> building and shipping Plasma for those that have come to depend on it and
> will continue to do so unless/until it becomes a maintenance burden. But no
> one active in the Arrow community is working on Plasma.
>
> Neal
>
> On Tue, Jul 13, 2021 at 3:07 AM Jarek Potiuk <po...@apache.org> wrote:
>
> > Hello Arrow Community,
> >
> > We've had a very interesting talk at the Apache Airflow Summit about
> > Airflow + Ray (which is really cool BTW and I am looking forward to
> > capabilities it will give to Airflow) and we had some discussions that
> > followed. From what I understand (maybe I am wrong?) the Plasma which was
> > initially developed in Ray, then contributed to Arrow, and then (
> >
> https://lists.apache.org/thread.html/r65b2852e4cddb1af8bff06d789bf3822d67777c5dfcd481414acd3d7%40%3Cdev.arrow.apache.org%3E
> )
> > forked (?) by Ray and is kind-of abandoned in Arrow and not really
> > maintained in Arrow any more (and likely Ray version and Arrow version
> are
> > not compatible /exchangeable).
> >
> > Is this correct understanding ? Any more comments or maybe explanation
> > what is the relation between Arrow's Plasma and Ray's Plasma?
> >
> > Just to explain my interest -  I am a PMC of Apache Airflow, I am an
> > independent Open-Source contributor and advisor, and I am genuinely
> > interested in Open-source business models and rationale of stakeholders
> and
> > how this plays out with individuals and the ASF/PMC and I wanted to
> > understand the current state of Plasma :)
> >
> > J.
> >
> >
>

Re: [DISCUSS] What is the Plasma status currently?

Posted by Neal Richardson <ne...@gmail.com>.
Hi Jarek,
Your understanding sounds about right to me. That said, we are still
building and shipping Plasma for those that have come to depend on it and
will continue to do so unless/until it becomes a maintenance burden. But no
one active in the Arrow community is working on Plasma.

Neal

On Tue, Jul 13, 2021 at 3:07 AM Jarek Potiuk <po...@apache.org> wrote:

> Hello Arrow Community,
>
> We've had a very interesting talk at the Apache Airflow Summit about
> Airflow + Ray (which is really cool BTW and I am looking forward to
> capabilities it will give to Airflow) and we had some discussions that
> followed. From what I understand (maybe I am wrong?) the Plasma which was
> initially developed in Ray, then contributed to Arrow, and then (
> https://lists.apache.org/thread.html/r65b2852e4cddb1af8bff06d789bf3822d67777c5dfcd481414acd3d7%40%3Cdev.arrow.apache.org%3E)
> forked (?) by Ray and is kind-of abandoned in Arrow and not really
> maintained in Arrow any more (and likely Ray version and Arrow version are
> not compatible /exchangeable).
>
> Is this correct understanding ? Any more comments or maybe explanation
> what is the relation between Arrow's Plasma and Ray's Plasma?
>
> Just to explain my interest -  I am a PMC of Apache Airflow, I am an
> independent Open-Source contributor and advisor, and I am genuinely
> interested in Open-source business models and rationale of stakeholders and
> how this plays out with individuals and the ASF/PMC and I wanted to
> understand the current state of Plasma :)
>
> J.
>
>