You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Wes McKinney <we...@gmail.com> on 2021/08/04 01:02:52 UTC

Re: [DISCUSS] Datasets API plugins?

I think if someone wants to build a plugin model for datasets / file
formats (and refactor the existing "built-in" formats to use those
plugin APIs), that sounds like a fine idea to me. I don't think the
idea was for the API to be closed only to the formats that are
implemented inside the Arrow codebase.

On Thu, Jul 29, 2021 at 4:09 PM Weston Pace <we...@gmail.com> wrote:
>
> In reviewing the RADOS PR I ran into another question.  I recently
> sent an email on the topic where the author wants their integration to
> be part of the Arrow repo (I believe this is the case for the RADOS
> PR).  However, what about the case where the author doesn't want to be
> part of the Gibhub repo (so, to be clear, this email is not relevant
> for the RADOS PR).
>
> Right now, in order to add a new file format to the dataset API the
> author has to add code to the Arrow codebase to create a new
> FileFormat or Fragment.  Do we want to make the datasets API a
> "plugin" architecture to allow new formats in the future be added
> dynamically.
>
> Of course, now that I'm writing the email, I suppose the answer is
> clear.  If someone cares enough about having an external extension
> they can always do the work to add such a plugin system.  Does this
> sound right or is there some other reason against this or different
> approach we'd want to take in the future?