You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by Julien Le Dem <ju...@dremio.com> on 2017/02/01 23:54:31 UTC

Re: Parquet CLI next steps

I'm happy with having it in the parquet-mr repo.
I reviewed the patch and I am happy to merge it after the comment on tests
is addressed.

I don't think it is necessarily faster to have it in a separate repo. But
I'll Ryan decide on this.

On Fri, Jan 27, 2017 at 2:25 PM, Antwnis <an...@gmail.com> wrote:

> Hi Ryan,
>
> +1 on separate parquet repository to give this projects speed - and also
> de-couple the release of the *cli* from the release of Parquet
>
> Eventually - once it matures it could come back into Parquet - but IMHO
> building up quickly the capabilities should be the priority
>
> On Thu, Jan 26, 2017 at 8:29 PM, Ryan Blue <rb...@netflix.com.invalid>
> wrote:
>
> > Hi everyone,
> >
> > Last year, I contributed a new Parquet command-line tool in PR #384. The
> > new CLI is easier to interpret than the current parquet-tools and
> supports
> > conversion from JSON, CSV, and Avro. I'd really like to get a version of
> it
> > released, but it is still waiting because it's a really big patch.
> >
> > The initial plan was to add it as a module of parquet-mr, but with the
> > review stalled, I'm wondering if it may be easier to get it in by making
> it
> > a separate Parquet repository with separate releases. Then we could get
> it
> > released more quickly.
> >
> > What does everyone think about a separate repo and release? Any other
> ideas
> > to get this tool out to Parquet users?
> >
> > rb
> >
> > --
> > Ryan Blue
> > Software Engineer
> > Netflix
> >
>



-- 
Julien

Re: Parquet CLI next steps

Posted by Ryan Blue <rb...@netflix.com.INVALID>.
Julien made a good point in the review about moving libraries from the CLI
into parquet-mr. Things like the CSV or JSON conversion code should be
exposed in libraries, so I agree that for the moment it makes sense to put
it in parquet-mr so we can do those moves easily. We can always separate it
later if it isn't changing much or needs its own release schedule.

rb

On Wed, Feb 1, 2017 at 3:54 PM, Julien Le Dem <ju...@dremio.com> wrote:

> I'm happy with having it in the parquet-mr repo.
> I reviewed the patch and I am happy to merge it after the comment on tests
> is addressed.
>
> I don't think it is necessarily faster to have it in a separate repo. But
> I'll Ryan decide on this.
>
> On Fri, Jan 27, 2017 at 2:25 PM, Antwnis <an...@gmail.com> wrote:
>
> > Hi Ryan,
> >
> > +1 on separate parquet repository to give this projects speed - and also
> > de-couple the release of the *cli* from the release of Parquet
> >
> > Eventually - once it matures it could come back into Parquet - but IMHO
> > building up quickly the capabilities should be the priority
> >
> > On Thu, Jan 26, 2017 at 8:29 PM, Ryan Blue <rb...@netflix.com.invalid>
> > wrote:
> >
> > > Hi everyone,
> > >
> > > Last year, I contributed a new Parquet command-line tool in PR #384.
> The
> > > new CLI is easier to interpret than the current parquet-tools and
> > supports
> > > conversion from JSON, CSV, and Avro. I'd really like to get a version
> of
> > it
> > > released, but it is still waiting because it's a really big patch.
> > >
> > > The initial plan was to add it as a module of parquet-mr, but with the
> > > review stalled, I'm wondering if it may be easier to get it in by
> making
> > it
> > > a separate Parquet repository with separate releases. Then we could get
> > it
> > > released more quickly.
> > >
> > > What does everyone think about a separate repo and release? Any other
> > ideas
> > > to get this tool out to Parquet users?
> > >
> > > rb
> > >
> > > --
> > > Ryan Blue
> > > Software Engineer
> > > Netflix
> > >
> >
>
>
>
> --
> Julien
>



-- 
Ryan Blue
Software Engineer
Netflix