You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by Micah Kornfield <em...@gmail.com> on 2023/07/17 04:16:00 UTC

Re: [DISCUSS] Time to release parquet format 2.10.0?

I'm sorry I've had less time to dedicate to this then I inspect.  Gang do
you have bandwidth to work on it?  I can help review.  Otherwise, will see
if I can make time this month.

On Sat, May 13, 2023 at 10:53 AM Xinli shang <sh...@uber.com.invalid>
wrote:

> Thank Gang for taking the lead on this! I agree we should have a new
> release. In addition to PARQUET-2261, there was also a discussion in Feb
> with PMCs for PARQUET-758. We may want to check for the plan with Antoine
> Pitrou <https://github.com/pitrou> if PARQUET-758 wants to be in also.
>
>
>
> On Sat, May 13, 2023 at 9:51 AM Micah Kornfield <em...@gmail.com>
> wrote:
>
> > >
> > >  BTW, I'd like to see the implementation from Micah to fully
> > > understand the use case. If he is too busy to do that, I can do it
> based
> > on
> > > my understanding.
> >
> >
> > I can allocate some time to try to make a PoC in C++ next month if we are
> > willing to wait until then.
> >
> > On Fri, May 12, 2023 at 5:04 AM Gang Wu <us...@gmail.com> wrote:
> >
> > > I think we can wait for a complete PoC implementation of PARQUET-2261
> > > before release. BTW, I'd like to see the implementation from Micah to
> > fully
> > > understand the use case. If he is too busy to do that, I can do it
> based
> > on
> > > my understanding.
> > >
> > > Best,
> > > Gang
> > >
> > > On Fri, May 12, 2023 at 4:34 PM Gábor Szádovszky <ga...@apache.org>
> > wrote:
> > >
> > > > Thanks a lot for volunteering, Gang!
> > > >
> > > > However it is more than 2 years indeed since the last release I think
> > the
> > > > actual changes since then are more important. There are lots of
> > > > additions/corrections in the spec docs and the thrift file comments
> > which
> > > > are very important but not tightly attached to a format release. I
> only
> > > can
> > > > see PARQUET-2257 that contains an actual change in the thrift
> > structure.
> > > >
> > > > Related to the ongoing effort of PARQUET-2261: I think, we are
> waiting
> > > for
> > > > a PoC implementation. @emkornfield: Do you plan to work on this?
> > > >
> > > > The question is if we think PARQUET-2257 is urgent enough to not to
> > wait
> > > > for PARQUET-2261 and have an additional release after the latter is
> > ready
> > > > or we shall wait for the PoC implementation and release format after
> > it.
> > > >
> > > > On 2023/05/02 03:33:05 Gang Wu wrote:
> > > > > Thanks Fokko!
> > > > >
> > > > > Let us just wait for more inputs to see if it is good to proceed.
> > > > >
> > > > > Best,
> > > > > Gang
> > > > >
> > > > > On Fri, Apr 28, 2023 at 4:05 PM Fokko Driesprong <fokko@apache.org
> >
> > > > wrote:
> > > > >
> > > > > > Hey Gang,
> > > > > >
> > > > > > Great bringing this up, I think that would be a great idea!
> > > > > >
> > > > > > Kind regards,
> > > > > > Fokko
> > > > > >
> > > > > > Op do 27 apr 2023 om 09:52 schreef Gang Wu <us...@gmail.com>:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > The latest parquet format is v2.9.0 [1] which was released two
> > > years
> > > > ago.
> > > > > > > Is it a good time to release the next version? If there is no
> > > > objection,
> > > > > > I
> > > > > > > can
> > > > > > > volunteer to be the release manager.
> > > > > > >
> > > > > > > [1]
> > > https://github.com/apache/parquet-format/blob/master/CHANGES.md
> > > > > > >
> > > > > > > Best,
> > > > > > > Gang
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
>
> --
> Xinli Shang
>

Re: [DISCUSS] Time to release parquet format 2.10.0?

Posted by Gang Wu <us...@gmail.com>.
Hi all,

Now that we have merged PARQUET-758 [1] and PARQUET-2261 [2], I
think it is a good time to move forward with the v2.10 release process. I
do notice that there is an ongoing effort with PARQUET-2249 [3]. Due to
the current status, I do not think it will be closed too soon. If there is
no
objection, I volunteer to be the release manager and go ahead.

[1] https://issues.apache.org/jira/browse/PARQUET-758
[2] https://issues.apache.org/jira/browse/PARQUET-2261
[3] https://issues.apache.org/jira/browse/PARQUET-2249

Thanks,
Gang

On Mon, Jul 17, 2023 at 2:10 PM Gang Wu <us...@gmail.com> wrote:

> I probably don't have much bandwidth to work on it this month. BTW,
> a POC implementation in the parquet-mr is also required, right? I can
> start to work on this but cannot provide a precise ETA yet.
>
> Best,
> Gang
>
> On Mon, Jul 17, 2023 at 12:17 PM Micah Kornfield <em...@gmail.com>
> wrote:
>
>> I'm sorry I've had less time to dedicate to this then I inspect.  Gang do
>> you have bandwidth to work on it?  I can help review.  Otherwise, will see
>> if I can make time this month.
>>
>> On Sat, May 13, 2023 at 10:53 AM Xinli shang <sh...@uber.com.invalid>
>> wrote:
>>
>> > Thank Gang for taking the lead on this! I agree we should have a new
>> > release. In addition to PARQUET-2261, there was also a discussion in Feb
>> > with PMCs for PARQUET-758. We may want to check for the plan with
>> Antoine
>> > Pitrou <https://github.com/pitrou> if PARQUET-758 wants to be in also.
>> >
>> >
>> >
>> > On Sat, May 13, 2023 at 9:51 AM Micah Kornfield <em...@gmail.com>
>> > wrote:
>> >
>> > > >
>> > > >  BTW, I'd like to see the implementation from Micah to fully
>> > > > understand the use case. If he is too busy to do that, I can do it
>> > based
>> > > on
>> > > > my understanding.
>> > >
>> > >
>> > > I can allocate some time to try to make a PoC in C++ next month if we
>> are
>> > > willing to wait until then.
>> > >
>> > > On Fri, May 12, 2023 at 5:04 AM Gang Wu <us...@gmail.com> wrote:
>> > >
>> > > > I think we can wait for a complete PoC implementation of
>> PARQUET-2261
>> > > > before release. BTW, I'd like to see the implementation from Micah
>> to
>> > > fully
>> > > > understand the use case. If he is too busy to do that, I can do it
>> > based
>> > > on
>> > > > my understanding.
>> > > >
>> > > > Best,
>> > > > Gang
>> > > >
>> > > > On Fri, May 12, 2023 at 4:34 PM Gábor Szádovszky <ga...@apache.org>
>> > > wrote:
>> > > >
>> > > > > Thanks a lot for volunteering, Gang!
>> > > > >
>> > > > > However it is more than 2 years indeed since the last release I
>> think
>> > > the
>> > > > > actual changes since then are more important. There are lots of
>> > > > > additions/corrections in the spec docs and the thrift file
>> comments
>> > > which
>> > > > > are very important but not tightly attached to a format release. I
>> > only
>> > > > can
>> > > > > see PARQUET-2257 that contains an actual change in the thrift
>> > > structure.
>> > > > >
>> > > > > Related to the ongoing effort of PARQUET-2261: I think, we are
>> > waiting
>> > > > for
>> > > > > a PoC implementation. @emkornfield: Do you plan to work on this?
>> > > > >
>> > > > > The question is if we think PARQUET-2257 is urgent enough to not
>> to
>> > > wait
>> > > > > for PARQUET-2261 and have an additional release after the latter
>> is
>> > > ready
>> > > > > or we shall wait for the PoC implementation and release format
>> after
>> > > it.
>> > > > >
>> > > > > On 2023/05/02 03:33:05 Gang Wu wrote:
>> > > > > > Thanks Fokko!
>> > > > > >
>> > > > > > Let us just wait for more inputs to see if it is good to
>> proceed.
>> > > > > >
>> > > > > > Best,
>> > > > > > Gang
>> > > > > >
>> > > > > > On Fri, Apr 28, 2023 at 4:05 PM Fokko Driesprong <
>> fokko@apache.org
>> > >
>> > > > > wrote:
>> > > > > >
>> > > > > > > Hey Gang,
>> > > > > > >
>> > > > > > > Great bringing this up, I think that would be a great idea!
>> > > > > > >
>> > > > > > > Kind regards,
>> > > > > > > Fokko
>> > > > > > >
>> > > > > > > Op do 27 apr 2023 om 09:52 schreef Gang Wu <ustcwg@gmail.com
>> >:
>> > > > > > >
>> > > > > > > > Hi,
>> > > > > > > >
>> > > > > > > > The latest parquet format is v2.9.0 [1] which was released
>> two
>> > > > years
>> > > > > ago.
>> > > > > > > > Is it a good time to release the next version? If there is
>> no
>> > > > > objection,
>> > > > > > > I
>> > > > > > > > can
>> > > > > > > > volunteer to be the release manager.
>> > > > > > > >
>> > > > > > > > [1]
>> > > > https://github.com/apache/parquet-format/blob/master/CHANGES.md
>> > > > > > > >
>> > > > > > > > Best,
>> > > > > > > > Gang
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> >
>> > --
>> > Xinli Shang
>> >
>>
>

Re: [DISCUSS] Time to release parquet format 2.10.0?

Posted by Gang Wu <us...@gmail.com>.
I probably don't have much bandwidth to work on it this month. BTW,
a POC implementation in the parquet-mr is also required, right? I can
start to work on this but cannot provide a precise ETA yet.

Best,
Gang

On Mon, Jul 17, 2023 at 12:17 PM Micah Kornfield <em...@gmail.com>
wrote:

> I'm sorry I've had less time to dedicate to this then I inspect.  Gang do
> you have bandwidth to work on it?  I can help review.  Otherwise, will see
> if I can make time this month.
>
> On Sat, May 13, 2023 at 10:53 AM Xinli shang <sh...@uber.com.invalid>
> wrote:
>
> > Thank Gang for taking the lead on this! I agree we should have a new
> > release. In addition to PARQUET-2261, there was also a discussion in Feb
> > with PMCs for PARQUET-758. We may want to check for the plan with Antoine
> > Pitrou <https://github.com/pitrou> if PARQUET-758 wants to be in also.
> >
> >
> >
> > On Sat, May 13, 2023 at 9:51 AM Micah Kornfield <em...@gmail.com>
> > wrote:
> >
> > > >
> > > >  BTW, I'd like to see the implementation from Micah to fully
> > > > understand the use case. If he is too busy to do that, I can do it
> > based
> > > on
> > > > my understanding.
> > >
> > >
> > > I can allocate some time to try to make a PoC in C++ next month if we
> are
> > > willing to wait until then.
> > >
> > > On Fri, May 12, 2023 at 5:04 AM Gang Wu <us...@gmail.com> wrote:
> > >
> > > > I think we can wait for a complete PoC implementation of PARQUET-2261
> > > > before release. BTW, I'd like to see the implementation from Micah to
> > > fully
> > > > understand the use case. If he is too busy to do that, I can do it
> > based
> > > on
> > > > my understanding.
> > > >
> > > > Best,
> > > > Gang
> > > >
> > > > On Fri, May 12, 2023 at 4:34 PM Gábor Szádovszky <ga...@apache.org>
> > > wrote:
> > > >
> > > > > Thanks a lot for volunteering, Gang!
> > > > >
> > > > > However it is more than 2 years indeed since the last release I
> think
> > > the
> > > > > actual changes since then are more important. There are lots of
> > > > > additions/corrections in the spec docs and the thrift file comments
> > > which
> > > > > are very important but not tightly attached to a format release. I
> > only
> > > > can
> > > > > see PARQUET-2257 that contains an actual change in the thrift
> > > structure.
> > > > >
> > > > > Related to the ongoing effort of PARQUET-2261: I think, we are
> > waiting
> > > > for
> > > > > a PoC implementation. @emkornfield: Do you plan to work on this?
> > > > >
> > > > > The question is if we think PARQUET-2257 is urgent enough to not to
> > > wait
> > > > > for PARQUET-2261 and have an additional release after the latter is
> > > ready
> > > > > or we shall wait for the PoC implementation and release format
> after
> > > it.
> > > > >
> > > > > On 2023/05/02 03:33:05 Gang Wu wrote:
> > > > > > Thanks Fokko!
> > > > > >
> > > > > > Let us just wait for more inputs to see if it is good to proceed.
> > > > > >
> > > > > > Best,
> > > > > > Gang
> > > > > >
> > > > > > On Fri, Apr 28, 2023 at 4:05 PM Fokko Driesprong <
> fokko@apache.org
> > >
> > > > > wrote:
> > > > > >
> > > > > > > Hey Gang,
> > > > > > >
> > > > > > > Great bringing this up, I think that would be a great idea!
> > > > > > >
> > > > > > > Kind regards,
> > > > > > > Fokko
> > > > > > >
> > > > > > > Op do 27 apr 2023 om 09:52 schreef Gang Wu <us...@gmail.com>:
> > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > The latest parquet format is v2.9.0 [1] which was released
> two
> > > > years
> > > > > ago.
> > > > > > > > Is it a good time to release the next version? If there is no
> > > > > objection,
> > > > > > > I
> > > > > > > > can
> > > > > > > > volunteer to be the release manager.
> > > > > > > >
> > > > > > > > [1]
> > > > https://github.com/apache/parquet-format/blob/master/CHANGES.md
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Gang
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> > --
> > Xinli Shang
> >
>