You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Driesprong, Fokko" <fo...@driesprong.frl> on 2020/01/08 15:13:10 UTC

Spotless

Y'all,

Recently Chen Junjie brought up the removal of trailing spaces within the
code and the headers:
https://github.com/apache/parquet-mr/pull/727#issuecomment-571562392

I've been looking into this and looked into if we can apply something like
checkstyle to let it fail on trailing whitespace. However, it comes up with
a LOT of warnings on improper formatting, short variables, wrong import
orders, etc.
For Apache Avro we've added Spotless as a maven plugin:
https://github.com/diffplug/spotless. Unlike checkstyle, spotless will also
fix the formatting. Would this be something that others find useful?
The main problem is that we need to apply this to the codebase, and this
will break a lot of PR's, and it will mess up a bit of the version control,
because a lot of lines will be changed:
https://github.com/apache/parquet-mr/pull/730/

WDYT?

Cheers, Fokko

Re: Spotless

Posted by David Mollitor <da...@gmail.com>.
I think you want this in place before bloom filters are released.  Since
it's the newest code, it is most at risk of receiving fixes and
improvements.  You're not going to want to use spotless after the feature
is introduced and make back ports more difficult.

On Wed, Jan 22, 2020, 10:02 AM Driesprong, Fokko <fo...@driesprong.frl>
wrote:

> I've rebased the PR: https://github.com/apache/parquet-mr/pull/730
>
> I did some searching and as far as I can tell, spotless does not allow to
> only apply it to VCS changed lines. If the forked repo also applies
> spotless, then it would be possible to do a diff.
>
> For me, I'm still interested in applying this, so we can keep our code
> clean and consistent. For example, I would like to enforce the use of
> braces, as it makes the code much more readable in my opinion.
>
> Cheers, Fokko
>
>
>
> Op do 9 jan. 2020 om 10:23 schreef Gabor Szadovszky <ga...@apache.org>:
>
> > Personally, I don't like formatting the whole code during minor version
> > development. These changes make really hard cherry-picking changes to
> > forked repos. It also makes hard to blame the code.
> > It is great to have a common code style and formatting configuration but
> I
> > would only apply them to the new lines. Let's do such changes that
> impacts
> > the whole code base at the beginning of a new major version development
> > where compatibility will break anyway.
> >
> > I am hesitating to give a -1, though. If everyone agrees on this is a
> good
> > idea, I'm fine with that. So, let me give a -0.
> >
> > Cheers,
> > Gabor
> >
> > On Wed, Jan 8, 2020 at 7:36 PM Ryan Blue <rb...@netflix.com.invalid>
> > wrote:
> >
> > > +1 for spotless checks.
> > >
> > > On Wed, Jan 8, 2020 at 7:13 AM Driesprong, Fokko <fokko@driesprong.frl
> >
> > > wrote:
> > >
> > > > Y'all,
> > > >
> > > > Recently Chen Junjie brought up the removal of trailing spaces within
> > the
> > > > code and the headers:
> > > > https://github.com/apache/parquet-mr/pull/727#issuecomment-571562392
> > > >
> > > > I've been looking into this and looked into if we can apply something
> > > like
> > > > checkstyle to let it fail on trailing whitespace. However, it comes
> up
> > > with
> > > > a LOT of warnings on improper formatting, short variables, wrong
> import
> > > > orders, etc.
> > > > For Apache Avro we've added Spotless as a maven plugin:
> > > > https://github.com/diffplug/spotless. Unlike checkstyle, spotless
> will
> > > > also
> > > > fix the formatting. Would this be something that others find useful?
> > > > The main problem is that we need to apply this to the codebase, and
> > this
> > > > will break a lot of PR's, and it will mess up a bit of the version
> > > control,
> > > > because a lot of lines will be changed:
> > > > https://github.com/apache/parquet-mr/pull/730/
> > > >
> > > > WDYT?
> > > >
> > > > Cheers, Fokko
> > > >
> > >
> > >
> > > --
> > > Ryan Blue
> > > Software Engineer
> > > Netflix
> > >
> >
>

Re: Spotless

Posted by "Driesprong, Fokko" <fo...@driesprong.frl>.
I've rebased the PR: https://github.com/apache/parquet-mr/pull/730

I did some searching and as far as I can tell, spotless does not allow to
only apply it to VCS changed lines. If the forked repo also applies
spotless, then it would be possible to do a diff.

For me, I'm still interested in applying this, so we can keep our code
clean and consistent. For example, I would like to enforce the use of
braces, as it makes the code much more readable in my opinion.

Cheers, Fokko



Op do 9 jan. 2020 om 10:23 schreef Gabor Szadovszky <ga...@apache.org>:

> Personally, I don't like formatting the whole code during minor version
> development. These changes make really hard cherry-picking changes to
> forked repos. It also makes hard to blame the code.
> It is great to have a common code style and formatting configuration but I
> would only apply them to the new lines. Let's do such changes that impacts
> the whole code base at the beginning of a new major version development
> where compatibility will break anyway.
>
> I am hesitating to give a -1, though. If everyone agrees on this is a good
> idea, I'm fine with that. So, let me give a -0.
>
> Cheers,
> Gabor
>
> On Wed, Jan 8, 2020 at 7:36 PM Ryan Blue <rb...@netflix.com.invalid>
> wrote:
>
> > +1 for spotless checks.
> >
> > On Wed, Jan 8, 2020 at 7:13 AM Driesprong, Fokko <fo...@driesprong.frl>
> > wrote:
> >
> > > Y'all,
> > >
> > > Recently Chen Junjie brought up the removal of trailing spaces within
> the
> > > code and the headers:
> > > https://github.com/apache/parquet-mr/pull/727#issuecomment-571562392
> > >
> > > I've been looking into this and looked into if we can apply something
> > like
> > > checkstyle to let it fail on trailing whitespace. However, it comes up
> > with
> > > a LOT of warnings on improper formatting, short variables, wrong import
> > > orders, etc.
> > > For Apache Avro we've added Spotless as a maven plugin:
> > > https://github.com/diffplug/spotless. Unlike checkstyle, spotless will
> > > also
> > > fix the formatting. Would this be something that others find useful?
> > > The main problem is that we need to apply this to the codebase, and
> this
> > > will break a lot of PR's, and it will mess up a bit of the version
> > control,
> > > because a lot of lines will be changed:
> > > https://github.com/apache/parquet-mr/pull/730/
> > >
> > > WDYT?
> > >
> > > Cheers, Fokko
> > >
> >
> >
> > --
> > Ryan Blue
> > Software Engineer
> > Netflix
> >
>

Re: Spotless

Posted by Gabor Szadovszky <ga...@apache.org>.
Personally, I don't like formatting the whole code during minor version
development. These changes make really hard cherry-picking changes to
forked repos. It also makes hard to blame the code.
It is great to have a common code style and formatting configuration but I
would only apply them to the new lines. Let's do such changes that impacts
the whole code base at the beginning of a new major version development
where compatibility will break anyway.

I am hesitating to give a -1, though. If everyone agrees on this is a good
idea, I'm fine with that. So, let me give a -0.

Cheers,
Gabor

On Wed, Jan 8, 2020 at 7:36 PM Ryan Blue <rb...@netflix.com.invalid> wrote:

> +1 for spotless checks.
>
> On Wed, Jan 8, 2020 at 7:13 AM Driesprong, Fokko <fo...@driesprong.frl>
> wrote:
>
> > Y'all,
> >
> > Recently Chen Junjie brought up the removal of trailing spaces within the
> > code and the headers:
> > https://github.com/apache/parquet-mr/pull/727#issuecomment-571562392
> >
> > I've been looking into this and looked into if we can apply something
> like
> > checkstyle to let it fail on trailing whitespace. However, it comes up
> with
> > a LOT of warnings on improper formatting, short variables, wrong import
> > orders, etc.
> > For Apache Avro we've added Spotless as a maven plugin:
> > https://github.com/diffplug/spotless. Unlike checkstyle, spotless will
> > also
> > fix the formatting. Would this be something that others find useful?
> > The main problem is that we need to apply this to the codebase, and this
> > will break a lot of PR's, and it will mess up a bit of the version
> control,
> > because a lot of lines will be changed:
> > https://github.com/apache/parquet-mr/pull/730/
> >
> > WDYT?
> >
> > Cheers, Fokko
> >
>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>

Re: Spotless

Posted by Xinli shang <sh...@uber.com.INVALID>.
+1 for spotless. Maybe we can drain the PRs to the minimum and then apply
it.

On Wed, Jan 8, 2020 at 6:06 PM Junjie Chen <ch...@gmail.com> wrote:

>  I see spotless could handle import order and remove unused import as well,
> we can make use of them step by step.
>
> +1 to use spotless.
>
>
> On Thu, Jan 9, 2020 at 2:36 AM Ryan Blue <rb...@netflix.com.invalid>
> wrote:
>
> > +1 for spotless checks.
> >
> > On Wed, Jan 8, 2020 at 7:13 AM Driesprong, Fokko <fo...@driesprong.frl>
> > wrote:
> >
> > > Y'all,
> > >
> > > Recently Chen Junjie brought up the removal of trailing spaces within
> the
> > > code and the headers:
> > >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_parquet-2Dmr_pull_727-23issuecomment-2D571562392&d=DwIBaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=FQ88AmOZ4TMjDdqNBGu-ag&m=67nJPy6EMPJ0n6vvmvA69XKi8sc_o0JDgsVKKm54yh0&s=YlANPBgPH3FsI1EpSIvCrUld47yX3wRqb8SMlfwsdVM&e=
> > >
> > > I've been looking into this and looked into if we can apply something
> > like
> > > checkstyle to let it fail on trailing whitespace. However, it comes up
> > with
> > > a LOT of warnings on improper formatting, short variables, wrong import
> > > orders, etc.
> > > For Apache Avro we've added Spotless as a maven plugin:
> > >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_diffplug_spotless&d=DwIBaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=FQ88AmOZ4TMjDdqNBGu-ag&m=67nJPy6EMPJ0n6vvmvA69XKi8sc_o0JDgsVKKm54yh0&s=cA6hUKdkxvyCXkRiu_WoPWuhJVmJqlg0_CkJ9604IPY&e=
> . Unlike checkstyle, spotless will
> > > also
> > > fix the formatting. Would this be something that others find useful?
> > > The main problem is that we need to apply this to the codebase, and
> this
> > > will break a lot of PR's, and it will mess up a bit of the version
> > control,
> > > because a lot of lines will be changed:
> > >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_parquet-2Dmr_pull_730_&d=DwIBaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=FQ88AmOZ4TMjDdqNBGu-ag&m=67nJPy6EMPJ0n6vvmvA69XKi8sc_o0JDgsVKKm54yh0&s=_FCuNEszM8Q4yJh3U-uzwvUywfvYSJtrxv4fXcLtcuc&e=
> > >
> > > WDYT?
> > >
> > > Cheers, Fokko
> > >
> >
> >
> > --
> > Ryan Blue
> > Software Engineer
> > Netflix
> >
>


-- 
Xinli Shang

Re: Spotless

Posted by Junjie Chen <ch...@gmail.com>.
 I see spotless could handle import order and remove unused import as well,
we can make use of them step by step.

+1 to use spotless.


On Thu, Jan 9, 2020 at 2:36 AM Ryan Blue <rb...@netflix.com.invalid> wrote:

> +1 for spotless checks.
>
> On Wed, Jan 8, 2020 at 7:13 AM Driesprong, Fokko <fo...@driesprong.frl>
> wrote:
>
> > Y'all,
> >
> > Recently Chen Junjie brought up the removal of trailing spaces within the
> > code and the headers:
> > https://github.com/apache/parquet-mr/pull/727#issuecomment-571562392
> >
> > I've been looking into this and looked into if we can apply something
> like
> > checkstyle to let it fail on trailing whitespace. However, it comes up
> with
> > a LOT of warnings on improper formatting, short variables, wrong import
> > orders, etc.
> > For Apache Avro we've added Spotless as a maven plugin:
> > https://github.com/diffplug/spotless. Unlike checkstyle, spotless will
> > also
> > fix the formatting. Would this be something that others find useful?
> > The main problem is that we need to apply this to the codebase, and this
> > will break a lot of PR's, and it will mess up a bit of the version
> control,
> > because a lot of lines will be changed:
> > https://github.com/apache/parquet-mr/pull/730/
> >
> > WDYT?
> >
> > Cheers, Fokko
> >
>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>

Re: Spotless

Posted by Ryan Blue <rb...@netflix.com.INVALID>.
+1 for spotless checks.

On Wed, Jan 8, 2020 at 7:13 AM Driesprong, Fokko <fo...@driesprong.frl>
wrote:

> Y'all,
>
> Recently Chen Junjie brought up the removal of trailing spaces within the
> code and the headers:
> https://github.com/apache/parquet-mr/pull/727#issuecomment-571562392
>
> I've been looking into this and looked into if we can apply something like
> checkstyle to let it fail on trailing whitespace. However, it comes up with
> a LOT of warnings on improper formatting, short variables, wrong import
> orders, etc.
> For Apache Avro we've added Spotless as a maven plugin:
> https://github.com/diffplug/spotless. Unlike checkstyle, spotless will
> also
> fix the formatting. Would this be something that others find useful?
> The main problem is that we need to apply this to the codebase, and this
> will break a lot of PR's, and it will mess up a bit of the version control,
> because a lot of lines will be changed:
> https://github.com/apache/parquet-mr/pull/730/
>
> WDYT?
>
> Cheers, Fokko
>


-- 
Ryan Blue
Software Engineer
Netflix