You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by sebb <se...@gmail.com> on 2013/05/15 03:19:13 UTC

[ALL] RAT 0.9 slowness

I've just done a test with IO, and the speed problem seems to be
related to the SVN files under site-content.
These seem to cause 0.9 to hang - a thread dump shows the code is
mainly at the line:

at org.apache.rat.analysis.license.FullTextMatchingLicense.match(FullTextMatchingLicense.java:79)

This is calling Pattern.matcher() so I assume there must be some kind
of content that causes the regex engine to churn.
Perhaps the pattern is causing excess backtracking.

I've no idea why 0.9 has a problem with these particular files, but
they should not be included in the RAT check anyway.

If you want to test RAT on its own:

mvn apache-rat:rat [-Dcommons.rat.version=0.9]

Obviously there is a bug in RAT 0.9, but maybe we can avoid it by
excluding site-content.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [ALL] RAT 0.9 slowness

Posted by Gary Gregory <ga...@gmail.com>.
On Wed, May 15, 2013 at 8:47 PM, sebb <se...@gmail.com> wrote:

> On 15 May 2013 17:09, sebb <se...@gmail.com> wrote:
> > On 15 May 2013 13:36, Gary Gregory <ga...@gmail.com> wrote:
> >> Can we do this in the pare t POM?
> >
> > Try it and see; update your local CP snapshot and install it locally.
> >
> > Should be easy to test whether includes/excludes can be overridden at
> > component level - the apache-rat:rat goal lists all the files it
> > matches.
>
> Turns out it's easy to define the default set of excludes in the
> parent POM and configure it that so any child pom excludes are
> appended to the config.
>

Cool, maybe we can push 30 after 29 is done.

Gary


>
> >> Gary
> >>
> >> On May 15, 2013, at 4:03, Thomas Neidhart <th...@gmail.com>
> wrote:
> >>
> >>> in collections I already filter it like this:
> >>>
> >>> <reporting>
> >>>  <plugins>
> >>>      <plugin>
> >>>        <groupId>org.apache.rat</groupId>
> >>>        <artifactId>apache-rat-plugin</artifactId>
> >>>        <configuration>
> >>>          <excludes>
> >>>            <exclude>site-content/**/*</exclude>
> >>>          </excludes>
> >>>        </configuration>
> >>>      </plugin>
> >>>  ...
> >>> </reporting>
> >>>
> >>>
> >>>
> >>> On Wed, May 15, 2013 at 3:19 AM, sebb <se...@gmail.com> wrote:
> >>>
> >>>> I've just done a test with IO, and the speed problem seems to be
> >>>> related to the SVN files under site-content.
> >>>> These seem to cause 0.9 to hang - a thread dump shows the code is
> >>>> mainly at the line:
> >>>>
> >>>> at
> >>>>
> org.apache.rat.analysis.license.FullTextMatchingLicense.match(FullTextMatchingLicense.java:79)
> >>>>
> >>>> This is calling Pattern.matcher() so I assume there must be some kind
> >>>> of content that causes the regex engine to churn.
> >>>> Perhaps the pattern is causing excess backtracking.
> >>>>
> >>>> I've no idea why 0.9 has a problem with these particular files, but
> >>>> they should not be included in the RAT check anyway.
> >>>>
> >>>> If you want to test RAT on its own:
> >>>>
> >>>> mvn apache-rat:rat [-Dcommons.rat.version=0.9]
> >>>>
> >>>> Obviously there is a bug in RAT 0.9, but maybe we can avoid it by
> >>>> excluding site-content.
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> >>>> For additional commands, e-mail: dev-help@commons.apache.org
> >>>>
> >>>>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> >> For additional commands, e-mail: dev-help@commons.apache.org
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


-- 
E-Mail: garydgregory@gmail.com | ggregory@apache.org
Java Persistence with Hibernate, Second Edition<http://www.manning.com/bauer3/>
JUnit in Action, Second Edition <http://www.manning.com/tahchiev/>
Spring Batch in Action <http://www.manning.com/templier/>
Blog: http://garygregory.wordpress.com
Home: http://garygregory.com/
Tweet! http://twitter.com/GaryGregory

Re: [ALL] RAT 0.9 slowness

Posted by sebb <se...@gmail.com>.
On 15 May 2013 17:09, sebb <se...@gmail.com> wrote:
> On 15 May 2013 13:36, Gary Gregory <ga...@gmail.com> wrote:
>> Can we do this in the pare t POM?
>
> Try it and see; update your local CP snapshot and install it locally.
>
> Should be easy to test whether includes/excludes can be overridden at
> component level - the apache-rat:rat goal lists all the files it
> matches.

Turns out it's easy to define the default set of excludes in the
parent POM and configure it that so any child pom excludes are
appended to the config.

>> Gary
>>
>> On May 15, 2013, at 4:03, Thomas Neidhart <th...@gmail.com> wrote:
>>
>>> in collections I already filter it like this:
>>>
>>> <reporting>
>>>  <plugins>
>>>      <plugin>
>>>        <groupId>org.apache.rat</groupId>
>>>        <artifactId>apache-rat-plugin</artifactId>
>>>        <configuration>
>>>          <excludes>
>>>            <exclude>site-content/**/*</exclude>
>>>          </excludes>
>>>        </configuration>
>>>      </plugin>
>>>  ...
>>> </reporting>
>>>
>>>
>>>
>>> On Wed, May 15, 2013 at 3:19 AM, sebb <se...@gmail.com> wrote:
>>>
>>>> I've just done a test with IO, and the speed problem seems to be
>>>> related to the SVN files under site-content.
>>>> These seem to cause 0.9 to hang - a thread dump shows the code is
>>>> mainly at the line:
>>>>
>>>> at
>>>> org.apache.rat.analysis.license.FullTextMatchingLicense.match(FullTextMatchingLicense.java:79)
>>>>
>>>> This is calling Pattern.matcher() so I assume there must be some kind
>>>> of content that causes the regex engine to churn.
>>>> Perhaps the pattern is causing excess backtracking.
>>>>
>>>> I've no idea why 0.9 has a problem with these particular files, but
>>>> they should not be included in the RAT check anyway.
>>>>
>>>> If you want to test RAT on its own:
>>>>
>>>> mvn apache-rat:rat [-Dcommons.rat.version=0.9]
>>>>
>>>> Obviously there is a bug in RAT 0.9, but maybe we can avoid it by
>>>> excluding site-content.
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>>
>>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [ALL] RAT 0.9 slowness

Posted by sebb <se...@gmail.com>.
On 15 May 2013 13:36, Gary Gregory <ga...@gmail.com> wrote:
> Can we do this in the pare t POM?

Try it and see; update your local CP snapshot and install it locally.

Should be easy to test whether includes/excludes can be overridden at
component level - the apache-rat:rat goal lists all the files it
matches.

> Gary
>
> On May 15, 2013, at 4:03, Thomas Neidhart <th...@gmail.com> wrote:
>
>> in collections I already filter it like this:
>>
>> <reporting>
>>  <plugins>
>>      <plugin>
>>        <groupId>org.apache.rat</groupId>
>>        <artifactId>apache-rat-plugin</artifactId>
>>        <configuration>
>>          <excludes>
>>            <exclude>site-content/**/*</exclude>
>>          </excludes>
>>        </configuration>
>>      </plugin>
>>  ...
>> </reporting>
>>
>>
>>
>> On Wed, May 15, 2013 at 3:19 AM, sebb <se...@gmail.com> wrote:
>>
>>> I've just done a test with IO, and the speed problem seems to be
>>> related to the SVN files under site-content.
>>> These seem to cause 0.9 to hang - a thread dump shows the code is
>>> mainly at the line:
>>>
>>> at
>>> org.apache.rat.analysis.license.FullTextMatchingLicense.match(FullTextMatchingLicense.java:79)
>>>
>>> This is calling Pattern.matcher() so I assume there must be some kind
>>> of content that causes the regex engine to churn.
>>> Perhaps the pattern is causing excess backtracking.
>>>
>>> I've no idea why 0.9 has a problem with these particular files, but
>>> they should not be included in the RAT check anyway.
>>>
>>> If you want to test RAT on its own:
>>>
>>> mvn apache-rat:rat [-Dcommons.rat.version=0.9]
>>>
>>> Obviously there is a bug in RAT 0.9, but maybe we can avoid it by
>>> excluding site-content.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>
>>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [ALL] RAT 0.9 slowness

Posted by Gary Gregory <ga...@gmail.com>.
Can we do this in the pare t POM?

Gary

On May 15, 2013, at 4:03, Thomas Neidhart <th...@gmail.com> wrote:

> in collections I already filter it like this:
>
> <reporting>
>  <plugins>
>      <plugin>
>        <groupId>org.apache.rat</groupId>
>        <artifactId>apache-rat-plugin</artifactId>
>        <configuration>
>          <excludes>
>            <exclude>site-content/**/*</exclude>
>          </excludes>
>        </configuration>
>      </plugin>
>  ...
> </reporting>
>
>
>
> On Wed, May 15, 2013 at 3:19 AM, sebb <se...@gmail.com> wrote:
>
>> I've just done a test with IO, and the speed problem seems to be
>> related to the SVN files under site-content.
>> These seem to cause 0.9 to hang - a thread dump shows the code is
>> mainly at the line:
>>
>> at
>> org.apache.rat.analysis.license.FullTextMatchingLicense.match(FullTextMatchingLicense.java:79)
>>
>> This is calling Pattern.matcher() so I assume there must be some kind
>> of content that causes the regex engine to churn.
>> Perhaps the pattern is causing excess backtracking.
>>
>> I've no idea why 0.9 has a problem with these particular files, but
>> they should not be included in the RAT check anyway.
>>
>> If you want to test RAT on its own:
>>
>> mvn apache-rat:rat [-Dcommons.rat.version=0.9]
>>
>> Obviously there is a bug in RAT 0.9, but maybe we can avoid it by
>> excluding site-content.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [ALL] RAT 0.9 slowness

Posted by Thomas Neidhart <th...@gmail.com>.
in collections I already filter it like this:

<reporting>
  <plugins>
      <plugin>
        <groupId>org.apache.rat</groupId>
        <artifactId>apache-rat-plugin</artifactId>
        <configuration>
          <excludes>
            <exclude>site-content/**/*</exclude>
          </excludes>
        </configuration>
      </plugin>
  ...
</reporting>



On Wed, May 15, 2013 at 3:19 AM, sebb <se...@gmail.com> wrote:

> I've just done a test with IO, and the speed problem seems to be
> related to the SVN files under site-content.
> These seem to cause 0.9 to hang - a thread dump shows the code is
> mainly at the line:
>
> at
> org.apache.rat.analysis.license.FullTextMatchingLicense.match(FullTextMatchingLicense.java:79)
>
> This is calling Pattern.matcher() so I assume there must be some kind
> of content that causes the regex engine to churn.
> Perhaps the pattern is causing excess backtracking.
>
> I've no idea why 0.9 has a problem with these particular files, but
> they should not be included in the RAT check anyway.
>
> If you want to test RAT on its own:
>
> mvn apache-rat:rat [-Dcommons.rat.version=0.9]
>
> Obviously there is a bug in RAT 0.9, but maybe we can avoid it by
> excluding site-content.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>

Re: [ALL] RAT 0.9 slowness

Posted by Gary Gregory <ga...@gmail.com>.
On Tue, May 14, 2013 at 10:21 PM, sebb <se...@gmail.com> wrote:

> On 15 May 2013 02:58, Gary Gregory <ga...@gmail.com> wrote:
> > For me 'mvn clean site' took 56 minutes for IO.
>
> So?
>

It's just a data point.

Gary


>
> What if you exclude site-content/** ?
>
> > Gary
> >
> >
> > On Tue, May 14, 2013 at 9:19 PM, sebb <se...@gmail.com> wrote:
> >
> >> I've just done a test with IO, and the speed problem seems to be
> >> related to the SVN files under site-content.
> >> These seem to cause 0.9 to hang - a thread dump shows the code is
> >> mainly at the line:
> >>
> >> at
> >>
> org.apache.rat.analysis.license.FullTextMatchingLicense.match(FullTextMatchingLicense.java:79)
> >>
> >> This is calling Pattern.matcher() so I assume there must be some kind
> >> of content that causes the regex engine to churn.
> >> Perhaps the pattern is causing excess backtracking.
> >>
> >> I've no idea why 0.9 has a problem with these particular files, but
> >> they should not be included in the RAT check anyway.
> >>
> >> If you want to test RAT on its own:
> >>
> >> mvn apache-rat:rat [-Dcommons.rat.version=0.9]
> >>
> >> Obviously there is a bug in RAT 0.9, but maybe we can avoid it by
> >> excluding site-content.
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> >> For additional commands, e-mail: dev-help@commons.apache.org
> >>
> >>
> >
> >
> > --
> > E-Mail: garydgregory@gmail.com | ggregory@apache.org
> > Java Persistence with Hibernate, Second Edition<
> http://www.manning.com/bauer3/>
> > JUnit in Action, Second Edition <http://www.manning.com/tahchiev/>
> > Spring Batch in Action <http://www.manning.com/templier/>
> > Blog: http://garygregory.wordpress.com
> > Home: http://garygregory.com/
> > Tweet! http://twitter.com/GaryGregory
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


-- 
E-Mail: garydgregory@gmail.com | ggregory@apache.org
Java Persistence with Hibernate, Second Edition<http://www.manning.com/bauer3/>
JUnit in Action, Second Edition <http://www.manning.com/tahchiev/>
Spring Batch in Action <http://www.manning.com/templier/>
Blog: http://garygregory.wordpress.com
Home: http://garygregory.com/
Tweet! http://twitter.com/GaryGregory

Re: [ALL] RAT 0.9 slowness

Posted by sebb <se...@gmail.com>.
On 15 May 2013 02:58, Gary Gregory <ga...@gmail.com> wrote:
> For me 'mvn clean site' took 56 minutes for IO.

So?

What if you exclude site-content/** ?

> Gary
>
>
> On Tue, May 14, 2013 at 9:19 PM, sebb <se...@gmail.com> wrote:
>
>> I've just done a test with IO, and the speed problem seems to be
>> related to the SVN files under site-content.
>> These seem to cause 0.9 to hang - a thread dump shows the code is
>> mainly at the line:
>>
>> at
>> org.apache.rat.analysis.license.FullTextMatchingLicense.match(FullTextMatchingLicense.java:79)
>>
>> This is calling Pattern.matcher() so I assume there must be some kind
>> of content that causes the regex engine to churn.
>> Perhaps the pattern is causing excess backtracking.
>>
>> I've no idea why 0.9 has a problem with these particular files, but
>> they should not be included in the RAT check anyway.
>>
>> If you want to test RAT on its own:
>>
>> mvn apache-rat:rat [-Dcommons.rat.version=0.9]
>>
>> Obviously there is a bug in RAT 0.9, but maybe we can avoid it by
>> excluding site-content.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>
>
>
> --
> E-Mail: garydgregory@gmail.com | ggregory@apache.org
> Java Persistence with Hibernate, Second Edition<http://www.manning.com/bauer3/>
> JUnit in Action, Second Edition <http://www.manning.com/tahchiev/>
> Spring Batch in Action <http://www.manning.com/templier/>
> Blog: http://garygregory.wordpress.com
> Home: http://garygregory.com/
> Tweet! http://twitter.com/GaryGregory

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [ALL] RAT 0.9 slowness

Posted by Gary Gregory <ga...@gmail.com>.
For me 'mvn clean site' took 56 minutes for IO.

Gary


On Tue, May 14, 2013 at 9:19 PM, sebb <se...@gmail.com> wrote:

> I've just done a test with IO, and the speed problem seems to be
> related to the SVN files under site-content.
> These seem to cause 0.9 to hang - a thread dump shows the code is
> mainly at the line:
>
> at
> org.apache.rat.analysis.license.FullTextMatchingLicense.match(FullTextMatchingLicense.java:79)
>
> This is calling Pattern.matcher() so I assume there must be some kind
> of content that causes the regex engine to churn.
> Perhaps the pattern is causing excess backtracking.
>
> I've no idea why 0.9 has a problem with these particular files, but
> they should not be included in the RAT check anyway.
>
> If you want to test RAT on its own:
>
> mvn apache-rat:rat [-Dcommons.rat.version=0.9]
>
> Obviously there is a bug in RAT 0.9, but maybe we can avoid it by
> excluding site-content.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


-- 
E-Mail: garydgregory@gmail.com | ggregory@apache.org
Java Persistence with Hibernate, Second Edition<http://www.manning.com/bauer3/>
JUnit in Action, Second Edition <http://www.manning.com/tahchiev/>
Spring Batch in Action <http://www.manning.com/templier/>
Blog: http://garygregory.wordpress.com
Home: http://garygregory.com/
Tweet! http://twitter.com/GaryGregory