You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@openwhisk.apache.org by Rodric Rabbah <ro...@gmail.com> on 2019/03/27 01:25:33 UTC

enhancing scanCode to respect gitignore files

I found some gaps in scanCode --- the tool we use for checking repository
conformance for things like headers, white space related formatting, etc.
--- and how exclusions are implemented. I posit there's a desire to allow
scanCode to process existing .gitignore files, and more over to treat the
exclusion section of the scanCode config in the same way as git ignore
files (wild cards don't current work and file-based matching is too loose).

This page provides a detailed description how gitignore rules [1]  and I
found a python library that appears to implement matching a directory tree
against .gitignore [2].

I've incorporated the enhancements into scanCode on my fork here
https://github.com/apache/incubator-openwhisk-utilities/compare/master...rabbah:gitignore?expand=1,
which closes an issue I opened some time ago [3].

The implementation would require an added dependence for the matching
library (pip install pathspec). I can look into compiling scanCode into a
self container binary which would mean we should also create a release for
scanCode itself.

Thoughts?

-r

[1] https://git-scm.com/docs/gitignore
[2] https://github.com/cpburnz/python-path-specification
[3] https://github.com/apache/incubator-openwhisk-utilities/issues/39

Re: enhancing scanCode to respect gitignore files

Posted by Matt Rutkowski <mr...@apache.org>.
Thanks for this work Rodric; this a welcome enhancement

On 2019/03/27 01:25:33, Rodric Rabbah <ro...@gmail.com> wrote: 
> I found some gaps in scanCode --- the tool we use for checking repository
> conformance for things like headers, white space related formatting, etc.
> --- and how exclusions are implemented. I posit there's a desire to allow
> scanCode to process existing .gitignore files, and more over to treat the
> exclusion section of the scanCode config in the same way as git ignore
> files (wild cards don't current work and file-based matching is too loose).
> 
> This page provides a detailed description how gitignore rules [1]  and I
> found a python library that appears to implement matching a directory tree
> against .gitignore [2].
> 
> I've incorporated the enhancements into scanCode on my fork here
> https://github.com/apache/incubator-openwhisk-utilities/compare/master...rabbah:gitignore?expand=1,
> which closes an issue I opened some time ago [3].
> 
> The implementation would require an added dependence for the matching
> library (pip install pathspec). I can look into compiling scanCode into a
> self container binary which would mean we should also create a release for
> scanCode itself.
> 
> Thoughts?
> 
> -r
> 
> [1] https://git-scm.com/docs/gitignore
> [2] https://github.com/cpburnz/python-path-specification
> [3] https://github.com/apache/incubator-openwhisk-utilities/issues/39
> 

Re: enhancing scanCode to respect gitignore files

Posted by Rodric Rabbah <ro...@gmail.com>.
The PR is now merged. I’ve checked some of the repos and they still pass but found some that need tweaks to their config file. I’ll address what I catch. 

-r

> On Mar 29, 2019, at 9:45 PM, Rodric Rabbah <ro...@gmail.com> wrote:
> 
> I opened a PR https://github.com/apache/incubator-openwhisk-utilities/pull/57 to bundle the relevant parts of the "pathspec" library. This avoids pip install in all downstream clients. 
> 
> The pathspec library is Mozilla Public License 2.0 [1]. My reading of the Mozilla License, their FAQ [2] and the Apache 3rd Party License Policy [3] leads me to conclude the bundling is acceptable. 
> 
> [1] https://www.mozilla.org/en-US/MPL/2.0/
> [2] https://www.mozilla.org/en-US/MPL/2.0/FAQ
> [3] https://apache.org/legal/resolved.html
> 
> -r
> 
>> On Fri, Mar 29, 2019 at 8:55 PM Rodric Rabbah <ro...@gmail.com> wrote:
>> >  Am I understanding correctly that we'd need to go change the .travis.yaml files for pretty much every openwhisk repo to do the `pip install pathspec` as part of its install phase?
>> 
>> That's right. I could mitigate this by having scancode itself install the module (via) pip if necessary.
>> Alternatively, I could bundle the relevant parts of library into scancode. It's Mozilla Licensed and not very big (https://github.com/cpburnz/python-path-specification).
>> 
>> -r
>> 
>> 

Re: enhancing scanCode to respect gitignore files

Posted by Rodric Rabbah <ro...@gmail.com>.
I opened a PR
https://github.com/apache/incubator-openwhisk-utilities/pull/57 to bundle
the relevant parts of the "pathspec" library. This avoids pip install in
all downstream clients.

The pathspec library is Mozilla Public License 2.0 [1]. My reading of the
Mozilla License, their FAQ [2] and the Apache 3rd Party License Policy [3]
leads me to conclude the bundling is acceptable.

[1] https://www.mozilla.org/en-US/MPL/2.0/
[2] https://www.mozilla.org/en-US/MPL/2.0/FAQ
[3] https://apache.org/legal/resolved.html

-r

On Fri, Mar 29, 2019 at 8:55 PM Rodric Rabbah <ro...@gmail.com> wrote:

> >  Am I understanding correctly that we'd need to go change the
> .travis.yaml files for pretty much every openwhisk repo to do the `pip
> install pathspec` as part of its install phase?
>
> That's right. I could mitigate this by having scancode itself install the
> module (via) pip if necessary.
> Alternatively, I could bundle the relevant parts of library into scancode.
> It's Mozilla Licensed and not very big (
> https://github.com/cpburnz/python-path-specification).
>
> -r
>
>
>

Re: enhancing scanCode to respect gitignore files

Posted by Rodric Rabbah <ro...@gmail.com>.
>  Am I understanding correctly that we'd need to go change the
.travis.yaml files for pretty much every openwhisk repo to do the `pip
install pathspec` as part of its install phase?

That's right. I could mitigate this by having scancode itself install the
module (via) pip if necessary.
Alternatively, I could bundle the relevant parts of library into scancode.
It's Mozilla Licensed and not very big (
https://github.com/cpburnz/python-path-specification).

-r

Re: enhancing scanCode to respect gitignore files

Posted by David P Grove <gr...@us.ibm.com>.
Rodric Rabbah <ro...@gmail.com> wrote on 03/26/2019 09:25:33 PM:
>
> I found some gaps in scanCode --- the tool we use for checking repository
> conformance for things like headers, white space related formatting, etc.
> --- and how exclusions are implemented. I posit there's a desire to allow
> scanCode to process existing .gitignore files, and more over to treat the
> exclusion section of the scanCode config in the same way as git ignore
> files (wild cards don't current work and file-based matching is too
loose).

+1

>
> The implementation would require an added dependence for the matching
> library (pip install pathspec). I can look into compiling scanCode into a
> self container binary which would mean we should also create a release
for
> scanCode itself.
>

No strong opinion here.  Am I understanding correctly that we'd need to go
change the .travis.yaml files for pretty much every openwhisk repo to do
the `pip install pathspec` as part of its install phase?

--dave