You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nuttx.apache.org by GitBox <gi...@apache.org> on 2020/10/09 03:19:06 UTC

[GitHub] [incubator-nuttx] v01d opened a new issue #1954: IP clearance checks via CI

v01d opened a new issue #1954:
URL: https://github.com/apache/incubator-nuttx/issues/1954


   The idea would be to run the CI clearance scripts (#1834) as part of the "check" step of CI in a similar vein to how nxstyle is used:
   
   - analize all files touched by the PR
   - if the file does not have an Apache header yet, use the script to verify if it can be converted to Apache safely
   - if so, fail the check
   
   The PR author should then have to convert the files to apache by altering the header apropriately in a separate commit and include in that commit all relevant tracing data (there could be a script for this as well, to avoid any copy paste). 
   Following this approach, all cleared files would end up having an easy to find commit in their history which can help clearance progress. 
   
   Processing Apache licensed files which do not have this commit in their history could also be checked in order to validate that they were correctly converted (ie, an author without ICLA has not touched ever the file).
   
   Another thing to consider is to check if the authors of the changes included in the PR have signed an ICLA (the same scripts can be used for that purpose). I think that this is only important for not-cleared or non Apache licensed files since once they are Apache licensed I understand that it is not necessary for a non-commiter to sign an ICLA since they would be already accepting the license terms when doing so.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-nuttx] PeterBee97 commented on issue #1954: IP clearance checks via CI

Posted by GitBox <gi...@apache.org>.
PeterBee97 commented on issue #1954:
URL: https://github.com/apache/incubator-nuttx/issues/1954#issuecomment-719159412


   Hi, @v01d I ran your scripts on all the non-apache licensed files([non-apache-list.txt](https://github.com/apache/incubator-nuttx/files/5462617/non-apache-list.txt)) and put the output of "all green authors" items in ([stage1.txt](https://github.com/apache/incubator-nuttx/files/5462616/stage1.txt)), assuming this means "can be apachized". The others' log is in stage2.txt, which is 78MB and omitted here. 
   
   Then I parsed stage2.txt, sorted, deduplicated the author names and emails with script and hands, and get [authors.txt](https://github.com/apache/incubator-nuttx/files/5462923/authors.txt). Lines starting with '-' indicates it's not directly an author to any file, but one that's been given credit to in file headers. And the list of companies with announced copyrights is attached at the end of the file. 
   
   Running the process over apache licensed files also produced two big files, but the author names here are mostly covered in the list above. Let's
   > 6. try to contact authors/companies to request ICLAs
   
   over this list ([authors.txt](https://github.com/apache/incubator-nuttx/files/5462923/authors.txt)) first.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-nuttx] v01d commented on issue #1954: IP clearance checks via CI

Posted by GitBox <gi...@apache.org>.
v01d commented on issue #1954:
URL: https://github.com/apache/incubator-nuttx/issues/1954#issuecomment-707708672


   hi @yy-gu , great that you and Peter can take on this task. Regarding the workflow, I have a few comments:
   
   * the checks you mention are already in place in the script I wrote and are mostly: check git author, check header author and any attributions on git commit message. the checks are done by name and alternatively by email, using a set of remappings (to consider alternative emails of same author).
   * the "without ambiguity" will be difficult since the script may fail to identify some author (mostly in the attribution, since this is really a heuristic). header authors should all be detected but this is regex based and if I didn't consider a particular case it may also miss an author
   * I wonder if assuming Apache licensed files is really safe: if we didn't went through this process before it is possible someone may have missed an attribution in a git message, for example. So, maybe (in a later step) we should distinguish between "apache licensed" and "cleared". We could first clear non-apache files to get most of the work done and then clear (validate) the remaining Apache files
   
   Finally, note that besides authors there may be companies involved from which you will need SGAs besides author's ICLAs. The script also tries to identify this but there may be border cases.
   
   In conclusion, I think that the approach could be:
   1. Stage 1
     1. look for non-apache licensed files where the script give the "can be apachized" result
     2. manually verify these by going throught the script output (to verify no authors, companies were missed)
   2. Stage 2
     1. look for non-apache licensed files where the script does not give the "can be apachized" result
     2. try to contact authors/companies to request ICLAs
   3. Stage 3 (optional?)
     1. look for apache licensed files (not resulting from stage 1 and 2) which have not been cleared
     2. run script and validate the "can be apachized"
   
   To perform the conversion itself, I added a script to change the header which you could use. But the two following issue is not yet addressed: the script does not work for files which were part of submodules in the past: this means boards, arch and maybe others which I don't remember. the problem is that the script tries to access the file content at a commit which was during submodule era and this is not possible in the current repo. Greg made the submodules available which could be used to retrieve contents from this part of their history.
   
   Finally, I would always do this header changes in separate commits and use the output of the script as part of the commit message. This way, there's traceability to these changes.
   
     
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-nuttx] yy-gu commented on issue #1954: IP clearance checks via CI

Posted by GitBox <gi...@apache.org>.
yy-gu commented on issue #1954:
URL: https://github.com/apache/incubator-nuttx/issues/1954#issuecomment-719378382


   Great work, @PeterBee97 
   
   So for next step, we can write a script to send emails to the authors in the list. Can someone advise on the content of the email and ICLA license file? 
   
   And for companies, what would be the advised approach to contact them?
   @v01d @Apache9 @gregory-nutt 
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-nuttx] yy-gu commented on issue #1954: IP clearance checks via CI

Posted by GitBox <gi...@apache.org>.
yy-gu commented on issue #1954:
URL: https://github.com/apache/incubator-nuttx/issues/1954#issuecomment-708092071


   @v01d Agree with all your comments.  We will give it a first try and report back our progress.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-nuttx] Apache9 commented on issue #1954: IP clearance checks via CI

Posted by GitBox <gi...@apache.org>.
Apache9 commented on issue #1954:
URL: https://github.com/apache/incubator-nuttx/issues/1954#issuecomment-706919511


   The approach is great @yy-gu . Let's move forward and get some progress first.
   
   In general, the only risk is that, at last we have too many files with ambiguity and we need to add a bunch of files to the notice doc. Anyway, we can only know this after we have done the above work on all files.
   
   Let's do it!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-nuttx] Apache9 commented on issue #1954: IP clearance checks via CI

Posted by GitBox <gi...@apache.org>.
Apache9 commented on issue #1954:
URL: https://github.com/apache/incubator-nuttx/issues/1954#issuecomment-719394605


   For the companies, we should contact them to sign a SGA then we could change the license header. Or if this is not possible then we should add a statement in the NOTICE file about this.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-nuttx] yy-gu commented on issue #1954: IP clearance checks via CI

Posted by GitBox <gi...@apache.org>.
yy-gu commented on issue #1954:
URL: https://github.com/apache/incubator-nuttx/issues/1954#issuecomment-706908163


   @v01d Hello!  I work with @xiaoxiang781216 and just recently joined the NuttX project.  PeterBee and I will help on clearing the license issue and push NuttX to graduation from the incubator.  Great to see you, @adamfeuer and @patacongo already took some big steps. 
   
   From all the previous conversation, I summarized the work flow in the attach graph.  
   
   ![image](https://user-images.githubusercontent.com/72725017/95712289-1c572400-0c97-11eb-8c8c-e861ec9d22aa.png)
   
   Let me know what you guys think. My general intuition is to run as much work as possible with script and automation.
   
   If everybody agrees with the plan. @v01d PeterBee can work with you on creating the necessary tools for automated work. Then we can work with everyone else on the project and divide up the manual work. 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-nuttx] yy-gu commented on issue #1954: IP clearance checks via CI

Posted by GitBox <gi...@apache.org>.
yy-gu commented on issue #1954:
URL: https://github.com/apache/incubator-nuttx/issues/1954#issuecomment-720297925


   > > Great work, @PeterBee97
   > > So for next step, we can write a script to send emails to the authors in the list. Can someone advise on the content of the email and ICLA license file?
   > > And for companies, what would be the advised approach to contact them?
   > > @v01d @Apache9 @gregory-nutt
   > 
   > Per @justinmclean 's recent comment on the mailing list I think you can just ask. Should this be done by a PPMC? Or can it come from any committer?
   > 
   > Getting the right contact for companies will probably be hard, but note that it is likely that if a header lists a company some of the above authors may be the employees directly. So I would include in the ICLA requesting email that the author identifies if this contribution was under employment of any of the companies holding copyright to their files (maybe just send the whole list to each one). If so, we can ask this person for the procedure to get an SGA.
   
   @v01d Wonderful suggestions! We will do.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org