You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by James Turton <dz...@apache.org> on 2024/01/26 07:21:57 UTC
License headers inside Javadoc comments
Good morning!
I'd like to ask about a feature to prevent RAT from allowing license
headers to appear inside Javadoc comments (/**) while still requiring
them in Java comments (/*) in .java files. Currently the Drill project
makes use of com.mycila.license-maven-plugin to reject licenses in
Javadoc comments because the developers at the time didn't want license
headers cluttering the Javadoc website that is generated from the
source. Are you aware of a general view on Apache license headers
appearing in Javadoc pages? If preventing them from doing so is a good
idea, could this become a (configurable) feature in RAT?
Thanks
James Turton
Fwd: License headers inside Javadoc comments
Posted by James Turton <dz...@apache.org>.
Just a forward to complete the records.
-------- Forwarded Message --------
Subject: Re: License headers inside Javadoc comments
Date: Fri, 26 Jan 2024 15:38:29 +0200
From: James Turton <dz...@apache.org>
To: P. Ottlinger <po...@apache.org>, dev@creadur.apache.org
CC: dev <de...@drill.apache.org>
Thanks Phil.
Here's some background [1] which comes from before I was involved with
Drill. What they wanted was for the license header checker to accept, in
.java files,
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
etc.
but reject
/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
etc.
Notice the two asterisks that open the Java comment block in the second
form thereby making it a Javadoc comment that will appear in generated
Javadoc. There are no longer any examples of the latter in Drill but
this has been enforced by the addition of the license-maven-plugin.
I got here because I want to remove that plugin, which essentially
duplicates RAT, in favour of another (with exactly the same name :()
that can generate license and notice information for our third party
code. This last task is what I'm really doing, the Javadoc license
header rejection matter is yak shaving that came up on the road.
So my yak shaving question is: if I make RAT Drill's only license header
checker then could I make it reject license headers of the second form?
Even if I can't I'm inclined to make it the only header checker since I
think that it's in any case mandatory and authoritative. But in an
effort to retain the work of the previous Drill developers I'm trying to
preserve what they implemented.
1. https://issues.apache.org/jira/browse/DRILL-6320
On 2024/01/26 14:06, P. Ottlinger wrote:
> Hi James,
>
> thanks for reaching out!
>
> Am 26.01.24 um 08:21 schrieb James Turton:
>> I'd like to ask about a feature to prevent RAT from allowing license
>> headers to appear inside Javadoc comments (/**) while still
>> requiring them in Java comments (/*) in .java files. Currently the
>> Drill project makes use of com.mycila.license-maven-plugin to reject
>> licenses in Javadoc comments because the developers at the time
>> didn't want license headers cluttering the Javadoc website that is
>> generated from the source. Are you aware of a general view on Apache
>> license headers appearing in Javadoc pages? If preventing them from
>> doing so is a good idea, could this become a (configurable) feature
>> in RAT?
>
> could you be so kind to provide an example of what you want to achieve
> and how your use case looks like?
>
> I'm afraid I do not really understand what you mean with
> javadoc-specific licenses?
>
> At the moment we don't have a file specific parsing to exclude
> comments - is that what you want to achieve?
>
> On the other hand if a license header is needed per file, it has to be
> somewhere in the sources ;)
>
> Thanks,
> Phil
Re: License headers inside Javadoc comments
Posted by James Turton <dz...@apache.org>.
Yes, well spotted.
On 2024/01/29 22:35, Paul Rogers wrote:
> James,
>
> If the extra check is costly, you might also observe that all (most?)
> existing files have the proper header format. It is only new or changed
> files that must be checked. So, you can use Git to determine the change set
> on each PR and do the extra format check only on those files.
>
> - Paul
>
> On Mon, Jan 29, 2024 at 7:37 AM James Turton <dz...@apache.org> wrote:
>
>> Thank you for these explanations Claude.
>>
>> Looking at your second paragraph about the proposal to enhance the code
>> that inserts headers, a comment start definition for Java files of
>> '/*\n' (newline after the '/*') should work to accept the Apache license
>> header in a Java comment but reject it if it's in a Javadoc comment.
>> That seems promising, and I'll take a look at RAT-330, but I'm also able
>> to move forward in Drill using alternatives in the interim.
>>
>> Regards
>> James
>>
>>
>> On 2024/01/29 09:33, Claude Warren wrote:
>>> James,
>>>
>>> The in general processing for matching licenses strips out all non
>>> essential text (e.g. '/' and '*') so the current implementation can not
>>> determine if the license text is within a javadoc block or not. Some
>>> matchers (e.g. Copyright, SPDX, and regex) do use the unmodified text but
>>> they are generally much slower. Infact, the original SPDX and Copyright
>>> implementations caused a significant (2 order of magnitude or more)
>>> increase in processing time. It would be possible to create a custom
>>> matcher to do what you want. But there is no mechanism currently
>> available
>>> in the code base to only call a matcher on specific file types.
>>>
>>> There is a section of code that understands file types, but this is the
>>> code that inserts headers into files that don't have them. It may be
>>> possible to build on that to create a custom matcher to ensure that
>> license
>>> comments are not within java docs. There is a ticket open to modify how
>>> this code works so that new file types with comment start stop
>> definitions
>>> and restrictions on first lines and such can be defined outside of the
>>> codebase, making it possible to insert headers in as yet unrecognized
>> file
>>> formats.[1] This might be extended and provide input to the process you
>>> are requesting.
>>>
>>> There is also a section of code that removes the non essential text. The
>>> 'prune' method could be modified to remove blocks of code between the
>>> opening javadoc '/**' and the closing '*/'. But this may lead to
>> problems
>>> with non java files. Speaking of non java files have you thought about
>>> ensuring that the license does not appear in other javadoc like systems?
>>> [2] Once this can of worms is opened we will need a way to manage all
>> the
>>> requests that will follow for other file types.
>>>
>>> If you have any ideas for implementing the change I would be interested
>> to
>>> hear them.
>>>
>>> Claude
>>>
>>> [1] https://issues.apache.org/jira/browse/RAT-330?
>>> [2]
>>>
>> https://stackoverflow.com/questions/5334531/using-javadoc-for-python-documentation
>>> On Fri, Jan 26, 2024 at 2:38 PM James Turton <dz...@apache.org> wrote:
>>>
>>>> Thanks Phil.
>>>>
>>>> Here's some background [1] which comes from before I was involved with
>>>> Drill. What they wanted was for the license header checker to accept, in
>>>> .java files,
>>>>
>>>> /*
>>>> * Licensed to the Apache Software Foundation (ASF) under one
>>>> * or more contributor license agreements. See the NOTICE file
>>>> * distributed with this work for additional information
>>>> etc.
>>>>
>>>> but reject
>>>>
>>>> /**
>>>> * Licensed to the Apache Software Foundation (ASF) under one
>>>> * or more contributor license agreements. See the NOTICE file
>>>> * distributed with this work for additional information
>>>> etc.
>>>>
>>>> Notice the two asterisks that open the Java comment block in the second
>>>> form thereby making it a Javadoc comment that will appear in generated
>>>> Javadoc. There are no longer any examples of the latter in Drill but
>>>> this has been enforced by the addition of the license-maven-plugin.
>>>>
>>>> I got here because I want to remove that plugin, which essentially
>>>> duplicates RAT, in favour of another (with exactly the same name :()
>>>> that can generate license and notice information for our third party
>>>> code. This last task is what I'm really doing, the Javadoc license
>>>> header rejection matter is yak shaving that came up on the road.
>>>>
>>>> So my yak shaving question is: if I make RAT Drill's only license header
>>>> checker then could I make it reject license headers of the second form?
>>>> Even if I can't I'm inclined to make it the only header checker since I
>>>> think that it's in any case mandatory and authoritative. But in an
>>>> effort to retain the work of the previous Drill developers I'm trying to
>>>> preserve what they implemented.
>>>>
>>>> 1. https://issues.apache.org/jira/browse/DRILL-6320
>>>>
>>>> On 2024/01/26 14:06, P. Ottlinger wrote:
>>>>> Hi James,
>>>>>
>>>>> thanks for reaching out!
>>>>>
>>>>> Am 26.01.24 um 08:21 schrieb James Turton:
>>>>>> I'd like to ask about a feature to prevent RAT from allowing license
>>>>>> headers to appear inside Javadoc comments (/**) while still requiring
>>>>>> them in Java comments (/*) in .java files. Currently the Drill project
>>>>>> makes use of com.mycila.license-maven-plugin to reject licenses in
>>>>>> Javadoc comments because the developers at the time didn't want
>>>>>> license headers cluttering the Javadoc website that is generated from
>>>>>> the source. Are you aware of a general view on Apache license headers
>>>>>> appearing in Javadoc pages? If preventing them from doing so is a good
>>>>>> idea, could this become a (configurable) feature in RAT?
>>>>> could you be so kind to provide an example of what you want to achieve
>>>>> and how your use case looks like?
>>>>>
>>>>> I'm afraid I do not really understand what you mean with
>>>>> javadoc-specific licenses?
>>>>>
>>>>> At the moment we don't have a file specific parsing to exclude comments
>>>>> - is that what you want to achieve?
>>>>>
>>>>> On the other hand if a license header is needed per file, it has to be
>>>>> somewhere in the sources ;)
>>>>>
>>>>> Thanks,
>>>>> Phil
>>
Re: License headers inside Javadoc comments
Posted by Paul Rogers <pa...@gmail.com>.
James,
If the extra check is costly, you might also observe that all (most?)
existing files have the proper header format. It is only new or changed
files that must be checked. So, you can use Git to determine the change set
on each PR and do the extra format check only on those files.
- Paul
On Mon, Jan 29, 2024 at 7:37 AM James Turton <dz...@apache.org> wrote:
> Thank you for these explanations Claude.
>
> Looking at your second paragraph about the proposal to enhance the code
> that inserts headers, a comment start definition for Java files of
> '/*\n' (newline after the '/*') should work to accept the Apache license
> header in a Java comment but reject it if it's in a Javadoc comment.
> That seems promising, and I'll take a look at RAT-330, but I'm also able
> to move forward in Drill using alternatives in the interim.
>
> Regards
> James
>
>
> On 2024/01/29 09:33, Claude Warren wrote:
> > James,
> >
> > The in general processing for matching licenses strips out all non
> > essential text (e.g. '/' and '*') so the current implementation can not
> > determine if the license text is within a javadoc block or not. Some
> > matchers (e.g. Copyright, SPDX, and regex) do use the unmodified text but
> > they are generally much slower. Infact, the original SPDX and Copyright
> > implementations caused a significant (2 order of magnitude or more)
> > increase in processing time. It would be possible to create a custom
> > matcher to do what you want. But there is no mechanism currently
> available
> > in the code base to only call a matcher on specific file types.
> >
> > There is a section of code that understands file types, but this is the
> > code that inserts headers into files that don't have them. It may be
> > possible to build on that to create a custom matcher to ensure that
> license
> > comments are not within java docs. There is a ticket open to modify how
> > this code works so that new file types with comment start stop
> definitions
> > and restrictions on first lines and such can be defined outside of the
> > codebase, making it possible to insert headers in as yet unrecognized
> file
> > formats.[1] This might be extended and provide input to the process you
> > are requesting.
> >
> > There is also a section of code that removes the non essential text. The
> > 'prune' method could be modified to remove blocks of code between the
> > opening javadoc '/**' and the closing '*/'. But this may lead to
> problems
> > with non java files. Speaking of non java files have you thought about
> > ensuring that the license does not appear in other javadoc like systems?
> > [2] Once this can of worms is opened we will need a way to manage all
> the
> > requests that will follow for other file types.
> >
> > If you have any ideas for implementing the change I would be interested
> to
> > hear them.
> >
> > Claude
> >
> > [1] https://issues.apache.org/jira/browse/RAT-330?
> > [2]
> >
> https://stackoverflow.com/questions/5334531/using-javadoc-for-python-documentation
> >
> > On Fri, Jan 26, 2024 at 2:38 PM James Turton <dz...@apache.org> wrote:
> >
> >> Thanks Phil.
> >>
> >> Here's some background [1] which comes from before I was involved with
> >> Drill. What they wanted was for the license header checker to accept, in
> >> .java files,
> >>
> >> /*
> >> * Licensed to the Apache Software Foundation (ASF) under one
> >> * or more contributor license agreements. See the NOTICE file
> >> * distributed with this work for additional information
> >> etc.
> >>
> >> but reject
> >>
> >> /**
> >> * Licensed to the Apache Software Foundation (ASF) under one
> >> * or more contributor license agreements. See the NOTICE file
> >> * distributed with this work for additional information
> >> etc.
> >>
> >> Notice the two asterisks that open the Java comment block in the second
> >> form thereby making it a Javadoc comment that will appear in generated
> >> Javadoc. There are no longer any examples of the latter in Drill but
> >> this has been enforced by the addition of the license-maven-plugin.
> >>
> >> I got here because I want to remove that plugin, which essentially
> >> duplicates RAT, in favour of another (with exactly the same name :()
> >> that can generate license and notice information for our third party
> >> code. This last task is what I'm really doing, the Javadoc license
> >> header rejection matter is yak shaving that came up on the road.
> >>
> >> So my yak shaving question is: if I make RAT Drill's only license header
> >> checker then could I make it reject license headers of the second form?
> >> Even if I can't I'm inclined to make it the only header checker since I
> >> think that it's in any case mandatory and authoritative. But in an
> >> effort to retain the work of the previous Drill developers I'm trying to
> >> preserve what they implemented.
> >>
> >> 1. https://issues.apache.org/jira/browse/DRILL-6320
> >>
> >> On 2024/01/26 14:06, P. Ottlinger wrote:
> >>> Hi James,
> >>>
> >>> thanks for reaching out!
> >>>
> >>> Am 26.01.24 um 08:21 schrieb James Turton:
> >>>> I'd like to ask about a feature to prevent RAT from allowing license
> >>>> headers to appear inside Javadoc comments (/**) while still requiring
> >>>> them in Java comments (/*) in .java files. Currently the Drill project
> >>>> makes use of com.mycila.license-maven-plugin to reject licenses in
> >>>> Javadoc comments because the developers at the time didn't want
> >>>> license headers cluttering the Javadoc website that is generated from
> >>>> the source. Are you aware of a general view on Apache license headers
> >>>> appearing in Javadoc pages? If preventing them from doing so is a good
> >>>> idea, could this become a (configurable) feature in RAT?
> >>> could you be so kind to provide an example of what you want to achieve
> >>> and how your use case looks like?
> >>>
> >>> I'm afraid I do not really understand what you mean with
> >>> javadoc-specific licenses?
> >>>
> >>> At the moment we don't have a file specific parsing to exclude comments
> >>> - is that what you want to achieve?
> >>>
> >>> On the other hand if a license header is needed per file, it has to be
> >>> somewhere in the sources ;)
> >>>
> >>> Thanks,
> >>> Phil
> >
>
>
Re: License headers inside Javadoc comments
Posted by James Turton <dz...@apache.org>.
Thank you for these explanations Claude.
Looking at your second paragraph about the proposal to enhance the code
that inserts headers, a comment start definition for Java files of
'/*\n' (newline after the '/*') should work to accept the Apache license
header in a Java comment but reject it if it's in a Javadoc comment.
That seems promising, and I'll take a look at RAT-330, but I'm also able
to move forward in Drill using alternatives in the interim.
Regards
James
On 2024/01/29 09:33, Claude Warren wrote:
> James,
>
> The in general processing for matching licenses strips out all non
> essential text (e.g. '/' and '*') so the current implementation can not
> determine if the license text is within a javadoc block or not. Some
> matchers (e.g. Copyright, SPDX, and regex) do use the unmodified text but
> they are generally much slower. Infact, the original SPDX and Copyright
> implementations caused a significant (2 order of magnitude or more)
> increase in processing time. It would be possible to create a custom
> matcher to do what you want. But there is no mechanism currently available
> in the code base to only call a matcher on specific file types.
>
> There is a section of code that understands file types, but this is the
> code that inserts headers into files that don't have them. It may be
> possible to build on that to create a custom matcher to ensure that license
> comments are not within java docs. There is a ticket open to modify how
> this code works so that new file types with comment start stop definitions
> and restrictions on first lines and such can be defined outside of the
> codebase, making it possible to insert headers in as yet unrecognized file
> formats.[1] This might be extended and provide input to the process you
> are requesting.
>
> There is also a section of code that removes the non essential text. The
> 'prune' method could be modified to remove blocks of code between the
> opening javadoc '/**' and the closing '*/'. But this may lead to problems
> with non java files. Speaking of non java files have you thought about
> ensuring that the license does not appear in other javadoc like systems?
> [2] Once this can of worms is opened we will need a way to manage all the
> requests that will follow for other file types.
>
> If you have any ideas for implementing the change I would be interested to
> hear them.
>
> Claude
>
> [1] https://issues.apache.org/jira/browse/RAT-330?
> [2]
> https://stackoverflow.com/questions/5334531/using-javadoc-for-python-documentation
>
> On Fri, Jan 26, 2024 at 2:38 PM James Turton <dz...@apache.org> wrote:
>
>> Thanks Phil.
>>
>> Here's some background [1] which comes from before I was involved with
>> Drill. What they wanted was for the license header checker to accept, in
>> .java files,
>>
>> /*
>> * Licensed to the Apache Software Foundation (ASF) under one
>> * or more contributor license agreements. See the NOTICE file
>> * distributed with this work for additional information
>> etc.
>>
>> but reject
>>
>> /**
>> * Licensed to the Apache Software Foundation (ASF) under one
>> * or more contributor license agreements. See the NOTICE file
>> * distributed with this work for additional information
>> etc.
>>
>> Notice the two asterisks that open the Java comment block in the second
>> form thereby making it a Javadoc comment that will appear in generated
>> Javadoc. There are no longer any examples of the latter in Drill but
>> this has been enforced by the addition of the license-maven-plugin.
>>
>> I got here because I want to remove that plugin, which essentially
>> duplicates RAT, in favour of another (with exactly the same name :()
>> that can generate license and notice information for our third party
>> code. This last task is what I'm really doing, the Javadoc license
>> header rejection matter is yak shaving that came up on the road.
>>
>> So my yak shaving question is: if I make RAT Drill's only license header
>> checker then could I make it reject license headers of the second form?
>> Even if I can't I'm inclined to make it the only header checker since I
>> think that it's in any case mandatory and authoritative. But in an
>> effort to retain the work of the previous Drill developers I'm trying to
>> preserve what they implemented.
>>
>> 1. https://issues.apache.org/jira/browse/DRILL-6320
>>
>> On 2024/01/26 14:06, P. Ottlinger wrote:
>>> Hi James,
>>>
>>> thanks for reaching out!
>>>
>>> Am 26.01.24 um 08:21 schrieb James Turton:
>>>> I'd like to ask about a feature to prevent RAT from allowing license
>>>> headers to appear inside Javadoc comments (/**) while still requiring
>>>> them in Java comments (/*) in .java files. Currently the Drill project
>>>> makes use of com.mycila.license-maven-plugin to reject licenses in
>>>> Javadoc comments because the developers at the time didn't want
>>>> license headers cluttering the Javadoc website that is generated from
>>>> the source. Are you aware of a general view on Apache license headers
>>>> appearing in Javadoc pages? If preventing them from doing so is a good
>>>> idea, could this become a (configurable) feature in RAT?
>>> could you be so kind to provide an example of what you want to achieve
>>> and how your use case looks like?
>>>
>>> I'm afraid I do not really understand what you mean with
>>> javadoc-specific licenses?
>>>
>>> At the moment we don't have a file specific parsing to exclude comments
>>> - is that what you want to achieve?
>>>
>>> On the other hand if a license header is needed per file, it has to be
>>> somewhere in the sources ;)
>>>
>>> Thanks,
>>> Phil
>
Re: License headers inside Javadoc comments
Posted by James Turton <dz...@apache.org>.
Thank you for these explanations Claude.
Looking at your second paragraph about the proposal to enhance the code
that inserts headers, a comment start definition for Java files of
'/*\n' (newline after the '/*') should work to accept the Apache license
header in a Java comment but reject it if it's in a Javadoc comment.
That seems promising, and I'll take a look at RAT-330, but I'm also able
to move forward in Drill using alternatives in the interim.
Regards
James
On 2024/01/29 09:33, Claude Warren wrote:
> James,
>
> The in general processing for matching licenses strips out all non
> essential text (e.g. '/' and '*') so the current implementation can not
> determine if the license text is within a javadoc block or not. Some
> matchers (e.g. Copyright, SPDX, and regex) do use the unmodified text but
> they are generally much slower. Infact, the original SPDX and Copyright
> implementations caused a significant (2 order of magnitude or more)
> increase in processing time. It would be possible to create a custom
> matcher to do what you want. But there is no mechanism currently available
> in the code base to only call a matcher on specific file types.
>
> There is a section of code that understands file types, but this is the
> code that inserts headers into files that don't have them. It may be
> possible to build on that to create a custom matcher to ensure that license
> comments are not within java docs. There is a ticket open to modify how
> this code works so that new file types with comment start stop definitions
> and restrictions on first lines and such can be defined outside of the
> codebase, making it possible to insert headers in as yet unrecognized file
> formats.[1] This might be extended and provide input to the process you
> are requesting.
>
> There is also a section of code that removes the non essential text. The
> 'prune' method could be modified to remove blocks of code between the
> opening javadoc '/**' and the closing '*/'. But this may lead to problems
> with non java files. Speaking of non java files have you thought about
> ensuring that the license does not appear in other javadoc like systems?
> [2] Once this can of worms is opened we will need a way to manage all the
> requests that will follow for other file types.
>
> If you have any ideas for implementing the change I would be interested to
> hear them.
>
> Claude
>
> [1] https://issues.apache.org/jira/browse/RAT-330?
> [2]
> https://stackoverflow.com/questions/5334531/using-javadoc-for-python-documentation
>
> On Fri, Jan 26, 2024 at 2:38 PM James Turton <dz...@apache.org> wrote:
>
>> Thanks Phil.
>>
>> Here's some background [1] which comes from before I was involved with
>> Drill. What they wanted was for the license header checker to accept, in
>> .java files,
>>
>> /*
>> * Licensed to the Apache Software Foundation (ASF) under one
>> * or more contributor license agreements. See the NOTICE file
>> * distributed with this work for additional information
>> etc.
>>
>> but reject
>>
>> /**
>> * Licensed to the Apache Software Foundation (ASF) under one
>> * or more contributor license agreements. See the NOTICE file
>> * distributed with this work for additional information
>> etc.
>>
>> Notice the two asterisks that open the Java comment block in the second
>> form thereby making it a Javadoc comment that will appear in generated
>> Javadoc. There are no longer any examples of the latter in Drill but
>> this has been enforced by the addition of the license-maven-plugin.
>>
>> I got here because I want to remove that plugin, which essentially
>> duplicates RAT, in favour of another (with exactly the same name :()
>> that can generate license and notice information for our third party
>> code. This last task is what I'm really doing, the Javadoc license
>> header rejection matter is yak shaving that came up on the road.
>>
>> So my yak shaving question is: if I make RAT Drill's only license header
>> checker then could I make it reject license headers of the second form?
>> Even if I can't I'm inclined to make it the only header checker since I
>> think that it's in any case mandatory and authoritative. But in an
>> effort to retain the work of the previous Drill developers I'm trying to
>> preserve what they implemented.
>>
>> 1. https://issues.apache.org/jira/browse/DRILL-6320
>>
>> On 2024/01/26 14:06, P. Ottlinger wrote:
>>> Hi James,
>>>
>>> thanks for reaching out!
>>>
>>> Am 26.01.24 um 08:21 schrieb James Turton:
>>>> I'd like to ask about a feature to prevent RAT from allowing license
>>>> headers to appear inside Javadoc comments (/**) while still requiring
>>>> them in Java comments (/*) in .java files. Currently the Drill project
>>>> makes use of com.mycila.license-maven-plugin to reject licenses in
>>>> Javadoc comments because the developers at the time didn't want
>>>> license headers cluttering the Javadoc website that is generated from
>>>> the source. Are you aware of a general view on Apache license headers
>>>> appearing in Javadoc pages? If preventing them from doing so is a good
>>>> idea, could this become a (configurable) feature in RAT?
>>> could you be so kind to provide an example of what you want to achieve
>>> and how your use case looks like?
>>>
>>> I'm afraid I do not really understand what you mean with
>>> javadoc-specific licenses?
>>>
>>> At the moment we don't have a file specific parsing to exclude comments
>>> - is that what you want to achieve?
>>>
>>> On the other hand if a license header is needed per file, it has to be
>>> somewhere in the sources ;)
>>>
>>> Thanks,
>>> Phil
>
Re: License headers inside Javadoc comments
Posted by Claude Warren <cl...@xenei.com>.
James,
The in general processing for matching licenses strips out all non
essential text (e.g. '/' and '*') so the current implementation can not
determine if the license text is within a javadoc block or not. Some
matchers (e.g. Copyright, SPDX, and regex) do use the unmodified text but
they are generally much slower. Infact, the original SPDX and Copyright
implementations caused a significant (2 order of magnitude or more)
increase in processing time. It would be possible to create a custom
matcher to do what you want. But there is no mechanism currently available
in the code base to only call a matcher on specific file types.
There is a section of code that understands file types, but this is the
code that inserts headers into files that don't have them. It may be
possible to build on that to create a custom matcher to ensure that license
comments are not within java docs. There is a ticket open to modify how
this code works so that new file types with comment start stop definitions
and restrictions on first lines and such can be defined outside of the
codebase, making it possible to insert headers in as yet unrecognized file
formats.[1] This might be extended and provide input to the process you
are requesting.
There is also a section of code that removes the non essential text. The
'prune' method could be modified to remove blocks of code between the
opening javadoc '/**' and the closing '*/'. But this may lead to problems
with non java files. Speaking of non java files have you thought about
ensuring that the license does not appear in other javadoc like systems?
[2] Once this can of worms is opened we will need a way to manage all the
requests that will follow for other file types.
If you have any ideas for implementing the change I would be interested to
hear them.
Claude
[1] https://issues.apache.org/jira/browse/RAT-330?
[2]
https://stackoverflow.com/questions/5334531/using-javadoc-for-python-documentation
On Fri, Jan 26, 2024 at 2:38 PM James Turton <dz...@apache.org> wrote:
> Thanks Phil.
>
> Here's some background [1] which comes from before I was involved with
> Drill. What they wanted was for the license header checker to accept, in
> .java files,
>
> /*
> * Licensed to the Apache Software Foundation (ASF) under one
> * or more contributor license agreements. See the NOTICE file
> * distributed with this work for additional information
> etc.
>
> but reject
>
> /**
> * Licensed to the Apache Software Foundation (ASF) under one
> * or more contributor license agreements. See the NOTICE file
> * distributed with this work for additional information
> etc.
>
> Notice the two asterisks that open the Java comment block in the second
> form thereby making it a Javadoc comment that will appear in generated
> Javadoc. There are no longer any examples of the latter in Drill but
> this has been enforced by the addition of the license-maven-plugin.
>
> I got here because I want to remove that plugin, which essentially
> duplicates RAT, in favour of another (with exactly the same name :()
> that can generate license and notice information for our third party
> code. This last task is what I'm really doing, the Javadoc license
> header rejection matter is yak shaving that came up on the road.
>
> So my yak shaving question is: if I make RAT Drill's only license header
> checker then could I make it reject license headers of the second form?
> Even if I can't I'm inclined to make it the only header checker since I
> think that it's in any case mandatory and authoritative. But in an
> effort to retain the work of the previous Drill developers I'm trying to
> preserve what they implemented.
>
> 1. https://issues.apache.org/jira/browse/DRILL-6320
>
> On 2024/01/26 14:06, P. Ottlinger wrote:
> > Hi James,
> >
> > thanks for reaching out!
> >
> > Am 26.01.24 um 08:21 schrieb James Turton:
> >> I'd like to ask about a feature to prevent RAT from allowing license
> >> headers to appear inside Javadoc comments (/**) while still requiring
> >> them in Java comments (/*) in .java files. Currently the Drill project
> >> makes use of com.mycila.license-maven-plugin to reject licenses in
> >> Javadoc comments because the developers at the time didn't want
> >> license headers cluttering the Javadoc website that is generated from
> >> the source. Are you aware of a general view on Apache license headers
> >> appearing in Javadoc pages? If preventing them from doing so is a good
> >> idea, could this become a (configurable) feature in RAT?
> >
> > could you be so kind to provide an example of what you want to achieve
> > and how your use case looks like?
> >
> > I'm afraid I do not really understand what you mean with
> > javadoc-specific licenses?
> >
> > At the moment we don't have a file specific parsing to exclude comments
> > - is that what you want to achieve?
> >
> > On the other hand if a license header is needed per file, it has to be
> > somewhere in the sources ;)
> >
> > Thanks,
> > Phil
>
--
LinkedIn: http://www.linkedin.com/in/claudewarren
Re: License headers inside Javadoc comments
Posted by Claude Warren <cl...@xenei.com>.
James,
The in general processing for matching licenses strips out all non
essential text (e.g. '/' and '*') so the current implementation can not
determine if the license text is within a javadoc block or not. Some
matchers (e.g. Copyright, SPDX, and regex) do use the unmodified text but
they are generally much slower. Infact, the original SPDX and Copyright
implementations caused a significant (2 order of magnitude or more)
increase in processing time. It would be possible to create a custom
matcher to do what you want. But there is no mechanism currently available
in the code base to only call a matcher on specific file types.
There is a section of code that understands file types, but this is the
code that inserts headers into files that don't have them. It may be
possible to build on that to create a custom matcher to ensure that license
comments are not within java docs. There is a ticket open to modify how
this code works so that new file types with comment start stop definitions
and restrictions on first lines and such can be defined outside of the
codebase, making it possible to insert headers in as yet unrecognized file
formats.[1] This might be extended and provide input to the process you
are requesting.
There is also a section of code that removes the non essential text. The
'prune' method could be modified to remove blocks of code between the
opening javadoc '/**' and the closing '*/'. But this may lead to problems
with non java files. Speaking of non java files have you thought about
ensuring that the license does not appear in other javadoc like systems?
[2] Once this can of worms is opened we will need a way to manage all the
requests that will follow for other file types.
If you have any ideas for implementing the change I would be interested to
hear them.
Claude
[1] https://issues.apache.org/jira/browse/RAT-330?
[2]
https://stackoverflow.com/questions/5334531/using-javadoc-for-python-documentation
On Fri, Jan 26, 2024 at 2:38 PM James Turton <dz...@apache.org> wrote:
> Thanks Phil.
>
> Here's some background [1] which comes from before I was involved with
> Drill. What they wanted was for the license header checker to accept, in
> .java files,
>
> /*
> * Licensed to the Apache Software Foundation (ASF) under one
> * or more contributor license agreements. See the NOTICE file
> * distributed with this work for additional information
> etc.
>
> but reject
>
> /**
> * Licensed to the Apache Software Foundation (ASF) under one
> * or more contributor license agreements. See the NOTICE file
> * distributed with this work for additional information
> etc.
>
> Notice the two asterisks that open the Java comment block in the second
> form thereby making it a Javadoc comment that will appear in generated
> Javadoc. There are no longer any examples of the latter in Drill but
> this has been enforced by the addition of the license-maven-plugin.
>
> I got here because I want to remove that plugin, which essentially
> duplicates RAT, in favour of another (with exactly the same name :()
> that can generate license and notice information for our third party
> code. This last task is what I'm really doing, the Javadoc license
> header rejection matter is yak shaving that came up on the road.
>
> So my yak shaving question is: if I make RAT Drill's only license header
> checker then could I make it reject license headers of the second form?
> Even if I can't I'm inclined to make it the only header checker since I
> think that it's in any case mandatory and authoritative. But in an
> effort to retain the work of the previous Drill developers I'm trying to
> preserve what they implemented.
>
> 1. https://issues.apache.org/jira/browse/DRILL-6320
>
> On 2024/01/26 14:06, P. Ottlinger wrote:
> > Hi James,
> >
> > thanks for reaching out!
> >
> > Am 26.01.24 um 08:21 schrieb James Turton:
> >> I'd like to ask about a feature to prevent RAT from allowing license
> >> headers to appear inside Javadoc comments (/**) while still requiring
> >> them in Java comments (/*) in .java files. Currently the Drill project
> >> makes use of com.mycila.license-maven-plugin to reject licenses in
> >> Javadoc comments because the developers at the time didn't want
> >> license headers cluttering the Javadoc website that is generated from
> >> the source. Are you aware of a general view on Apache license headers
> >> appearing in Javadoc pages? If preventing them from doing so is a good
> >> idea, could this become a (configurable) feature in RAT?
> >
> > could you be so kind to provide an example of what you want to achieve
> > and how your use case looks like?
> >
> > I'm afraid I do not really understand what you mean with
> > javadoc-specific licenses?
> >
> > At the moment we don't have a file specific parsing to exclude comments
> > - is that what you want to achieve?
> >
> > On the other hand if a license header is needed per file, it has to be
> > somewhere in the sources ;)
> >
> > Thanks,
> > Phil
>
--
LinkedIn: http://www.linkedin.com/in/claudewarren
Re: License headers inside Javadoc comments
Posted by James Turton <dz...@apache.org>.
Thanks Phil.
Here's some background [1] which comes from before I was involved with
Drill. What they wanted was for the license header checker to accept, in
.java files,
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
etc.
but reject
/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
etc.
Notice the two asterisks that open the Java comment block in the second
form thereby making it a Javadoc comment that will appear in generated
Javadoc. There are no longer any examples of the latter in Drill but
this has been enforced by the addition of the license-maven-plugin.
I got here because I want to remove that plugin, which essentially
duplicates RAT, in favour of another (with exactly the same name :()
that can generate license and notice information for our third party
code. This last task is what I'm really doing, the Javadoc license
header rejection matter is yak shaving that came up on the road.
So my yak shaving question is: if I make RAT Drill's only license header
checker then could I make it reject license headers of the second form?
Even if I can't I'm inclined to make it the only header checker since I
think that it's in any case mandatory and authoritative. But in an
effort to retain the work of the previous Drill developers I'm trying to
preserve what they implemented.
1. https://issues.apache.org/jira/browse/DRILL-6320
On 2024/01/26 14:06, P. Ottlinger wrote:
> Hi James,
>
> thanks for reaching out!
>
> Am 26.01.24 um 08:21 schrieb James Turton:
>> I'd like to ask about a feature to prevent RAT from allowing license
>> headers to appear inside Javadoc comments (/**) while still requiring
>> them in Java comments (/*) in .java files. Currently the Drill project
>> makes use of com.mycila.license-maven-plugin to reject licenses in
>> Javadoc comments because the developers at the time didn't want
>> license headers cluttering the Javadoc website that is generated from
>> the source. Are you aware of a general view on Apache license headers
>> appearing in Javadoc pages? If preventing them from doing so is a good
>> idea, could this become a (configurable) feature in RAT?
>
> could you be so kind to provide an example of what you want to achieve
> and how your use case looks like?
>
> I'm afraid I do not really understand what you mean with
> javadoc-specific licenses?
>
> At the moment we don't have a file specific parsing to exclude comments
> - is that what you want to achieve?
>
> On the other hand if a license header is needed per file, it has to be
> somewhere in the sources ;)
>
> Thanks,
> Phil
Re: License headers inside Javadoc comments
Posted by James Turton <dz...@apache.org>.
Thanks Phil.
Here's some background [1] which comes from before I was involved with
Drill. What they wanted was for the license header checker to accept, in
.java files,
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
etc.
but reject
/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
etc.
Notice the two asterisks that open the Java comment block in the second
form thereby making it a Javadoc comment that will appear in generated
Javadoc. There are no longer any examples of the latter in Drill but
this has been enforced by the addition of the license-maven-plugin.
I got here because I want to remove that plugin, which essentially
duplicates RAT, in favour of another (with exactly the same name :()
that can generate license and notice information for our third party
code. This last task is what I'm really doing, the Javadoc license
header rejection matter is yak shaving that came up on the road.
So my yak shaving question is: if I make RAT Drill's only license header
checker then could I make it reject license headers of the second form?
Even if I can't I'm inclined to make it the only header checker since I
think that it's in any case mandatory and authoritative. But in an
effort to retain the work of the previous Drill developers I'm trying to
preserve what they implemented.
1. https://issues.apache.org/jira/browse/DRILL-6320
On 2024/01/26 14:06, P. Ottlinger wrote:
> Hi James,
>
> thanks for reaching out!
>
> Am 26.01.24 um 08:21 schrieb James Turton:
>> I'd like to ask about a feature to prevent RAT from allowing license
>> headers to appear inside Javadoc comments (/**) while still requiring
>> them in Java comments (/*) in .java files. Currently the Drill project
>> makes use of com.mycila.license-maven-plugin to reject licenses in
>> Javadoc comments because the developers at the time didn't want
>> license headers cluttering the Javadoc website that is generated from
>> the source. Are you aware of a general view on Apache license headers
>> appearing in Javadoc pages? If preventing them from doing so is a good
>> idea, could this become a (configurable) feature in RAT?
>
> could you be so kind to provide an example of what you want to achieve
> and how your use case looks like?
>
> I'm afraid I do not really understand what you mean with
> javadoc-specific licenses?
>
> At the moment we don't have a file specific parsing to exclude comments
> - is that what you want to achieve?
>
> On the other hand if a license header is needed per file, it has to be
> somewhere in the sources ;)
>
> Thanks,
> Phil
Re: License headers inside Javadoc comments
Posted by "P. Ottlinger" <po...@apache.org>.
Hi James,
thanks for reaching out!
Am 26.01.24 um 08:21 schrieb James Turton:
> I'd like to ask about a feature to prevent RAT from allowing license
> headers to appear inside Javadoc comments (/**) while still requiring
> them in Java comments (/*) in .java files. Currently the Drill project
> makes use of com.mycila.license-maven-plugin to reject licenses in
> Javadoc comments because the developers at the time didn't want license
> headers cluttering the Javadoc website that is generated from the
> source. Are you aware of a general view on Apache license headers
> appearing in Javadoc pages? If preventing them from doing so is a good
> idea, could this become a (configurable) feature in RAT?
could you be so kind to provide an example of what you want to achieve
and how your use case looks like?
I'm afraid I do not really understand what you mean with
javadoc-specific licenses?
At the moment we don't have a file specific parsing to exclude comments
- is that what you want to achieve?
On the other hand if a license header is needed per file, it has to be
somewhere in the sources ;)
Thanks,
Phil
Re: License headers inside Javadoc comments
Posted by James Turton <ja...@somecomputer.xyz.INVALID>.
Thanks Paul
> I don't know how to configure the license plugin. But, I do suspect a
> Python file (or shell script) could make a one-time pass over the files to
> standardize headers into whatever format the team chooses. Only the first
> line of each file would change.
I put more information in my reply to the RAT devs but exactly this.
Adding a
grep or a sed command to the build or to the release instructions would
surely be
sufficient. I will likely replace the license-maven-plugin that was
brought in with
something that does other needed things, like report our third party
licenses.
On 2024/01/26 09:59, Paul Rogers wrote:
> Hi James,
>
> For some reason, Drill started with the license headers in Javadoc
> comments. The (weak) explanation I got was that we never generate Javadoc,
> so it didn't really matter. Later, we started converting the headers to
> regular comments when convenient.
>
> If we were to generate Javadoc, having the license at the top of each page
> as the summary for each class would probably not be something that anyone
> finds useful.
>
> I don't know how to configure the license plugin. But, I do suspect a
> Python file (or shell script) could make a one-time pass over the files to
> standardize headers into whatever format the team chooses. Only the first
> line of each file would change.
>
> - Paul
>
> On Thu, Jan 25, 2024 at 11:22 PM James Turton <dz...@apache.org> wrote:
>
>> Good morning!
>>
>> I'd like to ask about a feature to prevent RAT from allowing license
>> headers to appear inside Javadoc comments (/**) while still requiring
>> them in Java comments (/*) in .java files. Currently the Drill project
>> makes use of com.mycila.license-maven-plugin to reject licenses in
>> Javadoc comments because the developers at the time didn't want license
>> headers cluttering the Javadoc website that is generated from the
>> source. Are you aware of a general view on Apache license headers
>> appearing in Javadoc pages? If preventing them from doing so is a good
>> idea, could this become a (configurable) feature in RAT?
>>
>> Thanks
>> James Turton
>>
Re: License headers inside Javadoc comments
Posted by Ted Dunning <te...@gmail.com>.
The right way to get a copyright on every page is to tweak the javadoc
command to use a different template (I would think).
On Fri, Jan 26, 2024 at 12:00 AM Paul Rogers <pa...@gmail.com> wrote:
> Hi James,
>
> For some reason, Drill started with the license headers in Javadoc
> comments. The (weak) explanation I got was that we never generate Javadoc,
> so it didn't really matter. Later, we started converting the headers to
> regular comments when convenient.
>
> If we were to generate Javadoc, having the license at the top of each page
> as the summary for each class would probably not be something that anyone
> finds useful.
>
> I don't know how to configure the license plugin. But, I do suspect a
> Python file (or shell script) could make a one-time pass over the files to
> standardize headers into whatever format the team chooses. Only the first
> line of each file would change.
>
> - Paul
>
> On Thu, Jan 25, 2024 at 11:22 PM James Turton <dz...@apache.org> wrote:
>
> > Good morning!
> >
> > I'd like to ask about a feature to prevent RAT from allowing license
> > headers to appear inside Javadoc comments (/**) while still requiring
> > them in Java comments (/*) in .java files. Currently the Drill project
> > makes use of com.mycila.license-maven-plugin to reject licenses in
> > Javadoc comments because the developers at the time didn't want license
> > headers cluttering the Javadoc website that is generated from the
> > source. Are you aware of a general view on Apache license headers
> > appearing in Javadoc pages? If preventing them from doing so is a good
> > idea, could this become a (configurable) feature in RAT?
> >
> > Thanks
> > James Turton
> >
>
Re: License headers inside Javadoc comments
Posted by Paul Rogers <pa...@gmail.com>.
Hi James,
For some reason, Drill started with the license headers in Javadoc
comments. The (weak) explanation I got was that we never generate Javadoc,
so it didn't really matter. Later, we started converting the headers to
regular comments when convenient.
If we were to generate Javadoc, having the license at the top of each page
as the summary for each class would probably not be something that anyone
finds useful.
I don't know how to configure the license plugin. But, I do suspect a
Python file (or shell script) could make a one-time pass over the files to
standardize headers into whatever format the team chooses. Only the first
line of each file would change.
- Paul
On Thu, Jan 25, 2024 at 11:22 PM James Turton <dz...@apache.org> wrote:
> Good morning!
>
> I'd like to ask about a feature to prevent RAT from allowing license
> headers to appear inside Javadoc comments (/**) while still requiring
> them in Java comments (/*) in .java files. Currently the Drill project
> makes use of com.mycila.license-maven-plugin to reject licenses in
> Javadoc comments because the developers at the time didn't want license
> headers cluttering the Javadoc website that is generated from the
> source. Are you aware of a general view on Apache license headers
> appearing in Javadoc pages? If preventing them from doing so is a good
> idea, could this become a (configurable) feature in RAT?
>
> Thanks
> James Turton
>
Re: License headers inside Javadoc comments
Posted by "P. Ottlinger" <po...@apache.org>.
Hi James,
thanks for reaching out!
Am 26.01.24 um 08:21 schrieb James Turton:
> I'd like to ask about a feature to prevent RAT from allowing license
> headers to appear inside Javadoc comments (/**) while still requiring
> them in Java comments (/*) in .java files. Currently the Drill project
> makes use of com.mycila.license-maven-plugin to reject licenses in
> Javadoc comments because the developers at the time didn't want license
> headers cluttering the Javadoc website that is generated from the
> source. Are you aware of a general view on Apache license headers
> appearing in Javadoc pages? If preventing them from doing so is a good
> idea, could this become a (configurable) feature in RAT?
could you be so kind to provide an example of what you want to achieve
and how your use case looks like?
I'm afraid I do not really understand what you mean with
javadoc-specific licenses?
At the moment we don't have a file specific parsing to exclude comments
- is that what you want to achieve?
On the other hand if a license header is needed per file, it has to be
somewhere in the sources ;)
Thanks,
Phil