You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by Nicholas DiPiazza <ni...@gmail.com> on 2024/04/19 21:43:23 UTC

Copilot license for open source?

Can I get an open source license for GitHub copilot?

Re: Copilot license for open source?

Posted by Nicholas DiPiazza <ni...@gmail.com>.
Cool! Great info here!!

Good refresher to keep us from continuing to provide compliment commits.

It is tricky now that we are forced to use AI tools at work now because now
I have to get used to not having them for ASF projects. Thus why I was
seeing if I can get them for ASF projects.

I will continue to author my own commits and just use AI like Google at
this point which trains and gives examples and documentation but is not
authoring my my code.




On Mon, Apr 22, 2024, 3:36 AM Nick Burch <ap...@gagravarr.org> wrote:

> On Sun, 21 Apr 2024, Michael Wechner wrote:
> > Thanks for the pointer to the Generative Tooling rules, which I was not
> > aware of so far.
> >
> > At the bottom it says, that the ASF does not tell developers what tools
> > to use, but I think it would be useful to useful to have some concrete
> > examples, which would make the rules more clear.
>
> (Not a lawyer, not an official ASK response)
>
> There's nothing special about LLMs and this, other than perhaps the speed
> with which you can make mistakes... When including other people's code,
> it's all about license compatibility and attribution
>
> The ASF started when a bunch of people started sharing patches for a web
> server, with attribution and code under a compatible license. The
> foundation grew during a period where it got easier to find code + code
> snippets online, including much that wasn't under a compatible license.
> Rules didn't change, other than clarifying processes for checking licenses
> and what was/wasn't compatible.
>
> You weren't, and still aren't, allowed to copy + paste large chunks of
> someone else's code without a compatible license and suitable attribution.
> Using a LLM to read all the internet and suggest the code to copy doesn't
> change that. Well, other than the well-documented issues with getting LLMs
> to cite their sources...
>
> LLMs have loads of great uses, including helping you learn new things,
> decoding error messages, finding common patterns, rubber-ducking etc.
> They're even worse than many internet forums for suggesting large chunks
> of code of unclear provenance to copy+paste
>
> It doesn't matter if it's ChatGPT, Github Co-pilot, a local LLM, someone
> on StackOverflow, or a YouTube video that's giving you some code you want
> to copy. 3 characters are almost certainly fine, 3 pages are almost
> certainly not, a general idea is often fine, and you absolutely need to
> engage your brain before committing to ASF repos!
>
>
> Otherwise, if you do still think more rules / examples / etc are needed,
> you'll be wanting legal-discuss@
> https://lists.apache.org/list.html?legal-discuss@apache.org
>
> Cheers
> Nick
>

Re: Copilot license for open source?

Posted by Michael Wechner <mi...@wyona.com>.
Thanks for your feedback!

I think your statement "There's nothing special about LLMs and this, 
other than perhaps the speed with which you can make mistakes" hits the 
nail on the head, which I think means, that there should actually be no 
special rule in this sense for GenAI, but rather a "warning", that with 
GenAI you might risk to break the existing copyright laws more easily / 
unconscious.

Re examples, I could imagine there are GenAI tools which make it more 
obvious where the content comes resp. has good references, than other 
tools, which of course does not mean, that you should not be less aware 
of possibly breaking copyright laws.

With the EU AI Act the LLMs should actually have to declare what data 
they were trained on, etc. which also should make it more transparent in 
the future.

Thanks

Michael

Am 22.04.24 um 10:35 schrieb Nick Burch:
> On Sun, 21 Apr 2024, Michael Wechner wrote:
>> Thanks for the pointer to the Generative Tooling rules, which I was 
>> not aware of so far.
>>
>> At the bottom it says, that the ASF does not tell developers what 
>> tools to use, but I think it would be useful to useful to have some 
>> concrete examples, which would make the rules more clear.
>
> (Not a lawyer, not an official ASK response)
>
> There's nothing special about LLMs and this, other than perhaps the 
> speed with which you can make mistakes... When including other 
> people's code, it's all about license compatibility and attribution
>
> The ASF started when a bunch of people started sharing patches for a 
> web server, with attribution and code under a compatible license. The 
> foundation grew during a period where it got easier to find code + 
> code snippets online, including much that wasn't under a compatible 
> license. Rules didn't change, other than clarifying processes for 
> checking licenses and what was/wasn't compatible.
>
> You weren't, and still aren't, allowed to copy + paste large chunks of 
> someone else's code without a compatible license and suitable 
> attribution. Using a LLM to read all the internet and suggest the code 
> to copy doesn't change that. Well, other than the well-documented 
> issues with getting LLMs to cite their sources...
>
> LLMs have loads of great uses, including helping you learn new things, 
> decoding error messages, finding common patterns, rubber-ducking etc. 
> They're even worse than many internet forums for suggesting large 
> chunks of code of unclear provenance to copy+paste
>
> It doesn't matter if it's ChatGPT, Github Co-pilot, a local LLM, 
> someone on StackOverflow, or a YouTube video that's giving you some 
> code you want to copy. 3 characters are almost certainly fine, 3 pages 
> are almost certainly not, a general idea is often fine, and you 
> absolutely need to engage your brain before committing to ASF repos!
>
>
> Otherwise, if you do still think more rules / examples / etc are 
> needed, you'll be wanting legal-discuss@
> https://lists.apache.org/list.html?legal-discuss@apache.org
>
> Cheers
> Nick


Re: Copilot license for open source?

Posted by Nick Burch <ap...@gagravarr.org>.
On Sun, 21 Apr 2024, Michael Wechner wrote:
> Thanks for the pointer to the Generative Tooling rules, which I was not 
> aware of so far.
>
> At the bottom it says, that the ASF does not tell developers what tools 
> to use, but I think it would be useful to useful to have some concrete 
> examples, which would make the rules more clear.

(Not a lawyer, not an official ASK response)

There's nothing special about LLMs and this, other than perhaps the speed 
with which you can make mistakes... When including other people's code, 
it's all about license compatibility and attribution

The ASF started when a bunch of people started sharing patches for a web 
server, with attribution and code under a compatible license. The 
foundation grew during a period where it got easier to find code + code 
snippets online, including much that wasn't under a compatible license. 
Rules didn't change, other than clarifying processes for checking licenses 
and what was/wasn't compatible.

You weren't, and still aren't, allowed to copy + paste large chunks of 
someone else's code without a compatible license and suitable attribution. 
Using a LLM to read all the internet and suggest the code to copy doesn't 
change that. Well, other than the well-documented issues with getting LLMs 
to cite their sources...

LLMs have loads of great uses, including helping you learn new things, 
decoding error messages, finding common patterns, rubber-ducking etc. 
They're even worse than many internet forums for suggesting large chunks 
of code of unclear provenance to copy+paste

It doesn't matter if it's ChatGPT, Github Co-pilot, a local LLM, someone 
on StackOverflow, or a YouTube video that's giving you some code you want 
to copy. 3 characters are almost certainly fine, 3 pages are almost 
certainly not, a general idea is often fine, and you absolutely need to 
engage your brain before committing to ASF repos!


Otherwise, if you do still think more rules / examples / etc are needed, 
you'll be wanting legal-discuss@
https://lists.apache.org/list.html?legal-discuss@apache.org

Cheers
Nick

Re: Copilot license for open source?

Posted by Michael Wechner <mi...@wyona.com>.
Thanks for the pointer to the Generative Tooling rules, which I was not aware of so far.

At the bottom it says, that the ASF does not tell developers what tools to use, but I think it would be useful to useful to have some concrete examples, which would make the rules more clear.

WDYT?

Thanks

Michael


> Am 21.04.2024 um 17:48 schrieb Nick Burch <ni...@apache.org>:
> 
> On Fri, 19 Apr 2024, Nicholas DiPiazza wrote:
>> Can I get an open source license for GitHub copilot?
> 
> I've not heard of anyone offering that. Some of the open and open-ish models are quite good on coding tasks, though you'd need to hop to a different interface to ask for help (unlike the in-line way with github co-pilot)
> 
> Whatever you opt for, make sure you read + understand + follow the ASF Generative Tooling rules though!
> https://www.apache.org/legal/generative-tooling.html
> 
> Nick


Re: Copilot license for open source?

Posted by Nick Burch <ni...@apache.org>.
On Fri, 19 Apr 2024, Nicholas DiPiazza wrote:
> Can I get an open source license for GitHub copilot?

I've not heard of anyone offering that. Some of the open and open-ish 
models are quite good on coding tasks, though you'd need to hop to a 
different interface to ask for help (unlike the in-line way with github 
co-pilot)

Whatever you opt for, make sure you read + understand + follow the ASF 
Generative Tooling rules though!
https://www.apache.org/legal/generative-tooling.html

Nick