You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by Peter Klügl <pk...@uni-wuerzburg.de> on 2013/04/22 14:45:02 UTC

[DISCUSS] New extensions/components for UIMA TextMarker

Hi,

I am planning for some time to contribute a few extensions/components
for UIMA TextMarker, which have been developed by my students. As I will
rename all projects soon, I want to prepone the contributions.

I want to start a discussion with this mail, whether the contributions
are reasonable and welcome, and if the proposed procedure is OK.

textmarker-ep-textruler-kep:
A new rule learning algorithm with the idea, that humans use different
engineering patterns to create rule files. The implementation contains
simple learning algorithms for a few patterns and tries to combine the
different rules in order to gain advantage of their synergy.
Essentially, the resulting rules should resemble more the rules a human
would write.

textmarker-ep-textruler-trabal:
A new rule learning algorithm, which is able to induce
transformation-based error-driven rules. The basic idea is similar to
the Brill-Tagger, but it is completely generic (no rule templates) and
can also handle arbitrary annotations instead of tags of tokens.

textmarker-ep-augur:
This project is essentially about evaluating information extraction
models (textmarker rules) without labeled data. It is a new framework
similar to the testing views of the TextMarker Workbench, which are used
for back-testing and test-driven development. In contrast to the testing
views, the new framework is able to evaluate documents without a gold
dataset. Here, the user can specify background knowledge (constraints),
which are applied to estimate the accuracy.

Procedure of contribution:
- Create a Jira issue for each contribution
- Let the student attach the project (I do not know if there is still a
check box for the license)
- Commit the projects to sandbox/trunk
- Integrate projects in existing textmarker projects

Is there a way to avoid ICLA for each student?

Before the projects can be part of a future UIMA TextMarker release,
some additional work needs to be done. I would take care of that.

Best,

Peter

Re: [DISCUSS] New extensions/components for UIMA TextMarker

Posted by Peter Klügl <pk...@uni-wuerzburg.de>.
Hi,

On 22.04.2013 17:20, Peter Klügl wrote:
> On 22.04.2013 16:45, Marshall Schor wrote:
>> An important idea for student work:
>>
>> Some students don't pay (too) much attention to whether or not they are
>> incorporating other people's code for their projects.  But for code which Apache
>> is distributing, it is important that when people "contribute" code under the
>> ASL 2.0 that they realize they are certifying it's their original code and that
>> they have the rights to contribute it.  See section 5 and 7 of the CLA
>> http://www.apache.org/licenses/icla.txt
>>
>> That's one of the reasons to ask for people to sign the CLA - to make sure they
>> understand this :-).
> As far as I know the code, there shouldn't be a problem. There will be
> some refactoring and renaming where I can take a closer look at it.
> However, I prefer to do that after they attached the projects.
>
> So, I imply that the procedure is OK, if I send the ICLAs to sec@ before
> submitting anything the our svn?

I created the issues for the three contributions and have already sent
two of three ICLAs to sec@

Peter



> Best,
>
> Peter
>
>> -Marshall
>>
>> On 4/22/2013 10:07 AM, Peter Klügl wrote:
>>> On 22.04.2013 15:20, Richard Eckart de Castilho wrote:
>>>> Am 22.04.2013 um 15:03 schrieb Peter Klügl <pk...@uni-wuerzburg.de>:
>>>>> On 22.04.2013 14:50, Richard Eckart de Castilho wrote:
>>>>> Is something like a bachelor thesis an employment agreement? Is it a
>>>>> contract work, if I set a topic and the students have signed in?
>>>>>
>>>>> The students are of course aware of the planned contribution and have
>>>>> approved it.
>>>> So far, I have only treated remunerated work, e.g. as a student assistant,
>>>> as contract work. In that case, the ownership of the work is with the 
>>>> employer, e.g. the university.
>>>>
>>>> As far as I know, student work that is done as part of the studies is
>>>> theirs [1]. If they put their work under the ASL themselves, you can 
>>>> probably integrate their work via the third-party policy (see Marshall's
>>>> comment on my recent post of adding a new class to uimaFIT). If you have
>>>> a written approval from the students that you can use their work under
>>>> the ASL or similar conditions, maybe this can substitute the ICLA. But 
>>>> it may just be simpler to get them file an ICLA.
>>> The registration form of a thesis here has a subitem, which points out
>>> that the result (code) needs to be licensed under an appropriate open
>>> source license (ASL 2.0). So, I have a document signed by the students,
>>> but I will try to get a signed ICLA, nevertheless.
>>>
>>> Peter
>>>
>>>> I'm curious how this is going to work out for you. We are considering to
>>>> set up a contributors agreement which licenses such code to the university
>>>> and offer every interested potential contributor to file one of these.
>>>> The idea is that, we have them on file for the day that the university
>>>> may want to contribute something to an open source organization and that we
>>>> can act on the basis of these agreements then. Otherwise, we figure, I'll
>>>> be hard to impossible to get the documents signed when those people have
>>>> long gone.
>>>>
>>>> I'd be interested to know how others handle that.
>>>>
>>>> Cheers,
>>>>
>>>> -- Richard
>>>>
>>>> [1] http://www.diplom.de/urheberrecht-nutzungsrecht-autor.html


Re: [DISCUSS] New extensions/components for UIMA TextMarker

Posted by Peter Klügl <pk...@uni-wuerzburg.de>.
On 22.04.2013 16:45, Marshall Schor wrote:
> An important idea for student work:
>
> Some students don't pay (too) much attention to whether or not they are
> incorporating other people's code for their projects.  But for code which Apache
> is distributing, it is important that when people "contribute" code under the
> ASL 2.0 that they realize they are certifying it's their original code and that
> they have the rights to contribute it.  See section 5 and 7 of the CLA
> http://www.apache.org/licenses/icla.txt
>
> That's one of the reasons to ask for people to sign the CLA - to make sure they
> understand this :-).

As far as I know the code, there shouldn't be a problem. There will be
some refactoring and renaming where I can take a closer look at it.
However, I prefer to do that after they attached the projects.

So, I imply that the procedure is OK, if I send the ICLAs to sec@ before
submitting anything the our svn?

Best,

Peter

> -Marshall
>
> On 4/22/2013 10:07 AM, Peter Klügl wrote:
>> On 22.04.2013 15:20, Richard Eckart de Castilho wrote:
>>> Am 22.04.2013 um 15:03 schrieb Peter Klügl <pk...@uni-wuerzburg.de>:
>>>> On 22.04.2013 14:50, Richard Eckart de Castilho wrote:
>>>> Is something like a bachelor thesis an employment agreement? Is it a
>>>> contract work, if I set a topic and the students have signed in?
>>>>
>>>> The students are of course aware of the planned contribution and have
>>>> approved it.
>>> So far, I have only treated remunerated work, e.g. as a student assistant,
>>> as contract work. In that case, the ownership of the work is with the 
>>> employer, e.g. the university.
>>>
>>> As far as I know, student work that is done as part of the studies is
>>> theirs [1]. If they put their work under the ASL themselves, you can 
>>> probably integrate their work via the third-party policy (see Marshall's
>>> comment on my recent post of adding a new class to uimaFIT). If you have
>>> a written approval from the students that you can use their work under
>>> the ASL or similar conditions, maybe this can substitute the ICLA. But 
>>> it may just be simpler to get them file an ICLA.
>> The registration form of a thesis here has a subitem, which points out
>> that the result (code) needs to be licensed under an appropriate open
>> source license (ASL 2.0). So, I have a document signed by the students,
>> but I will try to get a signed ICLA, nevertheless.
>>
>> Peter
>>
>>> I'm curious how this is going to work out for you. We are considering to
>>> set up a contributors agreement which licenses such code to the university
>>> and offer every interested potential contributor to file one of these.
>>> The idea is that, we have them on file for the day that the university
>>> may want to contribute something to an open source organization and that we
>>> can act on the basis of these agreements then. Otherwise, we figure, I'll
>>> be hard to impossible to get the documents signed when those people have
>>> long gone.
>>>
>>> I'd be interested to know how others handle that.
>>>
>>> Cheers,
>>>
>>> -- Richard
>>>
>>> [1] http://www.diplom.de/urheberrecht-nutzungsrecht-autor.html


Re: [DISCUSS] New extensions/components for UIMA TextMarker

Posted by Marshall Schor <ms...@schor.com>.
An important idea for student work:

Some students don't pay (too) much attention to whether or not they are
incorporating other people's code for their projects.  But for code which Apache
is distributing, it is important that when people "contribute" code under the
ASL 2.0 that they realize they are certifying it's their original code and that
they have the rights to contribute it.  See section 5 and 7 of the CLA
http://www.apache.org/licenses/icla.txt

That's one of the reasons to ask for people to sign the CLA - to make sure they
understand this :-).

-Marshall

On 4/22/2013 10:07 AM, Peter Klügl wrote:
> On 22.04.2013 15:20, Richard Eckart de Castilho wrote:
>> Am 22.04.2013 um 15:03 schrieb Peter Klügl <pk...@uni-wuerzburg.de>:
>>> On 22.04.2013 14:50, Richard Eckart de Castilho wrote:
>>> Is something like a bachelor thesis an employment agreement? Is it a
>>> contract work, if I set a topic and the students have signed in?
>>>
>>> The students are of course aware of the planned contribution and have
>>> approved it.
>> So far, I have only treated remunerated work, e.g. as a student assistant,
>> as contract work. In that case, the ownership of the work is with the 
>> employer, e.g. the university.
>>
>> As far as I know, student work that is done as part of the studies is
>> theirs [1]. If they put their work under the ASL themselves, you can 
>> probably integrate their work via the third-party policy (see Marshall's
>> comment on my recent post of adding a new class to uimaFIT). If you have
>> a written approval from the students that you can use their work under
>> the ASL or similar conditions, maybe this can substitute the ICLA. But 
>> it may just be simpler to get them file an ICLA.
> The registration form of a thesis here has a subitem, which points out
> that the result (code) needs to be licensed under an appropriate open
> source license (ASL 2.0). So, I have a document signed by the students,
> but I will try to get a signed ICLA, nevertheless.
>
> Peter
>
>> I'm curious how this is going to work out for you. We are considering to
>> set up a contributors agreement which licenses such code to the university
>> and offer every interested potential contributor to file one of these.
>> The idea is that, we have them on file for the day that the university
>> may want to contribute something to an open source organization and that we
>> can act on the basis of these agreements then. Otherwise, we figure, I'll
>> be hard to impossible to get the documents signed when those people have
>> long gone.
>>
>> I'd be interested to know how others handle that.
>>
>> Cheers,
>>
>> -- Richard
>>
>> [1] http://www.diplom.de/urheberrecht-nutzungsrecht-autor.html
>


Re: [DISCUSS] New extensions/components for UIMA TextMarker

Posted by Peter Klügl <pk...@uni-wuerzburg.de>.
On 22.04.2013 15:20, Richard Eckart de Castilho wrote:
> Am 22.04.2013 um 15:03 schrieb Peter Klügl <pk...@uni-wuerzburg.de>:
>> On 22.04.2013 14:50, Richard Eckart de Castilho wrote:
>> Is something like a bachelor thesis an employment agreement? Is it a
>> contract work, if I set a topic and the students have signed in?
>>
>> The students are of course aware of the planned contribution and have
>> approved it.
>
> So far, I have only treated remunerated work, e.g. as a student assistant,
> as contract work. In that case, the ownership of the work is with the 
> employer, e.g. the university.
>
> As far as I know, student work that is done as part of the studies is
> theirs [1]. If they put their work under the ASL themselves, you can 
> probably integrate their work via the third-party policy (see Marshall's
> comment on my recent post of adding a new class to uimaFIT). If you have
> a written approval from the students that you can use their work under
> the ASL or similar conditions, maybe this can substitute the ICLA. But 
> it may just be simpler to get them file an ICLA.

The registration form of a thesis here has a subitem, which points out
that the result (code) needs to be licensed under an appropriate open
source license (ASL 2.0). So, I have a document signed by the students,
but I will try to get a signed ICLA, nevertheless.

Peter

> I'm curious how this is going to work out for you. We are considering to
> set up a contributors agreement which licenses such code to the university
> and offer every interested potential contributor to file one of these.
> The idea is that, we have them on file for the day that the university
> may want to contribute something to an open source organization and that we
> can act on the basis of these agreements then. Otherwise, we figure, I'll
> be hard to impossible to get the documents signed when those people have
> long gone.
>
> I'd be interested to know how others handle that.
>
> Cheers,
>
> -- Richard
>
> [1] http://www.diplom.de/urheberrecht-nutzungsrecht-autor.html


Re: [DISCUSS] New extensions/components for UIMA TextMarker

Posted by Richard Eckart de Castilho <ri...@gmail.com>.
Am 22.04.2013 um 15:03 schrieb Peter Klügl <pk...@uni-wuerzburg.de>:
> On 22.04.2013 14:50, Richard Eckart de Castilho wrote:
> Is something like a bachelor thesis an employment agreement? Is it a
> contract work, if I set a topic and the students have signed in?
> 
> The students are of course aware of the planned contribution and have
> approved it.


So far, I have only treated remunerated work, e.g. as a student assistant,
as contract work. In that case, the ownership of the work is with the 
employer, e.g. the university.

As far as I know, student work that is done as part of the studies is
theirs [1]. If they put their work under the ASL themselves, you can 
probably integrate their work via the third-party policy (see Marshall's
comment on my recent post of adding a new class to uimaFIT). If you have
a written approval from the students that you can use their work under
the ASL or similar conditions, maybe this can substitute the ICLA. But 
it may just be simpler to get them file an ICLA.

I'm curious how this is going to work out for you. We are considering to
set up a contributors agreement which licenses such code to the university
and offer every interested potential contributor to file one of these.
The idea is that, we have them on file for the day that the university
may want to contribute something to an open source organization and that we
can act on the basis of these agreements then. Otherwise, we figure, I'll
be hard to impossible to get the documents signed when those people have
long gone.

I'd be interested to know how others handle that.

Cheers,

-- Richard

[1] http://www.diplom.de/urheberrecht-nutzungsrecht-autor.html

Re: [DISCUSS] New extensions/components for UIMA TextMarker

Posted by Peter Klügl <pk...@uni-wuerzburg.de>.
On 22.04.2013 15:13, Jörn Kottmann wrote:
> On 04/22/2013 03:03 PM, Peter Klügl wrote:
>> Is something like a bachelor thesis an employment agreement? Is it a
>> contract work, if I set a topic and the students have signed in?
>
> If the work is done as part of a bachelor thesis it is definitely
> significant enough to
> count as "substantial", its probably at least 2 - 3 man-month of full
> time work.
> Do they get paid for it by someone? If they do their thesis at a
> company and signed a
> contract with them, it might say that the company has the rights on
> the produced source code.
>

No payment and no companies involved. However, one of them continued his
work as a student assistent and was payed for that.

Peter

> Jörn


Re: [DISCUSS] New extensions/components for UIMA TextMarker

Posted by Jörn Kottmann <ko...@gmail.com>.
On 04/22/2013 03:03 PM, Peter Klügl wrote:
> Is something like a bachelor thesis an employment agreement? Is it a
> contract work, if I set a topic and the students have signed in?

If the work is done as part of a bachelor thesis it is definitely 
significant enough to
count as "substantial", its probably at least 2 - 3 man-month of full 
time work.
Do they get paid for it by someone? If they do their thesis at a company 
and signed a
contract with them, it might say that the company has the rights on the 
produced source code.

Jörn

Re: [DISCUSS] New extensions/components for UIMA TextMarker

Posted by Peter Klügl <pk...@uni-wuerzburg.de>.
On 22.04.2013 14:50, Richard Eckart de Castilho wrote:
> Am 22.04.2013 um 14:45 schrieb Peter Klügl <pk...@uni-wuerzburg.de>:
>
>> Hi,
>>
>> I am planning for some time to contribute a few extensions/components
>> for UIMA TextMarker, which have been developed by my students. As I will
>> rename all projects soon, I want to prepone the contributions.
>>
>> I want to start a discussion with this mail, whether the contributions
>> are reasonable and welcome, and if the proposed procedure is OK.
>>
>> …
>>
>> Is there a way to avoid ICLA for each student?
> Afaik it depends on the mode in which the students implemented the contributions.
> E.g. the uimaFIT grant contained some code done by people that did not file an
> ICLA, but they were under contract of our University group, so the SGA/CCLA applied
> to their stuff. If the work was not contract work and the contributions are
> "substantial", I understand that an ICLA is required. I'm still not certain what
> "substantial" means, though.

http://uima.apache.org/get-involved.html says "for contributions beyond
simple typo fixes or short patches ... -> ICLA". So, I think, almost
everything is "substantial".

I am not sure how to interpret http://www.apache.org/licenses/#clas

Is something like a bachelor thesis an employment agreement? Is it a
contract work, if I set a topic and the students have signed in?

The students are of course aware of the planned contribution and have
approved it.

Best,

Peter

> Cheers,
>
> -- Richard


Re: [DISCUSS] New extensions/components for UIMA TextMarker

Posted by Richard Eckart de Castilho <ri...@gmail.com>.
Am 22.04.2013 um 14:45 schrieb Peter Klügl <pk...@uni-wuerzburg.de>:

> Hi,
> 
> I am planning for some time to contribute a few extensions/components
> for UIMA TextMarker, which have been developed by my students. As I will
> rename all projects soon, I want to prepone the contributions.
> 
> I want to start a discussion with this mail, whether the contributions
> are reasonable and welcome, and if the proposed procedure is OK.
> 
> …
> 
> Is there a way to avoid ICLA for each student?

Afaik it depends on the mode in which the students implemented the contributions.
E.g. the uimaFIT grant contained some code done by people that did not file an
ICLA, but they were under contract of our University group, so the SGA/CCLA applied
to their stuff. If the work was not contract work and the contributions are
"substantial", I understand that an ICLA is required. I'm still not certain what
"substantial" means, though.

Cheers,

-- Richard