You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ctakes.apache.org by "Chen, Pei" <Pe...@childrens.harvard.edu> on 2012/12/04 18:04:31 UTC

Create a ctakes-3.0.0-incubating RC4 soon?

Shall we create an RC4 based on the findings so far from RC3?  So far, I don't believe there were any code changes- just some minor configuration xml, script files, Readme updates:  

https://issues.apache.org/jira/browse/CTAKES-114 (Fixed - unexpected end of file in runctakesCPE.sh)
https://issues.apache.org/jira/browse/CTAKES-113 (Fixed - add example sentence to readme using words from toy dictionary)
https://issues.apache.org/jira/browse/CTAKES-101 (Pending- Term spotting pipeline- Sean: Could you confirm if these are just the descriptor xml files that need to be updated?)

Perhaps - we should give our mentors a few more days in case there are any ASF process specific issues that we may also need to incorporate?

--Pei

RE: Create a ctakes-3.0.0-incubating RC4 soon?

Posted by "Masanz, James J." <Ma...@mayo.edu>.
I'm a fan of frequent incremental releases too.
If that means leaving even the source package as a big download for now, that sounds fine to me.

Having said that, it looks like we don't have the full LVG included though, which will add to size.
I see less than 2 MB  taken up by 
\org\apache\ctakes\lvg\data\HSqlDb
Am I missing something or should I open a JIRA ticket

-- James


> -----Original Message-----
> From: ctakes-dev-return-974-Masanz.James=mayo.edu@incubator.apache.org
> [mailto:ctakes-dev-return-974-Masanz.James=mayo.edu@incubator.apache.org]
> On Behalf Of Chen, Pei
> Sent: Thursday, December 13, 2012 9:44 AM
> To: ctakes-dev@incubator.apache.org
> Subject: RE: Create a ctakes-3.0.0-incubating RC4 soon?
> 
> I just wanted to drop a  note to see how to best proceed with 3.0.0-
> incubating?
> I'm personally a big fan of releasing incremental releases frequently
> rather than a mammoth release.
> Having said that, should we proceed (as long as there are no stoppers) and
> make any fixes in a subsequent release?
> 
> --Pei
> 
> > -----Original Message-----
> > From: Chen, Pei [mailto:Pei.Chen@childrens.harvard.edu]
> > Sent: Wednesday, December 05, 2012 11:25 AM
> > To: ctakes-dev@incubator.apache.org
> > Subject: RE: Create a ctakes-3.0.0-incubating RC4 soon?
> >
> > Hi Jukka,
> > Welcome, it's awesome to have the chair (or former char now?) of the
> > incubator project join!!
> >
> > > * The source archive is pretty big at ~600MB. Surely that's not all
> > > source code written by you? External or precompiled binaries like
> > > the various lib jars should not be included in the source release as
> > > their contents can't reasonably be reviewed. Instead they can be
> > > made available as a separate - libs (or -deps) archive, or (better
> > > yet) downloaded automatically from something like the central Maven
> > repository as a part of the build.
> >
> > These files were system generated statistical  models built by their
> > respective authors/institutions who contributed the source code and
> > should be ASL 2.0 licensed along with the code.
> > For the ones that had a license or potential license conflict, we
> > moved them to maven central to be automatically downloaded.  For
> example:
> > <!-- cTAKES Resources -->
> > <dependency>
> > <groupId>net.sourceforge.ctakesresources</groupId>
> > <artifactId>ctakes-resources-umls2011ab</artifactId>
> > <version>3.0.0</version>
> > </dependency>
> >
> > Do you think this is a show-stopper from releasing this candidate out
> > of incubator?
> > Or will we need to move the remaining resources out the projects and
> > have it pulled in automatically by maven like the umls? i.e. Can we
> > potential do this for the next release?
> >
> > > * The top-level LICENSE file only mentions ALv2. Are there no
> > > non-ALv2 code or dependencies included? All the licenses covering
> > > code or other bits included in a release archive should be included
> > > or at least referenced in the LICENSE file.
> > Do we need separate LICENSE/NOTICE files?  There is one that inside
> > ctakes- distribution (that gets copied when packaging all of the
> > binaries contains references to the 3rd party libs.)
> >
> > > * How does the licensing of models like the ones in
> > > ctakes-dependency-
> > >
> > parser/src/main/resources/org/apache/ctakes/dependency/parser/models/
> > > dependency/
> > > work?
> > Yes, these models were system generated statistical  models built by
> > their respective authors/institutions who contributed the source code
> > and licensed along with the code ASL.
> >
> > Thanks,
> > Pei
> >
> >
> > >
> > > BR,
> > >
> > > Jukka Zitting

RE: Create a ctakes-3.0.0-incubating RC4 soon?

Posted by "Chen, Pei" <Pe...@childrens.harvard.edu>.
I just wanted to drop a  note to see how to best proceed with 3.0.0-incubating?
I'm personally a big fan of releasing incremental releases frequently rather than a mammoth release.
Having said that, should we proceed (as long as there are no stoppers) and make any fixes in a subsequent release?

--Pei

> -----Original Message-----
> From: Chen, Pei [mailto:Pei.Chen@childrens.harvard.edu]
> Sent: Wednesday, December 05, 2012 11:25 AM
> To: ctakes-dev@incubator.apache.org
> Subject: RE: Create a ctakes-3.0.0-incubating RC4 soon?
> 
> Hi Jukka,
> Welcome, it's awesome to have the chair (or former char now?) of the
> incubator project join!!
> 
> > * The source archive is pretty big at ~600MB. Surely that's not all
> > source code written by you? External or precompiled binaries like the
> > various lib jars should not be included in the source release as their
> > contents can't reasonably be reviewed. Instead they can be made
> > available as a separate - libs (or -deps) archive, or (better yet)
> > downloaded automatically from something like the central Maven
> repository as a part of the build.
> 
> These files were system generated statistical  models built by their
> respective authors/institutions who contributed the source code and should
> be ASL 2.0 licensed along with the code.
> For the ones that had a license or potential license conflict, we moved them
> to maven central to be automatically downloaded.  For example:
> <!-- cTAKES Resources -->
> <dependency>
> <groupId>net.sourceforge.ctakesresources</groupId>
> <artifactId>ctakes-resources-umls2011ab</artifactId>
> <version>3.0.0</version>
> </dependency>
> 
> Do you think this is a show-stopper from releasing this candidate out of
> incubator?
> Or will we need to move the remaining resources out the projects and have
> it pulled in automatically by maven like the umls? i.e. Can we potential do this
> for the next release?
> 
> > * The top-level LICENSE file only mentions ALv2. Are there no non-ALv2
> > code or dependencies included? All the licenses covering code or other
> > bits included in a release archive should be included or at least
> > referenced in the LICENSE file.
> Do we need separate LICENSE/NOTICE files?  There is one that inside ctakes-
> distribution (that gets copied when packaging all of the binaries contains
> references to the 3rd party libs.)
> 
> > * How does the licensing of models like the ones in ctakes-dependency-
> >
> parser/src/main/resources/org/apache/ctakes/dependency/parser/models/
> > dependency/
> > work?
> Yes, these models were system generated statistical  models built by their
> respective authors/institutions who contributed the source code and
> licensed along with the code ASL.
> 
> Thanks,
> Pei
> 
> 
> >
> > BR,
> >
> > Jukka Zitting

RE: Create a ctakes-3.0.0-incubating RC4 soon?

Posted by "Chen, Pei" <Pe...@childrens.harvard.edu>.
Hi Jukka,
Welcome, it's awesome to have the chair (or former char now?) of the incubator project join!!

> * The source archive is pretty big at ~600MB. Surely that's not all source code
> written by you? External or precompiled binaries like the various lib jars
> should not be included in the source release as their contents can't
> reasonably be reviewed. Instead they can be made available as a separate -
> libs (or -deps) archive, or (better yet) downloaded automatically from
> something like the central Maven repository as a part of the build.

These files were system generated statistical  models built by their respective authors/institutions who contributed the source code and should be ASL 2.0 licensed along with the code.
For the ones that had a license or potential license conflict, we moved them to maven central to be automatically downloaded.  For example:
<!-- cTAKES Resources -->
<dependency>
<groupId>net.sourceforge.ctakesresources</groupId>
<artifactId>ctakes-resources-umls2011ab</artifactId>
<version>3.0.0</version>
</dependency>

Do you think this is a show-stopper from releasing this candidate out of incubator?
Or will we need to move the remaining resources out the projects and have it pulled in automatically by maven like the umls? i.e. Can we potential do this for the next release?

> * The top-level LICENSE file only mentions ALv2. Are there no non-ALv2 code
> or dependencies included? All the licenses covering code or other bits
> included in a release archive should be included or at least referenced in the
> LICENSE file.
Do we need separate LICENSE/NOTICE files?  There is one that inside ctakes-distribution (that gets copied when packaging all of the binaries contains references to the 3rd party libs.)

> * How does the licensing of models like the ones in ctakes-dependency-
> parser/src/main/resources/org/apache/ctakes/dependency/parser/models/
> dependency/
> work?
Yes, these models were system generated statistical  models built by their respective authors/institutions who contributed the source code and licensed along with the code ASL.

Thanks,
Pei


> 
> BR,
> 
> Jukka Zitting

Re: Create a ctakes-3.0.0-incubating RC4 soon?

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Tue, Dec 4, 2012 at 6:04 PM, Chen, Pei
<Pe...@childrens.harvard.edu> wrote:
> Perhaps - we should give our mentors a few more days in case there are any
> ASF process specific issues that we may also need to incorporate?

I'm not a formal mentor (yet), but I'm giving a look at the latest
release candidate now as an interested IPMC member. I haven't yet
completed a full review, but here's a few initial things that should
be addressed:

* The source archive is pretty big at ~600MB. Surely that's not all
source code written by you? External or precompiled binaries like the
various lib jars should not be included in the source release as their
contents can't reasonably be reviewed. Instead they can be made
available as a separate -libs (or -deps) archive, or (better yet)
downloaded automatically from something like the central Maven
repository as a part of the build.

* The top-level LICENSE file only mentions ALv2. Are there no non-ALv2
code or dependencies included? All the licenses covering code or other
bits included in a release archive should be included or at least
referenced in the LICENSE file.

* How does the licensing of models like the ones in
ctakes-dependency-parser/src/main/resources/org/apache/ctakes/dependency/parser/models/dependency/
work?

BR,

Jukka Zitting