You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ctakes.apache.org by Karthik Sarma <ks...@ksarma.com> on 2013/04/03 13:53:07 UTC

Re: [DISCUSS] Where should cTAKES models live?

I like b as well





--
Karthik Sarma
UCLA Medical Scientist Training Program Class of 20??
Member, UCLA Medical Imaging & Informatics Lab
Member, CA Delegation to the House of Delegates of the American Medical
Association
ksarma@ksarma.com
gchat: ksarma@gmail.com
linkedin: www.linkedin.com/in/ksarma


On Fri, Mar 29, 2013 at 8:58 AM, Masanz, James J. <Ma...@mayo.edu>wrote:

> I agree with about (b)
>
>
> > -----Original Message-----
> > From: dev-return-1411-Masanz.James=mayo.edu@ctakes.apache.org [mailto:
> dev-
> > return-1411-Masanz.James=mayo.edu@ctakes.apache.org] On Behalf Of Steven
> > Bethard
> > Sent: Friday, March 29, 2013 8:27 AM
> > To: dev@ctakes.apache.org
> > Subject: Re: [DISCUSS] Where should cTAKES models live?
> >
> > On Mar 29, 2013, at 7:09 AM, "Chen, Pei" <Pei.Chen@childrens.harvard.edu
> >
> > wrote:
> > > It looks like the general consensus is for # 2)  Leave them in the ASF
> > repo, but as separate modules/project(s).
> > > Which means we (the community) will take on the risk (security, ip,
> > license, etc.) and responsibility for the models that we commit.
> > > I'll take a stab at this today...
> > > Does anyone think it's worthwhile to (a) lump them all together and
> call
> > it a ctakes-resources project/model for pragmatic reasons?  Otherwise
> (b),
> > we'll have a resource module for each such as ctakes-core-res,
> ctakes-pos-
> > tagger-res, etc.?
> >
> > I prefer (b). I know that means a lot more projects, but if I only want
> > to, say, run the ctakes-temporal models, it would be a pity if I had to
> > pull in the whole UMLS distribution at the same time.
> >
> > Steve
> >
> > >
> > > --Pei
> > >
> > >
> > >> -----Original Message-----
> > >> From: ksarma@gmail.com [mailto:ksarma@gmail.com] On Behalf Of Karthik
> > >> Sarma
> > >> Sent: Tuesday, March 19, 2013 1:35 PM
> > >> To: cTAKES Developer List
> > >> Subject: Re: [DISCUSS] Where should cTAKES models live?
> > >>
> > >> I concur. +1 for option 2 -- I do not really see any advantages that
> > >> option
> > >> 3 could have over option 2, as the difference should be largely
> > >> transparent to users (and even developers)
> > >>
> > >>
> > >>
> > >>
> > >>
> > >> --
> > >> Karthik Sarma
> > >> UCLA Medical Scientist Training Program Class of 20??
> > >> Member, UCLA Medical Imaging & Informatics Lab Member, CA Delegation
> > >> to the House of Delegates of the American Medical Association
> > >> ksarma@ksarma.com
> > >> gchat: ksarma@gmail.com
> > >> linkedin: www.linkedin.com/in/ksarma
> > >>
> > >>
> > >> On Tue, Mar 19, 2013 at 9:49 AM, Masanz, James J.
> > >> <Ma...@mayo.edu>wrote:
> > >>
> > >>> I also am +1 for option 2.
> > >>>
> > >>> #3 is my least favorite, because of the download time for some of
> > >>> the models, both for cases like Steve mentioned but also for cases
> > >>> of wanting to check out a fresh copy of the code and not wanting to
> > >>> wait to check out the models again
> > >>>
> > >>> -- James
> > >>>
> > >>>
> > >>>> -----Original Message-----
> > >>>> From:
> > >>>> ctakes-dev-return-1378-
> > >> Masanz.James=mayo.edu@incubator.apache.org
> > >>>> [mailto:ctakes-dev-return-1378-Masanz.James=
> > >>> mayo.edu@incubator.apache.org]
> > >>>> On Behalf Of Steven Bethard
> > >>>> Sent: Friday, March 15, 2013 1:06 PM
> > >>>> To: ctakes-dev@incubator.apache.org
> > >>>> Subject: Re: [DISCUSS] Where should cTAKES models live?
> > >>>>
> > >>>> On Mar 15, 2013, at 4:39 PM, "Chen, Pei"
> > >>>> <Pei.Chen@childrens.harvard.edu
> > >>>>
> > >>>> wrote:
> > >>>>> So the question is: What should we do with the model files?  Some
> > >>>> options include:
> > >>>>>
> > >>>>> 1)      Leave them in SourceForge/Maven Central.  Maven can
> download
> > >>> and
> > >>>> include them in the convenience binaries in the ctakes-distribution
> > >>>> project. Something we did quickly for 3.0, but needs to be improved
> > >>>> if we go with this approach.  For example: [2]
> > >>>>>
> > >>>>> 2)      Leave them in the ASF repo, but separate modules/projects.
> > >>>>>
> > >>>>> 3)      Keep them in the same respective ASF modules under
> > >>>> /src/main/resources
> > >>>>>
> > >>>>> I think it's nice to keep these fairly large (~1GB) and static
> > >>>>> resource
> > >>>> files separate from the source code (Either option 1 or 2).  Also,
> > >>>> option
> > >>>> 1 will require a little more work by the committers/release
> > >>>> managers but will definitely avoid any licensing issues/concerns.
> > >>>>
> > >>>> I'd definitely vote for (2). That makes releases much easier than
> > >>>> if you have to coordinate between the ASF and Sourceforge
> > >>>> repositories, but also allows people to depend on the code in a
> > >>>> module without also pulling in all the models as well. (This would
> > >>>> make a lot of sense even now, for example, in ctakes-temporal which
> > >>>> depends on ctakes-relation-extractor only for the relation
> > >>>> extraction framework and not for the location_of
> > >>> and
> > >>>> degree_of models.)
> > >>>>
> > >>>> Steve
> > >>>
>
>

RE: [DISCUSS] Where should cTAKES models live?

Posted by "Chen, Pei" <Pe...@childrens.harvard.edu>.
This has been done in trunk in  r1463641.
Feel free to give it a whirl...

> -----Original Message-----
> From: ksarma@gmail.com [mailto:ksarma@gmail.com] On Behalf Of Karthik
> Sarma
> Sent: Wednesday, April 03, 2013 7:54 AM
> To: dev@ctakes.apache.org
> Subject: Re: [DISCUSS] Where should cTAKES models live?
> 
> I like b as well
> 
> 
> 
> 
> 
> --
> Karthik Sarma
> UCLA Medical Scientist Training Program Class of 20??
> Member, UCLA Medical Imaging & Informatics Lab Member, CA Delegation
> to the House of Delegates of the American Medical Association
> ksarma@ksarma.com
> gchat: ksarma@gmail.com
> linkedin: www.linkedin.com/in/ksarma
> 
> 
> On Fri, Mar 29, 2013 at 8:58 AM, Masanz, James J.
> <Ma...@mayo.edu>wrote:
> 
> > I agree with about (b)
> >
> >
> > > -----Original Message-----
> > > From: dev-return-1411-Masanz.James=mayo.edu@ctakes.apache.org
> [mailto:
> > dev-
> > > return-1411-Masanz.James=mayo.edu@ctakes.apache.org] On Behalf Of
> > > Steven Bethard
> > > Sent: Friday, March 29, 2013 8:27 AM
> > > To: dev@ctakes.apache.org
> > > Subject: Re: [DISCUSS] Where should cTAKES models live?
> > >
> > > On Mar 29, 2013, at 7:09 AM, "Chen, Pei"
> > > <Pei.Chen@childrens.harvard.edu
> > >
> > > wrote:
> > > > It looks like the general consensus is for # 2)  Leave them in the
> > > > ASF
> > > repo, but as separate modules/project(s).
> > > > Which means we (the community) will take on the risk (security,
> > > > ip,
> > > license, etc.) and responsibility for the models that we commit.
> > > > I'll take a stab at this today...
> > > > Does anyone think it's worthwhile to (a) lump them all together
> > > > and
> > call
> > > it a ctakes-resources project/model for pragmatic reasons?
> > > Otherwise
> > (b),
> > > we'll have a resource module for each such as ctakes-core-res,
> > ctakes-pos-
> > > tagger-res, etc.?
> > >
> > > I prefer (b). I know that means a lot more projects, but if I only
> > > want to, say, run the ctakes-temporal models, it would be a pity if
> > > I had to pull in the whole UMLS distribution at the same time.
> > >
> > > Steve
> > >
> > > >
> > > > --Pei
> > > >
> > > >
> > > >> -----Original Message-----
> > > >> From: ksarma@gmail.com [mailto:ksarma@gmail.com] On Behalf Of
> > > >> Karthik Sarma
> > > >> Sent: Tuesday, March 19, 2013 1:35 PM
> > > >> To: cTAKES Developer List
> > > >> Subject: Re: [DISCUSS] Where should cTAKES models live?
> > > >>
> > > >> I concur. +1 for option 2 -- I do not really see any advantages
> > > >> that option
> > > >> 3 could have over option 2, as the difference should be largely
> > > >> transparent to users (and even developers)
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >> Karthik Sarma
> > > >> UCLA Medical Scientist Training Program Class of 20??
> > > >> Member, UCLA Medical Imaging & Informatics Lab Member, CA
> > > >> Delegation to the House of Delegates of the American Medical
> > > >> Association ksarma@ksarma.com
> > > >> gchat: ksarma@gmail.com
> > > >> linkedin: www.linkedin.com/in/ksarma
> > > >>
> > > >>
> > > >> On Tue, Mar 19, 2013 at 9:49 AM, Masanz, James J.
> > > >> <Ma...@mayo.edu>wrote:
> > > >>
> > > >>> I also am +1 for option 2.
> > > >>>
> > > >>> #3 is my least favorite, because of the download time for some
> > > >>> of the models, both for cases like Steve mentioned but also for
> > > >>> cases of wanting to check out a fresh copy of the code and not
> > > >>> wanting to wait to check out the models again
> > > >>>
> > > >>> -- James
> > > >>>
> > > >>>
> > > >>>> -----Original Message-----
> > > >>>> From:
> > > >>>> ctakes-dev-return-1378-
> > > >> Masanz.James=mayo.edu@incubator.apache.org
> > > >>>> [mailto:ctakes-dev-return-1378-Masanz.James=
> > > >>> mayo.edu@incubator.apache.org]
> > > >>>> On Behalf Of Steven Bethard
> > > >>>> Sent: Friday, March 15, 2013 1:06 PM
> > > >>>> To: ctakes-dev@incubator.apache.org
> > > >>>> Subject: Re: [DISCUSS] Where should cTAKES models live?
> > > >>>>
> > > >>>> On Mar 15, 2013, at 4:39 PM, "Chen, Pei"
> > > >>>> <Pei.Chen@childrens.harvard.edu
> > > >>>>
> > > >>>> wrote:
> > > >>>>> So the question is: What should we do with the model files?
> > > >>>>> Some
> > > >>>> options include:
> > > >>>>>
> > > >>>>> 1)      Leave them in SourceForge/Maven Central.  Maven can
> > download
> > > >>> and
> > > >>>> include them in the convenience binaries in the
> > > >>>> ctakes-distribution project. Something we did quickly for 3.0,
> > > >>>> but needs to be improved if we go with this approach.  For
> > > >>>> example: [2]
> > > >>>>>
> > > >>>>> 2)      Leave them in the ASF repo, but separate modules/projects.
> > > >>>>>
> > > >>>>> 3)      Keep them in the same respective ASF modules under
> > > >>>> /src/main/resources
> > > >>>>>
> > > >>>>> I think it's nice to keep these fairly large (~1GB) and static
> > > >>>>> resource
> > > >>>> files separate from the source code (Either option 1 or 2).
> > > >>>> Also, option
> > > >>>> 1 will require a little more work by the committers/release
> > > >>>> managers but will definitely avoid any licensing issues/concerns.
> > > >>>>
> > > >>>> I'd definitely vote for (2). That makes releases much easier
> > > >>>> than if you have to coordinate between the ASF and Sourceforge
> > > >>>> repositories, but also allows people to depend on the code in a
> > > >>>> module without also pulling in all the models as well. (This
> > > >>>> would make a lot of sense even now, for example, in
> > > >>>> ctakes-temporal which depends on ctakes-relation-extractor only
> > > >>>> for the relation extraction framework and not for the
> > > >>>> location_of
> > > >>> and
> > > >>>> degree_of models.)
> > > >>>>
> > > >>>> Steve
> > > >>>
> >
> >