You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ctakes.apache.org by "Finan, Sean" <Se...@childrens.harvard.edu> on 2020/06/29 15:02:26 UTC

ApacheCon 2020 [Bulk]

Hi all,


General admission to ApacheCon 2020 is free:  https://hopin.to/events/apachecon-home


I think that price of admission and travel costs have held back ctakes users from attending past conferences, and lack of a sizable audience has diminished the comparative value of ctakes presentations in the eyes of ApacheCon planners.  Because of the "at home" nature of this year's conference, an app with smaller presence and less hip buzz has a better chance of grabbing some time on the schedule.


The predetermined tracks are still an ill fit when it comes to the nature of ctakes.  https://apachecon.com/acah2020/cfp.html

However, I think that we can still use this opportunity to deliver some powerful introduction and training videos, as well as user stories and clinical project application.  Perhaps we can argue for a NLP track and do some coordination with projects like OpenNLP and UIMA.


There are a scant two weeks to come up with presentations, and less time to propose a track/topic.  The call for presentations ends July 13th.  That is a deadline that requires immediate attention by anybody who wants to show off their project or expertise.


Apache wants to have a single point of contact for each project, and I am volunteering to be that person for ctakes.   I am volunteering, not laying claim, so if you think that you are a better fit for the position please let me know.


I have written some ideas for presentations below.  If you want to take one (modify as you like) then please write me and post to the devlist.  If you have ideas for another presentation topic, please let me and the devlist know - even if you aren't volunteering to do the presentation yourself perhaps somebody else will.    Again ... two weeks.​


Thank you,

Sean



*  The following talk ideas are by and large directed toward training.  That does not mean that topics should stay within that scope.


=================================================================


Customizing cTAKES: First Principles

Built using Apache UIMA, cTAKES is modular and extensible.  Why is it frequently treated as a black box?  Is it lack of need, sparsity of resources, or simply fear of the unknown?

This is a quick start tutorial on adding custom elements to cTAKES.  We illustrate creating simple classes to input, process and output data.  This involves a concise overview of Apache uimaFIT and the cTAKES type system, as well as building a UIMA pipeline using piper files.


=================================================================


Loading a shippable with cTAKES DockHand

Customizing a simple pipeline need not be left to cTAKES experts.  Making a cTAKES installation need not be confined to source code checkouts or lengthy multi-stage binary downloads.

We introduce cTAKES DockHand, a compact single-file installation tool that allows one to construct custom pipelines as well as local installations, Rest Services and Dockerfiles.


==================================================================


Secret Engines of cTAKES

The cTAKES default natural language processing pipeline is a standard in the clinical research community.  What is past that standard?  While the default clinical pipeline uses almost 20 engines, there are dozens more in various cTAKES modules.

We present and discuss the top 10 annotation engines you never knew you had.


====================================================


Does cTAKES Know "The Best Words"?

Named Entity Recognition is at the core of all complete natural language processing tools.  Out of the box cTAKES uses a dictionary containing part of the Unified Medical Language System (UMLS) that covers most common clinical terms.  But it also comes with a custom dictionary creator.

If you think that your clinical research is directed, then you should probably have a directed dictionary.  UMLS subsets, non-english dictionaries and novel custom dictionaries have all been successfully used with cTAKES.

This is an overview of cTAKES named entity recognition with the essential what, why and how of custom dictionaries as the centerpiece.


====================================================

Academic Software: Performance or Performance?

A conundrum faced by all academic software projects is how to make the best of a small amount of resources.  Clinical natural language processing projects that use cTAKES are not exempt, and balancing accuracy of results against speed of processing often becomes central when it is time to put things into production (or just please the boss).

More than a history of cTAKES and its evolutionary efforts in precision, speed and usability, this presentation contains examples of how to best utilize each aspect.


================================================================


Diet cTAKES

One reason cTAKES is a popular framework in clinical natural language processing tools is its use of Apache Maven for project management.  Navigating cTAKES dependencies can be difficult, leading to a common practice of consuming the whole project.  Much of what ends up in your system may lead to unnecessary bloat.

Going piecemeal through the values and weights of cTAKES modules and resources, this presentation will assist any cTAKES user in trimming project bulk from gigabytes to megabytes.


================================================================


cTAKES Saved my Life

The title is inappropriate when it comes to healthcare in practice.  However, I used Apache cTAKES for my clinical research project on ________, and its [versatile / comprehensive / speedy / ?] nature was important in completing things [on time /  accurately / ?].

We share our real-world experiences with using cTAKES, discuss why we chose it, issues we faced and how we overcame unexpected problems.


================================================================


Large-scale cTAKES, an Installation Story

At our _____ facility, we needed to process _____ [patients / notes / term lists / ?] on a ______ system.

We present a real-world application of cTAKES on a large scale, our needs for _____ input and ____ output.  We compare and contrast cTAKES with other [clinical] NLP platforms that we tried and explain why we chose [it / another] in the end.

We will also share the novel [techniques / code / integration] that we used for the success of our installation.


================================================================


My Engine is Faster than Yours

We have created a cTAKES annotation engine that performs the task of _____.   This is [newer / faster / more comprehensive] than existing engines in [cTAKES / other].

We will present [numbers , usage , capabilities / i/o ] of the engine and its [model / data ].
We will also commit the code and documentation to Apache cTAKES.


================================================================


cTAKES on the Catwalk

We have created a Machine Learning model that can be used in cTAKES for ______.  The model uses the third party ______ for [newer / faster / more comprehensive] results.

We will present the essentials of model creation as well as [numbers , usage , capabilities / i/o ] of our model.   We will also advocate for the third party _____ and how we integrated it with cTAKES.
We will also commit the code [model] and documentation to Apache cTAKES.









RE: ApacheCon 2020 [EXTERNAL] [SUSPICIOUS]

Posted by "Savova, Guergana" <Gu...@childrens.harvard.edu>.
A fantastic set of presentations, will be of broad interest to the Apache community!
Amazing work, cTAKES community!
Stay safe and healthy all,
--
Guergana Savova, PhD, FACMI
Associate Professor
PI Natural Language Processing Lab
Computational Health Informatics Program (CHIP)
Boston Children's Hospital and Harvard Medical School
401 Park, 5th floor East, 5523.3
Boston, MA 02215
Tel: (617) 919-2972
Fax: (617) 730-0817


-----Original Message-----
From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu] 
Sent: Monday, July 6, 2020 9:21 AM
To: dev@ctakes.apache.org; user@ctakes.apache.org
Subject: Fw: ApacheCon 2020 [EXTERNAL] [SUSPICIOUS]

* External Email - Caution *


I can't believe that I forgot to mention ...


There will also be a presentation (maybe two?) by a group that has adapted ctakes to work with two other languages.  They have also integrated ctakes with other tools such as FreeLing and HeidelTime.  So cool ...


Cheers,

Sean


________________________________
From: Finan, Sean
Sent: Monday, July 6, 2020 9:08 AM
To: dev@ctakes.apache.org; user@ctakes.apache.org
Subject: ApacheCon 2020


Hi all,


The ctakes representation at ApacheCon 2020 is looking good!​


ApacheCon 2020 runs September 29 through October 1.

Submission runs through Sunday, July 12.  Technically it is 8:00 a.m. Eastern time Monday, but please don't procrastinate.

Registration is free.


I am excited to announce that we have three groups interested in giving presentations on their configuration and use of ctakes at a large scale!

We also have a presentation on the installation of the ctakes Rest service using the ctakes-rest module!


Knowledge on these topics is always extremely valuable to our users, and I for one really want to see how sites use ctakes when given different resources, requirements and restrictions.  Because of that, I am trying to put together (technology allowing) a roundtable discussion with those presenters.  That should be of value to every user no matter what your situation.


We still need more presentations!  To encourage you, here is a little information:


1.  What you do is interesting!  If you think that nobody out there cares about what you've done and how, then you probably aren't fully aware of how large and diverse our user base really is.  People want to know about things like integration, customization, clinical specialty application, augmentation and favorite capability fascination.

2.  Submission is very simple.  This is not like a scientific conference that requires a complete paper describing your work.  You only need to submit a blurb that loosely covers your topic and major talking point(s).  Half a dozen sentences will suffice.  In fact, what I sent last week (far below) could pass muster for a submission.  Go for something that will be on a brochure / schedule.

3.  The audience is made up of people just like you.  Developers, Bioinformaticians, IT Specialists, Students, Medical Researchers, AI Explorers and far more Hackers than Rock Stars.

4.  Slick presentation skills are not necessary.  Don't worry if you have never spoken to a room full of listeners.  Don't worry if English isn't your first language.  Don't worry if your slides are "sloppy".  Your presentation will not be graded.

5.  You don't need to prepare your whole talk before submitting.    Idea now, details later.

6.  Registration is FREE.


Right now the speaking time is anything up to 50 minutes.  If you don't want to present a full 50 minutes then that is ok ... The rest can be filled with extra question/answer or somebody else may fill the remaining time with a presentation on a similar topic.


I am going to put together a lightning round.  If you think that you can cover some material in five to fifteen minutes then this is for you!  Lightning rounds can be fun as you can make an impact with two or three slides and barely enough speaking to run out of breath.  This is really a free-for-all.  You can pack the time with data, give a short demonstration, compare using ctakes to breaking a mustang, or even do some on-topic (ctakes, nlp, AI, bioinformatics) stand up.  Anything goes.  This was an interesting (full) talk last year: https://urldefense.proofpoint.com/v2/url?u=https-3A__aceu19.apachecon.com_session_confessions-2Dmiddle-2Daged-2Dcoder-2Dturned-2Dgravel-2Dgrinder&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP&m=e-OOqkInyUhKdC06RHK2xAz6io-pUkfzLWQ4kF_HI1M&s=rrZwfkkVrf06VZ0-06cTQ-JCSvtGXKmpxQo7r20KBxs&e= .   If you want to be in the lightning round, just write me a couple of sentences on your strike and I will put together the full submission for ApacheCon.  Does it get any easier?


I will present one or two things, but to maximize impact I would like to know what most interests / would help all of you.  So, please write me a topic or two that would best apply to your work.


Some links ...


ApacheCon Home Page:  https://urldefense.proofpoint.com/v2/url?u=https-3A__www.apachecon.com_&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP&m=e-OOqkInyUhKdC06RHK2xAz6io-pUkfzLWQ4kF_HI1M&s=R6eHoB0p6UQBsGI4u0TSmoTE8p6_RzSCOye4hX76e2A&e= 

ApacheCon Registration: https://urldefense.proofpoint.com/v2/url?u=https-3A__hopin.to_events_apachecon-2Dhome&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP&m=e-OOqkInyUhKdC06RHK2xAz6io-pUkfzLWQ4kF_HI1M&s=YKg5P99PUXMcH2mmaZCzuhDbMUd7aMQplHEnBfin0oY&e= 

ApacheCon Submission:  https://urldefense.proofpoint.com/v2/url?u=https-3A__acna2020.jamhosted.net_cfp.html&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP&m=e-OOqkInyUhKdC06RHK2xAz6io-pUkfzLWQ4kF_HI1M&s=U8nGs4qaXcTny10cCRGWVRGegJq_3Vh8pfvAMOJLIM0&e= 


Lastly, so that we don't crash a server, I would like to have a rough head count for attendance estimation.  If you think that you will watch any presentation of ctakes then please send me ( seanfinan@apache.org ) an email with the subject "Attend" and "+1" in the body.


Cheers,

Sean


________________________________
From: Finan, Sean
Sent: Monday, June 29, 2020 11:02 AM
To: dev@ctakes.apache.org
Subject: ApacheCon 2020


Hi all,


General admission to ApacheCon 2020 is free:  https://urldefense.proofpoint.com/v2/url?u=https-3A__hopin.to_events_apachecon-2Dhome&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP&m=e-OOqkInyUhKdC06RHK2xAz6io-pUkfzLWQ4kF_HI1M&s=YKg5P99PUXMcH2mmaZCzuhDbMUd7aMQplHEnBfin0oY&e= 


I think that price of admission and travel costs have held back ctakes users from attending past conferences, and lack of a sizable audience has diminished the comparative value of ctakes presentations in the eyes of ApacheCon planners.  Because of the "at home" nature of this year's conference, an app with smaller presence and less hip buzz has a better chance of grabbing some time on the schedule.


The predetermined tracks are still an ill fit when it comes to the nature of ctakes.  https://urldefense.proofpoint.com/v2/url?u=https-3A__apachecon.com_acah2020_cfp.html&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP&m=e-OOqkInyUhKdC06RHK2xAz6io-pUkfzLWQ4kF_HI1M&s=EQkpUL6L1plJ-jT6i8TtOb2o3tK5drAMc2qWRa9Z0XM&e= 

However, I think that we can still use this opportunity to deliver some powerful introduction and training videos, as well as user stories and clinical project application.  Perhaps we can argue for a NLP track and do some coordination with projects like OpenNLP and UIMA.


There are a scant two weeks to come up with presentations, and less time to propose a track/topic.  The call for presentations ends July 13th.  That is a deadline that requires immediate attention by anybody who wants to show off their project or expertise.


Apache wants to have a single point of contact for each project, and I am volunteering to be that person for ctakes.   I am volunteering, not laying claim, so if you think that you are a better fit for the position please let me know.


I have written some ideas for presentations below.  If you want to take one (modify as you like) then please write me and post to the devlist.  If you have ideas for another presentation topic, please let me and the devlist know - even if you aren't volunteering to do the presentation yourself perhaps somebody else will.    Again ... two weeks.​


Thank you,

Sean



*  The following talk ideas are by and large directed toward training.  That does not mean that topics should stay within that scope.


=================================================================


Customizing cTAKES: First Principles

Built using Apache UIMA, cTAKES is modular and extensible.  Why is it frequently treated as a black box?  Is it lack of need, sparsity of resources, or simply fear of the unknown?

This is a quick start tutorial on adding custom elements to cTAKES.  We illustrate creating simple classes to input, process and output data.  This involves a concise overview of Apache uimaFIT and the cTAKES type system, as well as building a UIMA pipeline using piper files.


=================================================================


Loading a shippable with cTAKES DockHand

Customizing a simple pipeline need not be left to cTAKES experts.  Making a cTAKES installation need not be confined to source code checkouts or lengthy multi-stage binary downloads.

We introduce cTAKES DockHand, a compact single-file installation tool that allows one to construct custom pipelines as well as local installations, Rest Services and Dockerfiles.


==================================================================


Secret Engines of cTAKES

The cTAKES default natural language processing pipeline is a standard in the clinical research community.  What is past that standard?  While the default clinical pipeline uses almost 20 engines, there are dozens more in various cTAKES modules.

We present and discuss the top 10 annotation engines you never knew you had.


====================================================


Does cTAKES Know "The Best Words"?

Named Entity Recognition is at the core of all complete natural language processing tools.  Out of the box cTAKES uses a dictionary containing part of the Unified Medical Language System (UMLS) that covers most common clinical terms.  But it also comes with a custom dictionary creator.

If you think that your clinical research is directed, then you should probably have a directed dictionary.  UMLS subsets, non-english dictionaries and novel custom dictionaries have all been successfully used with cTAKES.

This is an overview of cTAKES named entity recognition with the essential what, why and how of custom dictionaries as the centerpiece.


====================================================

Academic Software: Performance or Performance?

A conundrum faced by all academic software projects is how to make the best of a small amount of resources.  Clinical natural language processing projects that use cTAKES are not exempt, and balancing accuracy of results against speed of processing often becomes central when it is time to put things into production (or just please the boss).

More than a history of cTAKES and its evolutionary efforts in precision, speed and usability, this presentation contains examples of how to best utilize each aspect.


================================================================


Diet cTAKES

One reason cTAKES is a popular framework in clinical natural language processing tools is its use of Apache Maven for project management.  Navigating cTAKES dependencies can be difficult, leading to a common practice of consuming the whole project.  Much of what ends up in your system may lead to unnecessary bloat.

Going piecemeal through the values and weights of cTAKES modules and resources, this presentation will assist any cTAKES user in trimming project bulk from gigabytes to megabytes.


================================================================


cTAKES Saved my Life

The title is inappropriate when it comes to healthcare in practice.  However, I used Apache cTAKES for my clinical research project on ________, and its [versatile / comprehensive / speedy / ?] nature was important in completing things [on time /  accurately / ?].

We share our real-world experiences with using cTAKES, discuss why we chose it, issues we faced and how we overcame unexpected problems.


================================================================


Large-scale cTAKES, an Installation Story

At our _____ facility, we needed to process _____ [patients / notes / term lists / ?] on a ______ system.

We present a real-world application of cTAKES on a large scale, our needs for _____ input and ____ output.  We compare and contrast cTAKES with other [clinical] NLP platforms that we tried and explain why we chose [it / another] in the end.

We will also share the novel [techniques / code / integration] that we used for the success of our installation.


================================================================


My Engine is Faster than Yours

We have created a cTAKES annotation engine that performs the task of _____.   This is [newer / faster / more comprehensive] than existing engines in [cTAKES / other].

We will present [numbers , usage , capabilities / i/o ] of the engine and its [model / data ].
We will also commit the code and documentation to Apache cTAKES.


================================================================


cTAKES on the Catwalk

We have created a Machine Learning model that can be used in cTAKES for ______.  The model uses the third party ______ for [newer / faster / more comprehensive] results.

We will present the essentials of model creation as well as [numbers , usage , capabilities / i/o ] of our model.   We will also advocate for the third party _____ and how we integrated it with cTAKES.
We will also commit the code [model] and documentation to Apache cTAKES.









Re: Fw: ApacheCon 2020

Posted by Peter Abramowitsch <pa...@gmail.com>.
Hi Sean

I'm asking my team's manager to see if we can present.  I work as Architect
/ cTakes Implementer with a team at the UCSF Bakar Institute for
computational health sciences.

Peter

On Mon, Jul 6, 2020 at 6:20 AM Finan, Sean <Se...@childrens.harvard.edu>
wrote:

> I can't believe that I forgot to mention ...
>
>
> There will also be a presentation (maybe two?) by a group that has adapted
> ctakes to work with two other languages.  They have also integrated ctakes
> with other tools such as FreeLing and HeidelTime.  So cool ...
>
>
> Cheers,
>
> Sean
>
>
> ________________________________
> From: Finan, Sean
> Sent: Monday, July 6, 2020 9:08 AM
> To: dev@ctakes.apache.org; user@ctakes.apache.org
> Subject: ApacheCon 2020
>
>
> Hi all,
>
>
> The ctakes representation at ApacheCon 2020 is looking good!​
>
>
> ApacheCon 2020 runs September 29 through October 1.
>
> Submission runs through Sunday, July 12.  Technically it is 8:00 a.m.
> Eastern time Monday, but please don't procrastinate.
>
> Registration is free.
>
>
> I am excited to announce that we have three groups interested in giving
> presentations on their configuration and use of ctakes at a large scale!
>
> We also have a presentation on the installation of the ctakes Rest service
> using the ctakes-rest module!
>
>
> Knowledge on these topics is always extremely valuable to our users, and I
> for one really want to see how sites use ctakes when given different
> resources, requirements and restrictions.  Because of that, I am trying to
> put together (technology allowing) a roundtable discussion with those
> presenters.  That should be of value to every user no matter what your
> situation.
>
>
> We still need more presentations!  To encourage you, here is a little
> information:
>
>
> 1.  What you do is interesting!  If you think that nobody out there cares
> about what you've done and how, then you probably aren't fully aware of how
> large and diverse our user base really is.  People want to know about
> things like integration, customization, clinical specialty application,
> augmentation and favorite capability fascination.
>
> 2.  Submission is very simple.  This is not like a scientific conference
> that requires a complete paper describing your work.  You only need to
> submit a blurb that loosely covers your topic and major talking point(s).
> Half a dozen sentences will suffice.  In fact, what I sent last week (far
> below) could pass muster for a submission.  Go for something that will be
> on a brochure / schedule.
>
> 3.  The audience is made up of people just like you.  Developers,
> Bioinformaticians, IT Specialists, Students, Medical Researchers, AI
> Explorers and far more Hackers than Rock Stars.
>
> 4.  Slick presentation skills are not necessary.  Don't worry if you have
> never spoken to a room full of listeners.  Don't worry if English isn't
> your first language.  Don't worry if your slides are "sloppy".  Your
> presentation will not be graded.
>
> 5.  You don't need to prepare your whole talk before submitting.    Idea
> now, details later.
>
> 6.  Registration is FREE.
>
>
> Right now the speaking time is anything up to 50 minutes.  If you don't
> want to present a full 50 minutes then that is ok ... The rest can be
> filled with extra question/answer or somebody else may fill the remaining
> time with a presentation on a similar topic.
>
>
> I am going to put together a lightning round.  If you think that you can
> cover some material in five to fifteen minutes then this is for you!
> Lightning rounds can be fun as you can make an impact with two or three
> slides and barely enough speaking to run out of breath.  This is really a
> free-for-all.  You can pack the time with data, give a short demonstration,
> compare using ctakes to breaking a mustang, or even do some on-topic
> (ctakes, nlp, AI, bioinformatics) stand up.  Anything goes.  This was an
> interesting (full) talk last year:
> https://aceu19.apachecon.com/session/confessions-middle-aged-coder-turned-gravel-grinder.
>  If you want to be in the lightning round, just write me a couple of
> sentences on your strike and I will put together the full submission for
> ApacheCon.  Does it get any easier?
>
>
> I will present one or two things, but to maximize impact I would like to
> know what most interests / would help all of you.  So, please write me a
> topic or two that would best apply to your work.
>
>
> Some links ...
>
>
> ApacheCon Home Page:  https://www.apachecon.com/
>
> ApacheCon Registration: https://hopin.to/events/apachecon-home
>
> ApacheCon Submission:  https://acna2020.jamhosted.net/cfp.html
>
>
> Lastly, so that we don't crash a server, I would like to have a rough head
> count for attendance estimation.  If you think that you will watch any
> presentation of ctakes then please send me ( seanfinan@apache.org ) an
> email with the subject "Attend" and "+1" in the body.
>
>
> Cheers,
>
> Sean
>
>
> ________________________________
> From: Finan, Sean
> Sent: Monday, June 29, 2020 11:02 AM
> To: dev@ctakes.apache.org
> Subject: ApacheCon 2020
>
>
> Hi all,
>
>
> General admission to ApacheCon 2020 is free:
> https://hopin.to/events/apachecon-home
>
>
> I think that price of admission and travel costs have held back ctakes
> users from attending past conferences, and lack of a sizable audience has
> diminished the comparative value of ctakes presentations in the eyes of
> ApacheCon planners.  Because of the "at home" nature of this year's
> conference, an app with smaller presence and less hip buzz has a better
> chance of grabbing some time on the schedule.
>
>
> The predetermined tracks are still an ill fit when it comes to the nature
> of ctakes.  https://apachecon.com/acah2020/cfp.html
>
> However, I think that we can still use this opportunity to deliver some
> powerful introduction and training videos, as well as user stories and
> clinical project application.  Perhaps we can argue for a NLP track and do
> some coordination with projects like OpenNLP and UIMA.
>
>
> There are a scant two weeks to come up with presentations, and less time
> to propose a track/topic.  The call for presentations ends July 13th.  That
> is a deadline that requires immediate attention by anybody who wants to
> show off their project or expertise.
>
>
> Apache wants to have a single point of contact for each project, and I am
> volunteering to be that person for ctakes.   I am volunteering, not laying
> claim, so if you think that you are a better fit for the position please
> let me know.
>
>
> I have written some ideas for presentations below.  If you want to take
> one (modify as you like) then please write me and post to the devlist.  If
> you have ideas for another presentation topic, please let me and the
> devlist know - even if you aren't volunteering to do the presentation
> yourself perhaps somebody else will.    Again ... two weeks.​
>
>
> Thank you,
>
> Sean
>
>
>
> *  The following talk ideas are by and large directed toward training.
> That does not mean that topics should stay within that scope.
>
>
> =================================================================
>
>
> Customizing cTAKES: First Principles
>
> Built using Apache UIMA, cTAKES is modular and extensible.  Why is it
> frequently treated as a black box?  Is it lack of need, sparsity of
> resources, or simply fear of the unknown?
>
> This is a quick start tutorial on adding custom elements to cTAKES.  We
> illustrate creating simple classes to input, process and output data.  This
> involves a concise overview of Apache uimaFIT and the cTAKES type system,
> as well as building a UIMA pipeline using piper files.
>
>
> =================================================================
>
>
> Loading a shippable with cTAKES DockHand
>
> Customizing a simple pipeline need not be left to cTAKES experts.  Making
> a cTAKES installation need not be confined to source code checkouts or
> lengthy multi-stage binary downloads.
>
> We introduce cTAKES DockHand, a compact single-file installation tool that
> allows one to construct custom pipelines as well as local installations,
> Rest Services and Dockerfiles.
>
>
> ==================================================================
>
>
> Secret Engines of cTAKES
>
> The cTAKES default natural language processing pipeline is a standard in
> the clinical research community.  What is past that standard?  While the
> default clinical pipeline uses almost 20 engines, there are dozens more in
> various cTAKES modules.
>
> We present and discuss the top 10 annotation engines you never knew you
> had.
>
>
> ====================================================
>
>
> Does cTAKES Know "The Best Words"?
>
> Named Entity Recognition is at the core of all complete natural language
> processing tools.  Out of the box cTAKES uses a dictionary containing part
> of the Unified Medical Language System (UMLS) that covers most common
> clinical terms.  But it also comes with a custom dictionary creator.
>
> If you think that your clinical research is directed, then you should
> probably have a directed dictionary.  UMLS subsets, non-english
> dictionaries and novel custom dictionaries have all been successfully used
> with cTAKES.
>
> This is an overview of cTAKES named entity recognition with the essential
> what, why and how of custom dictionaries as the centerpiece.
>
>
> ====================================================
>
> Academic Software: Performance or Performance?
>
> A conundrum faced by all academic software projects is how to make the
> best of a small amount of resources.  Clinical natural language processing
> projects that use cTAKES are not exempt, and balancing accuracy of results
> against speed of processing often becomes central when it is time to put
> things into production (or just please the boss).
>
> More than a history of cTAKES and its evolutionary efforts in precision,
> speed and usability, this presentation contains examples of how to best
> utilize each aspect.
>
>
> ================================================================
>
>
> Diet cTAKES
>
> One reason cTAKES is a popular framework in clinical natural language
> processing tools is its use of Apache Maven for project management.
> Navigating cTAKES dependencies can be difficult, leading to a common
> practice of consuming the whole project.  Much of what ends up in your
> system may lead to unnecessary bloat.
>
> Going piecemeal through the values and weights of cTAKES modules and
> resources, this presentation will assist any cTAKES user in trimming
> project bulk from gigabytes to megabytes.
>
>
> ================================================================
>
>
> cTAKES Saved my Life
>
> The title is inappropriate when it comes to healthcare in practice.
> However, I used Apache cTAKES for my clinical research project on ________,
> and its [versatile / comprehensive / speedy / ?] nature was important in
> completing things [on time /  accurately / ?].
>
> We share our real-world experiences with using cTAKES, discuss why we
> chose it, issues we faced and how we overcame unexpected problems.
>
>
> ================================================================
>
>
> Large-scale cTAKES, an Installation Story
>
> At our _____ facility, we needed to process _____ [patients / notes / term
> lists / ?] on a ______ system.
>
> We present a real-world application of cTAKES on a large scale, our needs
> for _____ input and ____ output.  We compare and contrast cTAKES with other
> [clinical] NLP platforms that we tried and explain why we chose [it /
> another] in the end.
>
> We will also share the novel [techniques / code / integration] that we
> used for the success of our installation.
>
>
> ================================================================
>
>
> My Engine is Faster than Yours
>
> We have created a cTAKES annotation engine that performs the task of
> _____.   This is [newer / faster / more comprehensive] than existing
> engines in [cTAKES / other].
>
> We will present [numbers , usage , capabilities / i/o ] of the engine and
> its [model / data ].
> We will also commit the code and documentation to Apache cTAKES.
>
>
> ================================================================
>
>
> cTAKES on the Catwalk
>
> We have created a Machine Learning model that can be used in cTAKES for
> ______.  The model uses the third party ______ for [newer / faster / more
> comprehensive] results.
>
> We will present the essentials of model creation as well as [numbers ,
> usage , capabilities / i/o ] of our model.   We will also advocate for the
> third party _____ and how we integrated it with cTAKES.
> We will also commit the code [model] and documentation to Apache cTAKES.
>
>
>
>
>
>
>
>
>

RE: ApacheCon 2020 [EXTERNAL] [SUSPICIOUS]

Posted by "Savova, Guergana" <Gu...@childrens.harvard.edu>.
A fantastic set of presentations, will be of broad interest to the Apache community!
Amazing work, cTAKES community!
Stay safe and healthy all,
--
Guergana Savova, PhD, FACMI
Associate Professor
PI Natural Language Processing Lab
Computational Health Informatics Program (CHIP)
Boston Children's Hospital and Harvard Medical School
401 Park, 5th floor East, 5523.3
Boston, MA 02215
Tel: (617) 919-2972
Fax: (617) 730-0817


-----Original Message-----
From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu] 
Sent: Monday, July 6, 2020 9:21 AM
To: dev@ctakes.apache.org; user@ctakes.apache.org
Subject: Fw: ApacheCon 2020 [EXTERNAL] [SUSPICIOUS]

* External Email - Caution *


I can't believe that I forgot to mention ...


There will also be a presentation (maybe two?) by a group that has adapted ctakes to work with two other languages.  They have also integrated ctakes with other tools such as FreeLing and HeidelTime.  So cool ...


Cheers,

Sean


________________________________
From: Finan, Sean
Sent: Monday, July 6, 2020 9:08 AM
To: dev@ctakes.apache.org; user@ctakes.apache.org
Subject: ApacheCon 2020


Hi all,


The ctakes representation at ApacheCon 2020 is looking good!​


ApacheCon 2020 runs September 29 through October 1.

Submission runs through Sunday, July 12.  Technically it is 8:00 a.m. Eastern time Monday, but please don't procrastinate.

Registration is free.


I am excited to announce that we have three groups interested in giving presentations on their configuration and use of ctakes at a large scale!

We also have a presentation on the installation of the ctakes Rest service using the ctakes-rest module!


Knowledge on these topics is always extremely valuable to our users, and I for one really want to see how sites use ctakes when given different resources, requirements and restrictions.  Because of that, I am trying to put together (technology allowing) a roundtable discussion with those presenters.  That should be of value to every user no matter what your situation.


We still need more presentations!  To encourage you, here is a little information:


1.  What you do is interesting!  If you think that nobody out there cares about what you've done and how, then you probably aren't fully aware of how large and diverse our user base really is.  People want to know about things like integration, customization, clinical specialty application, augmentation and favorite capability fascination.

2.  Submission is very simple.  This is not like a scientific conference that requires a complete paper describing your work.  You only need to submit a blurb that loosely covers your topic and major talking point(s).  Half a dozen sentences will suffice.  In fact, what I sent last week (far below) could pass muster for a submission.  Go for something that will be on a brochure / schedule.

3.  The audience is made up of people just like you.  Developers, Bioinformaticians, IT Specialists, Students, Medical Researchers, AI Explorers and far more Hackers than Rock Stars.

4.  Slick presentation skills are not necessary.  Don't worry if you have never spoken to a room full of listeners.  Don't worry if English isn't your first language.  Don't worry if your slides are "sloppy".  Your presentation will not be graded.

5.  You don't need to prepare your whole talk before submitting.    Idea now, details later.

6.  Registration is FREE.


Right now the speaking time is anything up to 50 minutes.  If you don't want to present a full 50 minutes then that is ok ... The rest can be filled with extra question/answer or somebody else may fill the remaining time with a presentation on a similar topic.


I am going to put together a lightning round.  If you think that you can cover some material in five to fifteen minutes then this is for you!  Lightning rounds can be fun as you can make an impact with two or three slides and barely enough speaking to run out of breath.  This is really a free-for-all.  You can pack the time with data, give a short demonstration, compare using ctakes to breaking a mustang, or even do some on-topic (ctakes, nlp, AI, bioinformatics) stand up.  Anything goes.  This was an interesting (full) talk last year: https://urldefense.proofpoint.com/v2/url?u=https-3A__aceu19.apachecon.com_session_confessions-2Dmiddle-2Daged-2Dcoder-2Dturned-2Dgravel-2Dgrinder&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP&m=e-OOqkInyUhKdC06RHK2xAz6io-pUkfzLWQ4kF_HI1M&s=rrZwfkkVrf06VZ0-06cTQ-JCSvtGXKmpxQo7r20KBxs&e= .   If you want to be in the lightning round, just write me a couple of sentences on your strike and I will put together the full submission for ApacheCon.  Does it get any easier?


I will present one or two things, but to maximize impact I would like to know what most interests / would help all of you.  So, please write me a topic or two that would best apply to your work.


Some links ...


ApacheCon Home Page:  https://urldefense.proofpoint.com/v2/url?u=https-3A__www.apachecon.com_&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP&m=e-OOqkInyUhKdC06RHK2xAz6io-pUkfzLWQ4kF_HI1M&s=R6eHoB0p6UQBsGI4u0TSmoTE8p6_RzSCOye4hX76e2A&e= 

ApacheCon Registration: https://urldefense.proofpoint.com/v2/url?u=https-3A__hopin.to_events_apachecon-2Dhome&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP&m=e-OOqkInyUhKdC06RHK2xAz6io-pUkfzLWQ4kF_HI1M&s=YKg5P99PUXMcH2mmaZCzuhDbMUd7aMQplHEnBfin0oY&e= 

ApacheCon Submission:  https://urldefense.proofpoint.com/v2/url?u=https-3A__acna2020.jamhosted.net_cfp.html&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP&m=e-OOqkInyUhKdC06RHK2xAz6io-pUkfzLWQ4kF_HI1M&s=U8nGs4qaXcTny10cCRGWVRGegJq_3Vh8pfvAMOJLIM0&e= 


Lastly, so that we don't crash a server, I would like to have a rough head count for attendance estimation.  If you think that you will watch any presentation of ctakes then please send me ( seanfinan@apache.org ) an email with the subject "Attend" and "+1" in the body.


Cheers,

Sean


________________________________
From: Finan, Sean
Sent: Monday, June 29, 2020 11:02 AM
To: dev@ctakes.apache.org
Subject: ApacheCon 2020


Hi all,


General admission to ApacheCon 2020 is free:  https://urldefense.proofpoint.com/v2/url?u=https-3A__hopin.to_events_apachecon-2Dhome&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP&m=e-OOqkInyUhKdC06RHK2xAz6io-pUkfzLWQ4kF_HI1M&s=YKg5P99PUXMcH2mmaZCzuhDbMUd7aMQplHEnBfin0oY&e= 


I think that price of admission and travel costs have held back ctakes users from attending past conferences, and lack of a sizable audience has diminished the comparative value of ctakes presentations in the eyes of ApacheCon planners.  Because of the "at home" nature of this year's conference, an app with smaller presence and less hip buzz has a better chance of grabbing some time on the schedule.


The predetermined tracks are still an ill fit when it comes to the nature of ctakes.  https://urldefense.proofpoint.com/v2/url?u=https-3A__apachecon.com_acah2020_cfp.html&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP&m=e-OOqkInyUhKdC06RHK2xAz6io-pUkfzLWQ4kF_HI1M&s=EQkpUL6L1plJ-jT6i8TtOb2o3tK5drAMc2qWRa9Z0XM&e= 

However, I think that we can still use this opportunity to deliver some powerful introduction and training videos, as well as user stories and clinical project application.  Perhaps we can argue for a NLP track and do some coordination with projects like OpenNLP and UIMA.


There are a scant two weeks to come up with presentations, and less time to propose a track/topic.  The call for presentations ends July 13th.  That is a deadline that requires immediate attention by anybody who wants to show off their project or expertise.


Apache wants to have a single point of contact for each project, and I am volunteering to be that person for ctakes.   I am volunteering, not laying claim, so if you think that you are a better fit for the position please let me know.


I have written some ideas for presentations below.  If you want to take one (modify as you like) then please write me and post to the devlist.  If you have ideas for another presentation topic, please let me and the devlist know - even if you aren't volunteering to do the presentation yourself perhaps somebody else will.    Again ... two weeks.​


Thank you,

Sean



*  The following talk ideas are by and large directed toward training.  That does not mean that topics should stay within that scope.


=================================================================


Customizing cTAKES: First Principles

Built using Apache UIMA, cTAKES is modular and extensible.  Why is it frequently treated as a black box?  Is it lack of need, sparsity of resources, or simply fear of the unknown?

This is a quick start tutorial on adding custom elements to cTAKES.  We illustrate creating simple classes to input, process and output data.  This involves a concise overview of Apache uimaFIT and the cTAKES type system, as well as building a UIMA pipeline using piper files.


=================================================================


Loading a shippable with cTAKES DockHand

Customizing a simple pipeline need not be left to cTAKES experts.  Making a cTAKES installation need not be confined to source code checkouts or lengthy multi-stage binary downloads.

We introduce cTAKES DockHand, a compact single-file installation tool that allows one to construct custom pipelines as well as local installations, Rest Services and Dockerfiles.


==================================================================


Secret Engines of cTAKES

The cTAKES default natural language processing pipeline is a standard in the clinical research community.  What is past that standard?  While the default clinical pipeline uses almost 20 engines, there are dozens more in various cTAKES modules.

We present and discuss the top 10 annotation engines you never knew you had.


====================================================


Does cTAKES Know "The Best Words"?

Named Entity Recognition is at the core of all complete natural language processing tools.  Out of the box cTAKES uses a dictionary containing part of the Unified Medical Language System (UMLS) that covers most common clinical terms.  But it also comes with a custom dictionary creator.

If you think that your clinical research is directed, then you should probably have a directed dictionary.  UMLS subsets, non-english dictionaries and novel custom dictionaries have all been successfully used with cTAKES.

This is an overview of cTAKES named entity recognition with the essential what, why and how of custom dictionaries as the centerpiece.


====================================================

Academic Software: Performance or Performance?

A conundrum faced by all academic software projects is how to make the best of a small amount of resources.  Clinical natural language processing projects that use cTAKES are not exempt, and balancing accuracy of results against speed of processing often becomes central when it is time to put things into production (or just please the boss).

More than a history of cTAKES and its evolutionary efforts in precision, speed and usability, this presentation contains examples of how to best utilize each aspect.


================================================================


Diet cTAKES

One reason cTAKES is a popular framework in clinical natural language processing tools is its use of Apache Maven for project management.  Navigating cTAKES dependencies can be difficult, leading to a common practice of consuming the whole project.  Much of what ends up in your system may lead to unnecessary bloat.

Going piecemeal through the values and weights of cTAKES modules and resources, this presentation will assist any cTAKES user in trimming project bulk from gigabytes to megabytes.


================================================================


cTAKES Saved my Life

The title is inappropriate when it comes to healthcare in practice.  However, I used Apache cTAKES for my clinical research project on ________, and its [versatile / comprehensive / speedy / ?] nature was important in completing things [on time /  accurately / ?].

We share our real-world experiences with using cTAKES, discuss why we chose it, issues we faced and how we overcame unexpected problems.


================================================================


Large-scale cTAKES, an Installation Story

At our _____ facility, we needed to process _____ [patients / notes / term lists / ?] on a ______ system.

We present a real-world application of cTAKES on a large scale, our needs for _____ input and ____ output.  We compare and contrast cTAKES with other [clinical] NLP platforms that we tried and explain why we chose [it / another] in the end.

We will also share the novel [techniques / code / integration] that we used for the success of our installation.


================================================================


My Engine is Faster than Yours

We have created a cTAKES annotation engine that performs the task of _____.   This is [newer / faster / more comprehensive] than existing engines in [cTAKES / other].

We will present [numbers , usage , capabilities / i/o ] of the engine and its [model / data ].
We will also commit the code and documentation to Apache cTAKES.


================================================================


cTAKES on the Catwalk

We have created a Machine Learning model that can be used in cTAKES for ______.  The model uses the third party ______ for [newer / faster / more comprehensive] results.

We will present the essentials of model creation as well as [numbers , usage , capabilities / i/o ] of our model.   We will also advocate for the third party _____ and how we integrated it with cTAKES.
We will also commit the code [model] and documentation to Apache cTAKES.









Fw: ApacheCon 2020

Posted by "Finan, Sean" <Se...@childrens.harvard.edu>.
I can't believe that I forgot to mention ...


There will also be a presentation (maybe two?) by a group that has adapted ctakes to work with two other languages.  They have also integrated ctakes with other tools such as FreeLing and HeidelTime.  So cool ...


Cheers,

Sean


________________________________
From: Finan, Sean
Sent: Monday, July 6, 2020 9:08 AM
To: dev@ctakes.apache.org; user@ctakes.apache.org
Subject: ApacheCon 2020


Hi all,


The ctakes representation at ApacheCon 2020 is looking good!​


ApacheCon 2020 runs September 29 through October 1.

Submission runs through Sunday, July 12.  Technically it is 8:00 a.m. Eastern time Monday, but please don't procrastinate.

Registration is free.


I am excited to announce that we have three groups interested in giving presentations on their configuration and use of ctakes at a large scale!

We also have a presentation on the installation of the ctakes Rest service using the ctakes-rest module!


Knowledge on these topics is always extremely valuable to our users, and I for one really want to see how sites use ctakes when given different resources, requirements and restrictions.  Because of that, I am trying to put together (technology allowing) a roundtable discussion with those presenters.  That should be of value to every user no matter what your situation.


We still need more presentations!  To encourage you, here is a little information:


1.  What you do is interesting!  If you think that nobody out there cares about what you've done and how, then you probably aren't fully aware of how large and diverse our user base really is.  People want to know about things like integration, customization, clinical specialty application, augmentation and favorite capability fascination.

2.  Submission is very simple.  This is not like a scientific conference that requires a complete paper describing your work.  You only need to submit a blurb that loosely covers your topic and major talking point(s).  Half a dozen sentences will suffice.  In fact, what I sent last week (far below) could pass muster for a submission.  Go for something that will be on a brochure / schedule.

3.  The audience is made up of people just like you.  Developers, Bioinformaticians, IT Specialists, Students, Medical Researchers, AI Explorers and far more Hackers than Rock Stars.

4.  Slick presentation skills are not necessary.  Don't worry if you have never spoken to a room full of listeners.  Don't worry if English isn't your first language.  Don't worry if your slides are "sloppy".  Your presentation will not be graded.

5.  You don't need to prepare your whole talk before submitting.    Idea now, details later.

6.  Registration is FREE.


Right now the speaking time is anything up to 50 minutes.  If you don't want to present a full 50 minutes then that is ok ... The rest can be filled with extra question/answer or somebody else may fill the remaining time with a presentation on a similar topic.


I am going to put together a lightning round.  If you think that you can cover some material in five to fifteen minutes then this is for you!  Lightning rounds can be fun as you can make an impact with two or three slides and barely enough speaking to run out of breath.  This is really a free-for-all.  You can pack the time with data, give a short demonstration, compare using ctakes to breaking a mustang, or even do some on-topic (ctakes, nlp, AI, bioinformatics) stand up.  Anything goes.  This was an interesting (full) talk last year: https://aceu19.apachecon.com/session/confessions-middle-aged-coder-turned-gravel-grinder.   If you want to be in the lightning round, just write me a couple of sentences on your strike and I will put together the full submission for ApacheCon.  Does it get any easier?


I will present one or two things, but to maximize impact I would like to know what most interests / would help all of you.  So, please write me a topic or two that would best apply to your work.


Some links ...


ApacheCon Home Page:  https://www.apachecon.com/

ApacheCon Registration: https://hopin.to/events/apachecon-home

ApacheCon Submission:  https://acna2020.jamhosted.net/cfp.html


Lastly, so that we don't crash a server, I would like to have a rough head count for attendance estimation.  If you think that you will watch any presentation of ctakes then please send me ( seanfinan@apache.org ) an email with the subject "Attend" and "+1" in the body.


Cheers,

Sean


________________________________
From: Finan, Sean
Sent: Monday, June 29, 2020 11:02 AM
To: dev@ctakes.apache.org
Subject: ApacheCon 2020


Hi all,


General admission to ApacheCon 2020 is free:  https://hopin.to/events/apachecon-home


I think that price of admission and travel costs have held back ctakes users from attending past conferences, and lack of a sizable audience has diminished the comparative value of ctakes presentations in the eyes of ApacheCon planners.  Because of the "at home" nature of this year's conference, an app with smaller presence and less hip buzz has a better chance of grabbing some time on the schedule.


The predetermined tracks are still an ill fit when it comes to the nature of ctakes.  https://apachecon.com/acah2020/cfp.html

However, I think that we can still use this opportunity to deliver some powerful introduction and training videos, as well as user stories and clinical project application.  Perhaps we can argue for a NLP track and do some coordination with projects like OpenNLP and UIMA.


There are a scant two weeks to come up with presentations, and less time to propose a track/topic.  The call for presentations ends July 13th.  That is a deadline that requires immediate attention by anybody who wants to show off their project or expertise.


Apache wants to have a single point of contact for each project, and I am volunteering to be that person for ctakes.   I am volunteering, not laying claim, so if you think that you are a better fit for the position please let me know.


I have written some ideas for presentations below.  If you want to take one (modify as you like) then please write me and post to the devlist.  If you have ideas for another presentation topic, please let me and the devlist know - even if you aren't volunteering to do the presentation yourself perhaps somebody else will.    Again ... two weeks.​


Thank you,

Sean



*  The following talk ideas are by and large directed toward training.  That does not mean that topics should stay within that scope.


=================================================================


Customizing cTAKES: First Principles

Built using Apache UIMA, cTAKES is modular and extensible.  Why is it frequently treated as a black box?  Is it lack of need, sparsity of resources, or simply fear of the unknown?

This is a quick start tutorial on adding custom elements to cTAKES.  We illustrate creating simple classes to input, process and output data.  This involves a concise overview of Apache uimaFIT and the cTAKES type system, as well as building a UIMA pipeline using piper files.


=================================================================


Loading a shippable with cTAKES DockHand

Customizing a simple pipeline need not be left to cTAKES experts.  Making a cTAKES installation need not be confined to source code checkouts or lengthy multi-stage binary downloads.

We introduce cTAKES DockHand, a compact single-file installation tool that allows one to construct custom pipelines as well as local installations, Rest Services and Dockerfiles.


==================================================================


Secret Engines of cTAKES

The cTAKES default natural language processing pipeline is a standard in the clinical research community.  What is past that standard?  While the default clinical pipeline uses almost 20 engines, there are dozens more in various cTAKES modules.

We present and discuss the top 10 annotation engines you never knew you had.


====================================================


Does cTAKES Know "The Best Words"?

Named Entity Recognition is at the core of all complete natural language processing tools.  Out of the box cTAKES uses a dictionary containing part of the Unified Medical Language System (UMLS) that covers most common clinical terms.  But it also comes with a custom dictionary creator.

If you think that your clinical research is directed, then you should probably have a directed dictionary.  UMLS subsets, non-english dictionaries and novel custom dictionaries have all been successfully used with cTAKES.

This is an overview of cTAKES named entity recognition with the essential what, why and how of custom dictionaries as the centerpiece.


====================================================

Academic Software: Performance or Performance?

A conundrum faced by all academic software projects is how to make the best of a small amount of resources.  Clinical natural language processing projects that use cTAKES are not exempt, and balancing accuracy of results against speed of processing often becomes central when it is time to put things into production (or just please the boss).

More than a history of cTAKES and its evolutionary efforts in precision, speed and usability, this presentation contains examples of how to best utilize each aspect.


================================================================


Diet cTAKES

One reason cTAKES is a popular framework in clinical natural language processing tools is its use of Apache Maven for project management.  Navigating cTAKES dependencies can be difficult, leading to a common practice of consuming the whole project.  Much of what ends up in your system may lead to unnecessary bloat.

Going piecemeal through the values and weights of cTAKES modules and resources, this presentation will assist any cTAKES user in trimming project bulk from gigabytes to megabytes.


================================================================


cTAKES Saved my Life

The title is inappropriate when it comes to healthcare in practice.  However, I used Apache cTAKES for my clinical research project on ________, and its [versatile / comprehensive / speedy / ?] nature was important in completing things [on time /  accurately / ?].

We share our real-world experiences with using cTAKES, discuss why we chose it, issues we faced and how we overcame unexpected problems.


================================================================


Large-scale cTAKES, an Installation Story

At our _____ facility, we needed to process _____ [patients / notes / term lists / ?] on a ______ system.

We present a real-world application of cTAKES on a large scale, our needs for _____ input and ____ output.  We compare and contrast cTAKES with other [clinical] NLP platforms that we tried and explain why we chose [it / another] in the end.

We will also share the novel [techniques / code / integration] that we used for the success of our installation.


================================================================


My Engine is Faster than Yours

We have created a cTAKES annotation engine that performs the task of _____.   This is [newer / faster / more comprehensive] than existing engines in [cTAKES / other].

We will present [numbers , usage , capabilities / i/o ] of the engine and its [model / data ].
We will also commit the code and documentation to Apache cTAKES.


================================================================


cTAKES on the Catwalk

We have created a Machine Learning model that can be used in cTAKES for ______.  The model uses the third party ______ for [newer / faster / more comprehensive] results.

We will present the essentials of model creation as well as [numbers , usage , capabilities / i/o ] of our model.   We will also advocate for the third party _____ and how we integrated it with cTAKES.
We will also commit the code [model] and documentation to Apache cTAKES.









ApacheCon 2020

Posted by "Finan, Sean" <Se...@childrens.harvard.edu>.
Hi all,


The ctakes representation at ApacheCon 2020 is looking good!​


ApacheCon 2020 runs September 29 through October 1.

Submission runs through Sunday, July 12.  Technically it is 8:00 a.m. Eastern time Monday, but please don't procrastinate.

Registration is free.


I am excited to announce that we have three groups interested in giving presentations on their configuration and use of ctakes at a large scale!

We also have a presentation on the installation of the ctakes Rest service using the ctakes-rest module!


Knowledge on these topics is always extremely valuable to our users, and I for one really want to see how sites use ctakes when given different resources, requirements and restrictions.  Because of that, I am trying to put together (technology allowing) a roundtable discussion with those presenters.  That should be of value to every user no matter what your situation.


We still need more presentations!  To encourage you, here is a little information:


1.  What you do is interesting!  If you think that nobody out there cares about what you've done and how, then you probably aren't fully aware of how large and diverse our user base really is.  People want to know about things like integration, customization, clinical specialty application, augmentation and favorite capability fascination.

2.  Submission is very simple.  This is not like a scientific conference that requires a complete paper describing your work.  You only need to submit a blurb that loosely covers your topic and major talking point(s).  Half a dozen sentences will suffice.  In fact, what I sent last week (far below) could pass muster for a submission.  Go for something that will be on a brochure / schedule.

3.  The audience is made up of people just like you.  Developers, Bioinformaticians, IT Specialists, Students, Medical Researchers, AI Explorers and far more Hackers than Rock Stars.

4.  Slick presentation skills are not necessary.  Don't worry if you have never spoken to a room full of listeners.  Don't worry if English isn't your first language.  Don't worry if your slides are "sloppy".  Your presentation will not be graded.

5.  You don't need to prepare your whole talk before submitting.    Idea now, details later.

6.  Registration is FREE.


Right now the speaking time is anything up to 50 minutes.  If you don't want to present a full 50 minutes then that is ok ... The rest can be filled with extra question/answer or somebody else may fill the remaining time with a presentation on a similar topic.


I am going to put together a lightning round.  If you think that you can cover some material in five to fifteen minutes then this is for you!  Lightning rounds can be fun as you can make an impact with two or three slides and barely enough speaking to run out of breath.  This is really a free-for-all.  You can pack the time with data, give a short demonstration, compare using ctakes to breaking a mustang, or even do some on-topic (ctakes, nlp, AI, bioinformatics) stand up.  Anything goes.  This was an interesting (full) talk last year: https://aceu19.apachecon.com/session/confessions-middle-aged-coder-turned-gravel-grinder.   If you want to be in the lightning round, just write me a couple of sentences on your strike and I will put together the full submission for ApacheCon.  Does it get any easier?


I will present one or two things, but to maximize impact I would like to know what most interests / would help all of you.  So, please write me a topic or two that would best apply to your work.


Some links ...


ApacheCon Home Page:  https://www.apachecon.com/

ApacheCon Registration: https://hopin.to/events/apachecon-home

ApacheCon Submission:  https://acna2020.jamhosted.net/cfp.html


Lastly, so that we don't crash a server, I would like to have a rough head count for attendance estimation.  If you think that you will watch any presentation of ctakes then please send me ( seanfinan@apache.org ) an email with the subject "Attend" and "+1" in the body.


Cheers,

Sean


________________________________
From: Finan, Sean
Sent: Monday, June 29, 2020 11:02 AM
To: dev@ctakes.apache.org
Subject: ApacheCon 2020


Hi all,


General admission to ApacheCon 2020 is free:  https://hopin.to/events/apachecon-home


I think that price of admission and travel costs have held back ctakes users from attending past conferences, and lack of a sizable audience has diminished the comparative value of ctakes presentations in the eyes of ApacheCon planners.  Because of the "at home" nature of this year's conference, an app with smaller presence and less hip buzz has a better chance of grabbing some time on the schedule.


The predetermined tracks are still an ill fit when it comes to the nature of ctakes.  https://apachecon.com/acah2020/cfp.html

However, I think that we can still use this opportunity to deliver some powerful introduction and training videos, as well as user stories and clinical project application.  Perhaps we can argue for a NLP track and do some coordination with projects like OpenNLP and UIMA.


There are a scant two weeks to come up with presentations, and less time to propose a track/topic.  The call for presentations ends July 13th.  That is a deadline that requires immediate attention by anybody who wants to show off their project or expertise.


Apache wants to have a single point of contact for each project, and I am volunteering to be that person for ctakes.   I am volunteering, not laying claim, so if you think that you are a better fit for the position please let me know.


I have written some ideas for presentations below.  If you want to take one (modify as you like) then please write me and post to the devlist.  If you have ideas for another presentation topic, please let me and the devlist know - even if you aren't volunteering to do the presentation yourself perhaps somebody else will.    Again ... two weeks.​


Thank you,

Sean



*  The following talk ideas are by and large directed toward training.  That does not mean that topics should stay within that scope.


=================================================================


Customizing cTAKES: First Principles

Built using Apache UIMA, cTAKES is modular and extensible.  Why is it frequently treated as a black box?  Is it lack of need, sparsity of resources, or simply fear of the unknown?

This is a quick start tutorial on adding custom elements to cTAKES.  We illustrate creating simple classes to input, process and output data.  This involves a concise overview of Apache uimaFIT and the cTAKES type system, as well as building a UIMA pipeline using piper files.


=================================================================


Loading a shippable with cTAKES DockHand

Customizing a simple pipeline need not be left to cTAKES experts.  Making a cTAKES installation need not be confined to source code checkouts or lengthy multi-stage binary downloads.

We introduce cTAKES DockHand, a compact single-file installation tool that allows one to construct custom pipelines as well as local installations, Rest Services and Dockerfiles.


==================================================================


Secret Engines of cTAKES

The cTAKES default natural language processing pipeline is a standard in the clinical research community.  What is past that standard?  While the default clinical pipeline uses almost 20 engines, there are dozens more in various cTAKES modules.

We present and discuss the top 10 annotation engines you never knew you had.


====================================================


Does cTAKES Know "The Best Words"?

Named Entity Recognition is at the core of all complete natural language processing tools.  Out of the box cTAKES uses a dictionary containing part of the Unified Medical Language System (UMLS) that covers most common clinical terms.  But it also comes with a custom dictionary creator.

If you think that your clinical research is directed, then you should probably have a directed dictionary.  UMLS subsets, non-english dictionaries and novel custom dictionaries have all been successfully used with cTAKES.

This is an overview of cTAKES named entity recognition with the essential what, why and how of custom dictionaries as the centerpiece.


====================================================

Academic Software: Performance or Performance?

A conundrum faced by all academic software projects is how to make the best of a small amount of resources.  Clinical natural language processing projects that use cTAKES are not exempt, and balancing accuracy of results against speed of processing often becomes central when it is time to put things into production (or just please the boss).

More than a history of cTAKES and its evolutionary efforts in precision, speed and usability, this presentation contains examples of how to best utilize each aspect.


================================================================


Diet cTAKES

One reason cTAKES is a popular framework in clinical natural language processing tools is its use of Apache Maven for project management.  Navigating cTAKES dependencies can be difficult, leading to a common practice of consuming the whole project.  Much of what ends up in your system may lead to unnecessary bloat.

Going piecemeal through the values and weights of cTAKES modules and resources, this presentation will assist any cTAKES user in trimming project bulk from gigabytes to megabytes.


================================================================


cTAKES Saved my Life

The title is inappropriate when it comes to healthcare in practice.  However, I used Apache cTAKES for my clinical research project on ________, and its [versatile / comprehensive / speedy / ?] nature was important in completing things [on time /  accurately / ?].

We share our real-world experiences with using cTAKES, discuss why we chose it, issues we faced and how we overcame unexpected problems.


================================================================


Large-scale cTAKES, an Installation Story

At our _____ facility, we needed to process _____ [patients / notes / term lists / ?] on a ______ system.

We present a real-world application of cTAKES on a large scale, our needs for _____ input and ____ output.  We compare and contrast cTAKES with other [clinical] NLP platforms that we tried and explain why we chose [it / another] in the end.

We will also share the novel [techniques / code / integration] that we used for the success of our installation.


================================================================


My Engine is Faster than Yours

We have created a cTAKES annotation engine that performs the task of _____.   This is [newer / faster / more comprehensive] than existing engines in [cTAKES / other].

We will present [numbers , usage , capabilities / i/o ] of the engine and its [model / data ].
We will also commit the code and documentation to Apache cTAKES.


================================================================


cTAKES on the Catwalk

We have created a Machine Learning model that can be used in cTAKES for ______.  The model uses the third party ______ for [newer / faster / more comprehensive] results.

We will present the essentials of model creation as well as [numbers , usage , capabilities / i/o ] of our model.   We will also advocate for the third party _____ and how we integrated it with cTAKES.
We will also commit the code [model] and documentation to Apache cTAKES.









Re: ApacheCon 2020 [Bulk]

Posted by gandhi rajan <ga...@gmail.com>.
Hi Sean,

Would love to help you out if you think I can be handy in any area. Do keep
me posted.

On Mon, Jun 29, 2020 at 8:33 PM Finan, Sean <
Sean.Finan@childrens.harvard.edu> wrote:

> Hi all,
>
>
> General admission to ApacheCon 2020 is free:
> https://hopin.to/events/apachecon-home
>
>
> I think that price of admission and travel costs have held back ctakes
> users from attending past conferences, and lack of a sizable audience has
> diminished the comparative value of ctakes presentations in the eyes of
> ApacheCon planners.  Because of the "at home" nature of this year's
> conference, an app with smaller presence and less hip buzz has a better
> chance of grabbing some time on the schedule.
>
>
> The predetermined tracks are still an ill fit when it comes to the nature
> of ctakes.  https://apachecon.com/acah2020/cfp.html
>
> However, I think that we can still use this opportunity to deliver some
> powerful introduction and training videos, as well as user stories and
> clinical project application.  Perhaps we can argue for a NLP track and do
> some coordination with projects like OpenNLP and UIMA.
>
>
> There are a scant two weeks to come up with presentations, and less time
> to propose a track/topic.  The call for presentations ends July 13th.  That
> is a deadline that requires immediate attention by anybody who wants to
> show off their project or expertise.
>
>
> Apache wants to have a single point of contact for each project, and I am
> volunteering to be that person for ctakes.   I am volunteering, not laying
> claim, so if you think that you are a better fit for the position please
> let me know.
>
>
> I have written some ideas for presentations below.  If you want to take
> one (modify as you like) then please write me and post to the devlist.  If
> you have ideas for another presentation topic, please let me and the
> devlist know - even if you aren't volunteering to do the presentation
> yourself perhaps somebody else will.    Again ... two weeks.​
>
>
> Thank you,
>
> Sean
>
>
>
> *  The following talk ideas are by and large directed toward training.
> That does not mean that topics should stay within that scope.
>
>
> =================================================================
>
>
> Customizing cTAKES: First Principles
>
> Built using Apache UIMA, cTAKES is modular and extensible.  Why is it
> frequently treated as a black box?  Is it lack of need, sparsity of
> resources, or simply fear of the unknown?
>
> This is a quick start tutorial on adding custom elements to cTAKES.  We
> illustrate creating simple classes to input, process and output data.  This
> involves a concise overview of Apache uimaFIT and the cTAKES type system,
> as well as building a UIMA pipeline using piper files.
>
>
> =================================================================
>
>
> Loading a shippable with cTAKES DockHand
>
> Customizing a simple pipeline need not be left to cTAKES experts.  Making
> a cTAKES installation need not be confined to source code checkouts or
> lengthy multi-stage binary downloads.
>
> We introduce cTAKES DockHand, a compact single-file installation tool that
> allows one to construct custom pipelines as well as local installations,
> Rest Services and Dockerfiles.
>
>
> ==================================================================
>
>
> Secret Engines of cTAKES
>
> The cTAKES default natural language processing pipeline is a standard in
> the clinical research community.  What is past that standard?  While the
> default clinical pipeline uses almost 20 engines, there are dozens more in
> various cTAKES modules.
>
> We present and discuss the top 10 annotation engines you never knew you
> had.
>
>
> ====================================================
>
>
> Does cTAKES Know "The Best Words"?
>
> Named Entity Recognition is at the core of all complete natural language
> processing tools.  Out of the box cTAKES uses a dictionary containing part
> of the Unified Medical Language System (UMLS) that covers most common
> clinical terms.  But it also comes with a custom dictionary creator.
>
> If you think that your clinical research is directed, then you should
> probably have a directed dictionary.  UMLS subsets, non-english
> dictionaries and novel custom dictionaries have all been successfully used
> with cTAKES.
>
> This is an overview of cTAKES named entity recognition with the essential
> what, why and how of custom dictionaries as the centerpiece.
>
>
> ====================================================
>
> Academic Software: Performance or Performance?
>
> A conundrum faced by all academic software projects is how to make the
> best of a small amount of resources.  Clinical natural language processing
> projects that use cTAKES are not exempt, and balancing accuracy of results
> against speed of processing often becomes central when it is time to put
> things into production (or just please the boss).
>
> More than a history of cTAKES and its evolutionary efforts in precision,
> speed and usability, this presentation contains examples of how to best
> utilize each aspect.
>
>
> ================================================================
>
>
> Diet cTAKES
>
> One reason cTAKES is a popular framework in clinical natural language
> processing tools is its use of Apache Maven for project management.
> Navigating cTAKES dependencies can be difficult, leading to a common
> practice of consuming the whole project.  Much of what ends up in your
> system may lead to unnecessary bloat.
>
> Going piecemeal through the values and weights of cTAKES modules and
> resources, this presentation will assist any cTAKES user in trimming
> project bulk from gigabytes to megabytes.
>
>
> ================================================================
>
>
> cTAKES Saved my Life
>
> The title is inappropriate when it comes to healthcare in practice.
> However, I used Apache cTAKES for my clinical research project on ________,
> and its [versatile / comprehensive / speedy / ?] nature was important in
> completing things [on time /  accurately / ?].
>
> We share our real-world experiences with using cTAKES, discuss why we
> chose it, issues we faced and how we overcame unexpected problems.
>
>
> ================================================================
>
>
> Large-scale cTAKES, an Installation Story
>
> At our _____ facility, we needed to process _____ [patients / notes / term
> lists / ?] on a ______ system.
>
> We present a real-world application of cTAKES on a large scale, our needs
> for _____ input and ____ output.  We compare and contrast cTAKES with other
> [clinical] NLP platforms that we tried and explain why we chose [it /
> another] in the end.
>
> We will also share the novel [techniques / code / integration] that we
> used for the success of our installation.
>
>
> ================================================================
>
>
> My Engine is Faster than Yours
>
> We have created a cTAKES annotation engine that performs the task of
> _____.   This is [newer / faster / more comprehensive] than existing
> engines in [cTAKES / other].
>
> We will present [numbers , usage , capabilities / i/o ] of the engine and
> its [model / data ].
> We will also commit the code and documentation to Apache cTAKES.
>
>
> ================================================================
>
>
> cTAKES on the Catwalk
>
> We have created a Machine Learning model that can be used in cTAKES for
> ______.  The model uses the third party ______ for [newer / faster / more
> comprehensive] results.
>
> We will present the essentials of model creation as well as [numbers ,
> usage , capabilities / i/o ] of our model.   We will also advocate for the
> third party _____ and how we integrated it with cTAKES.
> We will also commit the code [model] and documentation to Apache cTAKES.
>
>
>
>
>
>
>
>
>

-- 
Regards,
Gandhi

"The best way to find urself is to lose urself in the service of others !!!"

Re: ApacheCon 2020

Posted by Peter Klügl <pe...@averbis.com>.
Hi,


Am 02.07.2020 um 00:37 schrieb Jeffrey Hill:
> Sean,
> I have an experimental integration of cTAKES and RUTA.  I feel like it would be a good “poster” presentation — not really at the “paper” level.  Would something like that be appropriate?


I would be very interested in that. Is this integration already
available somewhere?


Best,


Peter


> P.S. Are you related to Tim Finan by any chance?
>
>
> ________________________________
> From: Finan, Sean <Se...@childrens.harvard.edu>
> Sent: Monday, June 29, 2020 11:02:26 AM
> To: dev@ctakes.apache.org <de...@ctakes.apache.org>
> Subject: ApacheCon 2020 [Bulk]
>
> Hi all,
>
>
> General admission to ApacheCon 2020 is free:  https://hopin.to/events/apachecon-home
>
>
> I think that price of admission and travel costs have held back ctakes users from attending past conferences, and lack of a sizable audience has diminished the comparative value of ctakes presentations in the eyes of ApacheCon planners.  Because of the "at home" nature of this year's conference, an app with smaller presence and less hip buzz has a better chance of grabbing somtime on the schedule.
>
>
> The predetermined tracks are still an ill fit when it comes to the nature of ctakes.  https://apachecon.com/acah2020/cfp.html
>
> However, I think that we can still use this opportunity to deliver some powerful introduction and training videos, as well as user stories and clinical project application.  Perhaps we can argue for a NLP track and do some coordination with projects like OpenNLP and UIMA.
>
>
> There are a scant two weeks to come up with presentations, and less time to propose a track/topic.  The call for presentations ends July 13th.  That is a deadline that requires immediate attention by anybody who wants to show off their project or expertise.
>
>
> Apache wants to have a single point of contact for each project, and I am volunteering to be that person for ctakes.   I am volunteering, not laying claim, so if you think that you are a better fit for the position please let me know.
>
>
> I have written some ideas for presentations below.  If you want to take one (modify as you like) then please write me and post to the devlist.  If you have ideas for another presentation topic, please let me and the devlist know - even if you aren't volunteering to do the presentation yourself perhaps somebody else will.    Again ... two weeks.​
>
>
> Thank you,
>
> Sean
>
>
>
> *  The following talk ideas are by and large directed toward training.  That does not mean that topics should stay within that scope.
>
>
> =================================================================
>
>
> Customizing cTAKES: First Principles
>
> Built using Apache UIMA, cTAKES is modular and extensible.  Why is it frequently treated as a black box?  Is it lack of need, sparsity of resources, or simply fear of the unknown?
>
> This is a quick start tutorial on adding custom elements to cTAKES.  We illustrate creating simple classes to input, process and output data.  This involves a concise overview of Apache uimaFIT and the cTAKES type system, as well as building a UIMA pipeline using piper files.
>
>
> =================================================================
>
>
> Loading a shippable with cTAKES DockHand
>
> Customizing a simple pipeline need not be left to cTAKES experts.  Making a cTAKES installation need not be confined to source code checkouts or lengthy multi-stage binary downloads.
>
> We introduce cTAKES DockHand, a compact single-file installation tool that allows one to construct custom pipelines as well as local installations, Rest Services and Dockerfiles.
>
>
> ==================================================================
>
>
> Secret Engines of cTAKES
>
> The cTAKES default natural language processing pipeline is a standard in the clinical research community.  What is past that standard?  While the default clinical pipeline uses almost 20 engines, there are dozens more in various cTAKES modules.
>
> We present and discuss the top 10 annotation engines you never knew you had.
>
>
> ====================================================
>
>
> Does cTAKES Know "The Best Words"?
>
> Named Entity Recognition is at the core of all complete natural language processing tools.  Out of the box cTAKES uses a dictionary containing part of the Unified Medical Language System (UMLS) that covers most common clinical terms.  But it also comes with a custom dictionary creator.
>
> If you think that your clinical research is directed, then you should probably have a directed dictionary.  UMLS subsets, non-english dictionaries and novel custom dictionaries have all been successfully used with cTAKES.
>
> This is an overview of cTAKES named entity recognition with the essential what, why and how of custom dictionaries as the centerpiece.
>
>
> ====================================================
>
> Academic Software: Performance or Performance?
>
> A conundrum faced by all academic software projects is how to make the best of a small amount of resources.  Clinical natural language processing projects that use cTAKES are not exempt, and balancing accuracy of results against speed of processing often becomes central when it is time to put things into production (or just please the boss).
>
> More than a history of cTAKES and its evolutionary efforts in precision, speed and usability, this presentation contains examples of how to best utilize each aspect.
>
>
> ================================================================
>
>
> Diet cTAKES
>
> One reason cTAKES is a popular framework in clinical natural language processing tools is its use of Apache Maven for project management.  Navigating cTAKES dependencies can be difficult, leading to a common practice of consuming the whole project.  Much of what ends up in your system may lead to unnecessary bloat.
>
> Going piecemeal through the values and weights of cTAKES modules and resources, this presentation will assist any cTAKES user in trimming project bulk from gigabytes to megabytes.
>
>
> ================================================================
>
>
> cTAKES Saved my Life
>
> The title is inappropriate when it comes to healthcare in practice.  However, I used Apache cTAKES for my clinical research project on ________, and its [versatile / comprehensive / speedy / ?] nature was important in completing things [on time /  accurately / ?].
>
> We share our real-world experiences with using cTAKES, discuss why we chose it, issues we faced and how we overcame unexpected problems.
>
>
> ================================================================
>
>
> Large-scale cTAKES, an Installation Story
>
> At our _____ facility, we needed to process _____ [patients / notes / term lists / ?] on a ______ system.
>
> We present a real-world application of cTAKES on a large scale, our needs for _____ input and ____ output.  We compare and contrast cTAKES with other [clinical] NLP platforms that we tried and explain why we chose [it / another] in the end.
>
> We will also share the novel [techniques / code / integration] that we used for the success of our installation.
>
>
> ================================================================
>
>
> My Engine is Faster than Yours
>
> We have created a cTAKES annotation engine that performs the task of _____.   This is [newer / faster / more comprehensive] than existing engines in [cTAKES / other].
>
> We will present [numbers , usage , capabilities / i/o ] of the engine and its [model / data ].
> We will also commit the code and documentation to Apache cTAKES.
>
>
> ================================================================
>
>
> cTAKES on the Catwalk
>
> We have created a Machine Learning model that can be used in cTAKES for ______.  The model uses the third party ______ for [newer / faster / more comprehensive] results.
>
> We will present the essentials of model creation as well as [numbers , usage , capabilities / i/o ] of our model.   We will also advocate for the third party _____ and how we integrated it with cTAKES.
> We will also commit the code [model] and documentation to Apache cTAKES.
>
>
>
>
>
>
>
>
-- 
Dr. Peter Klügl
Head of Text Mining/Machine Learning

Averbis GmbH
Salzstr. 15
79098 Freiburg
Germany

Fon: +49 761 708 394 0
Fax: +49 761 708 394 10
Email: peter.kluegl@averbis.com
Web: https://averbis.com

Headquarters: Freiburg im Breisgau
Register Court: Amtsgericht Freiburg im Breisgau, HRB 701080
Managing Directors: Dr. med. Philipp Daumke, Dr. Kornél Markó


Re: ApacheCon 2020 [EXTERNAL]

Posted by "Finan, Sean" <Se...@childrens.harvard.edu>.
Hi Jeff,

As far as I know ApacheCon 2020 does not have a venue for posters.  However  ...

If you can put together 2-3 slides and 5-10 minutes (or more) then you and I can submit for a 50 slot and split the time.  I can easily come up with something that will only take part of the time.  If you are talking about UIMA Ruta then I can come up with something similar but on a different tack - that way the two presentations will have some continuity.

What do you think?

To all ctakes users:  If you are in a similar situation and can put together a presentation on anything that is "poster size" then we can try to bundle them and make one or two of our own "poster session" presentations.  That would cover a lot of ground and be extremely beneficial to a large audience.  There are so many ways to use ctakes, so many ways to build upon it, so many success (and horror) stories ... I think that this is a great opportunity for people to share their work, add to speaking experience and gain fame!

As far as my being related to any Finan ... I have no idea.

Thanks,
Sean
________________________________________
From: Jeffrey Hill <je...@droicelabs.com>
Sent: Wednesday, July 1, 2020 6:37 PM
To: dev@ctakes.apache.org
Subject: Re: ApacheCon 2020  [EXTERNAL]

* External Email - Caution *


Sean,
I have an experimental integration of cTAKES and RUTA.  I feel like it would be a good “poster” presentation — not really at the “paper” level.  Would something like that be appropriate?

P.S. Are you related to Tim Finan by any chance?


________________________________
From: Finan, Sean <Se...@childrens.harvard.edu>
Sent: Monday, June 29, 2020 11:02:26 AM
To: dev@ctakes.apache.org <de...@ctakes.apache.org>
Subject: ApacheCon 2020 [Bulk]

Hi all,


General admission to ApacheCon 2020 is free:  https://urldefense.proofpoint.com/v2/url?u=https-3A__hopin.to_events_apachecon-2Dhome&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=XVdyqRaGblNRC0wSXA7_3qL5wm7TIpK8-juYfMd_7SM&s=uQtJ7sZebBw3qTitVkSe8ohwUFv6uqUibPVqTLX-01g&e=


I think that price of admission and travel costs have held back ctakes users from attending past conferences, and lack of a sizable audience has diminished the comparative value of ctakes presentations in the eyes of ApacheCon planners.  Because of the "at home" nature of this year's conference, an app with smaller presence and less hip buzz has a better chance of grabbing somtime on the schedule.


The predetermined tracks are still an ill fit when it comes to the nature of ctakes.  https://urldefense.proofpoint.com/v2/url?u=https-3A__apachecon.com_acah2020_cfp.html&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=XVdyqRaGblNRC0wSXA7_3qL5wm7TIpK8-juYfMd_7SM&s=QZ-fMsR5i47EH8w46pX6AToUhMLMkxyRU1hOWwK9hMI&e=

However, I think that we can still use this opportunity to deliver some powerful introduction and training videos, as well as user stories and clinical project application.  Perhaps we can argue for a NLP track and do some coordination with projects like OpenNLP and UIMA.


There are a scant two weeks to come up with presentations, and less time to propose a track/topic.  The call for presentations ends July 13th.  That is a deadline that requires immediate attention by anybody who wants to show off their project or expertise.


Apache wants to have a single point of contact for each project, and I am volunteering to be that person for ctakes.   I am volunteering, not laying claim, so if you think that you are a better fit for the position please let me know.


I have written some ideas for presentations below.  If you want to take one (modify as you like) then please write me and post to the devlist.  If you have ideas for another presentation topic, please let me and the devlist know - even if you aren't volunteering to do the presentation yourself perhaps somebody else will.    Again ... two weeks.​


Thank you,

Sean



*  The following talk ideas are by and large directed toward training.  That does not mean that topics should stay within that scope.


=================================================================


Customizing cTAKES: First Principles

Built using Apache UIMA, cTAKES is modular and extensible.  Why is it frequently treated as a black box?  Is it lack of need, sparsity of resources, or simply fear of the unknown?

This is a quick start tutorial on adding custom elements to cTAKES.  We illustrate creating simple classes to input, process and output data.  This involves a concise overview of Apache uimaFIT and the cTAKES type system, as well as building a UIMA pipeline using piper files.


=================================================================


Loading a shippable with cTAKES DockHand

Customizing a simple pipeline need not be left to cTAKES experts.  Making a cTAKES installation need not be confined to source code checkouts or lengthy multi-stage binary downloads.

We introduce cTAKES DockHand, a compact single-file installation tool that allows one to construct custom pipelines as well as local installations, Rest Services and Dockerfiles.


==================================================================


Secret Engines of cTAKES

The cTAKES default natural language processing pipeline is a standard in the clinical research community.  What is past that standard?  While the default clinical pipeline uses almost 20 engines, there are dozens more in various cTAKES modules.

We present and discuss the top 10 annotation engines you never knew you had.


====================================================


Does cTAKES Know "The Best Words"?

Named Entity Recognition is at the core of all complete natural language processing tools.  Out of the box cTAKES uses a dictionary containing part of the Unified Medical Language System (UMLS) that covers most common clinical terms.  But it also comes with a custom dictionary creator.

If you think that your clinical research is directed, then you should probably have a directed dictionary.  UMLS subsets, non-english dictionaries and novel custom dictionaries have all been successfully used with cTAKES.

This is an overview of cTAKES named entity recognition with the essential what, why and how of custom dictionaries as the centerpiece.


====================================================

Academic Software: Performance or Performance?

A conundrum faced by all academic software projects is how to make the best of a small amount of resources.  Clinical natural language processing projects that use cTAKES are not exempt, and balancing accuracy of results against speed of processing often becomes central when it is time to put things into production (or just please the boss).

More than a history of cTAKES and its evolutionary efforts in precision, speed and usability, this presentation contains examples of how to best utilize each aspect.


================================================================


Diet cTAKES

One reason cTAKES is a popular framework in clinical natural language processing tools is its use of Apache Maven for project management.  Navigating cTAKES dependencies can be difficult, leading to a common practice of consuming the whole project.  Much of what ends up in your system may lead to unnecessary bloat.

Going piecemeal through the values and weights of cTAKES modules and resources, this presentation will assist any cTAKES user in trimming project bulk from gigabytes to megabytes.


================================================================


cTAKES Saved my Life

The title is inappropriate when it comes to healthcare in practice.  However, I used Apache cTAKES for my clinical research project on ________, and its [versatile / comprehensive / speedy / ?] nature was important in completing things [on time /  accurately / ?].

We share our real-world experiences with using cTAKES, discuss why we chose it, issues we faced and how we overcame unexpected problems.


================================================================


Large-scale cTAKES, an Installation Story

At our _____ facility, we needed to process _____ [patients / notes / term lists / ?] on a ______ system.

We present a real-world application of cTAKES on a large scale, our needs for _____ input and ____ output.  We compare and contrast cTAKES with other [clinical] NLP platforms that we tried and explain why we chose [it / another] in the end.

We will also share the novel [techniques / code / integration] that we used for the success of our installation.


================================================================


My Engine is Faster than Yours

We have created a cTAKES annotation engine that performs the task of _____.   This is [newer / faster / more comprehensive] than existing engines in [cTAKES / other].

We will present [numbers , usage , capabilities / i/o ] of the engine and its [model / data ].
We will also commit the code and documentation to Apache cTAKES.


================================================================


cTAKES on the Catwalk

We have created a Machine Learning model that can be used in cTAKES for ______.  The model uses the third party ______ for [newer / faster / more comprehensive] results.

We will present the essentials of model creation as well as [numbers , usage , capabilities / i/o ] of our model.   We will also advocate for the third party _____ and how we integrated it with cTAKES.
We will also commit the code [model] and documentation to Apache cTAKES.









Re: ApacheCon 2020

Posted by Jeffrey Hill <je...@droicelabs.com>.
Sean,
I have an experimental integration of cTAKES and RUTA.  I feel like it would be a good “poster” presentation — not really at the “paper” level.  Would something like that be appropriate?

P.S. Are you related to Tim Finan by any chance?


________________________________
From: Finan, Sean <Se...@childrens.harvard.edu>
Sent: Monday, June 29, 2020 11:02:26 AM
To: dev@ctakes.apache.org <de...@ctakes.apache.org>
Subject: ApacheCon 2020 [Bulk]

Hi all,


General admission to ApacheCon 2020 is free:  https://hopin.to/events/apachecon-home


I think that price of admission and travel costs have held back ctakes users from attending past conferences, and lack of a sizable audience has diminished the comparative value of ctakes presentations in the eyes of ApacheCon planners.  Because of the "at home" nature of this year's conference, an app with smaller presence and less hip buzz has a better chance of grabbing somtime on the schedule.


The predetermined tracks are still an ill fit when it comes to the nature of ctakes.  https://apachecon.com/acah2020/cfp.html

However, I think that we can still use this opportunity to deliver some powerful introduction and training videos, as well as user stories and clinical project application.  Perhaps we can argue for a NLP track and do some coordination with projects like OpenNLP and UIMA.


There are a scant two weeks to come up with presentations, and less time to propose a track/topic.  The call for presentations ends July 13th.  That is a deadline that requires immediate attention by anybody who wants to show off their project or expertise.


Apache wants to have a single point of contact for each project, and I am volunteering to be that person for ctakes.   I am volunteering, not laying claim, so if you think that you are a better fit for the position please let me know.


I have written some ideas for presentations below.  If you want to take one (modify as you like) then please write me and post to the devlist.  If you have ideas for another presentation topic, please let me and the devlist know - even if you aren't volunteering to do the presentation yourself perhaps somebody else will.    Again ... two weeks.​


Thank you,

Sean



*  The following talk ideas are by and large directed toward training.  That does not mean that topics should stay within that scope.


=================================================================


Customizing cTAKES: First Principles

Built using Apache UIMA, cTAKES is modular and extensible.  Why is it frequently treated as a black box?  Is it lack of need, sparsity of resources, or simply fear of the unknown?

This is a quick start tutorial on adding custom elements to cTAKES.  We illustrate creating simple classes to input, process and output data.  This involves a concise overview of Apache uimaFIT and the cTAKES type system, as well as building a UIMA pipeline using piper files.


=================================================================


Loading a shippable with cTAKES DockHand

Customizing a simple pipeline need not be left to cTAKES experts.  Making a cTAKES installation need not be confined to source code checkouts or lengthy multi-stage binary downloads.

We introduce cTAKES DockHand, a compact single-file installation tool that allows one to construct custom pipelines as well as local installations, Rest Services and Dockerfiles.


==================================================================


Secret Engines of cTAKES

The cTAKES default natural language processing pipeline is a standard in the clinical research community.  What is past that standard?  While the default clinical pipeline uses almost 20 engines, there are dozens more in various cTAKES modules.

We present and discuss the top 10 annotation engines you never knew you had.


====================================================


Does cTAKES Know "The Best Words"?

Named Entity Recognition is at the core of all complete natural language processing tools.  Out of the box cTAKES uses a dictionary containing part of the Unified Medical Language System (UMLS) that covers most common clinical terms.  But it also comes with a custom dictionary creator.

If you think that your clinical research is directed, then you should probably have a directed dictionary.  UMLS subsets, non-english dictionaries and novel custom dictionaries have all been successfully used with cTAKES.

This is an overview of cTAKES named entity recognition with the essential what, why and how of custom dictionaries as the centerpiece.


====================================================

Academic Software: Performance or Performance?

A conundrum faced by all academic software projects is how to make the best of a small amount of resources.  Clinical natural language processing projects that use cTAKES are not exempt, and balancing accuracy of results against speed of processing often becomes central when it is time to put things into production (or just please the boss).

More than a history of cTAKES and its evolutionary efforts in precision, speed and usability, this presentation contains examples of how to best utilize each aspect.


================================================================


Diet cTAKES

One reason cTAKES is a popular framework in clinical natural language processing tools is its use of Apache Maven for project management.  Navigating cTAKES dependencies can be difficult, leading to a common practice of consuming the whole project.  Much of what ends up in your system may lead to unnecessary bloat.

Going piecemeal through the values and weights of cTAKES modules and resources, this presentation will assist any cTAKES user in trimming project bulk from gigabytes to megabytes.


================================================================


cTAKES Saved my Life

The title is inappropriate when it comes to healthcare in practice.  However, I used Apache cTAKES for my clinical research project on ________, and its [versatile / comprehensive / speedy / ?] nature was important in completing things [on time /  accurately / ?].

We share our real-world experiences with using cTAKES, discuss why we chose it, issues we faced and how we overcame unexpected problems.


================================================================


Large-scale cTAKES, an Installation Story

At our _____ facility, we needed to process _____ [patients / notes / term lists / ?] on a ______ system.

We present a real-world application of cTAKES on a large scale, our needs for _____ input and ____ output.  We compare and contrast cTAKES with other [clinical] NLP platforms that we tried and explain why we chose [it / another] in the end.

We will also share the novel [techniques / code / integration] that we used for the success of our installation.


================================================================


My Engine is Faster than Yours

We have created a cTAKES annotation engine that performs the task of _____.   This is [newer / faster / more comprehensive] than existing engines in [cTAKES / other].

We will present [numbers , usage , capabilities / i/o ] of the engine and its [model / data ].
We will also commit the code and documentation to Apache cTAKES.


================================================================


cTAKES on the Catwalk

We have created a Machine Learning model that can be used in cTAKES for ______.  The model uses the third party ______ for [newer / faster / more comprehensive] results.

We will present the essentials of model creation as well as [numbers , usage , capabilities / i/o ] of our model.   We will also advocate for the third party _____ and how we integrated it with cTAKES.
We will also commit the code [model] and documentation to Apache cTAKES.