You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ctakes.apache.org by Manuel Lamy <mm...@gmail.com> on 2018/01/24 22:10:23 UTC

Problem using CPE and XMI Writer CAS Consumer

Hello guys,

I'm having problems running the CPE using a XMI Writer CAS Consumer.
However, it works with other consumers.

*Problem*

In the figure below, you can see my setup and the error I'm obtaining:

[image: Imagem inline 2]

*Logs*

Concerning logs, I'm obtaining this from Intellij:

org.apache.uima.resource.ResourceInitializationException
at
org.apache.uima.collection.impl.CollectionProcessingEngine_impl.initialize(CollectionProcessingEngine_impl.java:81)
at
org.apache.uima.impl.UIMAFramework_impl._produceCollectionProcessingEngine(UIMAFramework_impl.java:438)
at
org.apache.uima.UIMAFramework.produceCollectionProcessingEngine(UIMAFramework.java:918)
at org.apache.uima.tools.cpm.CpmPanel.startProcessing(CpmPanel.java:573)
at org.apache.uima.tools.cpm.CpmPanel.access$000(CpmPanel.java:105)
at org.apache.uima.tools.cpm.CpmPanel$1.run(CpmPanel.java:713)
Caused by: org.apache.uima.resource.ResourceConfigurationException
at
org.apache.uima.collection.impl.cpm.container.CPEFactory.produceIntegratedCasProcessor(CPEFactory.java:1093)
at
org.apache.uima.collection.impl.cpm.container.CPEFactory.getCasProcessors(CPEFactory.java:547)
at
org.apache.uima.collection.impl.cpm.BaseCPMImpl.init(BaseCPMImpl.java:253)
at
org.apache.uima.collection.impl.cpm.BaseCPMImpl.<init>(BaseCPMImpl.java:127)
at
org.apache.uima.collection.impl.CollectionProcessingEngine_impl.initialize(CollectionProcessingEngine_impl.java:73)
... 5 more
Caused by: java.lang.Exception: The component XMI Writer CAS Consumer
cannot be created. (Thread Name: Thread-5)
... 10 more

*Attempted Solutions*

I only found one guy with the same problem as me. The solution proposed in
the thread, by Sean Finan, was to change the xml of my consumer
(__XmiWriterCasConsumer.xml), particularly the content of the tag
<implementationName>, from

 <implementationName>org.apache.ctakes.core.cc.XmiWriterCasConsumerCtakes</implementationName>

to

<implementationName>org.apache.uima.tools.components.XmiWriterCasConsumer</implementationName>


However, this didn't work. The error is exactly the same. I'm out of
ideas about what to do. I would like to have the report of CPE in XMI,
in order to read it with CVD. You can see the thread here:

http://mail-archives.apache.org/mod_mbox/ctakes-dev/201701.mbox/%3C29cefd1fa1b44ce4a8dc92ec8b1cd882@CHEXMAIL1A.CHBOSTON.ORG%3E


*Result Expected*

Running the CPE process and have outputs as XMI files.


*Result Obtained*

Running the CPE results in an error, specifically for the consumer
__XMIWriterCasConsumer.


*Conclusion*

Do any of you guys had this problem before? Do you have a suggestion
about how can it be solved? Thanks a lot


Best regards,

Manuel

Re: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]

Posted by Manuel Lamy <mm...@gmail.com>.
Yes, MeSH surely has some support for the Portuguese language.

However, MeSH can't help me reaching my goal I guess, since it is focused
in metadata about biomedical articles and is not a dictionary of terms as
SNOMED-CT is.

Or I'm not quite well understanding the purpose of MeSH.

I have to investigate OpenEMR sure.

Well, a first look of the Excel you sent me shows 8691 entries. Just to put
things in proportion SNOMED-CT has at least 300,000 Clinical Terms. And a
lot of Portuguese terms in that Excel you sent me aren't filled and/or are
incorrect for what I see.

Not trying to be condescendent at all, just shows how bad is the Portuguese
situation in this domain right now. What makes my goal a little bit harsher
:)

I'll see what I can do with OpenEMR. My last solution will be the direct
translation of my EMR's to English, but I'm afraid the performance of the
system will be too much compromised.

Thanks a lot for the references Sean!

Best regards,

Manuel

2018-01-25 23:15 GMT+00:00 Finan, Sean <Se...@childrens.harvard.edu>:

> Yeah, the translation is going to require a bit of effort.
>
> I didn't know that there is no Portuguese in the snomed international.
> However, there should be other parts of the umls with Portuguese.  It looks
> like MeSH has at least a bit of Portuguese:
> https://www.nlm.nih.gov/research/umls/sourcereleasedocs/current/MSHPOR/
>
> You can probably find other sources.  OpenEMR is a cool project with
> available medical term translations.  Some info here:
> http://www.open-emr.org/wiki/index.php/OpenEMR_Internationalization_
> Configuration
> A spreadsheet with terms:
> https://docs.google.com/spreadsheets/d/1i2_WsjBX9cwa9mx0gIv3psMzQ28VsUZ-
> MqlAyZcmbX0/edit?hl=en&hl=en
>
>
> Cheers,
> Sean
>
>
> -----Original Message-----
> From: Manuel Lamy [mailto:mmvpdml@gmail.com]
> Sent: Thursday, January 25, 2018 6:02 PM
> To: dev@ctakes.apache.org
> Subject: Re: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]
>
> Hello Sean,
>
> Thanks for the awesome inputs as always!
>
> *SNOMED*
>
> Afaik SNOMED doesn't exist in the Portuguese language yet. As per my
> research and this reference[1], SNOMED-CT is only translated for Australian
> English, Danish, Dutch, Spanish, Swedish, and USA/UK English. Did you hear
> about SNOMED-CT translated to Portuguese somewhere?
>
> I may have to come with a solution for this. It's possible for me to build
> a mechanism that tries to translate all SNOMED-CT from English to
> Portuguese, or from the Spanish since it's a close language from
> Portuguese. I can use many sources, such as ICD-9/10, DBpedia or/and a
> direct translation tool. However, this path will not be easy to take on my
> own. But it's a possibility though. I have to think about it.
>
>
> *Bat files in Development Version*
>
> I was trying to run the bat files that were inside the module
> ctakes-distribution of my Dev Version. I guess that was my problem after
> all.
>
>
> *Translation*
>
> Yes, I know cTAKES won't translate for me. I was thinking in using an
> offline translator and adding it to my pipeline. I have yet to find a
> translator that is half as good as the Google Translator though. I don't
> want to rely in an online translator.
>
>
> *Wiki Documentation*
>
> Thanks for your compliment. Sure, I would love to help. I like to express
> myself as clear as possible.
>
> However, my knowledge about the system is still limited. I only started
> using cTAKES a couple of months ago.
>
> But if I can help with something just ask me, I would be glad to help.
>
>
> Thanks a lot Sean.
>
> Best regards,
>
> Manuel Lamy
>
>
> [1] -
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.
> snomed.org_snomed-2Dct_snomed-2Dct-2Dworldwide_translations-
> 2Dof-2Dsnomed-2Dct&d=DwIFaQ&c=qS4goWBT7poplM69zy_
> 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=
> UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=Dglu7v-Nns-
> ao41Lbn-oe6MAFF2cEWlkJ-8NQmHv7Xk&e=
>
>
> 2018-01-25 20:43 GMT+00:00 Finan, Sean <Se...@childrens.harvard.edu>:
>
> > Hi Manuel,
> >
> > Thank you for the information.  I have a couple of response lines …
> >
> >
> > > I need to do it because cTAKES seems to not work with the Portuguese
> > language at all
> >                 - Yes and no … You can create a dictionary of terms in
> > the Portuguese language.  This would allow ctakes to at least
> > recognize these terms and save them for posterity.  However, the more
> > advanced processing available for English (negation, uncertainty
> > detection, etc.) will not be available.  If you can find other nlp
> > projects that work with Portuguese it may be possible to insert them
> > into a ctakes pipeline.  The instructions for creating a custom
> > dictionary are here (language selection is not documented but it is on
> > the gui, download the umls with portugese snomed if you can):
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_
> > confluence_display_CTAKES_Dictionary-2BCreator-2BGUI&d=DwIFaQ&c=qS4goW
> > BT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bc
> > pKGd4f7d4gTao&m=UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=AD732kch
> > CeJtdn_QfNegFdrzKsmW8uw7Nhb-GWg1VRQ&e=
> >
> > > What I have in mind is to create a pipeline system that first
> > > translates
> > the texts from Portuguese to English
> >                 - Probably a good way to go if you have a decent
> > translation tool.
> >
> > > From my research, I couldn't find anything relevant in this topic.
> >                 - We definitely could use more documentation.
> >
> > > Well, since this is the user version, I don't have the
> > runPiperSubmitter.bat available
> >                 - Correct.  It is a tool that was created after the
> > 4.0 release.
> >
> > > When I try to run the bat files inside the bin of the Dev Version, I
> > have the results shown in the image attached to this e-mail.
> >                 -  Your attachments were scrubbed so I can’t see them.
> > However, I have a guess: did you run a “maven package”, unzip the
> > created installation file and run from the bin/ directory there?  Or
> > are you running with the bin/ inside your development sandbox?  The
> > second method won’t work and will give you the “class not found”
> > errors that you are seeing.  If you want to run using Intellij, turn
> > on the profile “runPiperGui” and compile.  Maven should launch the gui
> after compilation.
> >
> > > Well, first of all, my objective is to share my experiences with
> > > cTAKES,
> > in order to share with the community what I'm going through. This way
> > I can contribute to the community and probably help others who are
> > going through the same as me.
> >                 -  Excellent.  Would you be willing to write
> > documentation for the ctakes wiki?  Your emails are clear and extremely
> well formatted!
> >
> >
> >   1.  Is this feasible? Am I aiming for something that I simply can't
> > rely in cTAKES only to do, because I have to translate the texts first?
> >
> > -          Ctakes won’t translate for you, but if you can find a tool
> that
> > will then processing with ctakes should be possible.
> >
> >   1.  Why don't I have a TypeSystem.xml file to feed CVD first, in the
> > Development Version? I can only find it in the User Version, under
> > /resources.
> >
> > -          The typesystem.xml file is in the ctakes-type-system project
> > until you “maven package” and create an “installation”.  If you just
> > run from your developer environment you can point to the
> > TypeSystem.xml in ctakes-type-system/src/main/resources/…
> >
> >   1.  Why do we have options in CVD for other languages, but it
> > clearly only works for the English language?
> >
> > -          The cvd is a tool that is part of Apache UIMA.  It is more
> > generic than ctakes and can read xmi files created by other systems.
> > I have no idea what the details are concerning its language support.
> >
> >   1.  Any other hint you can give me, concerning the big picture of
> > what I'm trying to build here?
> >
> > -          Not really, sorry.  The multi-lingual goes outside my area of
> > knowledge.
> >
> > Sean
> >
> >
> > From: Manuel Lamy [mailto:mmvpdml@gmail.com]
> > Sent: Thursday, January 25, 2018 2:28 PM
> > To: dev@ctakes.apache.org
> > Subject: Re: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]
> >
> > Hello Sean,
> >
> > Before all, thansk a lot for the quick and detailed answer. Awesome
> > support by you.
> >
> > I'll give you a structured answer to be the more objective and concise
> > possible. I guess it's important to tell you what I'm trying to
> > achieve in order for you to help me.
> >
> > My Project
> >
> > I'm actually making a project with cTAKES in a partnership with a
> > Portuguese hospital.
> >
> > My goal is to create reports of the narrative parts of the EMRs of
> > this hospital, in order to report the symptoms, diseases and clinical
> > procedures found in each EMR.
> >
> > What I have in mind is to create a pipeline system that first
> > translates the texts from Portuguese to English, and then creates
> > these reports based on the translated texts.
> >
> > I'm not even sure yet I can create a pipeline system of this style
> > with cTAKES. I need to do it because cTAKES seems to not work with the
> > Portuguese language at all (despite that option being shown in the
> > languages list when using CVD and that's confusing). So, well, I will
> > translate it, I guess it's my best bet.
> >
> > But just a note, I think it should exist more support and
> > documentation about how to work with cTAKES in different languages
> > than English. From my research, I couldn't find anything relevant in
> > this topic. Not even one reference telling clearly that cTAKES only
> > works with English language and not with the others.
> >
> > Version of cTAKES
> >
> > Naturally, I'm running the development version of cTAKES. I'm using
> > Intellij. I'm using the latest version of cTAKES, trunk, that
> > corresponds to version 4.0.1-SNAPSHOT.
> >
> > So, I guess so far so good, just as you said, I'm using trunk.
> >
> > I did everything as per the guide "Developer Install Guide",
> > concerning the Intellij instructions. The guide I used can be found here:
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_
> > confluence_display_CTAKES_&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMS
> > dioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=UOqyIzWZMi
> > qyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=e2yD-cLW1Dlpq29Om7A73a4xDPGLDUZSud
> > PClFUEzBU&e=
> > cTAKES+4.0+Developer+Install+Guide<https://urldefense.
> proofpoint.com/v2/url?u=https-3A__urldefense&d=DwIFaQ&c=
> qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=
> UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=jykKf1NthK-
> 1Y45BRzYVVVAYQK6939NDMIg-lhY2wLQ&e=.
> > proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_
> > confluence_display_CTAKES_cTAKES-2B4.0-2BDeveloper-
> > 2BInstall-2BGuide&d=DwMFaQ&c=qS4goWBT7poplM69zy_
> > 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bc
> > pKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjX
> > OytMmgM0&s=CVSlmO5-wIWG9Bh-dTmpXoUGF5sgLWD2jp4sbGZ8vh8&e=>
> >
> >
> > Behavior of cTAKES when running pipelines
> >
> > Well, I did what you told me. I ran the Default Clinical Pipeline and
> > the Piper File Submitter as per the wiki's. I have the User and
> > Development versions both in my machine.
> >
> > Now, I tried to run those pipelines in the User and Development versions.
> > I ran the respective bat files:
> >
> >
> >   *   For the Default Clinical Pipeline I ran 'bin/runClinicalPipeline
> -i
> > inputDirectory  --xmiOut outputDirectory  --user umlsUsername  --pass
> > umlsPassword'
> >   *   For the Piper File Submitter, I ran the 'bin/runPiperSubmitter'
> > Well, the results of running these two bat files were quite differents
> > for the User and Development versions.
> >
> > User Version
> >
> > Default Clinical Pipeline
> >
> > In this version, I went to bin directory and just ran the line
> > 'bin/runClinicalPipeline  -i inputDirectory  --xmiOut outputDirectory
> > --user umlsUsername  --pass umlsPassword' with my parameters.
> >
> > It worked well and created the XMI output files where it was supposed.
> > And I could open them in CVD, first opening a TypeSystem.xml file and
> > then the generated XMI files I wanted.
> >
> > Piper File Submitter
> >
> > Well, since this is the user version, I don't have the
> > runPiperSubmitter.bat available. Is this normal? That's comprehensible
> > and I guess normal, for what I understand from this quote " If you are
> > running from a development environment (checked out trunk from SVN)
> > they can also be run using the Piper File Submitter GUI." But you tell
> me.
> >
> > Well, I can say the User Version did what I wanted in this step, but I
> > thought that would be nice to replicate it in the Development version,
> > since I guess I'll have to use it in the future in order to implement
> > all I want for my project described in the beggining of this e-mail.
> > And the problems arose in the Development version....
> >
> > Development Version
> >
> > Well, in this version, I tried to replicate what I did in the User
> > version, thinking to myself it would output the same result. I was wrong.
> >
> >
> > Default Clinical Pipeline and Piper File Submitter
> >
> > When I try to run the bat files inside the bin of the Dev Version, I
> > have the results shown in the image attached to this e-mail.
> >
> > Yes, could not find or load PiperFileRunner and PiperRunnerGui. Is it
> > supposed to happen in the Development Version? Am I doing something
> > wrong in here? i just followed the guides you have available. All my
> > Development Version installation was per the guide.
> >
> >
> > My objective with this e-mail
> >
> > Well, first of all, my objective is to share my experiences with
> > cTAKES, in order to share with the community what I'm going through.
> > This way I can contribute to the community and probably help others
> > who are going through the same as me.
> >
> > In second place, I would like to know your opinion about the
> > feasability of what I'm trying to make here. My goal is build a pipeline
> system like:
> >
> >
> >   *   EMRs in Portuguese already in txt files in a directory ->
> > Translation to English -> Process all of the texts with Clinical
> > Pipeline
> > -> Output XMI in order to open them in CVD
> > This is what I aim with cTAKES. So I have the following questions:
> >
> >
> >   1.  Is this feasible? Am I aiming for something that I simply can't
> > rely in cTAKES only to do, because I have to translate the texts first?
> >   2.  Why don't I have a TypeSystem.xml file to feed CVD first, in the
> > Development Version? I can only find it in the User Version, under
> > /resources.
> >   3.  Why do we have options in CVD for other languages, but it
> > clearly only works for the English language?
> >   4.  Any other hint you can give me, concerning the big picture of
> > what I'm trying to build here?
> > Any additional information you need from my side, just tell me.
> >
> > Thanks one more time for the quick answers and support Sean.
> >
> > Best regards,
> >
> > Manuel
> >
> >
> > 2018-01-25 15:35 GMT+00:00 Finan, Sean
> > <Sean.Finan@childrens.harvard.edu
> > <ma...@childrens.harvard.edu>>:
> > Hi Manuel,
> >
> > My first comment is that you are running ctakes in a somewhat “ancient”
> > manner, or better put, the xml descriptor workflow has been pretty
> > much deprecated.
> >
> > You should try to run ctakes 4.0.  If you are software savvy then I
> > advise that you try the development version that is in trunk.  You’ve
> > probably been on the ctakes download page, but just a reminder :
> > https://urldefense.proofpoint.com/v2/url?u=http-3A__ctakes.apache.org_
> > &d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstT
> > pyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwV
> > gTy1XI8&s=NmTVdyQzJ4PMX8X5CEfmrRCPBn_0nupPPMBQtp_4Fg8&e=<https://urlde
> > fense.proofpoint.com/
> > v2/url?u=http-3A__ctakes.apache.org_&d=DwMFaQ&c=qS4goWBT7poplM69zy_
> > 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bc
> > pKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=
> > eR4BZrqJcoxN9dwsWE5PUw9qwMAju7w9zOOzqMHT95U&e=>
> >
> > The ctakes wiki has some useful information, and the 4.0 entry is here:
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_
> > confluence_display_CTAKES_cTAKES-2B4.0&d=DwIFaQ&c=qS4goWBT7poplM69zy_3
> > xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&
> > m=UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=gd_IcjGK3Jvad4c-PYkuLS
> > NXVjLBeSR-VrMEJ19Qrck&e=<
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_
> > confluence_display_CTAKES_cTAKES-2B4.0&d=DwMFaQ&c=qS4goWBT7poplM69zy_
> > 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bc
> > pKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=
> > IgvR2Z9rgstXIbo3scW0DsWkA59X0ANVuYeO5P5lrwI&e=>
> >
> > To start playing with ctakes I suggest that you try to run the default
> > clinical pipeline, following the instructions here:
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_
> > confluence_display_CTAKES_&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMS
> > dioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=UOqyIzWZMi
> > qyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=e2yD-cLW1Dlpq29Om7A73a4xDPGLDUZSud
> > PClFUEzBU&e=
> > Default+Clinical+Pipeline<https://urldefense.proofpoint.com/
> > v2/url?u=https-3A__cwiki.apache.org_confluence_display_
> > CTAKES_Default-2BClinical-2BPipeline&d=DwMFaQ&c=qS4goWBT7poplM69zy_
> > 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bc
> > pKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=
> > hvwwTI35sq53mx3R9TsPtHEF3p2G29qCmVime1NsgKU&e=>
> >
> > Those instructions will start the default clinical pipeline from a
> > command line.  If you have the development version from trunk then
> > there is a gui available to run pipelines:
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_
> > confluence_display_CTAKES_&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMS
> > dioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=UOqyIzWZMi
> > qyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=e2yD-cLW1Dlpq29Om7A73a4xDPGLDUZSud
> > PClFUEzBU&e=
> > Piper+File+Submitter+GUI<https://urldefense.proofpoint.com/
> > v2/url?u=https-3A__cwiki.apache.org_confluence_display_
> > CTAKES_Piper-2BFile-2BSubmitter-2BGUI&d=DwMFaQ&c=qS4goWBT7poplM69zy_
> > 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bc
> > pKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjX
> > OytMmgM0&s=HKBfRNAlLaLk9c-sPqupZpQzAc5ddcWbbXvWxRiWwBw&e=>
> >
> > There are also many other pipeline configurations available in trunk
> > to run more advanced / involved pipelines.  They are not in the 4.0
> release.
> > The pipelines (including 4.0 default) are all defined using the
> > replacement for those xml descriptor files.  The replacements are called
> “piper files”.
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_
> > confluence_display_CTAKES_Piper-2BFiles&d=DwIFaQ&c=qS4goWBT7poplM69zy_
> > 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao
> > &m=UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=-ifPT1RLQFSC5JfXB8KiX
> > FFpaWXiwmATZ3ZiEDqR0Mk&e=<https://
> > urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.
> > org_confluence_display_CTAKES_Piper-2BFiles&d=DwMFaQ&c=qS4goWBT7poplM6
> > 9zy_ 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bc
> > pKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjX
> > OytMmgM0&s=E7wf87y0Ldqo_pGw2sYdC_DPEeqsmnLYPMkrM5LIz8w&e=>
> >
> > I hope that you find the pipers easier to understand and use than the
> > old xml descriptors.
> >
> > Anyway, if you run the ctakes 4.0 default clinical pipeline as
> > outlined in the wiki page it will use the new FileTreeReader and
> > FileTreeXmiWriter combination.
> >
> > Give it a whirl and let me know how things go.
> >
> > Sean
> >
> >
> > From: Manuel Lamy [mailto:mmvpdml@gmail.com<ma...@gmail.com>]
> > Sent: Thursday, January 25, 2018 9:09 AM
> > To: dev@ctakes.apache.org<ma...@ctakes.apache.org>
> > Subject: Re: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]
> >
> > Hello Sean,
> >
> > First of all, thanks for your quick answer.
> >
> > I'm probably making some confusion over here, so I have the following
> > questions.
> >
> >
> >   1.  A CAS Consumer is defined by a XML file. What you are implying
> > is that I should go to my consumer XML (__XmiWriterCasConsumer.xml)
> > and change it's <implementationName> tag to
> > 'org.apache.ctakes.core.cc<htt ps://urldefense.proofpoint.
> com/v2/url?u=http-3A__org.
> > apache.ctakes.core.cc&d=DwMFaQ&c=qS4goWBT7poplM69zy_
> > 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bc
> > pKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=77ECYie_
> > 8Zy3RN9ARtzl51dBaHan8dijiNX2p0IkjIA&e=>.FileTreeXmiWriter' instead of '
> > org.apache.ctakes.core.cc<https://urldefense.proofpoint.com/
> v2/url?u=https-3A__urldefense.proofpoint&d=DwIFaQ&c=qS4goWBT7poplM69zy_
> 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=
> UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=6E81rE-
> QEYoyyKHrR6ZHlSurpOHAb_6oRM7SieeLSiU&e=.
> > com/v2/url?u=http-3A__org.apache.ctakes.core.cc&d=
> > DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> > fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_
> > ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=77ECYie_
> > 8Zy3RN9ARtzl51dBaHan8dijiNX2p0IkjIA&e=>.XmiWriterCasConsumer'? Funny
> > enough, it gives me a classNotFoundException if I do this. Would like
> > to have your confirmation if I'm doing the right thing please. The
> > class is well defined in that path though.
> >   2.  Concerning the reader, I make the same analogy. Should I go to
> > my descriptor and change it's <implementationName> tag from '
> > org.apache.ctakes.core.cr<https://urldefense.proofpoint.com/
> v2/url?u=https-3A__urldefense.proofpoint&d=DwIFaQ&c=qS4goWBT7poplM69zy_
> 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=
> UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=6E81rE-
> QEYoyyKHrR6ZHlSurpOHAb_6oRM7SieeLSiU&e=.
> > com/v2/url?u=http-3A__org.apache.ctakes.core.cr&d=
> > DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> > fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_
> > ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=-ag_dLUKFN_aLQ4irY_
> > xU_CLzGNrDn6NfV62R5ojs8k&e=>.FilesInDirectoryCollectionReader' to '
> > org.apache.ctakes.core.cr<https://urldefense.proofpoint.com/
> v2/url?u=https-3A__urldefense.proofpoint&d=DwIFaQ&c=qS4goWBT7poplM69zy_
> 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=
> UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=6E81rE-
> QEYoyyKHrR6ZHlSurpOHAb_6oRM7SieeLSiU&e=.
> > com/v2/url?u=http-3A__org.apache.ctakes.core.cr&d=
> > DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> > fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_
> > ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=-ag_dLUKFN_aLQ4irY_
> > xU_CLzGNrDn6NfV62R5ojs8k&e=>.FileTreeReader'?
> > I did these two things and the error is the same concerning the new
> > consumer 'FileTreeXmiWriter', as you can see in the first image
> > attached to this e-mail.
> >
> > I would also like to ask you another question:
> >
> >
> >        3. Why does my class 'FileTreeXmiWriter' has a lot of
> > unresolved classes? You can see it in the second image attached to
> > this e-mail. I can't seem to import them right. I tried to import the
> > extension of this class only to check the result, and look how it solved
> the import to me.
> > 'apache' is not recognized. I'm just kinda baffled with the hierarchy
> > defined for this project. If you could give me a little bit of
> > clarification in this topic and how to solve it I would be appreciated.
> >
> > Thanks for your attention! I'm really looking forward to put this to
> work.
> > cTAKES seems awesome. It just needs these little tweaks.
> >
> > Best regards,
> >
> > Manuel
> >
> >
> >
> >
> >
> > 2018-01-24 22:26 GMT+00:00 Finan, Sean
> > <Sean.Finan@childrens.harvard.edu
> > <ma...@childrens.harvard.edu><mailto:
> > Sean.Finan@childrens.harvard.edu<mailto:Sean.Finan@childrens.harvard.e
> > du
> > >>>:
> > Hi Manuel,
> >
> > Your image got scrubbed by a server, but the problem may have been
> > fixed in a recent xmi writer.  The latest xmi writer is in ctakes core
> > and is named FileTreeXmiWriter.  One possible cause for a problem in
> > the writer is if the document has some unexpected character or
> > character combination.  A document reader should be massaging
> > documents before they are processed and sent to the writer.  The most
> > recent file reader is named FileTreeReader and is also in ctakes core.
> >
> > Sean
> >
> >
> >
> > From: Manuel Lamy [mailto:mmvpdml@gmail.com<ma...@gmail.com><
> > mailto:mmvpdml@gmail.com<ma...@gmail.com>>]
> > Sent: Wednesday, January 24, 2018 5:10 PM
> > To: dev@ctakes.apache.org<ma...@ctakes.apache.org><mailto:d
> > ev@ctakes.apache.org<ma...@ctakes.apache.org>>
> > Subject: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]
> >
> > Hello guys,
> >
> > I'm having problems running the CPE using a XMI Writer CAS Consumer.
> > However, it works with other consumers.
> >
> > Problem
> >
> > In the figure below, you can see my setup and the error I'm obtaining:
> >
> > [Imagem inline 2]
> >
> > Logs
> >
> > Concerning logs, I'm obtaining this from Intellij:
> >
> > org.apache.uima.resource.ResourceInitializationException
> >             at
> > org.apache.uima.collection.impl.CollectionProcessingEngine_
> > impl.initialize(CollectionProcessingEngine_impl.java:81)
> >             at org.apache.uima.impl.UIMAFramework_impl._
> > produceCollectionProcessingEngine(UIMAFramework_impl.java:438)
> >             at org.apache.uima.UIMAFramework.
> > produceCollectionProcessingEngine(UIMAFramework.java:918)
> >             at org.apache.uima.tools.cpm.CpmPanel.startProcessing(
> > CpmPanel.java:573)
> >             at org.apache.uima.tools.cpm.CpmPanel.access$000(CpmPanel.
> > java:105)
> >             at
> > org.apache.uima.tools.cpm.CpmPanel$1.run(CpmPanel.java:713)
> > Caused by: org.apache.uima.resource.ResourceConfigurationException
> >             at org.apache.uima.collection.impl.cpm.container.CPEFactory.
> > pro<https://urldefense.proofpoint.com/v2/url?u=http-3A__l.cpm.container.
> > CPEFactory.pro&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&
> > r= fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_
> > ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=Kd-RE-JiMaX2AlLA310idXB-
> > Dyqrbh68kZ24-2ZFEe0&e=>duceIntegratedCasProcessor(CPEFactory.java:1093)
> >             at org.apache.uima.collection.impl.cpm.container.CPEFactory.
> > getCasProcessors(CPEFactory.java:547)
> >             at org.apache.uima.collection.impl.cpm.BaseCPMImpl.init(
> > BaseCPMImpl.java:253)
> >             at org.apache.uima.collection.impl.cpm.BaseCPMImpl.<init>(
> > BaseCPMImpl.java:127)
> >             at
> > org.apache.uima.collection.impl.CollectionProcessingEngine_
> > impl.initialize(CollectionProcessingEngine_impl.java:73)
> >             ... 5 more
> > Caused by: java.lang.Exception: The component XMI Writer CAS Consumer
> > cannot be created. (Thread Name: Thread-5)
> >             ... 10 more
> >
> > Attempted Solutions
> >
> > I only found one guy with the same problem as me. The solution
> > proposed in the thread, by Sean Finan, was to change the xml of my
> > consumer (__XmiWriterCasConsumer.xml), particularly the content of the
> > tag <implementationName>, from
> > <implementationName>org.apache.ctakes.core.cc<https://
> > urldefense.proofpoint.com/v2/url?u=http-3A__org.apache.
> > ctakes.core.cc&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&
> > r= fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_
> > ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=77ECYie_
> > 8Zy3RN9ARtzl51dBaHan8dijiNX2p0IkjIA&e=><https://urldefense.
> proofpoint.com/v2/url?u=https-3A__urldefense&d=DwIFaQ&c=
> qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=
> UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=jykKf1NthK-
> 1Y45BRzYVVVAYQK6939NDMIg-lhY2wLQ&e=.
> > proofpoint.com/v2/url?u=http-3A__apache.ctakes.core.cc&d=
> > DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> > fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=55lXUJ1MFPyBhpVH4sCBuEZD
> > -
> > InGrPRtD4YTvCJpMFo&s=zBsJhrOUC6BXHsKiMP4cEZTtjqB73N9V-kjGKPhqaNA&e=>.
> > XmiWriterCasConsumerCtakes</implementationName>
> >
> > to
> >
> > <implementationName>org.apache.uima.tools.components.
> > XmiWriterCasConsumer</implementationName>
> >
> >
> >
> > However, this didn't work. The error is exactly the same. I'm out of
> > ideas about what to do. I would like to have the report of CPE in XMI,
> > in order to read it with CVD. You can see the thread here:
> >
> > https://urldefense.proofpoint.com/v2/url?u=http-3A__mail-2Darchives.ap
> > ache.org_mod-5Fmbox_ctakes-2Ddev_201701.mbox_-25&d=DwIFaQ&c=qS4goWBT7p
> > oplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd
> > 4f7d4gTao&m=UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=VZfX5xii1Kxs
> > UcRxcVLFxR6hIhkAwYMQuJyh_SHu7KE&e=
> > 3C29cefd1fa1b44ce4a8dc92ec8b1cd882@CHEXMAIL1A.CHBOSTON.ORG%3E<
> > https://urldefense.proofpoint.com/v2/url?u=http-
> > 3A__mail-2Darchives.apache.org_mod-5Fmbox_ctakes-2Ddev_201701.mbox_-
> > 253C29cefd1fa1b44ce4a8dc92ec8b1cd882-40CHEXMAIL1A.CHBOSTON.
> > ORG-253E&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> > fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_
> > ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=_6v4jkcWzpMVtIWPH-
> > 1GkFuXpcYGRYdjs3sGzVLuEPA&e=><https://urldefense.proofpoint.
> com/v2/url?u=https-3A__urldefense.proofpoint&d=
> DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=
> UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=6E81rE-
> QEYoyyKHrR6ZHlSurpOHAb_6oRM7SieeLSiU&e=.
> > com/v2/url?u=http-3A__mail-2Darchives.apache.org_mod-
> > 5Fmbox_ctakes-2Ddev_201701.mbox_-253C29cefd1fa1b44ce4a8dc92ec8b
> > 1cd882-40CHEXMAIL1A.CHBOSTON.ORG-253E&d=DwMFaQ&c=qS4goWBT7poplM69zy_
> > 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao
> > &m= 55lXUJ1MFPyBhpVH4sCBuEZD-InGrPRtD4YTvCJpMFo&s=
> > vzHmir9t5IBncKpumZCOCqviJeDNNVl4ZkjEiK9AMp8&e=><https://
> > urldefense.proofpoint.com/v2/url?u=http-3A__mail-
> > 2Darchives.apache.org_mod-5Fmbox_ctakes-2Ddev_201701.mbox_-
> > 253C29cefd1fa1b44ce4a8dc92ec8b1cd882-40CHEXMAIL1A.CHBOSTON.
> > ORG-253E&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> > fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=
> > N5zX2YGt7jbGKsiWAN7z5tdADmV2PwJdHTvvx2oZ2fM&s=5c-Yr8TMBg7-
> > VyEjwF7gJlT1xP3LpHC6dvnZbihxDPg&e=>
> >
> >
> >
> > Result Expected
> >
> > Running the CPE process and have outputs as XMI files.
> >
> >
> >
> > Result Obtained
> >
> > Running the CPE results in an error, specifically for the consumer
> > __XMIWriterCasConsumer.
> >
> >
> >
> > Conclusion
> >
> > Do any of you guys had this problem before? Do you have a suggestion
> > about how can it be solved? Thanks a lot
> >
> >
> >
> > Best regards,
> >
> > Manuel
> >
> >
>

RE: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]

Posted by "Finan, Sean" <Se...@childrens.harvard.edu>.
Yeah, the translation is going to require a bit of effort.

I didn't know that there is no Portuguese in the snomed international.  However, there should be other parts of the umls with Portuguese.  It looks like MeSH has at least a bit of Portuguese:
https://www.nlm.nih.gov/research/umls/sourcereleasedocs/current/MSHPOR/

You can probably find other sources.  OpenEMR is a cool project with available medical term translations.  Some info here:
http://www.open-emr.org/wiki/index.php/OpenEMR_Internationalization_Configuration
A spreadsheet with terms:
https://docs.google.com/spreadsheets/d/1i2_WsjBX9cwa9mx0gIv3psMzQ28VsUZ-MqlAyZcmbX0/edit?hl=en&hl=en


Cheers,
Sean


-----Original Message-----
From: Manuel Lamy [mailto:mmvpdml@gmail.com] 
Sent: Thursday, January 25, 2018 6:02 PM
To: dev@ctakes.apache.org
Subject: Re: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]

Hello Sean,

Thanks for the awesome inputs as always!

*SNOMED*

Afaik SNOMED doesn't exist in the Portuguese language yet. As per my research and this reference[1], SNOMED-CT is only translated for Australian English, Danish, Dutch, Spanish, Swedish, and USA/UK English. Did you hear about SNOMED-CT translated to Portuguese somewhere?

I may have to come with a solution for this. It's possible for me to build a mechanism that tries to translate all SNOMED-CT from English to Portuguese, or from the Spanish since it's a close language from Portuguese. I can use many sources, such as ICD-9/10, DBpedia or/and a direct translation tool. However, this path will not be easy to take on my own. But it's a possibility though. I have to think about it.


*Bat files in Development Version*

I was trying to run the bat files that were inside the module ctakes-distribution of my Dev Version. I guess that was my problem after all.


*Translation*

Yes, I know cTAKES won't translate for me. I was thinking in using an offline translator and adding it to my pipeline. I have yet to find a translator that is half as good as the Google Translator though. I don't want to rely in an online translator.


*Wiki Documentation*

Thanks for your compliment. Sure, I would love to help. I like to express myself as clear as possible.

However, my knowledge about the system is still limited. I only started using cTAKES a couple of months ago.

But if I can help with something just ask me, I would be glad to help.


Thanks a lot Sean.

Best regards,

Manuel Lamy


[1] -
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.snomed.org_snomed-2Dct_snomed-2Dct-2Dworldwide_translations-2Dof-2Dsnomed-2Dct&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=Dglu7v-Nns-ao41Lbn-oe6MAFF2cEWlkJ-8NQmHv7Xk&e=


2018-01-25 20:43 GMT+00:00 Finan, Sean <Se...@childrens.harvard.edu>:

> Hi Manuel,
>
> Thank you for the information.  I have a couple of response lines …
>
>
> > I need to do it because cTAKES seems to not work with the Portuguese
> language at all
>                 - Yes and no … You can create a dictionary of terms in 
> the Portuguese language.  This would allow ctakes to at least 
> recognize these terms and save them for posterity.  However, the more 
> advanced processing available for English (negation, uncertainty 
> detection, etc.) will not be available.  If you can find other nlp 
> projects that work with Portuguese it may be possible to insert them 
> into a ctakes pipeline.  The instructions for creating a custom 
> dictionary are here (language selection is not documented but it is on 
> the gui, download the umls with portugese snomed if you can):
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_
> confluence_display_CTAKES_Dictionary-2BCreator-2BGUI&d=DwIFaQ&c=qS4goW
> BT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bc
> pKGd4f7d4gTao&m=UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=AD732kch
> CeJtdn_QfNegFdrzKsmW8uw7Nhb-GWg1VRQ&e=
>
> > What I have in mind is to create a pipeline system that first 
> > translates
> the texts from Portuguese to English
>                 - Probably a good way to go if you have a decent 
> translation tool.
>
> > From my research, I couldn't find anything relevant in this topic.
>                 - We definitely could use more documentation.
>
> > Well, since this is the user version, I don't have the
> runPiperSubmitter.bat available
>                 - Correct.  It is a tool that was created after the 
> 4.0 release.
>
> > When I try to run the bat files inside the bin of the Dev Version, I
> have the results shown in the image attached to this e-mail.
>                 -  Your attachments were scrubbed so I can’t see them.
> However, I have a guess: did you run a “maven package”, unzip the 
> created installation file and run from the bin/ directory there?  Or 
> are you running with the bin/ inside your development sandbox?  The 
> second method won’t work and will give you the “class not found” 
> errors that you are seeing.  If you want to run using Intellij, turn 
> on the profile “runPiperGui” and compile.  Maven should launch the gui after compilation.
>
> > Well, first of all, my objective is to share my experiences with 
> > cTAKES,
> in order to share with the community what I'm going through. This way 
> I can contribute to the community and probably help others who are 
> going through the same as me.
>                 -  Excellent.  Would you be willing to write 
> documentation for the ctakes wiki?  Your emails are clear and extremely well formatted!
>
>
>   1.  Is this feasible? Am I aiming for something that I simply can't 
> rely in cTAKES only to do, because I have to translate the texts first?
>
> -          Ctakes won’t translate for you, but if you can find a tool that
> will then processing with ctakes should be possible.
>
>   1.  Why don't I have a TypeSystem.xml file to feed CVD first, in the 
> Development Version? I can only find it in the User Version, under 
> /resources.
>
> -          The typesystem.xml file is in the ctakes-type-system project
> until you “maven package” and create an “installation”.  If you just 
> run from your developer environment you can point to the 
> TypeSystem.xml in ctakes-type-system/src/main/resources/…
>
>   1.  Why do we have options in CVD for other languages, but it 
> clearly only works for the English language?
>
> -          The cvd is a tool that is part of Apache UIMA.  It is more
> generic than ctakes and can read xmi files created by other systems.  
> I have no idea what the details are concerning its language support.
>
>   1.  Any other hint you can give me, concerning the big picture of 
> what I'm trying to build here?
>
> -          Not really, sorry.  The multi-lingual goes outside my area of
> knowledge.
>
> Sean
>
>
> From: Manuel Lamy [mailto:mmvpdml@gmail.com]
> Sent: Thursday, January 25, 2018 2:28 PM
> To: dev@ctakes.apache.org
> Subject: Re: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]
>
> Hello Sean,
>
> Before all, thansk a lot for the quick and detailed answer. Awesome 
> support by you.
>
> I'll give you a structured answer to be the more objective and concise 
> possible. I guess it's important to tell you what I'm trying to 
> achieve in order for you to help me.
>
> My Project
>
> I'm actually making a project with cTAKES in a partnership with a 
> Portuguese hospital.
>
> My goal is to create reports of the narrative parts of the EMRs of 
> this hospital, in order to report the symptoms, diseases and clinical 
> procedures found in each EMR.
>
> What I have in mind is to create a pipeline system that first 
> translates the texts from Portuguese to English, and then creates 
> these reports based on the translated texts.
>
> I'm not even sure yet I can create a pipeline system of this style 
> with cTAKES. I need to do it because cTAKES seems to not work with the 
> Portuguese language at all (despite that option being shown in the 
> languages list when using CVD and that's confusing). So, well, I will 
> translate it, I guess it's my best bet.
>
> But just a note, I think it should exist more support and 
> documentation about how to work with cTAKES in different languages 
> than English. From my research, I couldn't find anything relevant in 
> this topic. Not even one reference telling clearly that cTAKES only 
> works with English language and not with the others.
>
> Version of cTAKES
>
> Naturally, I'm running the development version of cTAKES. I'm using 
> Intellij. I'm using the latest version of cTAKES, trunk, that 
> corresponds to version 4.0.1-SNAPSHOT.
>
> So, I guess so far so good, just as you said, I'm using trunk.
>
> I did everything as per the guide "Developer Install Guide", 
> concerning the Intellij instructions. The guide I used can be found here:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_
> confluence_display_CTAKES_&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMS
> dioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=UOqyIzWZMi
> qyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=e2yD-cLW1Dlpq29Om7A73a4xDPGLDUZSud
> PClFUEzBU&e=
> cTAKES+4.0+Developer+Install+Guide<https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=jykKf1NthK-1Y45BRzYVVVAYQK6939NDMIg-lhY2wLQ&e=.
> proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_
> confluence_display_CTAKES_cTAKES-2B4.0-2BDeveloper-
> 2BInstall-2BGuide&d=DwMFaQ&c=qS4goWBT7poplM69zy_
> 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bc
> pKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjX
> OytMmgM0&s=CVSlmO5-wIWG9Bh-dTmpXoUGF5sgLWD2jp4sbGZ8vh8&e=>
>
>
> Behavior of cTAKES when running pipelines
>
> Well, I did what you told me. I ran the Default Clinical Pipeline and 
> the Piper File Submitter as per the wiki's. I have the User and 
> Development versions both in my machine.
>
> Now, I tried to run those pipelines in the User and Development versions.
> I ran the respective bat files:
>
>
>   *   For the Default Clinical Pipeline I ran 'bin/runClinicalPipeline  -i
> inputDirectory  --xmiOut outputDirectory  --user umlsUsername  --pass 
> umlsPassword'
>   *   For the Piper File Submitter, I ran the 'bin/runPiperSubmitter'
> Well, the results of running these two bat files were quite differents 
> for the User and Development versions.
>
> User Version
>
> Default Clinical Pipeline
>
> In this version, I went to bin directory and just ran the line 
> 'bin/runClinicalPipeline  -i inputDirectory  --xmiOut outputDirectory 
> --user umlsUsername  --pass umlsPassword' with my parameters.
>
> It worked well and created the XMI output files where it was supposed. 
> And I could open them in CVD, first opening a TypeSystem.xml file and 
> then the generated XMI files I wanted.
>
> Piper File Submitter
>
> Well, since this is the user version, I don't have the 
> runPiperSubmitter.bat available. Is this normal? That's comprehensible 
> and I guess normal, for what I understand from this quote " If you are 
> running from a development environment (checked out trunk from SVN) 
> they can also be run using the Piper File Submitter GUI." But you tell me.
>
> Well, I can say the User Version did what I wanted in this step, but I 
> thought that would be nice to replicate it in the Development version, 
> since I guess I'll have to use it in the future in order to implement 
> all I want for my project described in the beggining of this e-mail. 
> And the problems arose in the Development version....
>
> Development Version
>
> Well, in this version, I tried to replicate what I did in the User 
> version, thinking to myself it would output the same result. I was wrong.
>
>
> Default Clinical Pipeline and Piper File Submitter
>
> When I try to run the bat files inside the bin of the Dev Version, I 
> have the results shown in the image attached to this e-mail.
>
> Yes, could not find or load PiperFileRunner and PiperRunnerGui. Is it 
> supposed to happen in the Development Version? Am I doing something 
> wrong in here? i just followed the guides you have available. All my 
> Development Version installation was per the guide.
>
>
> My objective with this e-mail
>
> Well, first of all, my objective is to share my experiences with 
> cTAKES, in order to share with the community what I'm going through. 
> This way I can contribute to the community and probably help others 
> who are going through the same as me.
>
> In second place, I would like to know your opinion about the 
> feasability of what I'm trying to make here. My goal is build a pipeline system like:
>
>
>   *   EMRs in Portuguese already in txt files in a directory ->
> Translation to English -> Process all of the texts with Clinical 
> Pipeline
> -> Output XMI in order to open them in CVD
> This is what I aim with cTAKES. So I have the following questions:
>
>
>   1.  Is this feasible? Am I aiming for something that I simply can't 
> rely in cTAKES only to do, because I have to translate the texts first?
>   2.  Why don't I have a TypeSystem.xml file to feed CVD first, in the 
> Development Version? I can only find it in the User Version, under 
> /resources.
>   3.  Why do we have options in CVD for other languages, but it 
> clearly only works for the English language?
>   4.  Any other hint you can give me, concerning the big picture of 
> what I'm trying to build here?
> Any additional information you need from my side, just tell me.
>
> Thanks one more time for the quick answers and support Sean.
>
> Best regards,
>
> Manuel
>
>
> 2018-01-25 15:35 GMT+00:00 Finan, Sean 
> <Sean.Finan@childrens.harvard.edu
> <ma...@childrens.harvard.edu>>:
> Hi Manuel,
>
> My first comment is that you are running ctakes in a somewhat “ancient”
> manner, or better put, the xml descriptor workflow has been pretty 
> much deprecated.
>
> You should try to run ctakes 4.0.  If you are software savvy then I 
> advise that you try the development version that is in trunk.  You’ve 
> probably been on the ctakes download page, but just a reminder :
> https://urldefense.proofpoint.com/v2/url?u=http-3A__ctakes.apache.org_
> &d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstT
> pyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwV
> gTy1XI8&s=NmTVdyQzJ4PMX8X5CEfmrRCPBn_0nupPPMBQtp_4Fg8&e=<https://urlde
> fense.proofpoint.com/ 
> v2/url?u=http-3A__ctakes.apache.org_&d=DwMFaQ&c=qS4goWBT7poplM69zy_
> 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bc
> pKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=
> eR4BZrqJcoxN9dwsWE5PUw9qwMAju7w9zOOzqMHT95U&e=>
>
> The ctakes wiki has some useful information, and the 4.0 entry is here:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_
> confluence_display_CTAKES_cTAKES-2B4.0&d=DwIFaQ&c=qS4goWBT7poplM69zy_3
> xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&
> m=UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=gd_IcjGK3Jvad4c-PYkuLS
> NXVjLBeSR-VrMEJ19Qrck&e=< 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_
> confluence_display_CTAKES_cTAKES-2B4.0&d=DwMFaQ&c=qS4goWBT7poplM69zy_
> 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bc
> pKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=
> IgvR2Z9rgstXIbo3scW0DsWkA59X0ANVuYeO5P5lrwI&e=>
>
> To start playing with ctakes I suggest that you try to run the default 
> clinical pipeline, following the instructions here:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_
> confluence_display_CTAKES_&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMS
> dioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=UOqyIzWZMi
> qyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=e2yD-cLW1Dlpq29Om7A73a4xDPGLDUZSud
> PClFUEzBU&e=
> Default+Clinical+Pipeline<https://urldefense.proofpoint.com/
> v2/url?u=https-3A__cwiki.apache.org_confluence_display_
> CTAKES_Default-2BClinical-2BPipeline&d=DwMFaQ&c=qS4goWBT7poplM69zy_
> 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bc
> pKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=
> hvwwTI35sq53mx3R9TsPtHEF3p2G29qCmVime1NsgKU&e=>
>
> Those instructions will start the default clinical pipeline from a 
> command line.  If you have the development version from trunk then 
> there is a gui available to run pipelines:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_
> confluence_display_CTAKES_&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMS
> dioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=UOqyIzWZMi
> qyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=e2yD-cLW1Dlpq29Om7A73a4xDPGLDUZSud
> PClFUEzBU&e=
> Piper+File+Submitter+GUI<https://urldefense.proofpoint.com/
> v2/url?u=https-3A__cwiki.apache.org_confluence_display_
> CTAKES_Piper-2BFile-2BSubmitter-2BGUI&d=DwMFaQ&c=qS4goWBT7poplM69zy_
> 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bc
> pKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjX
> OytMmgM0&s=HKBfRNAlLaLk9c-sPqupZpQzAc5ddcWbbXvWxRiWwBw&e=>
>
> There are also many other pipeline configurations available in trunk 
> to run more advanced / involved pipelines.  They are not in the 4.0 release.
> The pipelines (including 4.0 default) are all defined using the 
> replacement for those xml descriptor files.  The replacements are called “piper files”.
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_
> confluence_display_CTAKES_Piper-2BFiles&d=DwIFaQ&c=qS4goWBT7poplM69zy_
> 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao
> &m=UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=-ifPT1RLQFSC5JfXB8KiX
> FFpaWXiwmATZ3ZiEDqR0Mk&e=<https://
> urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.
> org_confluence_display_CTAKES_Piper-2BFiles&d=DwMFaQ&c=qS4goWBT7poplM6
> 9zy_ 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bc
> pKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjX
> OytMmgM0&s=E7wf87y0Ldqo_pGw2sYdC_DPEeqsmnLYPMkrM5LIz8w&e=>
>
> I hope that you find the pipers easier to understand and use than the 
> old xml descriptors.
>
> Anyway, if you run the ctakes 4.0 default clinical pipeline as 
> outlined in the wiki page it will use the new FileTreeReader and 
> FileTreeXmiWriter combination.
>
> Give it a whirl and let me know how things go.
>
> Sean
>
>
> From: Manuel Lamy [mailto:mmvpdml@gmail.com<ma...@gmail.com>]
> Sent: Thursday, January 25, 2018 9:09 AM
> To: dev@ctakes.apache.org<ma...@ctakes.apache.org>
> Subject: Re: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]
>
> Hello Sean,
>
> First of all, thanks for your quick answer.
>
> I'm probably making some confusion over here, so I have the following 
> questions.
>
>
>   1.  A CAS Consumer is defined by a XML file. What you are implying 
> is that I should go to my consumer XML (__XmiWriterCasConsumer.xml) 
> and change it's <implementationName> tag to 
> 'org.apache.ctakes.core.cc<htt ps://urldefense.proofpoint.com/v2/url?u=http-3A__org.
> apache.ctakes.core.cc&d=DwMFaQ&c=qS4goWBT7poplM69zy_
> 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bc
> pKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=77ECYie_
> 8Zy3RN9ARtzl51dBaHan8dijiNX2p0IkjIA&e=>.FileTreeXmiWriter' instead of '
> org.apache.ctakes.core.cc<https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense.proofpoint&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=6E81rE-QEYoyyKHrR6ZHlSurpOHAb_6oRM7SieeLSiU&e=.
> com/v2/url?u=http-3A__org.apache.ctakes.core.cc&d=
> DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_
> ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=77ECYie_
> 8Zy3RN9ARtzl51dBaHan8dijiNX2p0IkjIA&e=>.XmiWriterCasConsumer'? Funny 
> enough, it gives me a classNotFoundException if I do this. Would like 
> to have your confirmation if I'm doing the right thing please. The 
> class is well defined in that path though.
>   2.  Concerning the reader, I make the same analogy. Should I go to 
> my descriptor and change it's <implementationName> tag from '
> org.apache.ctakes.core.cr<https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense.proofpoint&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=6E81rE-QEYoyyKHrR6ZHlSurpOHAb_6oRM7SieeLSiU&e=.
> com/v2/url?u=http-3A__org.apache.ctakes.core.cr&d=
> DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_
> ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=-ag_dLUKFN_aLQ4irY_
> xU_CLzGNrDn6NfV62R5ojs8k&e=>.FilesInDirectoryCollectionReader' to '
> org.apache.ctakes.core.cr<https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense.proofpoint&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=6E81rE-QEYoyyKHrR6ZHlSurpOHAb_6oRM7SieeLSiU&e=.
> com/v2/url?u=http-3A__org.apache.ctakes.core.cr&d=
> DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_
> ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=-ag_dLUKFN_aLQ4irY_
> xU_CLzGNrDn6NfV62R5ojs8k&e=>.FileTreeReader'?
> I did these two things and the error is the same concerning the new 
> consumer 'FileTreeXmiWriter', as you can see in the first image 
> attached to this e-mail.
>
> I would also like to ask you another question:
>
>
>        3. Why does my class 'FileTreeXmiWriter' has a lot of 
> unresolved classes? You can see it in the second image attached to 
> this e-mail. I can't seem to import them right. I tried to import the 
> extension of this class only to check the result, and look how it solved the import to me.
> 'apache' is not recognized. I'm just kinda baffled with the hierarchy 
> defined for this project. If you could give me a little bit of 
> clarification in this topic and how to solve it I would be appreciated.
>
> Thanks for your attention! I'm really looking forward to put this to work.
> cTAKES seems awesome. It just needs these little tweaks.
>
> Best regards,
>
> Manuel
>
>
>
>
>
> 2018-01-24 22:26 GMT+00:00 Finan, Sean 
> <Sean.Finan@childrens.harvard.edu
> <ma...@childrens.harvard.edu><mailto:
> Sean.Finan@childrens.harvard.edu<mailto:Sean.Finan@childrens.harvard.e
> du
> >>>:
> Hi Manuel,
>
> Your image got scrubbed by a server, but the problem may have been 
> fixed in a recent xmi writer.  The latest xmi writer is in ctakes core 
> and is named FileTreeXmiWriter.  One possible cause for a problem in 
> the writer is if the document has some unexpected character or 
> character combination.  A document reader should be massaging 
> documents before they are processed and sent to the writer.  The most 
> recent file reader is named FileTreeReader and is also in ctakes core.
>
> Sean
>
>
>
> From: Manuel Lamy [mailto:mmvpdml@gmail.com<ma...@gmail.com><
> mailto:mmvpdml@gmail.com<ma...@gmail.com>>]
> Sent: Wednesday, January 24, 2018 5:10 PM
> To: dev@ctakes.apache.org<ma...@ctakes.apache.org><mailto:d
> ev@ctakes.apache.org<ma...@ctakes.apache.org>>
> Subject: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]
>
> Hello guys,
>
> I'm having problems running the CPE using a XMI Writer CAS Consumer.
> However, it works with other consumers.
>
> Problem
>
> In the figure below, you can see my setup and the error I'm obtaining:
>
> [Imagem inline 2]
>
> Logs
>
> Concerning logs, I'm obtaining this from Intellij:
>
> org.apache.uima.resource.ResourceInitializationException
>             at 
> org.apache.uima.collection.impl.CollectionProcessingEngine_
> impl.initialize(CollectionProcessingEngine_impl.java:81)
>             at org.apache.uima.impl.UIMAFramework_impl._
> produceCollectionProcessingEngine(UIMAFramework_impl.java:438)
>             at org.apache.uima.UIMAFramework.
> produceCollectionProcessingEngine(UIMAFramework.java:918)
>             at org.apache.uima.tools.cpm.CpmPanel.startProcessing(
> CpmPanel.java:573)
>             at org.apache.uima.tools.cpm.CpmPanel.access$000(CpmPanel.
> java:105)
>             at 
> org.apache.uima.tools.cpm.CpmPanel$1.run(CpmPanel.java:713)
> Caused by: org.apache.uima.resource.ResourceConfigurationException
>             at org.apache.uima.collection.impl.cpm.container.CPEFactory.
> pro<https://urldefense.proofpoint.com/v2/url?u=http-3A__l.cpm.container.
> CPEFactory.pro&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&
> r= fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_
> ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=Kd-RE-JiMaX2AlLA310idXB-
> Dyqrbh68kZ24-2ZFEe0&e=>duceIntegratedCasProcessor(CPEFactory.java:1093)
>             at org.apache.uima.collection.impl.cpm.container.CPEFactory.
> getCasProcessors(CPEFactory.java:547)
>             at org.apache.uima.collection.impl.cpm.BaseCPMImpl.init(
> BaseCPMImpl.java:253)
>             at org.apache.uima.collection.impl.cpm.BaseCPMImpl.<init>(
> BaseCPMImpl.java:127)
>             at 
> org.apache.uima.collection.impl.CollectionProcessingEngine_
> impl.initialize(CollectionProcessingEngine_impl.java:73)
>             ... 5 more
> Caused by: java.lang.Exception: The component XMI Writer CAS Consumer 
> cannot be created. (Thread Name: Thread-5)
>             ... 10 more
>
> Attempted Solutions
>
> I only found one guy with the same problem as me. The solution 
> proposed in the thread, by Sean Finan, was to change the xml of my 
> consumer (__XmiWriterCasConsumer.xml), particularly the content of the 
> tag <implementationName>, from  
> <implementationName>org.apache.ctakes.core.cc<https://
> urldefense.proofpoint.com/v2/url?u=http-3A__org.apache.
> ctakes.core.cc&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&
> r= fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_
> ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=77ECYie_
> 8Zy3RN9ARtzl51dBaHan8dijiNX2p0IkjIA&e=><https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=jykKf1NthK-1Y45BRzYVVVAYQK6939NDMIg-lhY2wLQ&e=.
> proofpoint.com/v2/url?u=http-3A__apache.ctakes.core.cc&d=
> DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=55lXUJ1MFPyBhpVH4sCBuEZD
> - 
> InGrPRtD4YTvCJpMFo&s=zBsJhrOUC6BXHsKiMP4cEZTtjqB73N9V-kjGKPhqaNA&e=>.
> XmiWriterCasConsumerCtakes</implementationName>
>
> to
>
> <implementationName>org.apache.uima.tools.components.
> XmiWriterCasConsumer</implementationName>
>
>
>
> However, this didn't work. The error is exactly the same. I'm out of 
> ideas about what to do. I would like to have the report of CPE in XMI, 
> in order to read it with CVD. You can see the thread here:
>
> https://urldefense.proofpoint.com/v2/url?u=http-3A__mail-2Darchives.ap
> ache.org_mod-5Fmbox_ctakes-2Ddev_201701.mbox_-25&d=DwIFaQ&c=qS4goWBT7p
> oplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd
> 4f7d4gTao&m=UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=VZfX5xii1Kxs
> UcRxcVLFxR6hIhkAwYMQuJyh_SHu7KE&e=
> 3C29cefd1fa1b44ce4a8dc92ec8b1cd882@CHEXMAIL1A.CHBOSTON.ORG%3E<
> https://urldefense.proofpoint.com/v2/url?u=http-
> 3A__mail-2Darchives.apache.org_mod-5Fmbox_ctakes-2Ddev_201701.mbox_-
> 253C29cefd1fa1b44ce4a8dc92ec8b1cd882-40CHEXMAIL1A.CHBOSTON.
> ORG-253E&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_
> ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=_6v4jkcWzpMVtIWPH-
> 1GkFuXpcYGRYdjs3sGzVLuEPA&e=><https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense.proofpoint&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=6E81rE-QEYoyyKHrR6ZHlSurpOHAb_6oRM7SieeLSiU&e=.
> com/v2/url?u=http-3A__mail-2Darchives.apache.org_mod-
> 5Fmbox_ctakes-2Ddev_201701.mbox_-253C29cefd1fa1b44ce4a8dc92ec8b
> 1cd882-40CHEXMAIL1A.CHBOSTON.ORG-253E&d=DwMFaQ&c=qS4goWBT7poplM69zy_
> 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao
> &m= 55lXUJ1MFPyBhpVH4sCBuEZD-InGrPRtD4YTvCJpMFo&s=
> vzHmir9t5IBncKpumZCOCqviJeDNNVl4ZkjEiK9AMp8&e=><https://
> urldefense.proofpoint.com/v2/url?u=http-3A__mail-
> 2Darchives.apache.org_mod-5Fmbox_ctakes-2Ddev_201701.mbox_-
> 253C29cefd1fa1b44ce4a8dc92ec8b1cd882-40CHEXMAIL1A.CHBOSTON.
> ORG-253E&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=
> N5zX2YGt7jbGKsiWAN7z5tdADmV2PwJdHTvvx2oZ2fM&s=5c-Yr8TMBg7-
> VyEjwF7gJlT1xP3LpHC6dvnZbihxDPg&e=>
>
>
>
> Result Expected
>
> Running the CPE process and have outputs as XMI files.
>
>
>
> Result Obtained
>
> Running the CPE results in an error, specifically for the consumer 
> __XMIWriterCasConsumer.
>
>
>
> Conclusion
>
> Do any of you guys had this problem before? Do you have a suggestion 
> about how can it be solved? Thanks a lot
>
>
>
> Best regards,
>
> Manuel
>
>

RE: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]

Posted by "Finan, Sean" <Se...@childrens.harvard.edu>.
Hi again,

Another possible assists ...

Ctakes uses Apache OpenNLP.  It looks like they have put some effort into Portuguese systems:
https://sourceforge.net/projects/opennlp/files/models-1.5/
The files with the prefix pt- should be for Portuguese.  I think.  So you could probably get tokenization and part of speech into ctakes.  However, that won't be as good as an English translation.

Sean 







-----Original Message-----
From: Manuel Lamy [mailto:mmvpdml@gmail.com] 
Sent: Thursday, January 25, 2018 6:02 PM
To: dev@ctakes.apache.org
Subject: Re: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]

Hello Sean,

Thanks for the awesome inputs as always!

*SNOMED*

Afaik SNOMED doesn't exist in the Portuguese language yet. As per my
research and this reference[1], SNOMED-CT is only translated for Australian
English, Danish, Dutch, Spanish, Swedish, and USA/UK English. Did you hear
about SNOMED-CT translated to Portuguese somewhere?

I may have to come with a solution for this. It's possible for me to build
a mechanism that tries to translate all SNOMED-CT from English to
Portuguese, or from the Spanish since it's a close language from
Portuguese. I can use many sources, such as ICD-9/10, DBpedia or/and a
direct translation tool. However, this path will not be easy to take on my
own. But it's a possibility though. I have to think about it.


*Bat files in Development Version*

I was trying to run the bat files that were inside the module
ctakes-distribution of my Dev Version. I guess that was my problem after
all.


*Translation*

Yes, I know cTAKES won't translate for me. I was thinking in using an
offline translator and adding it to my pipeline. I have yet to find a
translator that is half as good as the Google Translator though. I don't
want to rely in an online translator.


*Wiki Documentation*

Thanks for your compliment. Sure, I would love to help. I like to express
myself as clear as possible.

However, my knowledge about the system is still limited. I only started
using cTAKES a couple of months ago.

But if I can help with something just ask me, I would be glad to help.


Thanks a lot Sean.

Best regards,

Manuel Lamy


[1] -
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.snomed.org_snomed-2Dct_snomed-2Dct-2Dworldwide_translations-2Dof-2Dsnomed-2Dct&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=Dglu7v-Nns-ao41Lbn-oe6MAFF2cEWlkJ-8NQmHv7Xk&e=


2018-01-25 20:43 GMT+00:00 Finan, Sean <Se...@childrens.harvard.edu>:

> Hi Manuel,
>
> Thank you for the information.  I have a couple of response lines …
>
>
> > I need to do it because cTAKES seems to not work with the Portuguese
> language at all
>                 - Yes and no … You can create a dictionary of terms in the
> Portuguese language.  This would allow ctakes to at least recognize these
> terms and save them for posterity.  However, the more advanced processing
> available for English (negation, uncertainty detection, etc.) will not be
> available.  If you can find other nlp projects that work with Portuguese it
> may be possible to insert them into a ctakes pipeline.  The instructions
> for creating a custom dictionary are here (language selection is not
> documented but it is on the gui, download the umls with portugese snomed if
> you can):
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_Dictionary-2BCreator-2BGUI&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=AD732kchCeJtdn_QfNegFdrzKsmW8uw7Nhb-GWg1VRQ&e=
>
> > What I have in mind is to create a pipeline system that first translates
> the texts from Portuguese to English
>                 - Probably a good way to go if you have a decent
> translation tool.
>
> > From my research, I couldn't find anything relevant in this topic.
>                 - We definitely could use more documentation.
>
> > Well, since this is the user version, I don't have the
> runPiperSubmitter.bat available
>                 - Correct.  It is a tool that was created after the 4.0
> release.
>
> > When I try to run the bat files inside the bin of the Dev Version, I
> have the results shown in the image attached to this e-mail.
>                 -  Your attachments were scrubbed so I can’t see them.
> However, I have a guess: did you run a “maven package”, unzip the created
> installation file and run from the bin/ directory there?  Or are you
> running with the bin/ inside your development sandbox?  The second method
> won’t work and will give you the “class not found” errors that you are
> seeing.  If you want to run using Intellij, turn on the profile
> “runPiperGui” and compile.  Maven should launch the gui after compilation.
>
> > Well, first of all, my objective is to share my experiences with cTAKES,
> in order to share with the community what I'm going through. This way I can
> contribute to the community and probably help others who are going through
> the same as me.
>                 -  Excellent.  Would you be willing to write documentation
> for the ctakes wiki?  Your emails are clear and extremely well formatted!
>
>
>   1.  Is this feasible? Am I aiming for something that I simply can't rely
> in cTAKES only to do, because I have to translate the texts first?
>
> -          Ctakes won’t translate for you, but if you can find a tool that
> will then processing with ctakes should be possible.
>
>   1.  Why don't I have a TypeSystem.xml file to feed CVD first, in the
> Development Version? I can only find it in the User Version, under
> /resources.
>
> -          The typesystem.xml file is in the ctakes-type-system project
> until you “maven package” and create an “installation”.  If you just run
> from your developer environment you can point to the TypeSystem.xml in
> ctakes-type-system/src/main/resources/…
>
>   1.  Why do we have options in CVD for other languages, but it clearly
> only works for the English language?
>
> -          The cvd is a tool that is part of Apache UIMA.  It is more
> generic than ctakes and can read xmi files created by other systems.  I
> have no idea what the details are concerning its language support.
>
>   1.  Any other hint you can give me, concerning the big picture of what
> I'm trying to build here?
>
> -          Not really, sorry.  The multi-lingual goes outside my area of
> knowledge.
>
> Sean
>
>
> From: Manuel Lamy [mailto:mmvpdml@gmail.com]
> Sent: Thursday, January 25, 2018 2:28 PM
> To: dev@ctakes.apache.org
> Subject: Re: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]
>
> Hello Sean,
>
> Before all, thansk a lot for the quick and detailed answer. Awesome
> support by you.
>
> I'll give you a structured answer to be the more objective and concise
> possible. I guess it's important to tell you what I'm trying to achieve in
> order for you to help me.
>
> My Project
>
> I'm actually making a project with cTAKES in a partnership with a
> Portuguese hospital.
>
> My goal is to create reports of the narrative parts of the EMRs of this
> hospital, in order to report the symptoms, diseases and clinical procedures
> found in each EMR.
>
> What I have in mind is to create a pipeline system that first translates
> the texts from Portuguese to English, and then creates these reports based
> on the translated texts.
>
> I'm not even sure yet I can create a pipeline system of this style with
> cTAKES. I need to do it because cTAKES seems to not work with the
> Portuguese language at all (despite that option being shown in the
> languages list when using CVD and that's confusing). So, well, I will
> translate it, I guess it's my best bet.
>
> But just a note, I think it should exist more support and documentation
> about how to work with cTAKES in different languages than English. From my
> research, I couldn't find anything relevant in this topic. Not even one
> reference telling clearly that cTAKES only works with English language and
> not with the others.
>
> Version of cTAKES
>
> Naturally, I'm running the development version of cTAKES. I'm using
> Intellij. I'm using the latest version of cTAKES, trunk, that corresponds
> to version 4.0.1-SNAPSHOT.
>
> So, I guess so far so good, just as you said, I'm using trunk.
>
> I did everything as per the guide "Developer Install Guide", concerning
> the Intellij instructions. The guide I used can be found here:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=e2yD-cLW1Dlpq29Om7A73a4xDPGLDUZSudPClFUEzBU&e=
> cTAKES+4.0+Developer+Install+Guide<https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=jykKf1NthK-1Y45BRzYVVVAYQK6939NDMIg-lhY2wLQ&e=.
> proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_
> confluence_display_CTAKES_cTAKES-2B4.0-2BDeveloper-
> 2BInstall-2BGuide&d=DwMFaQ&c=qS4goWBT7poplM69zy_
> 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bc
> pKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjX
> OytMmgM0&s=CVSlmO5-wIWG9Bh-dTmpXoUGF5sgLWD2jp4sbGZ8vh8&e=>
>
>
> Behavior of cTAKES when running pipelines
>
> Well, I did what you told me. I ran the Default Clinical Pipeline and the
> Piper File Submitter as per the wiki's. I have the User and Development
> versions both in my machine.
>
> Now, I tried to run those pipelines in the User and Development versions.
> I ran the respective bat files:
>
>
>   *   For the Default Clinical Pipeline I ran 'bin/runClinicalPipeline  -i
> inputDirectory  --xmiOut outputDirectory  --user umlsUsername  --pass
> umlsPassword'
>   *   For the Piper File Submitter, I ran the 'bin/runPiperSubmitter'
> Well, the results of running these two bat files were quite differents for
> the User and Development versions.
>
> User Version
>
> Default Clinical Pipeline
>
> In this version, I went to bin directory and just ran the line
> 'bin/runClinicalPipeline  -i inputDirectory  --xmiOut outputDirectory
> --user umlsUsername  --pass umlsPassword' with my parameters.
>
> It worked well and created the XMI output files where it was supposed. And
> I could open them in CVD, first opening a TypeSystem.xml file and then the
> generated XMI files I wanted.
>
> Piper File Submitter
>
> Well, since this is the user version, I don't have the
> runPiperSubmitter.bat available. Is this normal? That's comprehensible and
> I guess normal, for what I understand from this quote " If you are running
> from a development environment (checked out trunk from SVN) they can also
> be run using the Piper File Submitter GUI." But you tell me.
>
> Well, I can say the User Version did what I wanted in this step, but I
> thought that would be nice to replicate it in the Development version,
> since I guess I'll have to use it in the future in order to implement all I
> want for my project described in the beggining of this e-mail. And the
> problems arose in the Development version....
>
> Development Version
>
> Well, in this version, I tried to replicate what I did in the User
> version, thinking to myself it would output the same result. I was wrong.
>
>
> Default Clinical Pipeline and Piper File Submitter
>
> When I try to run the bat files inside the bin of the Dev Version, I have
> the results shown in the image attached to this e-mail.
>
> Yes, could not find or load PiperFileRunner and PiperRunnerGui. Is it
> supposed to happen in the Development Version? Am I doing something wrong
> in here? i just followed the guides you have available. All my Development
> Version installation was per the guide.
>
>
> My objective with this e-mail
>
> Well, first of all, my objective is to share my experiences with cTAKES,
> in order to share with the community what I'm going through. This way I can
> contribute to the community and probably help others who are going through
> the same as me.
>
> In second place, I would like to know your opinion about the feasability
> of what I'm trying to make here. My goal is build a pipeline system like:
>
>
>   *   EMRs in Portuguese already in txt files in a directory ->
> Translation to English -> Process all of the texts with Clinical Pipeline
> -> Output XMI in order to open them in CVD
> This is what I aim with cTAKES. So I have the following questions:
>
>
>   1.  Is this feasible? Am I aiming for something that I simply can't rely
> in cTAKES only to do, because I have to translate the texts first?
>   2.  Why don't I have a TypeSystem.xml file to feed CVD first, in the
> Development Version? I can only find it in the User Version, under
> /resources.
>   3.  Why do we have options in CVD for other languages, but it clearly
> only works for the English language?
>   4.  Any other hint you can give me, concerning the big picture of what
> I'm trying to build here?
> Any additional information you need from my side, just tell me.
>
> Thanks one more time for the quick answers and support Sean.
>
> Best regards,
>
> Manuel
>
>
> 2018-01-25 15:35 GMT+00:00 Finan, Sean <Sean.Finan@childrens.harvard.edu
> <ma...@childrens.harvard.edu>>:
> Hi Manuel,
>
> My first comment is that you are running ctakes in a somewhat “ancient”
> manner, or better put, the xml descriptor workflow has been pretty much
> deprecated.
>
> You should try to run ctakes 4.0.  If you are software savvy then I advise
> that you try the development version that is in trunk.  You’ve probably
> been on the ctakes download page, but just a reminder :
> https://urldefense.proofpoint.com/v2/url?u=http-3A__ctakes.apache.org_&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=NmTVdyQzJ4PMX8X5CEfmrRCPBn_0nupPPMBQtp_4Fg8&e=<https://urldefense.proofpoint.com/
> v2/url?u=http-3A__ctakes.apache.org_&d=DwMFaQ&c=qS4goWBT7poplM69zy_
> 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bc
> pKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=
> eR4BZrqJcoxN9dwsWE5PUw9qwMAju7w9zOOzqMHT95U&e=>
>
> The ctakes wiki has some useful information, and the 4.0 entry is here:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_cTAKES-2B4.0&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=gd_IcjGK3Jvad4c-PYkuLSNXVjLBeSR-VrMEJ19Qrck&e=<
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_
> confluence_display_CTAKES_cTAKES-2B4.0&d=DwMFaQ&c=qS4goWBT7poplM69zy_
> 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bc
> pKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=
> IgvR2Z9rgstXIbo3scW0DsWkA59X0ANVuYeO5P5lrwI&e=>
>
> To start playing with ctakes I suggest that you try to run the default
> clinical pipeline, following the instructions here:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=e2yD-cLW1Dlpq29Om7A73a4xDPGLDUZSudPClFUEzBU&e=
> Default+Clinical+Pipeline<https://urldefense.proofpoint.com/
> v2/url?u=https-3A__cwiki.apache.org_confluence_display_
> CTAKES_Default-2BClinical-2BPipeline&d=DwMFaQ&c=qS4goWBT7poplM69zy_
> 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bc
> pKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=
> hvwwTI35sq53mx3R9TsPtHEF3p2G29qCmVime1NsgKU&e=>
>
> Those instructions will start the default clinical pipeline from a command
> line.  If you have the development version from trunk then there is a gui
> available to run pipelines:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=e2yD-cLW1Dlpq29Om7A73a4xDPGLDUZSudPClFUEzBU&e=
> Piper+File+Submitter+GUI<https://urldefense.proofpoint.com/
> v2/url?u=https-3A__cwiki.apache.org_confluence_display_
> CTAKES_Piper-2BFile-2BSubmitter-2BGUI&d=DwMFaQ&c=qS4goWBT7poplM69zy_
> 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bc
> pKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjX
> OytMmgM0&s=HKBfRNAlLaLk9c-sPqupZpQzAc5ddcWbbXvWxRiWwBw&e=>
>
> There are also many other pipeline configurations available in trunk to
> run more advanced / involved pipelines.  They are not in the 4.0 release.
> The pipelines (including 4.0 default) are all defined using the replacement
> for those xml descriptor files.  The replacements are called “piper files”.
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_Piper-2BFiles&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=-ifPT1RLQFSC5JfXB8KiXFFpaWXiwmATZ3ZiEDqR0Mk&e=<https://
> urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.
> org_confluence_display_CTAKES_Piper-2BFiles&d=DwMFaQ&c=qS4goWBT7poplM69zy_
> 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bc
> pKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjX
> OytMmgM0&s=E7wf87y0Ldqo_pGw2sYdC_DPEeqsmnLYPMkrM5LIz8w&e=>
>
> I hope that you find the pipers easier to understand and use than the old
> xml descriptors.
>
> Anyway, if you run the ctakes 4.0 default clinical pipeline as outlined in
> the wiki page it will use the new FileTreeReader and FileTreeXmiWriter
> combination.
>
> Give it a whirl and let me know how things go.
>
> Sean
>
>
> From: Manuel Lamy [mailto:mmvpdml@gmail.com<ma...@gmail.com>]
> Sent: Thursday, January 25, 2018 9:09 AM
> To: dev@ctakes.apache.org<ma...@ctakes.apache.org>
> Subject: Re: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]
>
> Hello Sean,
>
> First of all, thanks for your quick answer.
>
> I'm probably making some confusion over here, so I have the following
> questions.
>
>
>   1.  A CAS Consumer is defined by a XML file. What you are implying is
> that I should go to my consumer XML (__XmiWriterCasConsumer.xml) and change
> it's <implementationName> tag to 'org.apache.ctakes.core.cc<htt
> ps://urldefense.proofpoint.com/v2/url?u=http-3A__org.
> apache.ctakes.core.cc&d=DwMFaQ&c=qS4goWBT7poplM69zy_
> 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bc
> pKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=77ECYie_
> 8Zy3RN9ARtzl51dBaHan8dijiNX2p0IkjIA&e=>.FileTreeXmiWriter' instead of '
> org.apache.ctakes.core.cc<https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense.proofpoint&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=6E81rE-QEYoyyKHrR6ZHlSurpOHAb_6oRM7SieeLSiU&e=.
> com/v2/url?u=http-3A__org.apache.ctakes.core.cc&d=
> DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_
> ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=77ECYie_
> 8Zy3RN9ARtzl51dBaHan8dijiNX2p0IkjIA&e=>.XmiWriterCasConsumer'? Funny
> enough, it gives me a classNotFoundException if I do this. Would like to
> have your confirmation if I'm doing the right thing please. The class is
> well defined in that path though.
>   2.  Concerning the reader, I make the same analogy. Should I go to my
> descriptor and change it's <implementationName> tag from '
> org.apache.ctakes.core.cr<https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense.proofpoint&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=6E81rE-QEYoyyKHrR6ZHlSurpOHAb_6oRM7SieeLSiU&e=.
> com/v2/url?u=http-3A__org.apache.ctakes.core.cr&d=
> DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_
> ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=-ag_dLUKFN_aLQ4irY_
> xU_CLzGNrDn6NfV62R5ojs8k&e=>.FilesInDirectoryCollectionReader' to '
> org.apache.ctakes.core.cr<https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense.proofpoint&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=6E81rE-QEYoyyKHrR6ZHlSurpOHAb_6oRM7SieeLSiU&e=.
> com/v2/url?u=http-3A__org.apache.ctakes.core.cr&d=
> DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_
> ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=-ag_dLUKFN_aLQ4irY_
> xU_CLzGNrDn6NfV62R5ojs8k&e=>.FileTreeReader'?
> I did these two things and the error is the same concerning the new
> consumer 'FileTreeXmiWriter', as you can see in the first image attached to
> this e-mail.
>
> I would also like to ask you another question:
>
>
>        3. Why does my class 'FileTreeXmiWriter' has a lot of unresolved
> classes? You can see it in the second image attached to this e-mail. I
> can't seem to import them right. I tried to import the extension of this
> class only to check the result, and look how it solved the import to me.
> 'apache' is not recognized. I'm just kinda baffled with the hierarchy
> defined for this project. If you could give me a little bit of
> clarification in this topic and how to solve it I would be appreciated.
>
> Thanks for your attention! I'm really looking forward to put this to work.
> cTAKES seems awesome. It just needs these little tweaks.
>
> Best regards,
>
> Manuel
>
>
>
>
>
> 2018-01-24 22:26 GMT+00:00 Finan, Sean <Sean.Finan@childrens.harvard.edu
> <ma...@childrens.harvard.edu><mailto:
> Sean.Finan@childrens.harvard.edu<mailto:Sean.Finan@childrens.harvard.edu
> >>>:
> Hi Manuel,
>
> Your image got scrubbed by a server, but the problem may have been fixed
> in a recent xmi writer.  The latest xmi writer is in ctakes core and is
> named FileTreeXmiWriter.  One possible cause for a problem in the writer is
> if the document has some unexpected character or character combination.  A
> document reader should be massaging documents before they are processed and
> sent to the writer.  The most recent file reader is named FileTreeReader
> and is also in ctakes core.
>
> Sean
>
>
>
> From: Manuel Lamy [mailto:mmvpdml@gmail.com<ma...@gmail.com><
> mailto:mmvpdml@gmail.com<ma...@gmail.com>>]
> Sent: Wednesday, January 24, 2018 5:10 PM
> To: dev@ctakes.apache.org<ma...@ctakes.apache.org><mailto:d
> ev@ctakes.apache.org<ma...@ctakes.apache.org>>
> Subject: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]
>
> Hello guys,
>
> I'm having problems running the CPE using a XMI Writer CAS Consumer.
> However, it works with other consumers.
>
> Problem
>
> In the figure below, you can see my setup and the error I'm obtaining:
>
> [Imagem inline 2]
>
> Logs
>
> Concerning logs, I'm obtaining this from Intellij:
>
> org.apache.uima.resource.ResourceInitializationException
>             at org.apache.uima.collection.impl.CollectionProcessingEngine_
> impl.initialize(CollectionProcessingEngine_impl.java:81)
>             at org.apache.uima.impl.UIMAFramework_impl._
> produceCollectionProcessingEngine(UIMAFramework_impl.java:438)
>             at org.apache.uima.UIMAFramework.
> produceCollectionProcessingEngine(UIMAFramework.java:918)
>             at org.apache.uima.tools.cpm.CpmPanel.startProcessing(
> CpmPanel.java:573)
>             at org.apache.uima.tools.cpm.CpmPanel.access$000(CpmPanel.
> java:105)
>             at org.apache.uima.tools.cpm.CpmPanel$1.run(CpmPanel.java:713)
> Caused by: org.apache.uima.resource.ResourceConfigurationException
>             at org.apache.uima.collection.impl.cpm.container.CPEFactory.
> pro<https://urldefense.proofpoint.com/v2/url?u=http-3A__l.cpm.container.
> CPEFactory.pro&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_
> ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=Kd-RE-JiMaX2AlLA310idXB-
> Dyqrbh68kZ24-2ZFEe0&e=>duceIntegratedCasProcessor(CPEFactory.java:1093)
>             at org.apache.uima.collection.impl.cpm.container.CPEFactory.
> getCasProcessors(CPEFactory.java:547)
>             at org.apache.uima.collection.impl.cpm.BaseCPMImpl.init(
> BaseCPMImpl.java:253)
>             at org.apache.uima.collection.impl.cpm.BaseCPMImpl.<init>(
> BaseCPMImpl.java:127)
>             at org.apache.uima.collection.impl.CollectionProcessingEngine_
> impl.initialize(CollectionProcessingEngine_impl.java:73)
>             ... 5 more
> Caused by: java.lang.Exception: The component XMI Writer CAS Consumer
> cannot be created. (Thread Name: Thread-5)
>             ... 10 more
>
> Attempted Solutions
>
> I only found one guy with the same problem as me. The solution proposed in
> the thread, by Sean Finan, was to change the xml of my consumer
> (__XmiWriterCasConsumer.xml), particularly the content of the tag
> <implementationName>, from
>  <implementationName>org.apache.ctakes.core.cc<https://
> urldefense.proofpoint.com/v2/url?u=http-3A__org.apache.
> ctakes.core.cc&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_
> ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=77ECYie_
> 8Zy3RN9ARtzl51dBaHan8dijiNX2p0IkjIA&e=><https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=jykKf1NthK-1Y45BRzYVVVAYQK6939NDMIg-lhY2wLQ&e=.
> proofpoint.com/v2/url?u=http-3A__apache.ctakes.core.cc&d=
> DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=55lXUJ1MFPyBhpVH4sCBuEZD-
> InGrPRtD4YTvCJpMFo&s=zBsJhrOUC6BXHsKiMP4cEZTtjqB73N9V-kjGKPhqaNA&e=>.
> XmiWriterCasConsumerCtakes</implementationName>
>
> to
>
> <implementationName>org.apache.uima.tools.components.
> XmiWriterCasConsumer</implementationName>
>
>
>
> However, this didn't work. The error is exactly the same. I'm out of ideas
> about what to do. I would like to have the report of CPE in XMI, in order
> to read it with CVD. You can see the thread here:
>
> https://urldefense.proofpoint.com/v2/url?u=http-3A__mail-2Darchives.apache.org_mod-5Fmbox_ctakes-2Ddev_201701.mbox_-25&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=VZfX5xii1KxsUcRxcVLFxR6hIhkAwYMQuJyh_SHu7KE&e=
> 3C29cefd1fa1b44ce4a8dc92ec8b1cd882@CHEXMAIL1A.CHBOSTON.ORG%3E<
> https://urldefense.proofpoint.com/v2/url?u=http-
> 3A__mail-2Darchives.apache.org_mod-5Fmbox_ctakes-2Ddev_201701.mbox_-
> 253C29cefd1fa1b44ce4a8dc92ec8b1cd882-40CHEXMAIL1A.CHBOSTON.
> ORG-253E&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_
> ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=_6v4jkcWzpMVtIWPH-
> 1GkFuXpcYGRYdjs3sGzVLuEPA&e=><https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense.proofpoint&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8&s=6E81rE-QEYoyyKHrR6ZHlSurpOHAb_6oRM7SieeLSiU&e=.
> com/v2/url?u=http-3A__mail-2Darchives.apache.org_mod-
> 5Fmbox_ctakes-2Ddev_201701.mbox_-253C29cefd1fa1b44ce4a8dc92ec8b
> 1cd882-40CHEXMAIL1A.CHBOSTON.ORG-253E&d=DwMFaQ&c=qS4goWBT7poplM69zy_
> 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=
> 55lXUJ1MFPyBhpVH4sCBuEZD-InGrPRtD4YTvCJpMFo&s=
> vzHmir9t5IBncKpumZCOCqviJeDNNVl4ZkjEiK9AMp8&e=><https://
> urldefense.proofpoint.com/v2/url?u=http-3A__mail-
> 2Darchives.apache.org_mod-5Fmbox_ctakes-2Ddev_201701.mbox_-
> 253C29cefd1fa1b44ce4a8dc92ec8b1cd882-40CHEXMAIL1A.CHBOSTON.
> ORG-253E&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=
> N5zX2YGt7jbGKsiWAN7z5tdADmV2PwJdHTvvx2oZ2fM&s=5c-Yr8TMBg7-
> VyEjwF7gJlT1xP3LpHC6dvnZbihxDPg&e=>
>
>
>
> Result Expected
>
> Running the CPE process and have outputs as XMI files.
>
>
>
> Result Obtained
>
> Running the CPE results in an error, specifically for the consumer
> __XMIWriterCasConsumer.
>
>
>
> Conclusion
>
> Do any of you guys had this problem before? Do you have a suggestion about
> how can it be solved? Thanks a lot
>
>
>
> Best regards,
>
> Manuel
>
>

Re: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]

Posted by Manuel Lamy <mm...@gmail.com>.
Hello Sean,

Thanks for the awesome inputs as always!

*SNOMED*

Afaik SNOMED doesn't exist in the Portuguese language yet. As per my
research and this reference[1], SNOMED-CT is only translated for Australian
English, Danish, Dutch, Spanish, Swedish, and USA/UK English. Did you hear
about SNOMED-CT translated to Portuguese somewhere?

I may have to come with a solution for this. It's possible for me to build
a mechanism that tries to translate all SNOMED-CT from English to
Portuguese, or from the Spanish since it's a close language from
Portuguese. I can use many sources, such as ICD-9/10, DBpedia or/and a
direct translation tool. However, this path will not be easy to take on my
own. But it's a possibility though. I have to think about it.


*Bat files in Development Version*

I was trying to run the bat files that were inside the module
ctakes-distribution of my Dev Version. I guess that was my problem after
all.


*Translation*

Yes, I know cTAKES won't translate for me. I was thinking in using an
offline translator and adding it to my pipeline. I have yet to find a
translator that is half as good as the Google Translator though. I don't
want to rely in an online translator.


*Wiki Documentation*

Thanks for your compliment. Sure, I would love to help. I like to express
myself as clear as possible.

However, my knowledge about the system is still limited. I only started
using cTAKES a couple of months ago.

But if I can help with something just ask me, I would be glad to help.


Thanks a lot Sean.

Best regards,

Manuel Lamy


[1] -
https://www.snomed.org/snomed-ct/snomed-ct-worldwide/translations-of-snomed-ct


2018-01-25 20:43 GMT+00:00 Finan, Sean <Se...@childrens.harvard.edu>:

> Hi Manuel,
>
> Thank you for the information.  I have a couple of response lines …
>
>
> > I need to do it because cTAKES seems to not work with the Portuguese
> language at all
>                 - Yes and no … You can create a dictionary of terms in the
> Portuguese language.  This would allow ctakes to at least recognize these
> terms and save them for posterity.  However, the more advanced processing
> available for English (negation, uncertainty detection, etc.) will not be
> available.  If you can find other nlp projects that work with Portuguese it
> may be possible to insert them into a ctakes pipeline.  The instructions
> for creating a custom dictionary are here (language selection is not
> documented but it is on the gui, download the umls with portugese snomed if
> you can):
> https://cwiki.apache.org/confluence/display/CTAKES/Dictionary+Creator+GUI
>
> > What I have in mind is to create a pipeline system that first translates
> the texts from Portuguese to English
>                 - Probably a good way to go if you have a decent
> translation tool.
>
> > From my research, I couldn't find anything relevant in this topic.
>                 - We definitely could use more documentation.
>
> > Well, since this is the user version, I don't have the
> runPiperSubmitter.bat available
>                 - Correct.  It is a tool that was created after the 4.0
> release.
>
> > When I try to run the bat files inside the bin of the Dev Version, I
> have the results shown in the image attached to this e-mail.
>                 -  Your attachments were scrubbed so I can’t see them.
> However, I have a guess: did you run a “maven package”, unzip the created
> installation file and run from the bin/ directory there?  Or are you
> running with the bin/ inside your development sandbox?  The second method
> won’t work and will give you the “class not found” errors that you are
> seeing.  If you want to run using Intellij, turn on the profile
> “runPiperGui” and compile.  Maven should launch the gui after compilation.
>
> > Well, first of all, my objective is to share my experiences with cTAKES,
> in order to share with the community what I'm going through. This way I can
> contribute to the community and probably help others who are going through
> the same as me.
>                 -  Excellent.  Would you be willing to write documentation
> for the ctakes wiki?  Your emails are clear and extremely well formatted!
>
>
>   1.  Is this feasible? Am I aiming for something that I simply can't rely
> in cTAKES only to do, because I have to translate the texts first?
>
> -          Ctakes won’t translate for you, but if you can find a tool that
> will then processing with ctakes should be possible.
>
>   1.  Why don't I have a TypeSystem.xml file to feed CVD first, in the
> Development Version? I can only find it in the User Version, under
> /resources.
>
> -          The typesystem.xml file is in the ctakes-type-system project
> until you “maven package” and create an “installation”.  If you just run
> from your developer environment you can point to the TypeSystem.xml in
> ctakes-type-system/src/main/resources/…
>
>   1.  Why do we have options in CVD for other languages, but it clearly
> only works for the English language?
>
> -          The cvd is a tool that is part of Apache UIMA.  It is more
> generic than ctakes and can read xmi files created by other systems.  I
> have no idea what the details are concerning its language support.
>
>   1.  Any other hint you can give me, concerning the big picture of what
> I'm trying to build here?
>
> -          Not really, sorry.  The multi-lingual goes outside my area of
> knowledge.
>
> Sean
>
>
> From: Manuel Lamy [mailto:mmvpdml@gmail.com]
> Sent: Thursday, January 25, 2018 2:28 PM
> To: dev@ctakes.apache.org
> Subject: Re: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]
>
> Hello Sean,
>
> Before all, thansk a lot for the quick and detailed answer. Awesome
> support by you.
>
> I'll give you a structured answer to be the more objective and concise
> possible. I guess it's important to tell you what I'm trying to achieve in
> order for you to help me.
>
> My Project
>
> I'm actually making a project with cTAKES in a partnership with a
> Portuguese hospital.
>
> My goal is to create reports of the narrative parts of the EMRs of this
> hospital, in order to report the symptoms, diseases and clinical procedures
> found in each EMR.
>
> What I have in mind is to create a pipeline system that first translates
> the texts from Portuguese to English, and then creates these reports based
> on the translated texts.
>
> I'm not even sure yet I can create a pipeline system of this style with
> cTAKES. I need to do it because cTAKES seems to not work with the
> Portuguese language at all (despite that option being shown in the
> languages list when using CVD and that's confusing). So, well, I will
> translate it, I guess it's my best bet.
>
> But just a note, I think it should exist more support and documentation
> about how to work with cTAKES in different languages than English. From my
> research, I couldn't find anything relevant in this topic. Not even one
> reference telling clearly that cTAKES only works with English language and
> not with the others.
>
> Version of cTAKES
>
> Naturally, I'm running the development version of cTAKES. I'm using
> Intellij. I'm using the latest version of cTAKES, trunk, that corresponds
> to version 4.0.1-SNAPSHOT.
>
> So, I guess so far so good, just as you said, I'm using trunk.
>
> I did everything as per the guide "Developer Install Guide", concerning
> the Intellij instructions. The guide I used can be found here:
> https://cwiki.apache.org/confluence/display/CTAKES/
> cTAKES+4.0+Developer+Install+Guide<https://urldefense.
> proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_
> confluence_display_CTAKES_cTAKES-2B4.0-2BDeveloper-
> 2BInstall-2BGuide&d=DwMFaQ&c=qS4goWBT7poplM69zy_
> 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bc
> pKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjX
> OytMmgM0&s=CVSlmO5-wIWG9Bh-dTmpXoUGF5sgLWD2jp4sbGZ8vh8&e=>
>
>
> Behavior of cTAKES when running pipelines
>
> Well, I did what you told me. I ran the Default Clinical Pipeline and the
> Piper File Submitter as per the wiki's. I have the User and Development
> versions both in my machine.
>
> Now, I tried to run those pipelines in the User and Development versions.
> I ran the respective bat files:
>
>
>   *   For the Default Clinical Pipeline I ran 'bin/runClinicalPipeline  -i
> inputDirectory  --xmiOut outputDirectory  --user umlsUsername  --pass
> umlsPassword'
>   *   For the Piper File Submitter, I ran the 'bin/runPiperSubmitter'
> Well, the results of running these two bat files were quite differents for
> the User and Development versions.
>
> User Version
>
> Default Clinical Pipeline
>
> In this version, I went to bin directory and just ran the line
> 'bin/runClinicalPipeline  -i inputDirectory  --xmiOut outputDirectory
> --user umlsUsername  --pass umlsPassword' with my parameters.
>
> It worked well and created the XMI output files where it was supposed. And
> I could open them in CVD, first opening a TypeSystem.xml file and then the
> generated XMI files I wanted.
>
> Piper File Submitter
>
> Well, since this is the user version, I don't have the
> runPiperSubmitter.bat available. Is this normal? That's comprehensible and
> I guess normal, for what I understand from this quote " If you are running
> from a development environment (checked out trunk from SVN) they can also
> be run using the Piper File Submitter GUI." But you tell me.
>
> Well, I can say the User Version did what I wanted in this step, but I
> thought that would be nice to replicate it in the Development version,
> since I guess I'll have to use it in the future in order to implement all I
> want for my project described in the beggining of this e-mail. And the
> problems arose in the Development version....
>
> Development Version
>
> Well, in this version, I tried to replicate what I did in the User
> version, thinking to myself it would output the same result. I was wrong.
>
>
> Default Clinical Pipeline and Piper File Submitter
>
> When I try to run the bat files inside the bin of the Dev Version, I have
> the results shown in the image attached to this e-mail.
>
> Yes, could not find or load PiperFileRunner and PiperRunnerGui. Is it
> supposed to happen in the Development Version? Am I doing something wrong
> in here? i just followed the guides you have available. All my Development
> Version installation was per the guide.
>
>
> My objective with this e-mail
>
> Well, first of all, my objective is to share my experiences with cTAKES,
> in order to share with the community what I'm going through. This way I can
> contribute to the community and probably help others who are going through
> the same as me.
>
> In second place, I would like to know your opinion about the feasability
> of what I'm trying to make here. My goal is build a pipeline system like:
>
>
>   *   EMRs in Portuguese already in txt files in a directory ->
> Translation to English -> Process all of the texts with Clinical Pipeline
> -> Output XMI in order to open them in CVD
> This is what I aim with cTAKES. So I have the following questions:
>
>
>   1.  Is this feasible? Am I aiming for something that I simply can't rely
> in cTAKES only to do, because I have to translate the texts first?
>   2.  Why don't I have a TypeSystem.xml file to feed CVD first, in the
> Development Version? I can only find it in the User Version, under
> /resources.
>   3.  Why do we have options in CVD for other languages, but it clearly
> only works for the English language?
>   4.  Any other hint you can give me, concerning the big picture of what
> I'm trying to build here?
> Any additional information you need from my side, just tell me.
>
> Thanks one more time for the quick answers and support Sean.
>
> Best regards,
>
> Manuel
>
>
> 2018-01-25 15:35 GMT+00:00 Finan, Sean <Sean.Finan@childrens.harvard.edu
> <ma...@childrens.harvard.edu>>:
> Hi Manuel,
>
> My first comment is that you are running ctakes in a somewhat “ancient”
> manner, or better put, the xml descriptor workflow has been pretty much
> deprecated.
>
> You should try to run ctakes 4.0.  If you are software savvy then I advise
> that you try the development version that is in trunk.  You’ve probably
> been on the ctakes download page, but just a reminder :
> http://ctakes.apache.org/<https://urldefense.proofpoint.com/
> v2/url?u=http-3A__ctakes.apache.org_&d=DwMFaQ&c=qS4goWBT7poplM69zy_
> 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bc
> pKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=
> eR4BZrqJcoxN9dwsWE5PUw9qwMAju7w9zOOzqMHT95U&e=>
>
> The ctakes wiki has some useful information, and the 4.0 entry is here:
> https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+4.0<
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_
> confluence_display_CTAKES_cTAKES-2B4.0&d=DwMFaQ&c=qS4goWBT7poplM69zy_
> 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bc
> pKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=
> IgvR2Z9rgstXIbo3scW0DsWkA59X0ANVuYeO5P5lrwI&e=>
>
> To start playing with ctakes I suggest that you try to run the default
> clinical pipeline, following the instructions here:
> https://cwiki.apache.org/confluence/display/CTAKES/
> Default+Clinical+Pipeline<https://urldefense.proofpoint.com/
> v2/url?u=https-3A__cwiki.apache.org_confluence_display_
> CTAKES_Default-2BClinical-2BPipeline&d=DwMFaQ&c=qS4goWBT7poplM69zy_
> 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bc
> pKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=
> hvwwTI35sq53mx3R9TsPtHEF3p2G29qCmVime1NsgKU&e=>
>
> Those instructions will start the default clinical pipeline from a command
> line.  If you have the development version from trunk then there is a gui
> available to run pipelines:
> https://cwiki.apache.org/confluence/display/CTAKES/
> Piper+File+Submitter+GUI<https://urldefense.proofpoint.com/
> v2/url?u=https-3A__cwiki.apache.org_confluence_display_
> CTAKES_Piper-2BFile-2BSubmitter-2BGUI&d=DwMFaQ&c=qS4goWBT7poplM69zy_
> 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bc
> pKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjX
> OytMmgM0&s=HKBfRNAlLaLk9c-sPqupZpQzAc5ddcWbbXvWxRiWwBw&e=>
>
> There are also many other pipeline configurations available in trunk to
> run more advanced / involved pipelines.  They are not in the 4.0 release.
> The pipelines (including 4.0 default) are all defined using the replacement
> for those xml descriptor files.  The replacements are called “piper files”.
> https://cwiki.apache.org/confluence/display/CTAKES/Piper+Files<https://
> urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.
> org_confluence_display_CTAKES_Piper-2BFiles&d=DwMFaQ&c=qS4goWBT7poplM69zy_
> 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bc
> pKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjX
> OytMmgM0&s=E7wf87y0Ldqo_pGw2sYdC_DPEeqsmnLYPMkrM5LIz8w&e=>
>
> I hope that you find the pipers easier to understand and use than the old
> xml descriptors.
>
> Anyway, if you run the ctakes 4.0 default clinical pipeline as outlined in
> the wiki page it will use the new FileTreeReader and FileTreeXmiWriter
> combination.
>
> Give it a whirl and let me know how things go.
>
> Sean
>
>
> From: Manuel Lamy [mailto:mmvpdml@gmail.com<ma...@gmail.com>]
> Sent: Thursday, January 25, 2018 9:09 AM
> To: dev@ctakes.apache.org<ma...@ctakes.apache.org>
> Subject: Re: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]
>
> Hello Sean,
>
> First of all, thanks for your quick answer.
>
> I'm probably making some confusion over here, so I have the following
> questions.
>
>
>   1.  A CAS Consumer is defined by a XML file. What you are implying is
> that I should go to my consumer XML (__XmiWriterCasConsumer.xml) and change
> it's <implementationName> tag to 'org.apache.ctakes.core.cc<htt
> ps://urldefense.proofpoint.com/v2/url?u=http-3A__org.
> apache.ctakes.core.cc&d=DwMFaQ&c=qS4goWBT7poplM69zy_
> 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bc
> pKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=77ECYie_
> 8Zy3RN9ARtzl51dBaHan8dijiNX2p0IkjIA&e=>.FileTreeXmiWriter' instead of '
> org.apache.ctakes.core.cc<https://urldefense.proofpoint.
> com/v2/url?u=http-3A__org.apache.ctakes.core.cc&d=
> DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_
> ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=77ECYie_
> 8Zy3RN9ARtzl51dBaHan8dijiNX2p0IkjIA&e=>.XmiWriterCasConsumer'? Funny
> enough, it gives me a classNotFoundException if I do this. Would like to
> have your confirmation if I'm doing the right thing please. The class is
> well defined in that path though.
>   2.  Concerning the reader, I make the same analogy. Should I go to my
> descriptor and change it's <implementationName> tag from '
> org.apache.ctakes.core.cr<https://urldefense.proofpoint.
> com/v2/url?u=http-3A__org.apache.ctakes.core.cr&d=
> DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_
> ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=-ag_dLUKFN_aLQ4irY_
> xU_CLzGNrDn6NfV62R5ojs8k&e=>.FilesInDirectoryCollectionReader' to '
> org.apache.ctakes.core.cr<https://urldefense.proofpoint.
> com/v2/url?u=http-3A__org.apache.ctakes.core.cr&d=
> DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_
> ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=-ag_dLUKFN_aLQ4irY_
> xU_CLzGNrDn6NfV62R5ojs8k&e=>.FileTreeReader'?
> I did these two things and the error is the same concerning the new
> consumer 'FileTreeXmiWriter', as you can see in the first image attached to
> this e-mail.
>
> I would also like to ask you another question:
>
>
>        3. Why does my class 'FileTreeXmiWriter' has a lot of unresolved
> classes? You can see it in the second image attached to this e-mail. I
> can't seem to import them right. I tried to import the extension of this
> class only to check the result, and look how it solved the import to me.
> 'apache' is not recognized. I'm just kinda baffled with the hierarchy
> defined for this project. If you could give me a little bit of
> clarification in this topic and how to solve it I would be appreciated.
>
> Thanks for your attention! I'm really looking forward to put this to work.
> cTAKES seems awesome. It just needs these little tweaks.
>
> Best regards,
>
> Manuel
>
>
>
>
>
> 2018-01-24 22:26 GMT+00:00 Finan, Sean <Sean.Finan@childrens.harvard.edu
> <ma...@childrens.harvard.edu><mailto:
> Sean.Finan@childrens.harvard.edu<mailto:Sean.Finan@childrens.harvard.edu
> >>>:
> Hi Manuel,
>
> Your image got scrubbed by a server, but the problem may have been fixed
> in a recent xmi writer.  The latest xmi writer is in ctakes core and is
> named FileTreeXmiWriter.  One possible cause for a problem in the writer is
> if the document has some unexpected character or character combination.  A
> document reader should be massaging documents before they are processed and
> sent to the writer.  The most recent file reader is named FileTreeReader
> and is also in ctakes core.
>
> Sean
>
>
>
> From: Manuel Lamy [mailto:mmvpdml@gmail.com<ma...@gmail.com><
> mailto:mmvpdml@gmail.com<ma...@gmail.com>>]
> Sent: Wednesday, January 24, 2018 5:10 PM
> To: dev@ctakes.apache.org<ma...@ctakes.apache.org><mailto:d
> ev@ctakes.apache.org<ma...@ctakes.apache.org>>
> Subject: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]
>
> Hello guys,
>
> I'm having problems running the CPE using a XMI Writer CAS Consumer.
> However, it works with other consumers.
>
> Problem
>
> In the figure below, you can see my setup and the error I'm obtaining:
>
> [Imagem inline 2]
>
> Logs
>
> Concerning logs, I'm obtaining this from Intellij:
>
> org.apache.uima.resource.ResourceInitializationException
>             at org.apache.uima.collection.impl.CollectionProcessingEngine_
> impl.initialize(CollectionProcessingEngine_impl.java:81)
>             at org.apache.uima.impl.UIMAFramework_impl._
> produceCollectionProcessingEngine(UIMAFramework_impl.java:438)
>             at org.apache.uima.UIMAFramework.
> produceCollectionProcessingEngine(UIMAFramework.java:918)
>             at org.apache.uima.tools.cpm.CpmPanel.startProcessing(
> CpmPanel.java:573)
>             at org.apache.uima.tools.cpm.CpmPanel.access$000(CpmPanel.
> java:105)
>             at org.apache.uima.tools.cpm.CpmPanel$1.run(CpmPanel.java:713)
> Caused by: org.apache.uima.resource.ResourceConfigurationException
>             at org.apache.uima.collection.impl.cpm.container.CPEFactory.
> pro<https://urldefense.proofpoint.com/v2/url?u=http-3A__l.cpm.container.
> CPEFactory.pro&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_
> ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=Kd-RE-JiMaX2AlLA310idXB-
> Dyqrbh68kZ24-2ZFEe0&e=>duceIntegratedCasProcessor(CPEFactory.java:1093)
>             at org.apache.uima.collection.impl.cpm.container.CPEFactory.
> getCasProcessors(CPEFactory.java:547)
>             at org.apache.uima.collection.impl.cpm.BaseCPMImpl.init(
> BaseCPMImpl.java:253)
>             at org.apache.uima.collection.impl.cpm.BaseCPMImpl.<init>(
> BaseCPMImpl.java:127)
>             at org.apache.uima.collection.impl.CollectionProcessingEngine_
> impl.initialize(CollectionProcessingEngine_impl.java:73)
>             ... 5 more
> Caused by: java.lang.Exception: The component XMI Writer CAS Consumer
> cannot be created. (Thread Name: Thread-5)
>             ... 10 more
>
> Attempted Solutions
>
> I only found one guy with the same problem as me. The solution proposed in
> the thread, by Sean Finan, was to change the xml of my consumer
> (__XmiWriterCasConsumer.xml), particularly the content of the tag
> <implementationName>, from
>  <implementationName>org.apache.ctakes.core.cc<https://
> urldefense.proofpoint.com/v2/url?u=http-3A__org.apache.
> ctakes.core.cc&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_
> ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=77ECYie_
> 8Zy3RN9ARtzl51dBaHan8dijiNX2p0IkjIA&e=><https://urldefense.
> proofpoint.com/v2/url?u=http-3A__apache.ctakes.core.cc&d=
> DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=55lXUJ1MFPyBhpVH4sCBuEZD-
> InGrPRtD4YTvCJpMFo&s=zBsJhrOUC6BXHsKiMP4cEZTtjqB73N9V-kjGKPhqaNA&e=>.
> XmiWriterCasConsumerCtakes</implementationName>
>
> to
>
> <implementationName>org.apache.uima.tools.components.
> XmiWriterCasConsumer</implementationName>
>
>
>
> However, this didn't work. The error is exactly the same. I'm out of ideas
> about what to do. I would like to have the report of CPE in XMI, in order
> to read it with CVD. You can see the thread here:
>
> http://mail-archives.apache.org/mod_mbox/ctakes-dev/201701.mbox/%
> 3C29cefd1fa1b44ce4a8dc92ec8b1cd882@CHEXMAIL1A.CHBOSTON.ORG%3E<
> https://urldefense.proofpoint.com/v2/url?u=http-
> 3A__mail-2Darchives.apache.org_mod-5Fmbox_ctakes-2Ddev_201701.mbox_-
> 253C29cefd1fa1b44ce4a8dc92ec8b1cd882-40CHEXMAIL1A.CHBOSTON.
> ORG-253E&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_
> ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=_6v4jkcWzpMVtIWPH-
> 1GkFuXpcYGRYdjs3sGzVLuEPA&e=><https://urldefense.proofpoint.
> com/v2/url?u=http-3A__mail-2Darchives.apache.org_mod-
> 5Fmbox_ctakes-2Ddev_201701.mbox_-253C29cefd1fa1b44ce4a8dc92ec8b
> 1cd882-40CHEXMAIL1A.CHBOSTON.ORG-253E&d=DwMFaQ&c=qS4goWBT7poplM69zy_
> 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=
> 55lXUJ1MFPyBhpVH4sCBuEZD-InGrPRtD4YTvCJpMFo&s=
> vzHmir9t5IBncKpumZCOCqviJeDNNVl4ZkjEiK9AMp8&e=><https://
> urldefense.proofpoint.com/v2/url?u=http-3A__mail-
> 2Darchives.apache.org_mod-5Fmbox_ctakes-2Ddev_201701.mbox_-
> 253C29cefd1fa1b44ce4a8dc92ec8b1cd882-40CHEXMAIL1A.CHBOSTON.
> ORG-253E&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=
> N5zX2YGt7jbGKsiWAN7z5tdADmV2PwJdHTvvx2oZ2fM&s=5c-Yr8TMBg7-
> VyEjwF7gJlT1xP3LpHC6dvnZbihxDPg&e=>
>
>
>
> Result Expected
>
> Running the CPE process and have outputs as XMI files.
>
>
>
> Result Obtained
>
> Running the CPE results in an error, specifically for the consumer
> __XMIWriterCasConsumer.
>
>
>
> Conclusion
>
> Do any of you guys had this problem before? Do you have a suggestion about
> how can it be solved? Thanks a lot
>
>
>
> Best regards,
>
> Manuel
>
>

RE: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]

Posted by "Finan, Sean" <Se...@childrens.harvard.edu>.
Hi Manuel,

Thank you for the information.  I have a couple of response lines …


> I need to do it because cTAKES seems to not work with the Portuguese language at all
                - Yes and no … You can create a dictionary of terms in the Portuguese language.  This would allow ctakes to at least recognize these terms and save them for posterity.  However, the more advanced processing available for English (negation, uncertainty detection, etc.) will not be available.  If you can find other nlp projects that work with Portuguese it may be possible to insert them into a ctakes pipeline.  The instructions for creating a custom dictionary are here (language selection is not documented but it is on the gui, download the umls with portugese snomed if you can):
https://cwiki.apache.org/confluence/display/CTAKES/Dictionary+Creator+GUI

> What I have in mind is to create a pipeline system that first translates the texts from Portuguese to English
                - Probably a good way to go if you have a decent translation tool.

> From my research, I couldn't find anything relevant in this topic.
                - We definitely could use more documentation.

> Well, since this is the user version, I don't have the runPiperSubmitter.bat available
                - Correct.  It is a tool that was created after the 4.0 release.

> When I try to run the bat files inside the bin of the Dev Version, I have the results shown in the image attached to this e-mail.
                -  Your attachments were scrubbed so I can’t see them.  However, I have a guess: did you run a “maven package”, unzip the created installation file and run from the bin/ directory there?  Or are you running with the bin/ inside your development sandbox?  The second method won’t work and will give you the “class not found” errors that you are seeing.  If you want to run using Intellij, turn on the profile “runPiperGui” and compile.  Maven should launch the gui after compilation.

> Well, first of all, my objective is to share my experiences with cTAKES, in order to share with the community what I'm going through. This way I can contribute to the community and probably help others who are going through the same as me.
                -  Excellent.  Would you be willing to write documentation for the ctakes wiki?  Your emails are clear and extremely well formatted!


  1.  Is this feasible? Am I aiming for something that I simply can't rely in cTAKES only to do, because I have to translate the texts first?

-          Ctakes won’t translate for you, but if you can find a tool that will then processing with ctakes should be possible.

  1.  Why don't I have a TypeSystem.xml file to feed CVD first, in the Development Version? I can only find it in the User Version, under /resources.

-          The typesystem.xml file is in the ctakes-type-system project until you “maven package” and create an “installation”.  If you just run from your developer environment you can point to the TypeSystem.xml in ctakes-type-system/src/main/resources/…

  1.  Why do we have options in CVD for other languages, but it clearly only works for the English language?

-          The cvd is a tool that is part of Apache UIMA.  It is more generic than ctakes and can read xmi files created by other systems.  I have no idea what the details are concerning its language support.

  1.  Any other hint you can give me, concerning the big picture of what I'm trying to build here?

-          Not really, sorry.  The multi-lingual goes outside my area of knowledge.

Sean


From: Manuel Lamy [mailto:mmvpdml@gmail.com]
Sent: Thursday, January 25, 2018 2:28 PM
To: dev@ctakes.apache.org
Subject: Re: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]

Hello Sean,

Before all, thansk a lot for the quick and detailed answer. Awesome support by you.

I'll give you a structured answer to be the more objective and concise possible. I guess it's important to tell you what I'm trying to achieve in order for you to help me.

My Project

I'm actually making a project with cTAKES in a partnership with a Portuguese hospital.

My goal is to create reports of the narrative parts of the EMRs of this hospital, in order to report the symptoms, diseases and clinical procedures found in each EMR.

What I have in mind is to create a pipeline system that first translates the texts from Portuguese to English, and then creates these reports based on the translated texts.

I'm not even sure yet I can create a pipeline system of this style with cTAKES. I need to do it because cTAKES seems to not work with the Portuguese language at all (despite that option being shown in the languages list when using CVD and that's confusing). So, well, I will translate it, I guess it's my best bet.

But just a note, I think it should exist more support and documentation about how to work with cTAKES in different languages than English. From my research, I couldn't find anything relevant in this topic. Not even one reference telling clearly that cTAKES only works with English language and not with the others.

Version of cTAKES

Naturally, I'm running the development version of cTAKES. I'm using Intellij. I'm using the latest version of cTAKES, trunk, that corresponds to version 4.0.1-SNAPSHOT.

So, I guess so far so good, just as you said, I'm using trunk.

I did everything as per the guide "Developer Install Guide", concerning the Intellij instructions. The guide I used can be found here: https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+4.0+Developer+Install+Guide<https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_cTAKES-2B4.0-2BDeveloper-2BInstall-2BGuide&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=CVSlmO5-wIWG9Bh-dTmpXoUGF5sgLWD2jp4sbGZ8vh8&e=>


Behavior of cTAKES when running pipelines

Well, I did what you told me. I ran the Default Clinical Pipeline and the Piper File Submitter as per the wiki's. I have the User and Development versions both in my machine.

Now, I tried to run those pipelines in the User and Development versions. I ran the respective bat files:


  *   For the Default Clinical Pipeline I ran 'bin/runClinicalPipeline  -i inputDirectory  --xmiOut outputDirectory  --user umlsUsername  --pass umlsPassword'
  *   For the Piper File Submitter, I ran the 'bin/runPiperSubmitter'
Well, the results of running these two bat files were quite differents for the User and Development versions.

User Version

Default Clinical Pipeline

In this version, I went to bin directory and just ran the line 'bin/runClinicalPipeline  -i inputDirectory  --xmiOut outputDirectory  --user umlsUsername  --pass umlsPassword' with my parameters.

It worked well and created the XMI output files where it was supposed. And I could open them in CVD, first opening a TypeSystem.xml file and then the generated XMI files I wanted.

Piper File Submitter

Well, since this is the user version, I don't have the runPiperSubmitter.bat available. Is this normal? That's comprehensible and I guess normal, for what I understand from this quote " If you are running from a development environment (checked out trunk from SVN) they can also be run using the Piper File Submitter GUI." But you tell me.

Well, I can say the User Version did what I wanted in this step, but I thought that would be nice to replicate it in the Development version, since I guess I'll have to use it in the future in order to implement all I want for my project described in the beggining of this e-mail. And the problems arose in the Development version....

Development Version

Well, in this version, I tried to replicate what I did in the User version, thinking to myself it would output the same result. I was wrong.


Default Clinical Pipeline and Piper File Submitter

When I try to run the bat files inside the bin of the Dev Version, I have the results shown in the image attached to this e-mail.

Yes, could not find or load PiperFileRunner and PiperRunnerGui. Is it supposed to happen in the Development Version? Am I doing something wrong in here? i just followed the guides you have available. All my Development Version installation was per the guide.


My objective with this e-mail

Well, first of all, my objective is to share my experiences with cTAKES, in order to share with the community what I'm going through. This way I can contribute to the community and probably help others who are going through the same as me.

In second place, I would like to know your opinion about the feasability of what I'm trying to make here. My goal is build a pipeline system like:


  *   EMRs in Portuguese already in txt files in a directory -> Translation to English -> Process all of the texts with Clinical Pipeline -> Output XMI in order to open them in CVD
This is what I aim with cTAKES. So I have the following questions:


  1.  Is this feasible? Am I aiming for something that I simply can't rely in cTAKES only to do, because I have to translate the texts first?
  2.  Why don't I have a TypeSystem.xml file to feed CVD first, in the Development Version? I can only find it in the User Version, under /resources.
  3.  Why do we have options in CVD for other languages, but it clearly only works for the English language?
  4.  Any other hint you can give me, concerning the big picture of what I'm trying to build here?
Any additional information you need from my side, just tell me.

Thanks one more time for the quick answers and support Sean.

Best regards,

Manuel


2018-01-25 15:35 GMT+00:00 Finan, Sean <Se...@childrens.harvard.edu>>:
Hi Manuel,

My first comment is that you are running ctakes in a somewhat “ancient” manner, or better put, the xml descriptor workflow has been pretty much deprecated.

You should try to run ctakes 4.0.  If you are software savvy then I advise that you try the development version that is in trunk.  You’ve probably been on the ctakes download page, but just a reminder :
http://ctakes.apache.org/<https://urldefense.proofpoint.com/v2/url?u=http-3A__ctakes.apache.org_&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=eR4BZrqJcoxN9dwsWE5PUw9qwMAju7w9zOOzqMHT95U&e=>

The ctakes wiki has some useful information, and the 4.0 entry is here:
https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+4.0<https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_cTAKES-2B4.0&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=IgvR2Z9rgstXIbo3scW0DsWkA59X0ANVuYeO5P5lrwI&e=>

To start playing with ctakes I suggest that you try to run the default clinical pipeline, following the instructions here:
https://cwiki.apache.org/confluence/display/CTAKES/Default+Clinical+Pipeline<https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_Default-2BClinical-2BPipeline&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=hvwwTI35sq53mx3R9TsPtHEF3p2G29qCmVime1NsgKU&e=>

Those instructions will start the default clinical pipeline from a command line.  If you have the development version from trunk then there is a gui available to run pipelines:
https://cwiki.apache.org/confluence/display/CTAKES/Piper+File+Submitter+GUI<https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_Piper-2BFile-2BSubmitter-2BGUI&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=HKBfRNAlLaLk9c-sPqupZpQzAc5ddcWbbXvWxRiWwBw&e=>

There are also many other pipeline configurations available in trunk to run more advanced / involved pipelines.  They are not in the 4.0 release.  The pipelines (including 4.0 default) are all defined using the replacement for those xml descriptor files.  The replacements are called “piper files”.
https://cwiki.apache.org/confluence/display/CTAKES/Piper+Files<https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_Piper-2BFiles&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=E7wf87y0Ldqo_pGw2sYdC_DPEeqsmnLYPMkrM5LIz8w&e=>

I hope that you find the pipers easier to understand and use than the old xml descriptors.

Anyway, if you run the ctakes 4.0 default clinical pipeline as outlined in the wiki page it will use the new FileTreeReader and FileTreeXmiWriter combination.

Give it a whirl and let me know how things go.

Sean


From: Manuel Lamy [mailto:mmvpdml@gmail.com<ma...@gmail.com>]
Sent: Thursday, January 25, 2018 9:09 AM
To: dev@ctakes.apache.org<ma...@ctakes.apache.org>
Subject: Re: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]

Hello Sean,

First of all, thanks for your quick answer.

I'm probably making some confusion over here, so I have the following questions.


  1.  A CAS Consumer is defined by a XML file. What you are implying is that I should go to my consumer XML (__XmiWriterCasConsumer.xml) and change it's <implementationName> tag to 'org.apache.ctakes.core.cc<https://urldefense.proofpoint.com/v2/url?u=http-3A__org.apache.ctakes.core.cc&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=77ECYie_8Zy3RN9ARtzl51dBaHan8dijiNX2p0IkjIA&e=>.FileTreeXmiWriter' instead of 'org.apache.ctakes.core.cc<https://urldefense.proofpoint.com/v2/url?u=http-3A__org.apache.ctakes.core.cc&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=77ECYie_8Zy3RN9ARtzl51dBaHan8dijiNX2p0IkjIA&e=>.XmiWriterCasConsumer'? Funny enough, it gives me a classNotFoundException if I do this. Would like to have your confirmation if I'm doing the right thing please. The class is well defined in that path though.
  2.  Concerning the reader, I make the same analogy. Should I go to my descriptor and change it's <implementationName> tag from 'org.apache.ctakes.core.cr<https://urldefense.proofpoint.com/v2/url?u=http-3A__org.apache.ctakes.core.cr&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=-ag_dLUKFN_aLQ4irY_xU_CLzGNrDn6NfV62R5ojs8k&e=>.FilesInDirectoryCollectionReader' to 'org.apache.ctakes.core.cr<https://urldefense.proofpoint.com/v2/url?u=http-3A__org.apache.ctakes.core.cr&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=-ag_dLUKFN_aLQ4irY_xU_CLzGNrDn6NfV62R5ojs8k&e=>.FileTreeReader'?
I did these two things and the error is the same concerning the new consumer 'FileTreeXmiWriter', as you can see in the first image attached to this e-mail.

I would also like to ask you another question:


       3. Why does my class 'FileTreeXmiWriter' has a lot of unresolved classes? You can see it in the second image attached to this e-mail. I can't seem to import them right. I tried to import the extension of this class only to check the result, and look how it solved the import to me. 'apache' is not recognized. I'm just kinda baffled with the hierarchy defined for this project. If you could give me a little bit of clarification in this topic and how to solve it I would be appreciated.

Thanks for your attention! I'm really looking forward to put this to work. cTAKES seems awesome. It just needs these little tweaks.

Best regards,

Manuel





2018-01-24 22:26 GMT+00:00 Finan, Sean <Se...@childrens.harvard.edu>>>:
Hi Manuel,

Your image got scrubbed by a server, but the problem may have been fixed in a recent xmi writer.  The latest xmi writer is in ctakes core and is named FileTreeXmiWriter.  One possible cause for a problem in the writer is if the document has some unexpected character or character combination.  A document reader should be massaging documents before they are processed and sent to the writer.  The most recent file reader is named FileTreeReader and is also in ctakes core.

Sean



From: Manuel Lamy [mailto:mmvpdml@gmail.com<ma...@gmail.com>>]
Sent: Wednesday, January 24, 2018 5:10 PM
To: dev@ctakes.apache.org<ma...@ctakes.apache.org>>
Subject: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]

Hello guys,

I'm having problems running the CPE using a XMI Writer CAS Consumer. However, it works with other consumers.

Problem

In the figure below, you can see my setup and the error I'm obtaining:

[Imagem inline 2]

Logs

Concerning logs, I'm obtaining this from Intellij:

org.apache.uima.resource.ResourceInitializationException
            at org.apache.uima.collection.impl.CollectionProcessingEngine_impl.initialize(CollectionProcessingEngine_impl.java:81)
            at org.apache.uima.impl.UIMAFramework_impl._produceCollectionProcessingEngine(UIMAFramework_impl.java:438)
            at org.apache.uima.UIMAFramework.produceCollectionProcessingEngine(UIMAFramework.java:918)
            at org.apache.uima.tools.cpm.CpmPanel.startProcessing(CpmPanel.java:573)
            at org.apache.uima.tools.cpm.CpmPanel.access$000(CpmPanel.java:105)
            at org.apache.uima.tools.cpm.CpmPanel$1.run(CpmPanel.java:713)
Caused by: org.apache.uima.resource.ResourceConfigurationException
            at org.apache.uima.collection.impl.cpm.container.CPEFactory.pro<https://urldefense.proofpoint.com/v2/url?u=http-3A__l.cpm.container.CPEFactory.pro&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=Kd-RE-JiMaX2AlLA310idXB-Dyqrbh68kZ24-2ZFEe0&e=>duceIntegratedCasProcessor(CPEFactory.java:1093)
            at org.apache.uima.collection.impl.cpm.container.CPEFactory.getCasProcessors(CPEFactory.java:547)
            at org.apache.uima.collection.impl.cpm.BaseCPMImpl.init(BaseCPMImpl.java:253)
            at org.apache.uima.collection.impl.cpm.BaseCPMImpl.<init>(BaseCPMImpl.java:127)
            at org.apache.uima.collection.impl.CollectionProcessingEngine_impl.initialize(CollectionProcessingEngine_impl.java:73)
            ... 5 more
Caused by: java.lang.Exception: The component XMI Writer CAS Consumer cannot be created. (Thread Name: Thread-5)
            ... 10 more

Attempted Solutions

I only found one guy with the same problem as me. The solution proposed in the thread, by Sean Finan, was to change the xml of my consumer (__XmiWriterCasConsumer.xml), particularly the content of the tag <implementationName>, from
 <implementationName>org.apache.ctakes.core.cc<https://urldefense.proofpoint.com/v2/url?u=http-3A__org.apache.ctakes.core.cc&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=77ECYie_8Zy3RN9ARtzl51dBaHan8dijiNX2p0IkjIA&e=><https://urldefense.proofpoint.com/v2/url?u=http-3A__apache.ctakes.core.cc&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=55lXUJ1MFPyBhpVH4sCBuEZD-InGrPRtD4YTvCJpMFo&s=zBsJhrOUC6BXHsKiMP4cEZTtjqB73N9V-kjGKPhqaNA&e=>.XmiWriterCasConsumerCtakes</implementationName>

to

<implementationName>org.apache.uima.tools.components.XmiWriterCasConsumer</implementationName>



However, this didn't work. The error is exactly the same. I'm out of ideas about what to do. I would like to have the report of CPE in XMI, in order to read it with CVD. You can see the thread here:

http://mail-archives.apache.org/mod_mbox/ctakes-dev/201701.mbox/%3C29cefd1fa1b44ce4a8dc92ec8b1cd882@CHEXMAIL1A.CHBOSTON.ORG%3E<https://urldefense.proofpoint.com/v2/url?u=http-3A__mail-2Darchives.apache.org_mod-5Fmbox_ctakes-2Ddev_201701.mbox_-253C29cefd1fa1b44ce4a8dc92ec8b1cd882-40CHEXMAIL1A.CHBOSTON.ORG-253E&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=AApC_ctDoYjWBegXtxXpnBYO1T5L0I1tSjXOytMmgM0&s=_6v4jkcWzpMVtIWPH-1GkFuXpcYGRYdjs3sGzVLuEPA&e=><https://urldefense.proofpoint.com/v2/url?u=http-3A__mail-2Darchives.apache.org_mod-5Fmbox_ctakes-2Ddev_201701.mbox_-253C29cefd1fa1b44ce4a8dc92ec8b1cd882-40CHEXMAIL1A.CHBOSTON.ORG-253E&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=55lXUJ1MFPyBhpVH4sCBuEZD-InGrPRtD4YTvCJpMFo&s=vzHmir9t5IBncKpumZCOCqviJeDNNVl4ZkjEiK9AMp8&e=><https://urldefense.proofpoint.com/v2/url?u=http-3A__mail-2Darchives.apache.org_mod-5Fmbox_ctakes-2Ddev_201701.mbox_-253C29cefd1fa1b44ce4a8dc92ec8b1cd882-40CHEXMAIL1A.CHBOSTON.ORG-253E&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=N5zX2YGt7jbGKsiWAN7z5tdADmV2PwJdHTvvx2oZ2fM&s=5c-Yr8TMBg7-VyEjwF7gJlT1xP3LpHC6dvnZbihxDPg&e=>



Result Expected

Running the CPE process and have outputs as XMI files.



Result Obtained

Running the CPE results in an error, specifically for the consumer __XMIWriterCasConsumer.



Conclusion

Do any of you guys had this problem before? Do you have a suggestion about how can it be solved? Thanks a lot



Best regards,

Manuel


Re: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]

Posted by Manuel Lamy <mm...@gmail.com>.
Hello Sean,

Before all, thansk a lot for the quick and detailed answer. Awesome support
by you.

I'll give you a structured answer to be the more objective and concise
possible. I guess it's important to tell you what I'm trying to achieve in
order for you to help me.

*My Project*

I'm actually making a project with cTAKES in a partnership with a
Portuguese hospital.

My goal is to create reports of the narrative parts of the EMRs of this
hospital, in order to report the symptoms, diseases and clinical procedures
found in each EMR.

What I have in mind is to create a pipeline system that first translates
the texts from Portuguese to English, and then creates these reports based
on the translated texts.

I'm not even sure yet I can create a pipeline system of this style with
cTAKES. I need to do it because cTAKES seems to not work with the
Portuguese language at all (despite that option being shown in the
languages list when using CVD and that's confusing). So, well, I will
translate it, I guess it's my best bet.

But just a note, I think it should exist more support and documentation
about how to work with cTAKES in different languages than English. From my
research, I couldn't find anything relevant in this topic. Not even one
reference telling clearly that cTAKES only works with English language and
not with the others.

*Version of cTAKES*

Naturally, I'm running the development version of cTAKES. I'm using
Intellij. I'm using the latest version of cTAKES, trunk, that corresponds
to version 4.0.1-SNAPSHOT.

So, I guess so far so good, just as you said, I'm using trunk.

I did everything as per the guide "Developer Install Guide", concerning the
Intellij instructions. The guide I used can be found here:
https://cwiki.apache.org/confluence/display/CTAKES/
cTAKES+4.0+Developer+Install+Guide


*Behavior of cTAKES when running pipelines*

Well, I did what you told me. I ran the Default Clinical Pipeline and the
Piper File Submitter as per the wiki's. I have the User and Development
versions both in my machine.

Now, I tried to run those pipelines in the User and Development versions. I
ran the respective bat files:


   - For the Default Clinical Pipeline I ran 'bin/*runClinicalPipeline * -i
   *inputDirectory*  --xmiOut *outputDirectory*  --user *umlsUsername*
   --pass *umlsPassword'*
   - For the Piper File Submitter, I ran the 'bin/*runPiperSubmitter'*

Well, the results of running these two bat files were quite differents for
the User and Development versions.

*User Version*

*Default Clinical Pipeline*

In this version, I went to bin directory and just ran the line 'bin/
*runClinicalPipeline * -i *inputDirectory*  --xmiOut *outputDirectory*
--user *umlsUsername*  --pass *umlsPassword' *with my parameters.

It worked well and created the XMI output files where it was supposed. And
I could open them in CVD, first opening a TypeSystem.xml file and then the
generated XMI files I wanted.

*Piper File Submitter*

Well, since this is the user version, I don't have
the runPiperSubmitter.bat available. Is this normal? That's comprehensible
and I guess normal, for what I understand from this quote " If you are
running from a development environment (checked out trunk from SVN) they
can also be run using the Piper File Submitter GUI." But you tell me.

Well, I can say the User Version did what I wanted in this step, but I
thought that would be nice to replicate it in the Development version,
since I guess I'll have to use it in the future in order to implement all I
want for my project described in the beggining of this e-mail. And the
problems arose in the Development version....

*Development Version*

Well, in this version, I tried to replicate what I did in the User version,
thinking to myself it would output the same result. I was wrong.


*Default Clinical Pipeline and Piper File Submitter*

When I try to run the bat files inside the bin of the Dev Version, I have
the results shown in the image attached to this e-mail.

Yes, could not find or load PiperFileRunner and PiperRunnerGui. Is it
supposed to happen in the Development Version? Am I doing something wrong
in here? i just followed the guides you have available. All my Development
Version installation was per the guide.


*My objective with this e-mail*

Well, first of all, my objective is to share my experiences with cTAKES, in
order to share with the community what I'm going through. This way I can
contribute to the community and probably help others who are going through
the same as me.

In second place, I would like to know your opinion about the feasability of
what I'm trying to make here. My goal is build a pipeline system like:


   - EMRs in Portuguese already in txt files in a directory -> Translation
   to English -> Process all of the texts with Clinical Pipeline -> Output XMI
   in order to open them in CVD

This is what I aim with cTAKES. So I have the following questions:


   1. Is this feasible? Am I aiming for something that I simply can't rely
   in cTAKES only to do, because I have to translate the texts first?
   2. Why don't I have a TypeSystem.xml file to feed CVD first, in the
   Development Version? I can only find it in the User Version, under
   /resources.
   3. Why do we have options in CVD for other languages, but it clearly
   only works for the English language?
   4. Any other hint you can give me, concerning the big picture of what
   I'm trying to build here?

Any additional information you need from my side, just tell me.

Thanks one more time for the quick answers and support Sean.

Best regards,

Manuel


2018-01-25 15:35 GMT+00:00 Finan, Sean <Se...@childrens.harvard.edu>:

> Hi Manuel,
>
> My first comment is that you are running ctakes in a somewhat “ancient”
> manner, or better put, the xml descriptor workflow has been pretty much
> deprecated.
>
> You should try to run ctakes 4.0.  If you are software savvy then I advise
> that you try the development version that is in trunk.  You’ve probably
> been on the ctakes download page, but just a reminder :
> http://ctakes.apache.org/
>
> The ctakes wiki has some useful information, and the 4.0 entry is here:
> https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+4.0
>
> To start playing with ctakes I suggest that you try to run the default
> clinical pipeline, following the instructions here:
> https://cwiki.apache.org/confluence/display/CTAKES/Default+
> Clinical+Pipeline
>
> Those instructions will start the default clinical pipeline from a command
> line.  If you have the development version from trunk then there is a gui
> available to run pipelines:
> https://cwiki.apache.org/confluence/display/CTAKES/Piper+
> File+Submitter+GUI
>
> There are also many other pipeline configurations available in trunk to
> run more advanced / involved pipelines.  They are not in the 4.0 release.
> The pipelines (including 4.0 default) are all defined using the replacement
> for those xml descriptor files.  The replacements are called “piper files”.
> https://cwiki.apache.org/confluence/display/CTAKES/Piper+Files
>
> I hope that you find the pipers easier to understand and use than the old
> xml descriptors.
>
> Anyway, if you run the ctakes 4.0 default clinical pipeline as outlined in
> the wiki page it will use the new FileTreeReader and FileTreeXmiWriter
> combination.
>
> Give it a whirl and let me know how things go.
>
> Sean
>
>
> From: Manuel Lamy [mailto:mmvpdml@gmail.com]
> Sent: Thursday, January 25, 2018 9:09 AM
> To: dev@ctakes.apache.org
> Subject: Re: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]
>
> Hello Sean,
>
> First of all, thanks for your quick answer.
>
> I'm probably making some confusion over here, so I have the following
> questions.
>
>
>   1.  A CAS Consumer is defined by a XML file. What you are implying is
> that I should go to my consumer XML (__XmiWriterCasConsumer.xml) and change
> it's <implementationName> tag to 'org.apache.ctakes.core.cc.FileTreeXmiWriter'
> instead of 'org.apache.ctakes.core.cc.XmiWriterCasConsumer'? Funny
> enough, it gives me a classNotFoundException if I do this. Would like to
> have your confirmation if I'm doing the right thing please. The class is
> well defined in that path though.
>   2.  Concerning the reader, I make the same analogy. Should I go to my
> descriptor and change it's <implementationName> tag from '
> org.apache.ctakes.core.cr.FilesInDirectoryCollectionReader' to '
> org.apache.ctakes.core.cr.FileTreeReader'?
> I did these two things and the error is the same concerning the new
> consumer 'FileTreeXmiWriter', as you can see in the first image attached to
> this e-mail.
>
> I would also like to ask you another question:
>
>
>        3. Why does my class 'FileTreeXmiWriter' has a lot of unresolved
> classes? You can see it in the second image attached to this e-mail. I
> can't seem to import them right. I tried to import the extension of this
> class only to check the result, and look how it solved the import to me.
> 'apache' is not recognized. I'm just kinda baffled with the hierarchy
> defined for this project. If you could give me a little bit of
> clarification in this topic and how to solve it I would be appreciated.
>
> Thanks for your attention! I'm really looking forward to put this to work.
> cTAKES seems awesome. It just needs these little tweaks.
>
> Best regards,
>
> Manuel
>
>
>
>
>
> 2018-01-24 22:26 GMT+00:00 Finan, Sean <Sean.Finan@childrens.harvard.edu
> <ma...@childrens.harvard.edu>>:
> Hi Manuel,
>
> Your image got scrubbed by a server, but the problem may have been fixed
> in a recent xmi writer.  The latest xmi writer is in ctakes core and is
> named FileTreeXmiWriter.  One possible cause for a problem in the writer is
> if the document has some unexpected character or character combination.  A
> document reader should be massaging documents before they are processed and
> sent to the writer.  The most recent file reader is named FileTreeReader
> and is also in ctakes core.
>
> Sean
>
>
>
> From: Manuel Lamy [mailto:mmvpdml@gmail.com<ma...@gmail.com>]
> Sent: Wednesday, January 24, 2018 5:10 PM
> To: dev@ctakes.apache.org<ma...@ctakes.apache.org>
> Subject: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]
>
> Hello guys,
>
> I'm having problems running the CPE using a XMI Writer CAS Consumer.
> However, it works with other consumers.
>
> Problem
>
> In the figure below, you can see my setup and the error I'm obtaining:
>
> [Imagem inline 2]
>
> Logs
>
> Concerning logs, I'm obtaining this from Intellij:
>
> org.apache.uima.resource.ResourceInitializationException
>             at org.apache.uima.collection.imp
> l.CollectionProcessingEngine_impl.initialize(CollectionProce
> ssingEngine_impl.java:81)
>             at org.apache.uima.impl.UIMAFrame
> work_impl._produceCollectionProcessingEngine(UIMAFramework_impl.java:438)
>             at org.apache.uima.UIMAFramework.
> produceCollectionProcessingEngine(UIMAFramework.java:918)
>             at org.apache.uima.tools.cpm.CpmP
> anel.startProcessing(CpmPanel.java:573)
>             at org.apache.uima.tools.cpm.CpmP
> anel.access$000(CpmPanel.java:105)
>             at org.apache.uima.tools.cpm.CpmPanel$1.run(CpmPanel.java:713)
> Caused by: org.apache.uima.resource.ResourceConfigurationException
>             at org.apache.uima.collection.imp
> l.cpm.container.CPEFactory.produceIntegratedCasProcessor(CPE
> Factory.java:1093)
>             at org.apache.uima.collection.imp
> l.cpm.container.CPEFactory.getCasProcessors(CPEFactory.java:547)
>             at org.apache.uima.collection.imp
> l.cpm.BaseCPMImpl.init(BaseCPMImpl.java:253)
>             at org.apache.uima.collection.imp
> l.cpm.BaseCPMImpl.<init>(BaseCPMImpl.java:127)
>             at org.apache.uima.collection.imp
> l.CollectionProcessingEngine_impl.initialize(CollectionProce
> ssingEngine_impl.java:73)
>             ... 5 more
> Caused by: java.lang.Exception: The component XMI Writer CAS Consumer
> cannot be created. (Thread Name: Thread-5)
>             ... 10 more
>
> Attempted Solutions
>
> I only found one guy with the same problem as me. The solution proposed in
> the thread, by Sean Finan, was to change the xml of my consumer
> (__XmiWriterCasConsumer.xml), particularly the content of the tag
> <implementationName>, from
>
>  <implementationName>org.apache.ctakes.core.cc<https://urlde
> fense.proofpoint.com/v2/url?u=http-3A__apache.ctakes.core.
> cc&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=55lXUJ1MFPyBhp
> VH4sCBuEZD-InGrPRtD4YTvCJpMFo&s=zBsJhrOUC6BXHsKiMP4cEZTtjqB7
> 3N9V-kjGKPhqaNA&e=>.XmiWriterCasConsumerCtakes</implementationName>
>
> to
>
> <implementationName>org.apache.uima.tools.components.XmiWrit
> erCasConsumer</implementationName>
>
>
>
> However, this didn't work. The error is exactly the same. I'm out of ideas
> about what to do. I would like to have the report of CPE in XMI, in order
> to read it with CVD. You can see the thread here:
>
> http://mail-archives.apache.org/mod_mbox/ctakes-dev/201701.
> mbox/%3C29cefd1fa1b44ce4a8dc92ec8b1cd882@CHEXMAIL1A.CHBOSTON.ORG%3E<
> https://urldefense.proofpoint.com/v2/url?u=http-3A__mail-
> 2Darchives.apache.org_mod-5Fmbox_ctakes-2Ddev_201701.
> mbox_-253C29cefd1fa1b44ce4a8dc92ec8b1cd882-40CHEXMAIL1A.
> CHBOSTON.ORG-253E&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZM
> SdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&
> m=55lXUJ1MFPyBhpVH4sCBuEZD-InGrPRtD4YTvCJpMFo&s=vzHmir9t5IBn
> cKpumZCOCqviJeDNNVl4ZkjEiK9AMp8&e=><https://urldefense.
> proofpoint.com/v2/url?u=http-3A__mail-2Darchives.apache.
> org_mod-5Fmbox_ctakes-2Ddev_201701.mbox_-253C29cefd1fa1b44
> ce4a8dc92ec8b1cd882-40CHEXMAIL1A.CHBOSTON.ORG-
> 253E&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&
> r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=N5zX2YGt7jbG
> KsiWAN7z5tdADmV2PwJdHTvvx2oZ2fM&s=5c-Yr8TMBg7-VyEjwF7gJlT1xP
> 3LpHC6dvnZbihxDPg&e=>
>
>
>
> Result Expected
>
> Running the CPE process and have outputs as XMI files.
>
>
>
> Result Obtained
>
> Running the CPE results in an error, specifically for the consumer
> __XMIWriterCasConsumer.
>
>
>
> Conclusion
>
> Do any of you guys had this problem before? Do you have a suggestion about
> how can it be solved? Thanks a lot
>
>
>
> Best regards,
>
> Manuel
>
>

RE: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]

Posted by "Finan, Sean" <Se...@childrens.harvard.edu>.
Hi Manuel,

My first comment is that you are running ctakes in a somewhat “ancient” manner, or better put, the xml descriptor workflow has been pretty much deprecated.

You should try to run ctakes 4.0.  If you are software savvy then I advise that you try the development version that is in trunk.  You’ve probably been on the ctakes download page, but just a reminder :
http://ctakes.apache.org/

The ctakes wiki has some useful information, and the 4.0 entry is here:
https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+4.0

To start playing with ctakes I suggest that you try to run the default clinical pipeline, following the instructions here:
https://cwiki.apache.org/confluence/display/CTAKES/Default+Clinical+Pipeline

Those instructions will start the default clinical pipeline from a command line.  If you have the development version from trunk then there is a gui available to run pipelines:
https://cwiki.apache.org/confluence/display/CTAKES/Piper+File+Submitter+GUI

There are also many other pipeline configurations available in trunk to run more advanced / involved pipelines.  They are not in the 4.0 release.  The pipelines (including 4.0 default) are all defined using the replacement for those xml descriptor files.  The replacements are called “piper files”.
https://cwiki.apache.org/confluence/display/CTAKES/Piper+Files

I hope that you find the pipers easier to understand and use than the old xml descriptors.

Anyway, if you run the ctakes 4.0 default clinical pipeline as outlined in the wiki page it will use the new FileTreeReader and FileTreeXmiWriter combination.

Give it a whirl and let me know how things go.

Sean


From: Manuel Lamy [mailto:mmvpdml@gmail.com]
Sent: Thursday, January 25, 2018 9:09 AM
To: dev@ctakes.apache.org
Subject: Re: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]

Hello Sean,

First of all, thanks for your quick answer.

I'm probably making some confusion over here, so I have the following questions.


  1.  A CAS Consumer is defined by a XML file. What you are implying is that I should go to my consumer XML (__XmiWriterCasConsumer.xml) and change it's <implementationName> tag to 'org.apache.ctakes.core.cc.FileTreeXmiWriter' instead of 'org.apache.ctakes.core.cc.XmiWriterCasConsumer'? Funny enough, it gives me a classNotFoundException if I do this. Would like to have your confirmation if I'm doing the right thing please. The class is well defined in that path though.
  2.  Concerning the reader, I make the same analogy. Should I go to my descriptor and change it's <implementationName> tag from 'org.apache.ctakes.core.cr.FilesInDirectoryCollectionReader' to 'org.apache.ctakes.core.cr.FileTreeReader'?
I did these two things and the error is the same concerning the new consumer 'FileTreeXmiWriter', as you can see in the first image attached to this e-mail.

I would also like to ask you another question:


       3. Why does my class 'FileTreeXmiWriter' has a lot of unresolved classes? You can see it in the second image attached to this e-mail. I can't seem to import them right. I tried to import the extension of this class only to check the result, and look how it solved the import to me. 'apache' is not recognized. I'm just kinda baffled with the hierarchy defined for this project. If you could give me a little bit of clarification in this topic and how to solve it I would be appreciated.

Thanks for your attention! I'm really looking forward to put this to work. cTAKES seems awesome. It just needs these little tweaks.

Best regards,

Manuel





2018-01-24 22:26 GMT+00:00 Finan, Sean <Se...@childrens.harvard.edu>>:
Hi Manuel,

Your image got scrubbed by a server, but the problem may have been fixed in a recent xmi writer.  The latest xmi writer is in ctakes core and is named FileTreeXmiWriter.  One possible cause for a problem in the writer is if the document has some unexpected character or character combination.  A document reader should be massaging documents before they are processed and sent to the writer.  The most recent file reader is named FileTreeReader and is also in ctakes core.

Sean



From: Manuel Lamy [mailto:mmvpdml@gmail.com<ma...@gmail.com>]
Sent: Wednesday, January 24, 2018 5:10 PM
To: dev@ctakes.apache.org<ma...@ctakes.apache.org>
Subject: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]

Hello guys,

I'm having problems running the CPE using a XMI Writer CAS Consumer. However, it works with other consumers.

Problem

In the figure below, you can see my setup and the error I'm obtaining:

[Imagem inline 2]

Logs

Concerning logs, I'm obtaining this from Intellij:

org.apache.uima.resource.ResourceInitializationException
            at org.apache.uima.collection.impl.CollectionProcessingEngine_impl.initialize(CollectionProcessingEngine_impl.java:81)
            at org.apache.uima.impl.UIMAFramework_impl._produceCollectionProcessingEngine(UIMAFramework_impl.java:438)
            at org.apache.uima.UIMAFramework.produceCollectionProcessingEngine(UIMAFramework.java:918)
            at org.apache.uima.tools.cpm.CpmPanel.startProcessing(CpmPanel.java:573)
            at org.apache.uima.tools.cpm.CpmPanel.access$000(CpmPanel.java:105)
            at org.apache.uima.tools.cpm.CpmPanel$1.run(CpmPanel.java:713)
Caused by: org.apache.uima.resource.ResourceConfigurationException
            at org.apache.uima.collection.impl.cpm.container.CPEFactory.produceIntegratedCasProcessor(CPEFactory.java:1093)
            at org.apache.uima.collection.impl.cpm.container.CPEFactory.getCasProcessors(CPEFactory.java:547)
            at org.apache.uima.collection.impl.cpm.BaseCPMImpl.init(BaseCPMImpl.java:253)
            at org.apache.uima.collection.impl.cpm.BaseCPMImpl.<init>(BaseCPMImpl.java:127)
            at org.apache.uima.collection.impl.CollectionProcessingEngine_impl.initialize(CollectionProcessingEngine_impl.java:73)
            ... 5 more
Caused by: java.lang.Exception: The component XMI Writer CAS Consumer cannot be created. (Thread Name: Thread-5)
            ... 10 more

Attempted Solutions

I only found one guy with the same problem as me. The solution proposed in the thread, by Sean Finan, was to change the xml of my consumer (__XmiWriterCasConsumer.xml), particularly the content of the tag <implementationName>, from

 <implementationName>org.apache.ctakes.core.cc<https://urldefense.proofpoint.com/v2/url?u=http-3A__apache.ctakes.core.cc&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=55lXUJ1MFPyBhpVH4sCBuEZD-InGrPRtD4YTvCJpMFo&s=zBsJhrOUC6BXHsKiMP4cEZTtjqB73N9V-kjGKPhqaNA&e=>.XmiWriterCasConsumerCtakes</implementationName>

to

<implementationName>org.apache.uima.tools.components.XmiWriterCasConsumer</implementationName>



However, this didn't work. The error is exactly the same. I'm out of ideas about what to do. I would like to have the report of CPE in XMI, in order to read it with CVD. You can see the thread here:

http://mail-archives.apache.org/mod_mbox/ctakes-dev/201701.mbox/%3C29cefd1fa1b44ce4a8dc92ec8b1cd882@CHEXMAIL1A.CHBOSTON.ORG%3E<https://urldefense.proofpoint.com/v2/url?u=http-3A__mail-2Darchives.apache.org_mod-5Fmbox_ctakes-2Ddev_201701.mbox_-253C29cefd1fa1b44ce4a8dc92ec8b1cd882-40CHEXMAIL1A.CHBOSTON.ORG-253E&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=55lXUJ1MFPyBhpVH4sCBuEZD-InGrPRtD4YTvCJpMFo&s=vzHmir9t5IBncKpumZCOCqviJeDNNVl4ZkjEiK9AMp8&e=><https://urldefense.proofpoint.com/v2/url?u=http-3A__mail-2Darchives.apache.org_mod-5Fmbox_ctakes-2Ddev_201701.mbox_-253C29cefd1fa1b44ce4a8dc92ec8b1cd882-40CHEXMAIL1A.CHBOSTON.ORG-253E&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=N5zX2YGt7jbGKsiWAN7z5tdADmV2PwJdHTvvx2oZ2fM&s=5c-Yr8TMBg7-VyEjwF7gJlT1xP3LpHC6dvnZbihxDPg&e=>



Result Expected

Running the CPE process and have outputs as XMI files.



Result Obtained

Running the CPE results in an error, specifically for the consumer __XMIWriterCasConsumer.



Conclusion

Do any of you guys had this problem before? Do you have a suggestion about how can it be solved? Thanks a lot



Best regards,

Manuel


Re: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]

Posted by Manuel Lamy <mm...@gmail.com>.
Hello Sean,

First of all, thanks for your quick answer.

I'm probably making some confusion over here, so I have the following
questions.


   1. A CAS Consumer is defined by a XML file. What you are implying is
   that I should go to my consumer XML (__XmiWriterCasConsumer.xml) and change
   it's <implementationName> tag to
   'org.apache.ctakes.core.cc.FileTreeXmiWriter' instead of
   'org.apache.ctakes.core.cc.XmiWriterCasConsumer'? Funny enough, it gives me
   a classNotFoundException if I do this. Would like to have your confirmation
   if I'm doing the right thing please. The class is well defined in that path
   though.
   2. Concerning the reader, I make the same analogy. Should I go to my
   descriptor and change it's <implementationName> tag from
   'org.apache.ctakes.core.cr.FilesInDirectoryCollectionReader' to
   'org.apache.ctakes.core.cr.FileTreeReader'?

I did these two things and the error is the same concerning the new
consumer 'FileTreeXmiWriter', as you can see in the first image attached to
this e-mail.

I would also like to ask you another question:


       3. Why does my class 'FileTreeXmiWriter' has a lot of unresolved
classes? You can see it in the second image attached to this e-mail. I
can't seem to import them right. I tried to import the extension of this
class only to check the result, and look how it solved the import to me.
'apache' is not recognized. I'm just kinda baffled with the hierarchy
defined for this project. If you could give me a little bit of
clarification in this topic and how to solve it I would be appreciated.

Thanks for your attention! I'm really looking forward to put this to work.
cTAKES seems awesome. It just needs these little tweaks.

Best regards,

Manuel





2018-01-24 22:26 GMT+00:00 Finan, Sean <Se...@childrens.harvard.edu>:

> Hi Manuel,
>
> Your image got scrubbed by a server, but the problem may have been fixed
> in a recent xmi writer.  The latest xmi writer is in ctakes core and is
> named FileTreeXmiWriter.  One possible cause for a problem in the writer is
> if the document has some unexpected character or character combination.  A
> document reader should be massaging documents before they are processed and
> sent to the writer.  The most recent file reader is named FileTreeReader
> and is also in ctakes core.
>
> Sean
>
>
>
> From: Manuel Lamy [mailto:mmvpdml@gmail.com]
> Sent: Wednesday, January 24, 2018 5:10 PM
> To: dev@ctakes.apache.org
> Subject: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]
>
> Hello guys,
>
> I'm having problems running the CPE using a XMI Writer CAS Consumer.
> However, it works with other consumers.
>
> Problem
>
> In the figure below, you can see my setup and the error I'm obtaining:
>
> [Imagem inline 2]
>
> Logs
>
> Concerning logs, I'm obtaining this from Intellij:
>
> org.apache.uima.resource.ResourceInitializationException
>             at org.apache.uima.collection.impl.CollectionProcessingEngine_
> impl.initialize(CollectionProcessingEngine_impl.java:81)
>             at org.apache.uima.impl.UIMAFramework_impl._
> produceCollectionProcessingEngine(UIMAFramework_impl.java:438)
>             at org.apache.uima.UIMAFramework.
> produceCollectionProcessingEngine(UIMAFramework.java:918)
>             at org.apache.uima.tools.cpm.CpmPanel.startProcessing(
> CpmPanel.java:573)
>             at org.apache.uima.tools.cpm.CpmPanel.access$000(CpmPanel.
> java:105)
>             at org.apache.uima.tools.cpm.CpmPanel$1.run(CpmPanel.java:713)
> Caused by: org.apache.uima.resource.ResourceConfigurationException
>             at org.apache.uima.collection.impl.cpm.container.CPEFactory.
> produceIntegratedCasProcessor(CPEFactory.java:1093)
>             at org.apache.uima.collection.impl.cpm.container.CPEFactory.
> getCasProcessors(CPEFactory.java:547)
>             at org.apache.uima.collection.impl.cpm.BaseCPMImpl.init(
> BaseCPMImpl.java:253)
>             at org.apache.uima.collection.impl.cpm.BaseCPMImpl.<init>(
> BaseCPMImpl.java:127)
>             at org.apache.uima.collection.impl.CollectionProcessingEngine_
> impl.initialize(CollectionProcessingEngine_impl.java:73)
>             ... 5 more
> Caused by: java.lang.Exception: The component XMI Writer CAS Consumer
> cannot be created. (Thread Name: Thread-5)
>             ... 10 more
>
> Attempted Solutions
>
> I only found one guy with the same problem as me. The solution proposed in
> the thread, by Sean Finan, was to change the xml of my consumer
> (__XmiWriterCasConsumer.xml), particularly the content of the tag
> <implementationName>, from
>
>  <implementationName>org.apache.ctakes.core.cc.
> XmiWriterCasConsumerCtakes</implementationName>
>
> to
>
> <implementationName>org.apache.uima.tools.components.
> XmiWriterCasConsumer</implementationName>
>
>
>
> However, this didn't work. The error is exactly the same. I'm out of ideas
> about what to do. I would like to have the report of CPE in XMI, in order
> to read it with CVD. You can see the thread here:
>
> http://mail-archives.apache.org/mod_mbox/ctakes-dev/201701.mbox/%
> 3C29cefd1fa1b44ce4a8dc92ec8b1cd882@CHEXMAIL1A.CHBOSTON.ORG%3E<
> https://urldefense.proofpoint.com/v2/url?u=http-
> 3A__mail-2Darchives.apache.org_mod-5Fmbox_ctakes-2Ddev_201701.mbox_-
> 253C29cefd1fa1b44ce4a8dc92ec8b1cd882-40CHEXMAIL1A.CHBOSTON.
> ORG-253E&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=
> N5zX2YGt7jbGKsiWAN7z5tdADmV2PwJdHTvvx2oZ2fM&s=5c-Yr8TMBg7-
> VyEjwF7gJlT1xP3LpHC6dvnZbihxDPg&e=>
>
>
>
> Result Expected
>
> Running the CPE process and have outputs as XMI files.
>
>
>
> Result Obtained
>
> Running the CPE results in an error, specifically for the consumer
> __XMIWriterCasConsumer.
>
>
>
> Conclusion
>
> Do any of you guys had this problem before? Do you have a suggestion about
> how can it be solved? Thanks a lot
>
>
>
> Best regards,
>
> Manuel
>

RE: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]

Posted by "Finan, Sean" <Se...@childrens.harvard.edu>.
Hi Manuel,

Your image got scrubbed by a server, but the problem may have been fixed in a recent xmi writer.  The latest xmi writer is in ctakes core and is named FileTreeXmiWriter.  One possible cause for a problem in the writer is if the document has some unexpected character or character combination.  A document reader should be massaging documents before they are processed and sent to the writer.  The most recent file reader is named FileTreeReader and is also in ctakes core.

Sean



From: Manuel Lamy [mailto:mmvpdml@gmail.com]
Sent: Wednesday, January 24, 2018 5:10 PM
To: dev@ctakes.apache.org
Subject: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]

Hello guys,

I'm having problems running the CPE using a XMI Writer CAS Consumer. However, it works with other consumers.

Problem

In the figure below, you can see my setup and the error I'm obtaining:

[Imagem inline 2]

Logs

Concerning logs, I'm obtaining this from Intellij:

org.apache.uima.resource.ResourceInitializationException
            at org.apache.uima.collection.impl.CollectionProcessingEngine_impl.initialize(CollectionProcessingEngine_impl.java:81)
            at org.apache.uima.impl.UIMAFramework_impl._produceCollectionProcessingEngine(UIMAFramework_impl.java:438)
            at org.apache.uima.UIMAFramework.produceCollectionProcessingEngine(UIMAFramework.java:918)
            at org.apache.uima.tools.cpm.CpmPanel.startProcessing(CpmPanel.java:573)
            at org.apache.uima.tools.cpm.CpmPanel.access$000(CpmPanel.java:105)
            at org.apache.uima.tools.cpm.CpmPanel$1.run(CpmPanel.java:713)
Caused by: org.apache.uima.resource.ResourceConfigurationException
            at org.apache.uima.collection.impl.cpm.container.CPEFactory.produceIntegratedCasProcessor(CPEFactory.java:1093)
            at org.apache.uima.collection.impl.cpm.container.CPEFactory.getCasProcessors(CPEFactory.java:547)
            at org.apache.uima.collection.impl.cpm.BaseCPMImpl.init(BaseCPMImpl.java:253)
            at org.apache.uima.collection.impl.cpm.BaseCPMImpl.<init>(BaseCPMImpl.java:127)
            at org.apache.uima.collection.impl.CollectionProcessingEngine_impl.initialize(CollectionProcessingEngine_impl.java:73)
            ... 5 more
Caused by: java.lang.Exception: The component XMI Writer CAS Consumer cannot be created. (Thread Name: Thread-5)
            ... 10 more

Attempted Solutions

I only found one guy with the same problem as me. The solution proposed in the thread, by Sean Finan, was to change the xml of my consumer (__XmiWriterCasConsumer.xml), particularly the content of the tag <implementationName>, from

 <implementationName>org.apache.ctakes.core.cc.XmiWriterCasConsumerCtakes</implementationName>

to

<implementationName>org.apache.uima.tools.components.XmiWriterCasConsumer</implementationName>



However, this didn't work. The error is exactly the same. I'm out of ideas about what to do. I would like to have the report of CPE in XMI, in order to read it with CVD. You can see the thread here:

http://mail-archives.apache.org/mod_mbox/ctakes-dev/201701.mbox/%3C29cefd1fa1b44ce4a8dc92ec8b1cd882@CHEXMAIL1A.CHBOSTON.ORG%3E<https://urldefense.proofpoint.com/v2/url?u=http-3A__mail-2Darchives.apache.org_mod-5Fmbox_ctakes-2Ddev_201701.mbox_-253C29cefd1fa1b44ce4a8dc92ec8b1cd882-40CHEXMAIL1A.CHBOSTON.ORG-253E&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=N5zX2YGt7jbGKsiWAN7z5tdADmV2PwJdHTvvx2oZ2fM&s=5c-Yr8TMBg7-VyEjwF7gJlT1xP3LpHC6dvnZbihxDPg&e=>



Result Expected

Running the CPE process and have outputs as XMI files.



Result Obtained

Running the CPE results in an error, specifically for the consumer __XMIWriterCasConsumer.



Conclusion

Do any of you guys had this problem before? Do you have a suggestion about how can it be solved? Thanks a lot



Best regards,

Manuel