You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by Peter Klügl <pk...@uni-wuerzburg.de> on 2009/01/08 19:37:59 UTC

CAS Viewer

Hello,

For those that are interested:
I released a quite stable version of the CEV plugin at 
https://sourceforge.net/projects/textmarker/ (Mind the PDF with 
installation instructions). An english manual and the source code will 
hopefully follow before spring. The CEV plugin like the CAS Viewer 
visualizes CAS files, but was especially built for HTML artifacts and 
provides extension points for viewers and editors.

At the CAS Viewer developers:
The HTML and the extension points are two essential functionalities for 
my projects, e.g. the TextMarker provides several CEV views for 
debugging/explaining its rules inference. I would be happy to integrate 
both features in the CAS Viewer, but can not see how. I think there are 
no plans at your side to support extensions or using the complete editor 
for the artifact?

Peter

-- 
Peter Klügl
University of Würzburg
pkluegl@uni-wuerzburg.de


Re: CAS Viewer

Posted by Anuj Kumar Gupta <vi...@gmail.com>.
Oops Soory Thilo...

On Mon, Jan 19, 2009 at 7:13 PM, Thilo Goetz <tw...@gmx.de> wrote:

> Please stick to the topic.  If you have a new topic,
> open a new thread.  Thanks.
>
> Anuj Kumar Gupta wrote:
> > Hello user-
> >
> > Can we use MS SQL and Oracle Database with UIMA ?
> > Can we Extract any Information from DB and also can insert extracted data
> in
> > to DB?
> > Any example would be help more.
> >
> > Thanks.
> > -Anuj
>

Re: CAS Viewer

Posted by Thilo Goetz <tw...@gmx.de>.
Please stick to the topic.  If you have a new topic,
open a new thread.  Thanks.

Anuj Kumar Gupta wrote:
> Hello user-
> 
> Can we use MS SQL and Oracle Database with UIMA ?
> Can we Extract any Information from DB and also can insert extracted data in
> to DB?
> Any example would be help more.
> 
> Thanks.
> -Anuj

UIMA ConceptMapper in Pear Package

Posted by Joachim Wermter <Jo...@uni-jena.de>.
Hi,

I checked out the Sandbox ConceptMapper from the SVN repository and, after
adjusting a couple of file paths (in the ConceptMapperOffsetTokenizer.xml,
created a PEAR package out of the project, with OffsetTokenizerMatcher.xml
being the PEAR component descriptor. When trying to install the PEAR (on a
Windows machine), I get the following ClassNotFound Exception below. I really
double- and triple-checked all classpath issues inside the PEAR setenv.txt and
elsewhere, and I have no clue as to why this happens.
Any help would be greatly appreciated!

Joachim


Verification of ConceptMapper failed =>
 org.apache.uima.resource.ResourceInitializationException: Initialization of
annotator class "org.apache.uima.conceptMapper.ConceptMapper" failed. 
(Descriptor:
file:/C:/Users/jwermter/Desktop/PearInstalled/ConceptMapper/desc/analysis_engine/primitive/ConceptMapperOffsetTokenizer.xml)
	at
org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initializeAnalysisComponent(PrimitiveAnalysisEngine_impl.java:253)
	at
org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initialize(PrimitiveAnalysisEngine_impl.java:157)
	at
org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
	at
org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
	at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:258)
	at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:352)
	at org.apache.uima.analysis_engine.asb.impl.ASB_impl.setup(ASB_impl.java:243)
	at
org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initASB(AggregateAnalysisEngine_impl.java:413)
	at
org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initializeAggregateAnalysisEngine(AggregateAnalysisEngine_impl.java:361)
	at
org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initialize(AggregateAnalysisEngine_impl.java:183)
	at
org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
	at
org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
	at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:258)
	at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:303)
	at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:383)
	at
org.apache.uima.pear.tools.InstallationTester.testAnalysisEngine(InstallationTester.java:219)
	at
org.apache.uima.pear.tools.InstallationTester.doTest(InstallationTester.java:114)
	at
org.apache.uima.pear.tools.InstallationController.verifyComponentInstallation(InstallationController.java:1110)
	at
org.apache.uima.pear.tools.InstallationController.verifyComponent(InstallationController.java:1993)
	at
org.apache.uima.tools.pear.install.InstallPear.installPear(InstallPear.java:388)
	at
org.apache.uima.tools.pear.install.InstallPear.access$000(InstallPear.java:79)
	at
org.apache.uima.tools.pear.install.InstallPear$RunInstallation.run(InstallPear.java:108)
	at java.lang.Thread.run(Thread.java:619)
Caused by: org.apache.uima.resource.ResourceInitializationException
	at
org.apache.uima.analysis_engine.impl.compatibility.AnnotatorAdapter.initialize(AnnotatorAdapter.java:113)
	at
org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initializeAnalysisComponent(PrimitiveAnalysisEngine_impl.java:251)
	... 22 more
Caused by:
org.apache.uima.analysis_engine.annotator.AnnotatorConfigurationException
	at
org.apache.uima.conceptMapper.ConceptMapper.initialize(ConceptMapper.java:343)
	at
org.apache.uima.analysis_engine.impl.compatibility.AnnotatorAdapter.initialize(AnnotatorAdapter.java:109)
	... 23 more
Caused by: org.apache.uima.resource.ResourceInitializationException
	at
org.apache.uima.conceptMapper.support.dictionaryResource.DictionaryResource_impl.loadDictionaryContents(DictionaryResource_impl.java:278)
	at
org.apache.uima.conceptMapper.ConceptMapper.initialize(ConceptMapper.java:335)
	... 24 more
Caused by:
org.apache.uima.conceptMapper.support.dictionaryResource.DictionaryLoaderException:
org.apache.uima.resource.ResourceInitializationException: Annotator class
"org.apache.uima.conceptMapper.support.tokenizer.OffsetTokenizer" was not
found. (Descriptor:
file:/C:/Users/jwermter/Desktop/PearInstalled/ConceptMapper/desc/analysis_engine/primitive/OffsetTokenizer.xml)
	at
org.apache.uima.conceptMapper.support.dictionaryResource.annotatorAdaptor.AnnotatorAdaptor.initCPM(AnnotatorAdaptor.java:93)
	at
org.apache.uima.conceptMapper.support.dictionaryResource.DictionaryResource_impl$DictLoader.setDictionary(DictionaryResource_impl.java:939)
	at
org.apache.uima.conceptMapper.support.dictionaryResource.DictionaryResource_impl.loadDictionaryContents(DictionaryResource_impl.java:263)
	... 25 more
Caused by: org.apache.uima.resource.ResourceInitializationException: Annotator
class "org.apache.uima.conceptMapper.support.tokenizer.OffsetTokenizer" was not
found. (Descriptor:
file:/C:/Users/jwermter/Desktop/PearInstalled/ConceptMapper/desc/analysis_engine/primitive/OffsetTokenizer.xml)
	at
org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initializeAnalysisComponent(PrimitiveAnalysisEngine_impl.java:208)
	at
org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initialize(PrimitiveAnalysisEngine_impl.java:157)
	at
org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
	at
org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
	at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:258)
	at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:326)
	at
org.apache.uima.conceptMapper.support.dictionaryResource.annotatorAdaptor.AnnotatorAdaptor.initCPM(AnnotatorAdaptor.java:90)
	... 27 more
Caused by: java.lang.ClassNotFoundException:
org.apache.uima.conceptMapper.support.tokenizer.OffsetTokenizer
	at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:252)
	at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:169)
	at
org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initializeAnalysisComponent(PrimitiveAnalysisEngine_impl.java:205)
	... 33 more


----------------------------------------------------------------
This mail was sent through http://webmail.uni-jena.de

Re: Using databases with UIMA

Posted by Marshall Schor <ms...@schor.com>.
An example of putting information into databases is in the
uimaj-examples project - see
src/main/java/org/apache/uima/exaples/cpe/PersonTitleDBWriterCasConsumer.

This example uses the Java standard API for working with databases,
called JDBC. 

In this example, the actual database being connected to is Apache Derby;
other databases can be used with JDBC.  Please see any good book on JDBC
for more details on this.

As you can see from the examples, UIMA doesn't have any direct
connection to data bases; you need to make use of components (a Cas
Consumer, for example, in this case) to read/write information between
the database and the CAS.

Hope this is of some help.  -Marshall

Anuj Kumar Gupta wrote:
> Hello user-
>
> Can we use MS SQL and Oracle Database with UIMA ?
> Can we Extract any Information from DB and also can insert extracted data in
> to DB?
> Any example would be help more.
>
> Thanks.
> -Anuj
>
>
>
> On Fri, Jan 16, 2009 at 6:20 PM, Peter Klügl <
> pkluegl@ki.informatik.uni-wuerzburg.de> wrote:
>
>   
>> Hi Tong,
>>
>> I added a simple (trivial) exmaple xmiCAS with a type system to the CEV
>> file package on sourceforge. The text is in german, but I think you can test
>> at least the CEV functionality. The content is anyway fake.
>>
>> Peter
>>
>> Peter Klügl schrieb:
>>
>> Hi Tong,
>>     
>>>> When processing input files that contain HTML tags, most of annotators
>>>>  will
>>>> "clean-up" the HTML tags before doing any further processing. As the
>>>> result
>>>> of that, the xmiCAS doesn't contain the original HTML text anymore.
>>>>
>>>>
>>>>         
>>> Ah ok. Visual and layout information is quite important for my extraction
>>> tasks. My rule language has the capability to dynamically filter all kinds
>>> and combinations of markup and annotations types. Therefore the original
>>> HTML text stays the main artifact in the xmiCAS even if the tags contain no
>>> valuable information. I plan to integrate "external" annotators with
>>> restrictions also in that manner.
>>>
>>>       
>>>> I think the most useful feature of your plug-in is its capability to
>>>> allow
>>>> users to edit the xmiCAS in the browser window similar to editing the
>>>> HTML
>>>> page with an HTML Editor (Please corect me if I am wrong).
>>>>
>>>>
>>>>         
>>> I am not sure if I understand you. The structure or text of the HTML
>>> cannot be modified by the CEV plugin (the rule language does such things). I
>>> think the only real advantage to the CAS Viewer and the CAS Editor is that
>>> the CEV can display annotations of an HTML artifact in some kind of browser
>>> and the user can create new annotations in this browser. It is really
>>> painfully to review or edit annotations in the HTML source. There is
>>> probably no reason (except maybe the extension point) to use the CEV plugin
>>> instead of the CAS Viewer if you are just processing plain text.
>>>
>>>       
>>>> Having some xmiCAS samples will help us to understand the plug-in's
>>>> capability.
>>>>
>>>>
>>>>         
>>> Yes, I will provide a simple example next week.
>>>
>>> Have a nice weekend!
>>>
>>> Peter
>>>
>>>       
>> --
>>  Peter Klügl
>> University of Würzburg
>> pkluegl@uni-wuerzburg.de
>>
>>
>>     
>
>   

Re: CAS Viewer

Posted by Anuj Kumar Gupta <vi...@gmail.com>.
Hello user-

Can we use MS SQL and Oracle Database with UIMA ?
Can we Extract any Information from DB and also can insert extracted data in
to DB?
Any example would be help more.

Thanks.
-Anuj



On Fri, Jan 16, 2009 at 6:20 PM, Peter Klügl <
pkluegl@ki.informatik.uni-wuerzburg.de> wrote:

> Hi Tong,
>
> I added a simple (trivial) exmaple xmiCAS with a type system to the CEV
> file package on sourceforge. The text is in german, but I think you can test
> at least the CEV functionality. The content is anyway fake.
>
> Peter
>
> Peter Klügl schrieb:
>
> Hi Tong,
>>
>>> When processing input files that contain HTML tags, most of annotators
>>>  will
>>> "clean-up" the HTML tags before doing any further processing. As the
>>> result
>>> of that, the xmiCAS doesn't contain the original HTML text anymore.
>>>
>>>
>> Ah ok. Visual and layout information is quite important for my extraction
>> tasks. My rule language has the capability to dynamically filter all kinds
>> and combinations of markup and annotations types. Therefore the original
>> HTML text stays the main artifact in the xmiCAS even if the tags contain no
>> valuable information. I plan to integrate "external" annotators with
>> restrictions also in that manner.
>>
>>> I think the most useful feature of your plug-in is its capability to
>>> allow
>>> users to edit the xmiCAS in the browser window similar to editing the
>>> HTML
>>> page with an HTML Editor (Please corect me if I am wrong).
>>>
>>>
>> I am not sure if I understand you. The structure or text of the HTML
>> cannot be modified by the CEV plugin (the rule language does such things). I
>> think the only real advantage to the CAS Viewer and the CAS Editor is that
>> the CEV can display annotations of an HTML artifact in some kind of browser
>> and the user can create new annotations in this browser. It is really
>> painfully to review or edit annotations in the HTML source. There is
>> probably no reason (except maybe the extension point) to use the CEV plugin
>> instead of the CAS Viewer if you are just processing plain text.
>>
>>> Having some xmiCAS samples will help us to understand the plug-in's
>>> capability.
>>>
>>>
>> Yes, I will provide a simple example next week.
>>
>> Have a nice weekend!
>>
>> Peter
>>
>
>
> --
>  Peter Klügl
> University of Würzburg
> pkluegl@uni-wuerzburg.de
>
>

Re: CAS Viewer

Posted by Peter Klügl <pk...@ki.informatik.uni-wuerzburg.de>.
Hi Tong,

I added a simple (trivial) exmaple xmiCAS with a type system to the CEV 
file package on sourceforge. The text is in german, but I think you can 
test at least the CEV functionality. The content is anyway fake.

Peter

Peter Klügl schrieb:
> Hi Tong,
>> When processing input files that contain HTML tags, most of 
>> annotators  will
>> "clean-up" the HTML tags before doing any further processing. As the 
>> result
>> of that, the xmiCAS doesn't contain the original HTML text anymore.
>>   
> Ah ok. Visual and layout information is quite important for my 
> extraction tasks. My rule language has the capability to dynamically 
> filter all kinds and combinations of markup and annotations types. 
> Therefore the original HTML text stays the main artifact in the xmiCAS 
> even if the tags contain no valuable information. I plan to integrate 
> "external" annotators with restrictions also in that manner.
>> I think the most useful feature of your plug-in is its capability to 
>> allow
>> users to edit the xmiCAS in the browser window similar to editing the 
>> HTML
>> page with an HTML Editor (Please corect me if I am wrong).
>>   
> I am not sure if I understand you. The structure or text of the HTML 
> cannot be modified by the CEV plugin (the rule language does such 
> things). I think the only real advantage to the CAS Viewer and the CAS 
> Editor is that the CEV can display annotations of an HTML artifact in 
> some kind of browser and the user can create new annotations in this 
> browser. It is really painfully to review or edit annotations in the 
> HTML source. There is probably no reason (except maybe the extension 
> point) to use the CEV plugin instead of the CAS Viewer if you are just 
> processing plain text.
>> Having some xmiCAS samples will help us to understand the plug-in's
>> capability.
>>   
> Yes, I will provide a simple example next week.
>
> Have a nice weekend!
>
> Peter


-- 
Peter Klügl
University of Würzburg
pkluegl@uni-wuerzburg.de


Re: CAS Viewer

Posted by Peter Klügl <pk...@ki.informatik.uni-wuerzburg.de>.
Hi Tong,
> When processing input files that contain HTML tags, most of annotators  will
> "clean-up" the HTML tags before doing any further processing. As the result
> of that, the xmiCAS doesn't contain the original HTML text anymore.
>   
Ah ok. Visual and layout information is quite important for my 
extraction tasks. My rule language has the capability to dynamically 
filter all kinds and combinations of markup and annotations types. 
Therefore the original HTML text stays the main artifact in the xmiCAS 
even if the tags contain no valuable information. I plan to integrate 
"external" annotators with restrictions also in that manner.
> I think the most useful feature of your plug-in is its capability to allow
> users to edit the xmiCAS in the browser window similar to editing the HTML
> page with an HTML Editor (Please corect me if I am wrong).
>   
I am not sure if I understand you. The structure or text of the HTML 
cannot be modified by the CEV plugin (the rule language does such 
things). I think the only real advantage to the CAS Viewer and the CAS 
Editor is that the CEV can display annotations of an HTML artifact in 
some kind of browser and the user can create new annotations in this 
browser. It is really painfully to review or edit annotations in the 
HTML source. There is probably no reason (except maybe the extension 
point) to use the CEV plugin instead of the CAS Viewer if you are just 
processing plain text.
> Having some xmiCAS samples will help us to understand the plug-in's
> capability.
>   
Yes, I will provide a simple example next week.

Have a nice weekend!

Peter


Re: CAS Viewer

Posted by Tong Fin <to...@gmail.com>.
 Hi Peter,

Beside that the plugin should work with any kind of textual artifacts in the
> XMI since a plain text page is also provided (a typesystem in the same
> directory is required).


The installation is very helpful. By following the instruction, the plug-in
works at the first place and it can open my xmiCAS without any problem.


> You can also simply create a XMI with an HTML artifact (the HTML should be
> as welformed as possible).

When processing input files that contain HTML tags, most of annotators  will
"clean-up" the HTML tags before doing any further processing. As the result
of that, the xmiCAS doesn't contain the original HTML text anymore.
I think the most useful feature of your plug-in is its capability to allow
users to edit the xmiCAS in the browser window similar to editing the HTML
page with an HTML Editor (Please corect me if I am wrong).

Having some xmiCAS samples will help us to understand the plug-in's
capability.

Regards,
Tong

Re: CAS Viewer

Posted by Peter Klügl <pk...@uni-wuerzburg.de>.
Hi Tong,

i must admit that i dont have any (non project) examples at hand, but i 
will provide some in future on the TextMarker homepage (the CEV is just 
a side product of the TextMarker system). Beside that the plugin should 
work with any kind of textual artifacts in the XMI since a plain text 
page is also provided (a typesystem in the same directory is required). 
You can also simply create a XMI with an HTML artifact (the HTML should 
be as welformed as possible).

Peter

Tong Fin schrieb:
> Hi Peter,
> This is a good progress. I will look at your plugin.
>
> I would like to have a discussion on how we can use your work to expand the
> tooling for UIMA.
>
> Is there any chance that you can add some xmi CAS examples to your download
> ?
>
> Regards,
> Tong
>
> On Thu, Jan 8, 2009 at 1:37 PM, Peter Klügl <pk...@uni-wuerzburg.de>wrote:
>
>   
>> Hello,
>>
>> For those that are interested:
>> I released a quite stable version of the CEV plugin at
>> https://sourceforge.net/projects/textmarker/ (Mind the PDF with
>> installation instructions). An english manual and the source code will
>> hopefully follow before spring. The CEV plugin like the CAS Viewer
>> visualizes CAS files, but was especially built for HTML artifacts and
>> provides extension points for viewers and editors.
>>
>> At the CAS Viewer developers:
>> The HTML and the extension points are two essential functionalities for my
>> projects, e.g. the TextMarker provides several CEV views for
>> debugging/explaining its rules inference. I would be happy to integrate both
>> features in the CAS Viewer, but can not see how. I think there are no plans
>> at your side to support extensions or using the complete editor for the
>> artifact?
>>
>> Peter
>>
>> --
>> Peter Klügl
>> University of Würzburg
>> pkluegl@uni-wuerzburg.de
>>
>>
>>     
>
>
>   


-- 
Peter Klügl
University of Würzburg
pkluegl@uni-wuerzburg.de


Re: CAS Viewer

Posted by Tong Fin <to...@gmail.com>.
Hi Peter,
This is a good progress. I will look at your plugin.

I would like to have a discussion on how we can use your work to expand the
tooling for UIMA.

Is there any chance that you can add some xmi CAS examples to your download
?

Regards,
Tong

On Thu, Jan 8, 2009 at 1:37 PM, Peter Klügl <pk...@uni-wuerzburg.de>wrote:

> Hello,
>
> For those that are interested:
> I released a quite stable version of the CEV plugin at
> https://sourceforge.net/projects/textmarker/ (Mind the PDF with
> installation instructions). An english manual and the source code will
> hopefully follow before spring. The CEV plugin like the CAS Viewer
> visualizes CAS files, but was especially built for HTML artifacts and
> provides extension points for viewers and editors.
>
> At the CAS Viewer developers:
> The HTML and the extension points are two essential functionalities for my
> projects, e.g. the TextMarker provides several CEV views for
> debugging/explaining its rules inference. I would be happy to integrate both
> features in the CAS Viewer, but can not see how. I think there are no plans
> at your side to support extensions or using the complete editor for the
> artifact?
>
> Peter
>
> --
> Peter Klügl
> University of Würzburg
> pkluegl@uni-wuerzburg.de
>
>


-- 
 Tong