You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by Piyush Paliwal <pi...@gmail.com> on 2014/10/22 13:35:55 UTC

UIMA Ruta into jar?

Hi,

we are developing one Ruta Project and want to access it in java project.
Currently what we did is to add the descriptor (generated from ruta script)
into UIMA pipeline which is in java project.

The pipeline can only be run on workspace, we are not able to make a single
jar of that java project and run on command line because it can not access
Ruta project as dependency.

There is also a direct way to read ruta script within java, but the script
can not import annotations from type systems if we put in java project
(i.e. it needs Ruta editor).

Any way to add Ruta project dependency into java?

Thanks.

Piyush


-- 
Piyush Paliwal

Re: UIMA Ruta into jar?

Posted by Peter Klügl <pk...@uni-wuerzburg.de>.
Hi,

just to summarize possible pitfalls when using a ruta project developed
in the workbench in a normal UIMA/Java environment:

There are two parts:

1. Contains the CAS everything needed?
The CAS needs to contain all types. If the CAS is created using the
analysis engine (descriptor) generated by the workbench and it is still
located in the ruta workbench, then everything should work just nicely.
If the CAS was created using the generated type system descriptor, then
the ruta type priorities need to be included. If the descriptors were
copied to the java project, one has to take care that relative paths are
still valid. The workbench normally uses import by location with
relative paths. There should be no problems when the ruta engine is
included in a larger aggregated analysis engine. If the CAS is created
with uimaFIT by automatically collecting the type systems, then one has
to take care that the types systems of the script files are included and
that the type priorities are not missed. If the type priorities become
too annoying, we could maybe remove them completely in future.

2. Is Ruta able to find all resources?
The layout of ruta projects and the usage of absolute paths in the
descriptors have historical reasons. The problem is that if a java
project includes a ruta project in its classpath, then the ruta engine
is not able to find imported resources. The reason for this is because
the folders script/descriptor/resources are not part of the classpath
but only the root of the ruta project.  Hence, if the absolute paths are
not valid anymore, e.g., because the resources have been copied or
packed into a jar, then the engine tries to find the resources on the
classpath. If, however, the folder structure was copied, then the
imports are not valid anymore, e.g, the engine searches for
"uima.ruta.example.X", but it's located in "descriptor/...". What we do
is to copy the contents of script/descriptor/resources to the root of
the jar. If this jar is included in the classpath of the java project,
then the stuff should be found.

There are already open issues related to these things and we will
improve the handling in future. I also plan to add a section in the
documentation about the pitfalls after the upcoming restructuring. If I
find the time, I will implement the ruta-maven-plugin which should
facilitate the development of ruta script in a maven context.

Best,

Peter


Am 23.10.2014 19:36, schrieb Alexandre Patry:
> On 14-10-23 09:40 AM, Piyush Paliwal wrote:
>> Hi Richard,
>>
>> its seems to work now. Thanks. As I was only at testing stage, I
>> forgot to
>> add other descriptors (OpenNlpTagger, etc) prior to that Ruta
>> descriptor in
>> pipeline. Those were needed so that the CAS can find all types.
>>
>> Though, its a little hectic solution (copy and paste), but is
>> workable and
>> therefore is great.
> I am glad that you made it work! If you want to reduce XML
> boilerplate, you can look at uimaFIT [1], a library offering a very
> nice Java API to replace XML descriptors.
>
> Alexandre
>
> [1] http://uima.apache.org/uimafit.html
>>
>> Piyush
>>
>> On Thu, Oct 23, 2014 at 8:10 AM, Richard Eckart de Castilho
>> <re...@apache.org>
>> wrote:
>>
>>> On 23.10.2014, at 00:39, Piyush Paliwal <pi...@gmail.com>
>>> wrote:
>>>
>>>> As an example, I wish to import the following types from
>>>> TypeSystem.xml
>>>> descriptor which also resides in same folder as script (both files
>>>> now in
>>>> Java project).
>>>>
>>>> //import the additional annotations types and alias in short name
>>>>
>>>> IMPORT de.tudarmstadt.ukp.dkpro.core.api.lexmorph.type.pos.NN FROM
>>>> uima.ruta.example.TypeSystem  AS _NN;
>>>>
>>>> IMPORT de.tudarmstadt.ukp.dkpro.core.api.syntax.type.constituent.PP
>>>> FROM
>>>> uima.ruta.example.TypeSystem AS _PP;
>>> I assume you are invoking Ruta via uimaFIT? If yes, then you should
>>> make
>>> sure that uimaFIT can find all necessary type systems via the type
>>> detection
>>> mechanism [1].
>>>
>>> If you not using uimaFIT or if you have some special way to create your
>>> CASes, make sure that when the CAS is created, all types that all your
>>> scripts need are already loaded at that point.
>>>
>>> UIMA does not allow to change the type system while a pipeline is
>>> running.
>>> Thus the IMPORT declarations will normally not be interpreted when the
>>> script
>>> is executed.
>>>
>>> I do not know how the IMPORT (type) AS (alias) is implemented. If
>>> the alias
>>> is set up at execution time and not at CAS initialization time, it
>>> should
>>> work.
>>>
>>> Alexandre?
>>>
>>> Cheers,
>>>
>>> -- Richard
>>>
>>> [1]
>>> http://uima.apache.org/d/uimafit-current/tools.uimafit.book.html#d5e531
>>>
>>
>>
>



Re: UIMA Ruta into jar?

Posted by Alexandre Patry <al...@nlpfu.com>.
On 14-10-23 09:40 AM, Piyush Paliwal wrote:
> Hi Richard,
>
> its seems to work now. Thanks. As I was only at testing stage, I forgot to
> add other descriptors (OpenNlpTagger, etc) prior to that Ruta descriptor in
> pipeline. Those were needed so that the CAS can find all types.
>
> Though, its a little hectic solution (copy and paste), but is workable and
> therefore is great.
I am glad that you made it work! If you want to reduce XML boilerplate, 
you can look at uimaFIT [1], a library offering a very nice Java API to 
replace XML descriptors.

Alexandre

[1] http://uima.apache.org/uimafit.html
>
> Piyush
>
> On Thu, Oct 23, 2014 at 8:10 AM, Richard Eckart de Castilho <re...@apache.org>
> wrote:
>
>> On 23.10.2014, at 00:39, Piyush Paliwal <pi...@gmail.com> wrote:
>>
>>> As an example, I wish to import the following types from TypeSystem.xml
>>> descriptor which also resides in same folder as script (both files now in
>>> Java project).
>>>
>>> //import the additional annotations types and alias in short name
>>>
>>> IMPORT de.tudarmstadt.ukp.dkpro.core.api.lexmorph.type.pos.NN FROM
>>> uima.ruta.example.TypeSystem  AS _NN;
>>>
>>> IMPORT de.tudarmstadt.ukp.dkpro.core.api.syntax.type.constituent.PP FROM
>>> uima.ruta.example.TypeSystem AS _PP;
>> I assume you are invoking Ruta via uimaFIT? If yes, then you should make
>> sure that uimaFIT can find all necessary type systems via the type
>> detection
>> mechanism [1].
>>
>> If you not using uimaFIT or if you have some special way to create your
>> CASes, make sure that when the CAS is created, all types that all your
>> scripts need are already loaded at that point.
>>
>> UIMA does not allow to change the type system while a pipeline is running.
>> Thus the IMPORT declarations will normally not be interpreted when the
>> script
>> is executed.
>>
>> I do not know how the IMPORT (type) AS (alias) is implemented. If the alias
>> is set up at execution time and not at CAS initialization time, it should
>> work.
>>
>> Alexandre?
>>
>> Cheers,
>>
>> -- Richard
>>
>> [1]
>> http://uima.apache.org/d/uimafit-current/tools.uimafit.book.html#d5e531
>>
>
>


Re: UIMA Ruta into jar?

Posted by Piyush Paliwal <pi...@gmail.com>.
Hi Richard,

its seems to work now. Thanks. As I was only at testing stage, I forgot to
add other descriptors (OpenNlpTagger, etc) prior to that Ruta descriptor in
pipeline. Those were needed so that the CAS can find all types.

Though, its a little hectic solution (copy and paste), but is workable and
therefore is great.

Piyush

On Thu, Oct 23, 2014 at 8:10 AM, Richard Eckart de Castilho <re...@apache.org>
wrote:

> On 23.10.2014, at 00:39, Piyush Paliwal <pi...@gmail.com> wrote:
>
> > As an example, I wish to import the following types from TypeSystem.xml
> > descriptor which also resides in same folder as script (both files now in
> > Java project).
> >
> > //import the additional annotations types and alias in short name
> >
> > IMPORT de.tudarmstadt.ukp.dkpro.core.api.lexmorph.type.pos.NN FROM
> > uima.ruta.example.TypeSystem  AS _NN;
> >
> > IMPORT de.tudarmstadt.ukp.dkpro.core.api.syntax.type.constituent.PP FROM
> > uima.ruta.example.TypeSystem AS _PP;
>
> I assume you are invoking Ruta via uimaFIT? If yes, then you should make
> sure that uimaFIT can find all necessary type systems via the type
> detection
> mechanism [1].
>
> If you not using uimaFIT or if you have some special way to create your
> CASes, make sure that when the CAS is created, all types that all your
> scripts need are already loaded at that point.
>
> UIMA does not allow to change the type system while a pipeline is running.
> Thus the IMPORT declarations will normally not be interpreted when the
> script
> is executed.
>
> I do not know how the IMPORT (type) AS (alias) is implemented. If the alias
> is set up at execution time and not at CAS initialization time, it should
> work.
>
> Alexandre?
>
> Cheers,
>
> -- Richard
>
> [1]
> http://uima.apache.org/d/uimafit-current/tools.uimafit.book.html#d5e531
>



-- 
Piyush Paliwal

Re: UIMA Ruta into jar?

Posted by Alexandre Patry <al...@nlpfu.com>.
On 14-10-23 02:10 AM, Richard Eckart de Castilho wrote:
> On 23.10.2014, at 00:39, Piyush Paliwal <pi...@gmail.com> wrote:
>
>> As an example, I wish to import the following types from TypeSystem.xml
>> descriptor which also resides in same folder as script (both files now in
>> Java project).
>>
>> //import the additional annotations types and alias in short name
>>
>> IMPORT de.tudarmstadt.ukp.dkpro.core.api.lexmorph.type.pos.NN FROM
>> uima.ruta.example.TypeSystem  AS _NN;
>>
>> IMPORT de.tudarmstadt.ukp.dkpro.core.api.syntax.type.constituent.PP FROM
>> uima.ruta.example.TypeSystem AS _PP;
> I do not know how the IMPORT (type) AS (alias) is implemented. If the alias
> is set up at execution time and not at CAS initialization time, it should
> work.
>
> Alexandre?
IMPORT instructions and aliases are resolved at the same time as 
TYPESYSTEM instructions, when the first CAS is processed.

Best,

Alexandre



Re: UIMA Ruta into jar?

Posted by Richard Eckart de Castilho <re...@apache.org>.
On 23.10.2014, at 00:39, Piyush Paliwal <pi...@gmail.com> wrote:

> As an example, I wish to import the following types from TypeSystem.xml
> descriptor which also resides in same folder as script (both files now in
> Java project).
> 
> //import the additional annotations types and alias in short name
> 
> IMPORT de.tudarmstadt.ukp.dkpro.core.api.lexmorph.type.pos.NN FROM
> uima.ruta.example.TypeSystem  AS _NN;
> 
> IMPORT de.tudarmstadt.ukp.dkpro.core.api.syntax.type.constituent.PP FROM
> uima.ruta.example.TypeSystem AS _PP;

I assume you are invoking Ruta via uimaFIT? If yes, then you should make
sure that uimaFIT can find all necessary type systems via the type detection
mechanism [1].

If you not using uimaFIT or if you have some special way to create your
CASes, make sure that when the CAS is created, all types that all your
scripts need are already loaded at that point.

UIMA does not allow to change the type system while a pipeline is running.
Thus the IMPORT declarations will normally not be interpreted when the script
is executed.

I do not know how the IMPORT (type) AS (alias) is implemented. If the alias
is set up at execution time and not at CAS initialization time, it should
work.

Alexandre?

Cheers,

-- Richard

[1] http://uima.apache.org/d/uimafit-current/tools.uimafit.book.html#d5e531

Re: UIMA Ruta into jar?

Posted by Piyush Paliwal <pi...@gmail.com>.
Hi Alexandre,

This is more or less clear. However, as the main issue, as I earlier
pointed out, the script even if I put now in java project, it fails to
import additional annotations from external typesystem. This importing was
successfully done when script was inside ruta project.


As an example, I wish to import the following types from TypeSystem.xml
descriptor which also resides in same folder as script (both files now in
Java project).

//import the additional annotations types and alias in short name

IMPORT de.tudarmstadt.ukp.dkpro.core.api.lexmorph.type.pos.NN FROM
uima.ruta.example.TypeSystem  AS _NN;

IMPORT de.tudarmstadt.ukp.dkpro.core.api.syntax.type.constituent.PP FROM
uima.ruta.example.TypeSystem AS _PP;


If I wish to declare any annotation which does not depend on imported
types, then there are no errors even if script is in Java.

DECLARE Annotation A(String y,String x);


But how to allow script imports the above annotations from external type
system, if put in java?

- Piyush

On Wed, Oct 22, 2014 at 4:50 PM, Alexandre Patry <al...@nlpfu.com> wrote:

> Hi Piyush,
>
> A while ago, I wrote a blog post on how to package a RUTA script with
> maven:
>
> http://textjuicer.com/blog/2013/09/08/using-ruta-in-a-maven-project/
>
> Even if you do not use maven, it should give you an idea on the files to
> distribute in your jars.
>
> Hope this help,
>
> Alexandre
>
>
> On 14-10-22 07:35 AM, Piyush Paliwal wrote:
>
>> Hi,
>>
>> we are developing one Ruta Project and want to access it in java project.
>> Currently what we did is to add the descriptor (generated from ruta
>> script)
>> into UIMA pipeline which is in java project.
>>
>> The pipeline can only be run on workspace, we are not able to make a
>> single
>> jar of that java project and run on command line because it can not access
>> Ruta project as dependency.
>>
>> There is also a direct way to read ruta script within java, but the script
>> can not import annotations from type systems if we put in java project
>> (i.e. it needs Ruta editor).
>>
>> Any way to add Ruta project dependency into java?
>>
>> Thanks.
>>
>> Piyush
>>
>>
>>
>


-- 
Piyush Paliwal

Re: UIMA Ruta into jar?

Posted by Alexandre Patry <al...@nlpfu.com>.
Hi Piyush,

A while ago, I wrote a blog post on how to package a RUTA script with 
maven:

http://textjuicer.com/blog/2013/09/08/using-ruta-in-a-maven-project/

Even if you do not use maven, it should give you an idea on the files to 
distribute in your jars.

Hope this help,

Alexandre

On 14-10-22 07:35 AM, Piyush Paliwal wrote:
> Hi,
>
> we are developing one Ruta Project and want to access it in java project.
> Currently what we did is to add the descriptor (generated from ruta script)
> into UIMA pipeline which is in java project.
>
> The pipeline can only be run on workspace, we are not able to make a single
> jar of that java project and run on command line because it can not access
> Ruta project as dependency.
>
> There is also a direct way to read ruta script within java, but the script
> can not import annotations from type systems if we put in java project
> (i.e. it needs Ruta editor).
>
> Any way to add Ruta project dependency into java?
>
> Thanks.
>
> Piyush
>
>