You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@uima.apache.org by Yi-Wen Liu <yi...@usc.edu> on 2015/10/09 23:54:59 UTC

Running UIMA Apps on Hadoop

Hi all,

I am a USC student working on professor Mattmann's project "Integration of
cTAKES/UIMA and Apache Hadoop for the Shangridocs system", the proposal is
attached.
I searched many relevant resources of running UIMA on Hadoop, this is one
of them:
https://cwiki.apache.org/confluence/display/UIMA/Running+UIMA+Apps+on+Hadoop

but it only provides very general explanations.

Has anybody gone through all the steps and succeeded? Hope you could
provide me some examples. Thanks!

And in the *Important Considerations*,
"1. The jar file created should *shave* all the classes, descriptors of the
UIMA app along with the map/reduce and job main class"

I think you mean "The jar file created should *have* all the classes...".
Is that correct?


Thanks,
Yi-Wen

Re: Running UIMA Apps on Hadoop

Posted by Yi-Wen Liu <yi...@usc.edu>.

Thanks for your reply, I'll take a look at the sample.
It seems like a mixture of both. cTAKES using UIMA as the framework and
cTAKES is inside a MR job.
I found something that might help:
https://github.com/DigitalPebble/behemoth, but
not sure if it is what I need.
Have you used  Behemoth before?


Thanks,
Yi-Wen

On Mon, Oct 12, 2015 at 12:25 AM, Tommaso Teofili <tommaso.teofili@gmail.com
> wrote:

> In the past I had developed a small sample of running UIMA on Apache Hama
> (which in turns can run on HDFS) [1].
> Other than that I don't know the mentioned doc and never tried reproducing
> the steps, however it sounds like "shaving the classes" is a typo :-)
>
> First and foremost I think you should decide which tool does what: do you
> use UIMA as a library inside a MR job (like in the mentioned doc) ? Or you
> use UIMA as the framework and intend to read / write job data from HDFS ?
> Or a mixture of both ?
>
> My 2 cents,
> Tommaso
>
> [1] :
>
> https://github.com/apache/uima-sandbox/blob/trunk/uima-bsp/src/main/java/org/apache/uima/bsp/BasicAEProcessingBSPJob.java
> [2] : http://hama.apache.org
>
>
> 2015-10-12 7:52 GMT+02:00 Yi-Wen Liu <yi...@usc.edu>:
>
> > Thanks, but if we still have to use Hadoop, does anybody know who I can
> ask
> > questions about the document?
> >
> >
> https://cwiki.apache.org/confluence/display/UIMA/Running+UIMA+Apps+on+Hadoop
> >
> > Thanks,
> > Yi-Wen
> >
> > On Sat, Oct 10, 2015 at 4:05 AM, Lou DeGenaro <lo...@gmail.com>
> > wrote:
> >
> > > Perhaps you have no choice, but if you do then consider DUCC
> > https://uima.
> > > apache.org/doc-uimaducc-whatitam.html for UIMA scale-out.
> > >
> > > Lou.
> > >
> > > On Fri, Oct 9, 2015 at 5:54 PM, Yi-Wen Liu <yi...@usc.edu> wrote:
> > >
> > > > Hi all,
> > > >
> > > > I am a USC student working on professor Mattmann's project
> "Integration
> > > of
> > > > cTAKES/UIMA and Apache Hadoop for the Shangridocs system", the
> proposal
> > > is
> > > > attached.
> > > > I searched many relevant resources of running UIMA on Hadoop, this is
> > one
> > > > of them:
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/UIMA/Running+UIMA+Apps+on+Hadoop
> > > >
> > > > but it only provides very general explanations.
> > > >
> > > > Has anybody gone through all the steps and succeeded? Hope you could
> > > > provide me some examples. Thanks!
> > > >
> > > > And in the *Important Considerations*,
> > > > "1. The jar file created should *shave* all the classes, descriptors
> of
> > > > the UIMA app along with the map/reduce and job main class"
> > > >
> > > > I think you mean "The jar file created should *have* all the
> > classes...".
> > > > Is that correct?
> > > >
> > > >
> > > > Thanks,
> > > > Yi-Wen
> > > >
> > >
> >
>

Re: Running UIMA Apps on Hadoop

Posted by Tommaso Teofili <to...@gmail.com>.

In the past I had developed a small sample of running UIMA on Apache Hama
(which in turns can run on HDFS) [1].
Other than that I don't know the mentioned doc and never tried reproducing
the steps, however it sounds like "shaving the classes" is a typo :-)

First and foremost I think you should decide which tool does what: do you
use UIMA as a library inside a MR job (like in the mentioned doc) ? Or you
use UIMA as the framework and intend to read / write job data from HDFS ?
Or a mixture of both ?

My 2 cents,
Tommaso

[1] :
https://github.com/apache/uima-sandbox/blob/trunk/uima-bsp/src/main/java/org/apache/uima/bsp/BasicAEProcessingBSPJob.java
[2] : http://hama.apache.org


2015-10-12 7:52 GMT+02:00 Yi-Wen Liu <yi...@usc.edu>:

> Thanks, but if we still have to use Hadoop, does anybody know who I can ask
> questions about the document?
>
> https://cwiki.apache.org/confluence/display/UIMA/Running+UIMA+Apps+on+Hadoop
>
> Thanks,
> Yi-Wen
>
> On Sat, Oct 10, 2015 at 4:05 AM, Lou DeGenaro <lo...@gmail.com>
> wrote:
>
> > Perhaps you have no choice, but if you do then consider DUCC
> https://uima.
> > apache.org/doc-uimaducc-whatitam.html for UIMA scale-out.
> >
> > Lou.
> >
> > On Fri, Oct 9, 2015 at 5:54 PM, Yi-Wen Liu <yi...@usc.edu> wrote:
> >
> > > Hi all,
> > >
> > > I am a USC student working on professor Mattmann's project "Integration
> > of
> > > cTAKES/UIMA and Apache Hadoop for the Shangridocs system", the proposal
> > is
> > > attached.
> > > I searched many relevant resources of running UIMA on Hadoop, this is
> one
> > > of them:
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/UIMA/Running+UIMA+Apps+on+Hadoop
> > >
> > > but it only provides very general explanations.
> > >
> > > Has anybody gone through all the steps and succeeded? Hope you could
> > > provide me some examples. Thanks!
> > >
> > > And in the *Important Considerations*,
> > > "1. The jar file created should *shave* all the classes, descriptors of
> > > the UIMA app along with the map/reduce and job main class"
> > >
> > > I think you mean "The jar file created should *have* all the
> classes...".
> > > Is that correct?
> > >
> > >
> > > Thanks,
> > > Yi-Wen
> > >
> >
>

Re: Running UIMA Apps on Hadoop

Posted by Yi-Wen Liu <yi...@usc.edu>.

Thanks, but if we still have to use Hadoop, does anybody know who I can ask
questions about the document?
https://cwiki.apache.org/confluence/display/UIMA/Running+UIMA+Apps+on+Hadoop

Thanks,
Yi-Wen

On Sat, Oct 10, 2015 at 4:05 AM, Lou DeGenaro <lo...@gmail.com>
wrote:

> Perhaps you have no choice, but if you do then consider DUCC https://uima.
> apache.org/doc-uimaducc-whatitam.html for UIMA scale-out.
>
> Lou.
>
> On Fri, Oct 9, 2015 at 5:54 PM, Yi-Wen Liu <yi...@usc.edu> wrote:
>
> > Hi all,
> >
> > I am a USC student working on professor Mattmann's project "Integration
> of
> > cTAKES/UIMA and Apache Hadoop for the Shangridocs system", the proposal
> is
> > attached.
> > I searched many relevant resources of running UIMA on Hadoop, this is one
> > of them:
> >
> >
> https://cwiki.apache.org/confluence/display/UIMA/Running+UIMA+Apps+on+Hadoop
> >
> > but it only provides very general explanations.
> >
> > Has anybody gone through all the steps and succeeded? Hope you could
> > provide me some examples. Thanks!
> >
> > And in the *Important Considerations*,
> > "1. The jar file created should *shave* all the classes, descriptors of
> > the UIMA app along with the map/reduce and job main class"
> >
> > I think you mean "The jar file created should *have* all the classes...".
> > Is that correct?
> >
> >
> > Thanks,
> > Yi-Wen
> >
>

Re: Running UIMA Apps on Hadoop

Posted by Lou DeGenaro <lo...@gmail.com>.

Perhaps you have no choice, but if you do then consider DUCC https://uima.
apache.org/doc-uimaducc-whatitam.html for UIMA scale-out.

Lou.

On Fri, Oct 9, 2015 at 5:54 PM, Yi-Wen Liu <yi...@usc.edu> wrote:

> Hi all,
>
> I am a USC student working on professor Mattmann's project "Integration of
> cTAKES/UIMA and Apache Hadoop for the Shangridocs system", the proposal is
> attached.
> I searched many relevant resources of running UIMA on Hadoop, this is one
> of them:
>
> https://cwiki.apache.org/confluence/display/UIMA/Running+UIMA+Apps+on+Hadoop
>
> but it only provides very general explanations.
>
> Has anybody gone through all the steps and succeeded? Hope you could
> provide me some examples. Thanks!
>
> And in the *Important Considerations*,
> "1. The jar file created should *shave* all the classes, descriptors of
> the UIMA app along with the map/reduce and job main class"
>
> I think you mean "The jar file created should *have* all the classes...".
> Is that correct?
>
>
> Thanks,
> Yi-Wen
>