You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by Robert Spurrier <sp...@gmail.com> on 2015/08/24 20:04:58 UTC
Overriding PEAR Installation Metadata For Moving it Using Hadoop's DistributedCache
Hello,
I'm trying to use PEAR files with Hadoop's DistributedCache mechanism.
The cache provides all of the distribution and cleanup mechanisms involved
with metadata on a cluster of Hadoop datanodes, and the PEARs provide a
convenient delivery for NLP pipelines. My problem is that it the
DistributedCache is read-only, and the PEAR installation procedures require
overwriting macros and creating files in the directory in which it will be
used. So for now I install locally, compress the installed PEAR directory,
and ship it off to the grid.
Then I use an override mechanism to load an AE from the relocated PEAR:
I've modified the uimaj-core source, specifically ASB_impl.java and
PearAnalysisEngineWrapper.java, to check for install directory override
parameters. If given, the ConfigurationParameterSettings and
ExternalResourceSpecifiers in the ResourceCreationSpecifier are modified
by replacing the local install directory with the current datanode's
DistributedCache directory, where the PEAR now lives. It works great, but
I'd rather not deal with maintaining the tainted source, since to me right
now it seems like something that was not intended for PEARs.
Now that I have some more time to try to do things 'right', is there a
preferred way to leverage the API to make a portable pear PEAR when you
don't know the name of the directory in which it will ultimately live?
DistributedCache directories for a datanode are uniquely stamped, so I
can't change anything until the PEAR mechanisms have loaded the
description resources into memory.
Thanks for your time and effort, using UIMA in MapReduce has been a treat
so far!
Rob
Re: Overriding PEAR Installation Metadata For Moving it Using
Hadoop's DistributedCache
Posted by Marshall Schor <ms...@schor.com>.
Hi,
It sounds like you've got it mostly solved. I'm wondering if the "fix" might be
to run a step after your PEAR install step, before you zip things up, that
replaces absolute paths with some kind of relative path specifications that
would work when the thing is zipped up and distributed?
-Marshall
On 8/24/2015 2:04 PM, Robert Spurrier wrote:
> Hello,
>
> I'm trying to use PEAR files with Hadoop's DistributedCache mechanism.
> The cache provides all of the distribution and cleanup mechanisms involved
> with metadata on a cluster of Hadoop datanodes, and the PEARs provide a
> convenient delivery for NLP pipelines. My problem is that it the
> DistributedCache is read-only, and the PEAR installation procedures require
> overwriting macros and creating files in the directory in which it will be
> used. So for now I install locally, compress the installed PEAR directory,
> and ship it off to the grid.
>
> Then I use an override mechanism to load an AE from the relocated PEAR:
> I've modified the uimaj-core source, specifically ASB_impl.java and
> PearAnalysisEngineWrapper.java, to check for install directory override
> parameters. If given, the ConfigurationParameterSettings and
> ExternalResourceSpecifiers in the ResourceCreationSpecifier are modified
> by replacing the local install directory with the current datanode's
> DistributedCache directory, where the PEAR now lives. It works great, but
> I'd rather not deal with maintaining the tainted source, since to me right
> now it seems like something that was not intended for PEARs.
>
> Now that I have some more time to try to do things 'right', is there a
> preferred way to leverage the API to make a portable pear PEAR when you
> don't know the name of the directory in which it will ultimately live?
> DistributedCache directories for a datanode are uniquely stamped, so I
> can't change anything until the PEAR mechanisms have loaded the
> description resources into memory.
>
>
> Thanks for your time and effort, using UIMA in MapReduce has been a treat
> so far!
>
>
> Rob
>
>
>
>