You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@accumulo.apache.org by "Slater, David M." <Da...@jhuapl.edu> on 2013/03/28 16:45:41 UTC
Distributed Cache - for iterators?
Hey everyone,
In Hadoop Map Reduce, the Configuration class can pass String parameters (via the Context argument to map and reduce). Likewise, the Map<String, String> options argument in Iterator init allows the same functionality for Accumulo iterators.
However, for more complex parameters, Hadoop has a DistributedCache which is available to all of the mappers and reducers. Is there any similar functionality for Accumulo iterators, or does all of the information need to be sent as a String through options?
Also, are there any problems with sending exceptionally long Strings in the options argument?
Thanks,
David
Re: Distributed Cache - for iterators?
Posted by Eric Newton <er...@gmail.com>.
He might. I know users who send a lot of configuration data to their
iterators. It's quite ugly when viewed with "listscans" in the shell. If
you are thinking of passing more than a megabyte, maybe its better to send
it through a side channel like HDFS.
On Thu, Mar 28, 2013 at 12:03 PM, Keith Turner <ke...@deenlo.com> wrote:
> On Thu, Mar 28, 2013 at 11:45 AM, Slater, David M.
> <Da...@jhuapl.edu> wrote:
> > Hey everyone,
> >
> >
> >
> > In Hadoop Map Reduce, the Configuration class can pass String parameters
> > (via the Context argument to map and reduce). Likewise, the Map<String,
> > String> options argument in Iterator init allows the same functionality
> for
> > Accumulo iterators.
> >
> >
> >
> > However, for more complex parameters, Hadoop has a DistributedCache
> which is
> > available to all of the mappers and reducers. Is there any similar
> > functionality for Accumulo iterators, or does all of the information
> need to
> > be sent as a String through options?
>
> Accumulo does not provide anything out of the box. I wonder if
> putting a file in HDFS w/ a high replication factor would be a good
> way to pass this info.
>
> >
> >
> >
> > Also, are there any problems with sending exceptionally long Strings in
> the
> > options argument?
>
> Does anyone know if David would run into issues similar to ACCUMULO-1141?
>
> >
> >
> >
> > Thanks,
> > David
>
Re: Distributed Cache - for iterators?
Posted by Keith Turner <ke...@deenlo.com>.
On Thu, Mar 28, 2013 at 11:45 AM, Slater, David M.
<Da...@jhuapl.edu> wrote:
> Hey everyone,
>
>
>
> In Hadoop Map Reduce, the Configuration class can pass String parameters
> (via the Context argument to map and reduce). Likewise, the Map<String,
> String> options argument in Iterator init allows the same functionality for
> Accumulo iterators.
>
>
>
> However, for more complex parameters, Hadoop has a DistributedCache which is
> available to all of the mappers and reducers. Is there any similar
> functionality for Accumulo iterators, or does all of the information need to
> be sent as a String through options?
Accumulo does not provide anything out of the box. I wonder if
putting a file in HDFS w/ a high replication factor would be a good
way to pass this info.
>
>
>
> Also, are there any problems with sending exceptionally long Strings in the
> options argument?
Does anyone know if David would run into issues similar to ACCUMULO-1141?
>
>
>
> Thanks,
> David