You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by "Josh Elser (JIRA)" <ji...@apache.org> on 2015/04/01 22:06:53 UTC

[jira] [Reopened] (ACCUMULO-3704) Localize client configuration for MapReduce

     [ https://issues.apache.org/jira/browse/ACCUMULO-3704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Josh Elser reopened ACCUMULO-3704:
----------------------------------

[~billie.rinaldi] noticed that I missed one in getTabletLocator

> Localize client configuration for MapReduce
> -------------------------------------------
>
>                 Key: ACCUMULO-3704
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3704
>             Project: Accumulo
>          Issue Type: Bug
>          Components: client, mapreduce
>            Reporter: Josh Elser
>            Assignee: Josh Elser
>            Priority: Blocker
>             Fix For: 1.7.0
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Backstory is that I had a Kerberized Hadoop node and was running ContinuousVerify on it.
> The job launched successfully, but the mappers hung, unable to authenticate with the TabletServers. I knew that I had the configuration (mostly) right, because the Tool (client code) was able to fetch the split points for the job: the mappers were just unable to read from Accumulo.
> The Tool was able to talk to Accumulo because ACCUMULO_CONF_DIR was correctly set by config.sh (called from tool.sh). However, environment variables from the Tool are not passed into the child mappers/reducers. As such, the Mappers could only guess at a few locations where the client configuration file might be. In my case, they did not guess correctly. This kind of boils down to the following:
> 1. Client launches job with correct environment
> 2. Mappers reliably fail to talk to Accumulo
> [~billie.rinaldi] had the suggestion that we localize the client configuration in the Job itself. I think the easiest way to do this is to construct a ClientConfiguration in the Tool, serialize it as a property file and add it to the distributed cache.
> Then, when we construct the RecordReader, we can search for that file first, and then fall back to loading the default. This should make a seamless experience for users and prevents the need for Accumulo configuration across all YARN nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)