You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by "Josh Elser (JIRA)" <ji...@apache.org> on 2014/01/22 23:12:21 UTC
[jira] [Resolved] (ACCUMULO-2234) Cannot run offline mapreduce over
non-default instance.dfs.dir value
[ https://issues.apache.org/jira/browse/ACCUMULO-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Josh Elser resolved ACCUMULO-2234.
----------------------------------
Resolution: Fixed
Always pull accumulo-site.xml out of ACCUMULO_CONF_DIR and provide it to the ContinuousVerify Tool. Use DistributedCache to ensure that it gets on the Mapper's classpath.
Tested against Apache Hadoop 1.2.1 and 2.2.0.
> Cannot run offline mapreduce over non-default instance.dfs.dir value
> --------------------------------------------------------------------
>
> Key: ACCUMULO-2234
> URL: https://issues.apache.org/jira/browse/ACCUMULO-2234
> Project: Accumulo
> Issue Type: Bug
> Affects Versions: 1.4.4, 1.5.0
> Reporter: Josh Elser
> Assignee: Josh Elser
> Priority: Blocker
> Fix For: 1.4.5, 1.5.1, 1.6.0
>
>
> The javadoc for setting up offline scans over RFiles (InputFormatBase.setScanOffline in 1.4 or InputFormatBase.setOfflineTableScan in 1.5) includes a nice little comment to the effect that if a "non-standard" directory is used for Accumulo in HDFS (read as, if the default value for instance.dfs.dir), accumulo-site.xml may need to be on the classpath for the mappers.
> Best as I can tell, even if accumulo-site.xml is on the classpath, it makes no difference as InputFormatBase is creating a new ZooKeeperInstance which, in turn, will only ever make a DefaultConfiguration and never try to check if an accumulo-site.xml file is available. This would make it impossible for a non-default value for instance.dfs.dir to ever be used.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)