You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "deepankar (JIRA)" <ji...@apache.org> on 2014/06/12 01:59:03 UTC

[jira] [Created] (HBASE-11335) Fix the TABLE_DIR param in TableSnapshotInputFormat

deepankar created HBASE-11335:
---------------------------------

             Summary: Fix the TABLE_DIR param in TableSnapshotInputFormat
                 Key: HBASE-11335
                 URL: https://issues.apache.org/jira/browse/HBASE-11335
             Project: HBase
          Issue Type: Bug
          Components: mapreduce, snapshots
    Affects Versions: 0.98.3, 0.96.2
            Reporter: deepankar


In class *TableSnapshotInputFormat* or *TableSnapshotInputFormatImpl*
in the function 
{code}
public static void setInput(Job job, String snapshotName, Path restoreDir) throws IOException {
{code}
we are setting restoreDir (temporary root) to tableDir
{code}
conf.set(TABLE_DIR_KEY, restoreDir.toString());
{code}

The above parameter is used to get the InputSplits, especially for 
calculating favorable hosts in the function
{code}
Path tableDir = new Path(conf.get(TABLE_DIR_KEY));

List<String> hosts = getBestLocations(conf,
          HRegion.computeHDFSBlocksDistribution(conf, htd, hri, tableDir));
{code}

This will lead to returning a empty *HDFSBlocksDistribution*, as there is 
will be no directory with name as the region name from hri in the restored
root directory, which will lead to scheduling of non local tasks.

The change is simple in the sense, is to call the {code}FSUtils.getTableDir(rootDir, tableDesc.getTableName()) {code}
in the getSplits function

more discussion in the comments below 

https://issues.apache.org/jira/browse/HBASE-8369?focusedCommentId=14012085&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14012085




--
This message was sent by Atlassian JIRA
(v6.2#6252)