You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Vihang Karajgaonkar (JIRA)" <ji...@apache.org> on 2016/09/30 00:49:20 UTC
[jira] [Created] (HIVE-14864) Distcp is not called from MoveTask
when src is a directory
Vihang Karajgaonkar created HIVE-14864:
------------------------------------------
Summary: Distcp is not called from MoveTask when src is a directory
Key: HIVE-14864
URL: https://issues.apache.org/jira/browse/HIVE-14864
Project: Hive
Issue Type: Bug
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar
In FileUtils.java the following code does not get executed even when src directory size is greater than HIVE_EXEC_COPYFILE_MAXSIZE because
srcFS.getFileStatus(src).getLen() returns 0 when src is a directory. We should use srcFS.getContentSummary(src).getLength() instead.
{noformat}
/* Run distcp if source file/dir is too big */
if (srcFS.getUri().getScheme().equals("hdfs") &&
srcFS.getFileStatus(src).getLen() > conf.getLongVar(HiveConf.ConfVars.HIVE_EXEC_COPYFILE_MAXSIZE)) {
LOG.info("Source is " + srcFS.getFileStatus(src).getLen() + " bytes. (MAX: " + conf.getLongVar(HiveConf.ConfVars.HIVE_EXEC_COPYFILE_MAXSIZE) + ")");
LOG.info("Launch distributed copy (distcp) job.");
HiveConfUtil.updateJobCredentialProviders(conf);
copied = shims.runDistCp(src, dst, conf);
if (copied && deleteSource) {
srcFS.delete(src, true);
}
}
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)