You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Zheng Shao (JIRA)" <ji...@apache.org> on 2010/01/13 09:38:54 UTC
[jira] Created: (HIVE-1050) Reduce the memory foot-print of
HiveInputSplit
Reduce the memory foot-print of HiveInputSplit
----------------------------------------------
Key: HIVE-1050
URL: https://issues.apache.org/jira/browse/HIVE-1050
Project: Hadoop Hive
Issue Type: Improvement
Reporter: Zheng Shao
{{HiveInputSplit}} now inherits from {{FileSplit}} just because we want {{MapTask}} to forward the file name of the mapper:
This makes {{HiveInputSplit}} big. See MAPREDUCE-1374
{code}
private void updateJobWithSplit(final JobConf job, InputSplit inputSplit) {
if (inputSplit instanceof FileSplit) {
FileSplit fileSplit = (FileSplit) inputSplit;
job.set("map.input.file", fileSplit.getPath().toString());
job.setLong("map.input.start", fileSplit.getStart());
job.setLong("map.input.length", fileSplit.getLength());
LOG.info("split: " + job.get("map.input.file")+", range: "
+ job.getLong("map.input.start", 0) + "-"
+ job.getLong("map.input.length", 0));
}
}
{code}
Once we move to the new MapReduce framework, we should be able to make smaller HiveInputFormat which will reduce the amount of memory needed on {{JobClient}}.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.