You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hong Shen (JIRA)" <ji...@apache.org> on 2015/01/21 09:47:34 UTC
[jira] [Comment Edited] (SPARK-5347) InputMetrics bug when
inputSplit is not instanceOf FileSplit
[ https://issues.apache.org/jira/browse/SPARK-5347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14285366#comment-14285366 ]
Hong Shen edited comment on SPARK-5347 at 1/21/15 8:46 AM:
-----------------------------------------------------------
It's because in HadoopRDD, inputMetrics only been set when split is instanceOf FileSplit, but CombineFileInputFormat use InputSplit. It's not nessesary to instanceOf FileSplit, only have to instanceOf InputSplit.
{code}
override def close() {
try {
reader.close()
if (bytesReadCallback.isDefined) {
val bytesReadFn = bytesReadCallback.get
inputMetrics.bytesRead = bytesReadFn()
} else if (split.inputSplit.value.isInstanceOf[FileSplit]) {
// If we can't get the bytes read from the FS stats, fall back to the split size,
// which may be inaccurate.
try {
inputMetrics.bytesRead = split.inputSplit.value.getLength
context.taskMetrics.inputMetrics = Some(inputMetrics)
} catch {
case e: java.io.IOException =>
logWarning("Unable to get input size to set InputMetrics for task", e)
}
}
} catch {
case e: Exception => {
if (!Utils.inShutdown()) {
logWarning("Exception in RecordReader.close()", e)
}
}
}
}
{code}
was (Author: shenhong):
It's because in HadoopRDD, inputMetrics only been set when split is instanceOf FileSplit, but CombineFileInputFormat use InputSplit. It's not nessesary to instanceOf FileSplit, only have to instanceOf InputSplit.
{code}
if (bytesReadCallback.isDefined) {
val bytesReadFn = bytesReadCallback.get
inputMetrics.bytesRead = bytesReadFn()
} else if (split.inputSplit.value.isInstanceOf[FileSplit]) {
// If we can't get the bytes read from the FS stats, fall back to the split size,
// which may be inaccurate.
try {
inputMetrics.bytesRead = split.inputSplit.value.getLength
context.taskMetrics.inputMetrics = Some(inputMetrics)
} catch {
case e: java.io.IOException =>
logWarning("Unable to get input size to set InputMetrics for task", e)
}
}
{code}
> InputMetrics bug when inputSplit is not instanceOf FileSplit
> ------------------------------------------------------------
>
> Key: SPARK-5347
> URL: https://issues.apache.org/jira/browse/SPARK-5347
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 1.2.0
> Reporter: Hong Shen
>
> When inputFormatClass is set to CombineFileInputFormat, input metrics show that input is empty. It don't appear is spark-1.1.0.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org