You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Renxia Wang (JIRA)" <ji...@apache.org> on 2016/08/01 06:35:20 UTC
[jira] [Created] (SPARK-16830) Executors Keep Trying to Fetch
Blocks from a Bad Host
Renxia Wang created SPARK-16830:
-----------------------------------
Summary: Executors Keep Trying to Fetch Blocks from a Bad Host
Key: SPARK-16830
URL: https://issues.apache.org/jira/browse/SPARK-16830
Project: Spark
Issue Type: Bug
Components: Spark Core, Streaming
Affects Versions: 1.6.2
Environment: EMR 4.7.2
Reporter: Renxia Wang
When a host became unreachable, driver removes the executors and block managers on that hosts because it doesn't receive heartbeats. However, executors on other hosts still keep trying to fetch blocks from the bad hosts.
I am running a Spark Streaming job to consume data from Kinesis. As a result of this block fetch retrying and failing, I started seeing ProvisionedThroughputExceededException on shards, AmazonHttpClient (to Kinesis) SocketException, Kinesis ExpiredIteratorException etc.
This issue also expose a potential memory leak. Starting from the time that the bad host became unreachable, the physical memory usages of executors that keep trying to fetch block from the bad host started increasing and finally hit the physical memory limit and killed by YARN.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org