You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "t oo (Jira)" <ji...@apache.org> on 2020/03/06 22:01:00 UTC
[jira] [Work started] (AIRFLOW-6994) SparkSubmitOperator re
launches spark driver even when original driver still running
[ https://issues.apache.org/jira/browse/AIRFLOW-6994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Work on AIRFLOW-6994 started by t oo.
-------------------------------------
> SparkSubmitOperator re launches spark driver even when original driver still running
> ------------------------------------------------------------------------------------
>
> Key: AIRFLOW-6994
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6994
> Project: Apache Airflow
> Issue Type: Bug
> Components: scheduler
> Affects Versions: 1.10.8, 1.10.9
> Reporter: t oo
> Assignee: t oo
> Priority: Major
>
> https://issues.apache.org/jira/browse/AIRFLOW-6229 introduced a bug
> Due to temporary network blip in connection to spark the state goes to unknown (as no tags found in curl response) and forces retry
> fix in spark_submit_hook.py:
>
> {code:java}
> def _process_spark_status_log(self, itr):
> """
> parses the logs of the spark driver status query process
> :param itr: An iterator which iterates over the input of the subprocess
> """
> response_found = False
> driver_found = False
> # Consume the iterator
> for line in itr:
> line = line.strip()
> if "submissionId" in line:
> response_found = True
>
> # Check if the log line is about the driver status and extract the status.
> if "driverState" in line:
> self._driver_status = line.split(' : ')[1] \
> .replace(',', '').replace('\"', '').strip()
> driver_found = True
> self.log.debug("spark driver status log: {}".format(line))
> if response_found and not driver_found:
> self._driver_status = "UNKNOWN"
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)