You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/03/06 22:01:00 UTC

[jira] [Commented] (AIRFLOW-6994) SparkSubmitOperator re launches spark driver even when original driver still running

    [ https://issues.apache.org/jira/browse/AIRFLOW-6994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17053777#comment-17053777 ] 

ASF GitHub Bot commented on AIRFLOW-6994:
-----------------------------------------

tooptoop4 commented on pull request #7637: [AIRFLOW-6994] SparkSubmitOperator re-launches spark driver even when original driver still running
URL: https://github.com/apache/airflow/pull/7637
 
 
   ---
   Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg)
   
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Commit message/PR title starts with `[AIRFLOW-NNNN]`. AIRFLOW-NNNN = JIRA ID<sup>*</sup>
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   <sup>*</sup> For document-only changes commit message can start with `[AIRFLOW-XXXX]`.
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information.
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> SparkSubmitOperator re launches spark driver even when original driver still running
> ------------------------------------------------------------------------------------
>
>                 Key: AIRFLOW-6994
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-6994
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: scheduler
>    Affects Versions: 1.10.8, 1.10.9
>            Reporter: t oo
>            Assignee: t oo
>            Priority: Major
>
> https://issues.apache.org/jira/browse/AIRFLOW-6229 introduced a bug
> Due to temporary network blip in connection to spark the state goes to unknown (as no tags found in curl response) and forces retry
> fix in spark_submit_hook.py:
>   
> {code:java}
>   def _process_spark_status_log(self, itr):
>         """
>         parses the logs of the spark driver status query process
>         :param itr: An iterator which iterates over the input of the subprocess
>         """
>         response_found = False
>         driver_found = False
>         # Consume the iterator
>         for line in itr:
>             line = line.strip()
>             if "submissionId" in line:
>                 response_found = True
>             
>             # Check if the log line is about the driver status and extract the status.
>             if "driverState" in line:
>                 self._driver_status = line.split(' : ')[1] \
>                     .replace(',', '').replace('\"', '').strip()
>                 driver_found = True
>             self.log.debug("spark driver status log: {}".format(line))
>         if response_found and not driver_found:
>             self._driver_status = "UNKNOWN"
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)