You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Fokko Driesprong (JIRA)" <ji...@apache.org> on 2017/09/04 19:04:00 UTC

[jira] [Updated] (AIRFLOW-1562) Spark-sql deadlock in logging

     [ https://issues.apache.org/jira/browse/AIRFLOW-1562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Fokko Driesprong updated AIRFLOW-1562:
--------------------------------------
    Description: 
Related to Issue 1255

Logging in SparkSqlOperator does not work as intended (continuous logging as received in the subprocess). This is because, spark-sql internally redirects all logs to stdout (including stderr), which causes the current two iterator logging to get stuck with empty stderr pipe. Also this situation can lead to a deadlock because the std-err can grow too big and it will start to block until it gets consumed, which will only happen when the process ends, so the process stalls.

  was:
Related to Issue 1255

Logging in SparkSubmitOperator does not work as intended (continuous logging as received in the subprocess). This is because, spark-submit internally redirects all logs to stdout (including stderr), which causes the current two iterator logging to get stuck with empty stderr pipe. The logs are written only when the subprocess finishes. This leads to yarn_application_id not being available until the end of application.



> Spark-sql deadlock in logging
> -----------------------------
>
>                 Key: AIRFLOW-1562
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1562
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: hooks
>    Affects Versions: Airflow 1.8
>            Reporter: Fokko Driesprong
>
> Related to Issue 1255
> Logging in SparkSqlOperator does not work as intended (continuous logging as received in the subprocess). This is because, spark-sql internally redirects all logs to stdout (including stderr), which causes the current two iterator logging to get stuck with empty stderr pipe. Also this situation can lead to a deadlock because the std-err can grow too big and it will start to block until it gets consumed, which will only happen when the process ends, so the process stalls.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)