You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "David Smith (Jira)" <ji...@apache.org> on 2019/09/25 21:27:00 UTC

[jira] [Updated] (AIRFLOW-5554) Airflow 1.10.4+ needs statsd>=3.3.0

     [ https://issues.apache.org/jira/browse/AIRFLOW-5554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David Smith updated AIRFLOW-5554:
---------------------------------
    Description: 
Our environment configuration uses statsd 3.2.2, which is a valid dependency version choice in airflow's requirements. However if statsd is enabled in airflow config, there is a critical bug preventing dag runs from completing properly when a timedelta is passed to the statsd timing to be formatted into a float:

[https://github.com/apache/airflow/blob/66a139d734bd434caa792007c3b980ca4cf8f931/airflow/models/dagrun.py#L354]

Causing an exception to be raised due to the unexpected delta data type:

File "/opt/evidation/data-processing-pipeline/lib/python3.5/site-packages/airflow/jobs/scheduler_job.py", line 1547, in process_file
 self._process_dags(dagbag, dags, ti_keys_to_schedule)
 File "/opt/evidation/data-processing-pipeline/lib/python3.5/site-packages/airflow/jobs/scheduler_job.py", line 1242, in _process_dags
 schedule_delay)
 File "/opt/evidation/data-processing-pipeline/lib/python3.5/site-packages/statsd/client.py", line 93, in timing
 self._send_stat(stat, '%0.6f|ms' % delta, rate)
 TypeError: a float is required

The code to detect the timedelta type and convert to millis was introduced in 3.3.0:

[https://github.com/jsocol/pystatsd/blob/1c90b9fdf322680e2625da659abc2aa5d79b5bff/statsd/client/base.py#L28]

So the setup.py in airflow should be updated to make this the minimum dependency:

[https://github.com/apache/airflow/blob/66a139d734bd434caa792007c3b980ca4cf8f931/setup.py#L262]

  was:
Our environment configuration uses statsd 2.2.3, which is a valid dependency version choice in airflow's requirements. However if statsd is enabled in airflow config, there is a critical bug preventing dag runs from completing properly when a timedelta is passed to the statsd timing to be formatted into a float:

[https://github.com/apache/airflow/blob/66a139d734bd434caa792007c3b980ca4cf8f931/airflow/models/dagrun.py#L354]

Causing an exception to be raised due to the unexpected delta data type:

File "/opt/evidation/data-processing-pipeline/lib/python3.5/site-packages/airflow/jobs/scheduler_job.py", line 1547, in process_file
 self._process_dags(dagbag, dags, ti_keys_to_schedule)
 File "/opt/evidation/data-processing-pipeline/lib/python3.5/site-packages/airflow/jobs/scheduler_job.py", line 1242, in _process_dags
 schedule_delay)
 File "/opt/evidation/data-processing-pipeline/lib/python3.5/site-packages/statsd/client.py", line 93, in timing
 self._send_stat(stat, '%0.6f|ms' % delta, rate)
TypeError: a float is required

The code to detect the timedelta type and convert to millis was introduced in 3.3.0:

[https://github.com/jsocol/pystatsd/blob/1c90b9fdf322680e2625da659abc2aa5d79b5bff/statsd/client/base.py#L28]

So the setup.py in airflow should be updated to make this the minimum dependency:

https://github.com/apache/airflow/blob/66a139d734bd434caa792007c3b980ca4cf8f931/setup.py#L262


> Airflow 1.10.4+ needs statsd>=3.3.0
> -----------------------------------
>
>                 Key: AIRFLOW-5554
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-5554
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: DagRun, dependencies
>    Affects Versions: 1.10.4, 1.10.5
>            Reporter: David Smith
>            Assignee: David Smith
>            Priority: Major
>
> Our environment configuration uses statsd 3.2.2, which is a valid dependency version choice in airflow's requirements. However if statsd is enabled in airflow config, there is a critical bug preventing dag runs from completing properly when a timedelta is passed to the statsd timing to be formatted into a float:
> [https://github.com/apache/airflow/blob/66a139d734bd434caa792007c3b980ca4cf8f931/airflow/models/dagrun.py#L354]
> Causing an exception to be raised due to the unexpected delta data type:
> File "/opt/evidation/data-processing-pipeline/lib/python3.5/site-packages/airflow/jobs/scheduler_job.py", line 1547, in process_file
>  self._process_dags(dagbag, dags, ti_keys_to_schedule)
>  File "/opt/evidation/data-processing-pipeline/lib/python3.5/site-packages/airflow/jobs/scheduler_job.py", line 1242, in _process_dags
>  schedule_delay)
>  File "/opt/evidation/data-processing-pipeline/lib/python3.5/site-packages/statsd/client.py", line 93, in timing
>  self._send_stat(stat, '%0.6f|ms' % delta, rate)
>  TypeError: a float is required
> The code to detect the timedelta type and convert to millis was introduced in 3.3.0:
> [https://github.com/jsocol/pystatsd/blob/1c90b9fdf322680e2625da659abc2aa5d79b5bff/statsd/client/base.py#L28]
> So the setup.py in airflow should be updated to make this the minimum dependency:
> [https://github.com/apache/airflow/blob/66a139d734bd434caa792007c3b980ca4cf8f931/setup.py#L262]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)