You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "alvaroserper (via GitHub)" <gi...@apache.org> on 2023/10/10 11:30:07 UTC
[I] Invalid stat name using opentelemetry [airflow]
alvaroserper opened a new issue, #34845:
URL: https://github.com/apache/airflow/issues/34845
### Apache Airflow version
2.7.1
### What happened
When running a dag an error ocurred. The error says that there is a metric with an invalid name. This causes that the task of the dag is set up for retry. Then the task executes again and is marked as success.
`2023-10-10 13:05:21 [2023-10-10T11:05:21.738+0000] {local_executor.py:135} ERROR - Failed to execute task Invalid stat name: ***.dag.cwf_path_inspector_generator.delete-xcom-task.queued_duration. Please see https://opentelemetry.io/docs/reference/specification/metrics/api/#instrument-name-syntax.
2023-10-10 13:05:21 Traceback (most recent call last):
2023-10-10 13:05:21 File "/usr/local/lib/python3.10/dist-packages/airflow/executors/local_executor.py", line 131, in _execute_work_in_fork
2023-10-10 13:05:21 args.func(args)
2023-10-10 13:05:21 File "/usr/local/lib/python3.10/dist-packages/airflow/cli/cli_config.py", line 49, in command
2023-10-10 13:05:21 return func(*args, **kwargs)
2023-10-10 13:05:21 File "/usr/local/lib/python3.10/dist-packages/airflow/utils/cli.py", line 113, in wrapper
2023-10-10 13:05:21 return f(*args, **kwargs)
2023-10-10 13:05:21 File "/usr/local/lib/python3.10/dist-packages/airflow/cli/commands/task_command.py", line 430, in task_run
2023-10-10 13:05:21 task_return_code = _run_task_by_selected_method(args, _dag, ti)
2023-10-10 13:05:21 File "/usr/local/lib/python3.10/dist-packages/airflow/cli/commands/task_command.py", line 208, in _run_task_by_selected_method
2023-10-10 13:05:21 return _run_task_by_local_task_job(args, ti)
2023-10-10 13:05:21 File "/usr/local/lib/python3.10/dist-packages/airflow/cli/commands/task_command.py", line 270, in _run_task_by_local_task_job
2023-10-10 13:05:21 ret = run_job(job=job_runner.job, execute_callable=job_runner._execute)
2023-10-10 13:05:21 File "/usr/local/lib/python3.10/dist-packages/airflow/utils/session.py", line 77, in wrapper
2023-10-10 13:05:21 return func(*args, session=session, **kwargs)
2023-10-10 13:05:21 File "/usr/local/lib/python3.10/dist-packages/airflow/jobs/job.py", line 289, in run_job
2023-10-10 13:05:21 return execute_job(job, execute_callable=execute_callable)
2023-10-10 13:05:21 File "/usr/local/lib/python3.10/dist-packages/airflow/jobs/job.py", line 318, in execute_job
2023-10-10 13:05:21 ret = execute_callable()
2023-10-10 13:05:21 File "/usr/local/lib/python3.10/dist-packages/airflow/jobs/local_task_job_runner.py", line 143, in _execute
2023-10-10 13:05:21 if not self.task_instance.check_and_change_state_before_execution(
2023-10-10 13:05:21 File "/usr/local/lib/python3.10/dist-packages/airflow/utils/session.py", line 77, in wrapper
2023-10-10 13:05:21 return func(*args, session=session, **kwargs)
2023-10-10 13:05:21 File "/usr/local/lib/python3.10/dist-packages/airflow/models/taskinstance.py", line 1366, in check_and_change_state_before_execution
2023-10-10 13:05:21 self.emit_state_change_metric(TaskInstanceState.RUNNING)
2023-10-10 13:05:21 File "/usr/local/lib/python3.10/dist-packages/airflow/models/taskinstance.py", line 1450, in emit_state_change_metric
2023-10-10 13:05:21 Stats.timing(f"dag.{self.dag_id}.{self.task_id}.{metric_name}", timing)
2023-10-10 13:05:21 File "/usr/local/lib/python3.10/dist-packages/airflow/metrics/otel_logger.py", line 266, in timing
2023-10-10 13:05:21 if self.metrics_validator.test(stat) and name_is_otel_safe(self.prefix, stat):
2023-10-10 13:05:21 File "/usr/local/lib/python3.10/dist-packages/airflow/metrics/otel_logger.py", line 95, in name_is_otel_safe
2023-10-10 13:05:21 return bool(stat_name_otel_handler(prefix, name, max_length=OTEL_NAME_MAX_LENGTH))
2023-10-10 13:05:21 File "/usr/local/lib/python3.10/dist-packages/airflow/metrics/validators.py", line 142, in stat_name_otel_handler
2023-10-10 13:05:21 raise InvalidStatsNameException(
2023-10-10 13:05:21 airflow.exceptions.InvalidStatsNameException: Invalid stat name: ***.dag.cwf_path_inspector_generator.delete-xcom-task.queued_duration. Please see https://opentelemetry.io/docs/reference/specification/metrics/api/#instrument-name-syntax`
### What you think should happen instead
There should not be an error with the name of a default metric causing a task to retry.
### How to reproduce
Enable opentelemetry in airflow.cfg:
`otel_on = True
otel_host = breeze-otel-collector
otel_port = 4318
otel_prefix = airflow
otel_interval_milliseconds = 30000 # The interval between exports, defaults to 60000
otel_ssl_active = False`
Run opentelemetry collector docker:
`otel-collector:
image: otel/opentelemetry-collector-contrib:0.70.0
container_name: "breeze-otel-collector"
command: [--config=/etc/otel-collector-config.yml]
volumes:
- ./otel-collector/otel-collector-config.yml:/etc/otel-collector-config.yml
# - ./otel-collector/keys:/etc/keys
ports:
- "24318:4318" # OTLP http receiver
- "28889:8889" # Prometheus exporter metrics`
### Operating System
Ubuntu 22.04.3 LTS
### Versions of Apache Airflow Providers
_No response_
### Deployment
Docker-Compose
### Deployment details
_No response_
### Anything else
_No response_
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] Invalid stat name using opentelemetry [airflow]
Posted by "ferruzzi (via GitHub)" <gi...@apache.org>.
ferruzzi commented on issue #34845:
URL: https://github.com/apache/airflow/issues/34845#issuecomment-1759996079
2.7.2 was released this morning!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] Invalid stat name using opentelemetry [airflow]
Posted by "ferruzzi (via GitHub)" <gi...@apache.org>.
ferruzzi commented on issue #34845:
URL: https://github.com/apache/airflow/issues/34845#issuecomment-1758654459
I just double-checked and this fix should be in Airflow 2.7.2 which is currently being voted on and should be out Very Soon :tm: ((next week I think, but don't hold me to that))
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] Invalid stat name using opentelemetry [airflow]
Posted by "ferruzzi (via GitHub)" <gi...@apache.org>.
ferruzzi closed issue #34845: Invalid stat name using opentelemetry
URL: https://github.com/apache/airflow/issues/34845
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] Invalid stat name using opentelemetry [airflow]
Posted by "Bisk1 (via GitHub)" <gi...@apache.org>.
Bisk1 commented on issue #34845:
URL: https://github.com/apache/airflow/issues/34845#issuecomment-1756602096
looks like some metric names are missing on exemptions list https://github.com/apache/airflow/pull/30873/files#diff-1cca954ec0be1aaf2c212e718c004cb0902a96ac60043bf0c97a782dee52cc32R55
@ferruzzi should we add them?
```
r"^dag\.(?P<dag_id>.*)\.(?P<task_id>.*)\.queued_duration$",
r"^dag\.(?P<dag_id>.*)\.(?P<task_id>.*)\.scheduled_duration$",
```
It looks like it was added before your change:
https://github.com/apache/airflow/blob/8fdf3582c2967161dd794f7efb53691d092f0ce6/airflow/models/taskinstance.py#L2184
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] Invalid stat name using opentelemetry [airflow]
Posted by "alvaroserper (via GitHub)" <gi...@apache.org>.
alvaroserper commented on issue #34845:
URL: https://github.com/apache/airflow/issues/34845#issuecomment-1756854264
Okey thank you, this will be solved in 3.0 v ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] Invalid stat name using opentelemetry [airflow]
Posted by "ferruzzi (via GitHub)" <gi...@apache.org>.
ferruzzi commented on issue #34845:
URL: https://github.com/apache/airflow/issues/34845#issuecomment-1758214925
> should we add them to the exemption list?
Must have been some awkward timing, adding new non-compliant names should have been prevented by the unit tests.... Yeah, I guess in this case, let's add it to the exemption list. Can you cut the PR?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] Invalid stat name using opentelemetry [airflow]
Posted by "ferruzzi (via GitHub)" <gi...@apache.org>.
ferruzzi commented on issue #34845:
URL: https://github.com/apache/airflow/issues/34845#issuecomment-1758253375
I know this isn't really an answer, but the root cause id that when you combine a long `dag_id` and a long `task_id`, the total length of the metric name exceeds OTel's max name length. A temporary workaround would be to use shorter names. Again: I know that's not a satisfactory solution, and the fix has already been applied to the next release, but if you want a temporary band-aid, that's one option.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] Invalid stat name using opentelemetry [airflow]
Posted by "boring-cyborg[bot] (via GitHub)" <gi...@apache.org>.
boring-cyborg[bot] commented on issue #34845:
URL: https://github.com/apache/airflow/issues/34845#issuecomment-1755138098
Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org