You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Shlomi Cohen (Jira)" <ji...@apache.org> on 2020/01/03 12:18:00 UTC
[jira] [Created] (AIRFLOW-6439) Sagemaker training operator does
not consider metric_definition
Shlomi Cohen created AIRFLOW-6439:
-------------------------------------
Summary: Sagemaker training operator does not consider metric_definition
Key: AIRFLOW-6439
URL: https://issues.apache.org/jira/browse/AIRFLOW-6439
Project: Apache Airflow
Issue Type: Bug
Components: contrib
Affects Versions: 1.10.5
Reporter: Shlomi Cohen
Hi
we have been using all Sagemaker operators in Airflow with great success.
we now wanted to see some metrics in Cloudwatch for our training jobs,
i have tried to modify the estimator.metric_definition according to Sagemaker documentation , i even set the deprecated flag of sending metrics to True.
{code:java}
estimator.enable_cloudwatch_metrics = True
estimator.metric_definitions = [
{
"Name": "execution_time",
"Regex": "Execution Time - ([0-9.]*)"
}
]
{code}
But this didn't yield any metrics in job definition as seen in Sagemaker training jobs console.
while trying to debug the code , i saw that for Tuning jobs this block of code does consider metrics but for training it does not.
{code:java}
if tuner.metric_definitions is not None:
tune_config["TrainingJobDefinition"]["AlgorithmSpecification"][
"MetricDefinitions"
] = tuner.metric_definitions
{code}
According to Sagemaker documentation - an Estimator should be able to get metric_definitions and that would cause the metrics to be reported to CloudWatch.
here is the docs.
[https://sagemaker.readthedocs.io/en/stable/estimators.html#sagemaker.estimator.EstimatorBase]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)