You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Shlomi Cohen (Jira)" <ji...@apache.org> on 2020/01/03 12:18:00 UTC

[jira] [Created] (AIRFLOW-6439) Sagemaker training operator does not consider metric_definition

Shlomi Cohen created AIRFLOW-6439:
-------------------------------------

             Summary: Sagemaker training operator does not consider metric_definition
                 Key: AIRFLOW-6439
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-6439
             Project: Apache Airflow
          Issue Type: Bug
          Components: contrib
    Affects Versions: 1.10.5
            Reporter: Shlomi Cohen


Hi

we have been using all Sagemaker operators in Airflow with great success.

we now wanted to see some metrics in Cloudwatch for our training jobs, 
i have tried to modify the estimator.metric_definition according to Sagemaker documentation , i even set the deprecated flag of sending metrics to True.
{code:java}
estimator.enable_cloudwatch_metrics = True
estimator.metric_definitions = [
    {
        "Name": "execution_time",
        "Regex": "Execution Time - ([0-9.]*)"
    }
]
{code}
But this didn't yield any metrics in job definition as seen in Sagemaker training jobs console.
while trying to debug the code , i saw that for Tuning jobs this block of code does consider metrics but  for training it does not. 
{code:java}
if tuner.metric_definitions is not None:
    tune_config["TrainingJobDefinition"]["AlgorithmSpecification"][
        "MetricDefinitions"
    ] = tuner.metric_definitions

{code}
According to Sagemaker documentation - an Estimator should be able to get metric_definitions and that would cause the metrics to be reported to CloudWatch.
here is the docs.
[https://sagemaker.readthedocs.io/en/stable/estimators.html#sagemaker.estimator.EstimatorBase]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)