You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Shlomi Cohen (Jira)" <ji...@apache.org> on 2020/01/03 13:20:00 UTC

[jira] [Reopened] (AIRFLOW-6439) Sagemaker training operator does not consider metric_definition

     [ https://issues.apache.org/jira/browse/AIRFLOW-6439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shlomi Cohen reopened AIRFLOW-6439:
-----------------------------------
      Assignee:     (was: Shlomi Cohen)

This still needs to be fixed i think 

> Sagemaker training operator does not consider metric_definition
> ---------------------------------------------------------------
>
>                 Key: AIRFLOW-6439
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-6439
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: contrib
>    Affects Versions: 1.10.5
>            Reporter: Shlomi Cohen
>            Priority: Major
>
> Hi
> we have been using all Sagemaker operators in Airflow with great success.
> we now wanted to see some metrics in Cloudwatch for our training jobs, 
> i have tried to modify the estimator.metric_definition according to Sagemaker documentation , i even set the deprecated flag of sending metrics to True.
> {code:java}
> estimator.enable_cloudwatch_metrics = True
> estimator.metric_definitions = [
>     {
>         "Name": "execution_time",
>         "Regex": "Execution Time - ([0-9.]*)"
>     }
> ]
> {code}
> But this didn't yield any metrics in job definition as seen in Sagemaker training jobs console.
> while trying to debug the code , i saw that for Tuning jobs this block of code does consider metrics but  for training it does not. 
> {code:java}
> if tuner.metric_definitions is not None:
>     tune_config["TrainingJobDefinition"]["AlgorithmSpecification"][
>         "MetricDefinitions"
>     ] = tuner.metric_definitions
> {code}
> According to Sagemaker documentation - an Estimator should be able to get metric_definitions and that would cause the metrics to be reported to CloudWatch.
> here is the docs.
> [https://sagemaker.readthedocs.io/en/stable/estimators.html#sagemaker.estimator.EstimatorBase]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)