You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Shlomi Cohen (Jira)" <ji...@apache.org> on 2020/01/03 13:20:00 UTC
[jira] [Reopened] (AIRFLOW-6439) Sagemaker training operator does
not consider metric_definition
[ https://issues.apache.org/jira/browse/AIRFLOW-6439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Shlomi Cohen reopened AIRFLOW-6439:
-----------------------------------
Assignee: (was: Shlomi Cohen)
This still needs to be fixed i think
> Sagemaker training operator does not consider metric_definition
> ---------------------------------------------------------------
>
> Key: AIRFLOW-6439
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6439
> Project: Apache Airflow
> Issue Type: Bug
> Components: contrib
> Affects Versions: 1.10.5
> Reporter: Shlomi Cohen
> Priority: Major
>
> Hi
> we have been using all Sagemaker operators in Airflow with great success.
> we now wanted to see some metrics in Cloudwatch for our training jobs,
> i have tried to modify the estimator.metric_definition according to Sagemaker documentation , i even set the deprecated flag of sending metrics to True.
> {code:java}
> estimator.enable_cloudwatch_metrics = True
> estimator.metric_definitions = [
> {
> "Name": "execution_time",
> "Regex": "Execution Time - ([0-9.]*)"
> }
> ]
> {code}
> But this didn't yield any metrics in job definition as seen in Sagemaker training jobs console.
> while trying to debug the code , i saw that for Tuning jobs this block of code does consider metrics but for training it does not.
> {code:java}
> if tuner.metric_definitions is not None:
> tune_config["TrainingJobDefinition"]["AlgorithmSpecification"][
> "MetricDefinitions"
> ] = tuner.metric_definitions
> {code}
> According to Sagemaker documentation - an Estimator should be able to get metric_definitions and that would cause the metrics to be reported to CloudWatch.
> here is the docs.
> [https://sagemaker.readthedocs.io/en/stable/estimators.html#sagemaker.estimator.EstimatorBase]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)