You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/06/14 07:47:02 UTC

[GitHub] [airflow] olivermeyer opened a new issue #16386: SageMaker operators return non-serializable value

olivermeyer opened a new issue #16386:
URL: https://github.com/apache/airflow/issues/16386


   **Apache Airflow version**: 1.10.15, 2.1.0
   
   **What happened**:
   The `execute` methods of all SageMaker operators return a dictionary which contains `datetime` objects (e.g. [here](https://github.com/apache/airflow/blob/69a1a732a034406967e82a59be6d3c019e94a07b/airflow/providers/amazon/aws/operators/sagemaker_training.py#L120) but affects all operators). Execution fails consistently, throwing `TypeError: Object of type 'datetime' is not JSON serializable`.
   
   There is already a ticket for this issue [here](https://issues.apache.org/jira/browse/AIRFLOW-3772), I am opening this to add visibility in GitHub and reference it in an eventual PR.
   
   **What you expected to happen**:
   The operator's `execute` should return only serializable values. I suggest that these operators should return only the name of the job, instead of the full description, which can then be used elsewhere to pull further information about the job if needed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on issue #16386: SageMaker operators return non-serializable value

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #16386:
URL: https://github.com/apache/airflow/issues/16386#issuecomment-932985402


   > I ended up forking the operators and using my own version. I don't know if there's a plan to address this, perhaps @eladkal or @kaxil can shed some light?
   
   @olivermeyer - I think the easiest/fastest way to get it released is to contribute your fork version back. Do you need some guidance on how to do it? I will be happy to review and merge it. Airflow is created by more than 1700 contributors, and you can easily become one of them - and it 's also great if you can clear the path to others!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uniquejtx commented on issue #16386: SageMaker operators return non-serializable value

Posted by GitBox <gi...@apache.org>.
uniquejtx commented on issue #16386:
URL: https://github.com/apache/airflow/issues/16386#issuecomment-1054659252


   what's the solution here? I am running  a SageMakerProcessingOperator Airflow task the processing job executed successfully but I got the error "ERROR - Could not serialize the XCom value into JSON. If you are using pickle instead of JSON for XCom, then you need to enable pickle support for XCom in your airflow config., TypeError: Object of type datetime is not JSON serializable" .   How should I get away from this ? I have downstream task depend on success of SageMakerProcessingOperator


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] olivermeyer commented on issue #16386: SageMaker operators return non-serializable value

Posted by GitBox <gi...@apache.org>.
olivermeyer commented on issue #16386:
URL: https://github.com/apache/airflow/issues/16386#issuecomment-1060340498


   @potiuk I opened a PR some time ago (https://github.com/apache/airflow/pull/16395) but was told to wait for some other PRs which were, by the looks of it, never merged. Are you able to tell me if my PR is relevant? It seems to me that it would fix the issue. If that's the case, I can spend some time updating my solution and opening another PR. I just don't want to waste time if it's clear from the start that another solution is preferred.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] c-thiel commented on issue #16386: SageMaker operators return non-serializable value

Posted by GitBox <gi...@apache.org>.
c-thiel commented on issue #16386:
URL: https://github.com/apache/airflow/issues/16386#issuecomment-932022089


   @olivermeyer do you know what the state on this is? Looking at #8703 I don't think there is going to be a solution outside of this operator.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] olivermeyer commented on issue #16386: SageMaker operators return non-serializable value

Posted by GitBox <gi...@apache.org>.
olivermeyer commented on issue #16386:
URL: https://github.com/apache/airflow/issues/16386#issuecomment-932233807


   > @olivermeyer do you know what the state on this is? Looking at #8703 I don't think there is going to be a solution outside of this operator.
   
   I ended up forking the operators and using my own version. I don't know if there's a plan to address this, perhaps @eladkal or @kaxil can shed some light?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #16386: SageMaker operators return non-serializable value

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #16386:
URL: https://github.com/apache/airflow/issues/16386#issuecomment-1060017364


   > what's the solution here? I am running a SageMakerProcessingOperator Airflow task the processing job executed successfully but I got the error "ERROR - Could not serialize the XCom value into JSON. If you are using pickle instead of JSON for XCom, then you need to enable pickle support for XCom in your airflow config., TypeError: Object of type datetime is not JSON serializable" . How should I get away from this ? I have downstream task depend on success of SageMakerProcessingOperator
   
   a) set `do_xcom_push` to False:  https://airflow.apache.org/docs/apache-airflow/stable/_api/airflow/models/baseoperator/index.html
   
   b) derive your own custom operator from the SageMarkerOperator and return nothing (or converte return value to serializable. Smth like: 
   
   ```
   class MyOperator(SageMakerProcessingOperator):
       def execute():
          ret = super().execute()
         serializable_return = .....  # convert it to serializable
         return (serializable_return)
   ```
   
   c) enable pickle in Airflow configuration https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html?highlight=pickle#enable-xcom-pickling
   
   d) correct it in the operator and contribute it back (this is by far the best idea).
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #16386: SageMaker operators return non-serializable value

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #16386:
URL: https://github.com/apache/airflow/issues/16386#issuecomment-859572001


   Since we already implements custom (de-)serialisation in `airflow.utils.json`, we should use it in `XCom.serialize_value()` (and add a `AirflowJsonDecoder` for `XCom.deserialize_value()`).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] olivermeyer commented on issue #16386: SageMaker operators return non-serializable value

Posted by GitBox <gi...@apache.org>.
olivermeyer commented on issue #16386:
URL: https://github.com/apache/airflow/issues/16386#issuecomment-859648141


   That makes sense and seems simple enough. I'm happy to take a shot at it in the next few days.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] olivermeyer commented on issue #16386: SageMaker operators return non-serializable value

Posted by GitBox <gi...@apache.org>.
olivermeyer commented on issue #16386:
URL: https://github.com/apache/airflow/issues/16386#issuecomment-933194810


   > @olivermeyer - I think the easiest/fastest way to get it released is to contribute your fork version back. Do you need some guidance on how to do it? I will be happy to review and merge it. Airflow is created by more than 1700 contributors, and you can easily become one of them - and it 's also great if you can clear the path to others!
   
   I'm not sure my fork will work for others, since I bypassed the problem by having the operator return just the model name. I think my initial PR https://github.com/apache/airflow/pull/16395 is worth considering as it's a bit more generic, although it's a bit old now and might need to be updated.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #16386: SageMaker operators return non-serializable value

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #16386:
URL: https://github.com/apache/airflow/issues/16386#issuecomment-1060499192


   @kaxil  - do you think the reasons from https://github.com/apache/airflow/pull/16395 stil hold, or is @olivermeyer free to continue with the PR (I think it should be no problem to continue, but I would like to verfiy with you).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #16386: SageMaker operators return non-serializable value

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #16386:
URL: https://github.com/apache/airflow/issues/16386#issuecomment-932985402


   > I ended up forking the operators and using my own version. I don't know if there's a plan to address this, perhaps @eladkal or @kaxil can shed some light?
   
   @olivermeyer - I think the easiest way to get it released is to contribute your fork version back. Do you need some guidance on how to do it? I will be happy to review and merge it. Airflow is created by more than 1700 contributors, and you can easily become one of them - and it 's also great if you can clear the path to others!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org