You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/09/27 22:47:40 UTC

[GitHub] [airflow] dstandish opened a new pull request, #26735: Allow serialization of custom objects

dstandish opened a new pull request, #26735:
URL: https://github.com/apache/airflow/pull/26735

   We can make the serializer handle arbitrary custom objects provided they implement methods `airflow_serialize` and `airflow_deserialize`.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on pull request #26735: Allow serialization of custom objects

Posted by GitBox <gi...@apache.org>.
uranusjr commented on PR #26735:
URL: https://github.com/apache/airflow/pull/26735#issuecomment-1260333746

   Yes, preventing security issues would be the biggest thing. Currently we do that by requiring custom types to be registered via a plugin (e.g. timetables), but if we were to do that for any custom types, it may be easier to use a custom serialiser pattern instead, similar to how `json.dumps` handles this. A plugin can provide a set of serislise/deserialise hooks that would be called for any unknown object is encountered by the (de)serialiser.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] github-actions[bot] commented on pull request #26735: Allow serialization of custom objects

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #26735:
URL: https://github.com/apache/airflow/pull/26735#issuecomment-1312601830

   This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ashb commented on pull request #26735: Allow serialization of custom objects

Posted by GitBox <gi...@apache.org>.
ashb commented on PR #26735:
URL: https://github.com/apache/airflow/pull/26735#issuecomment-1260153962

   One thought: We need to be careful we don't open up artibrarty object inflation vulnerabilities this way.
   
   (There were security problems in Rails where you could give it some session data and it would treat it as YAML, and due to oddness in YAML spec, end up creating arbitrary ruby objects which was used to pop reverse shells on Rails installs.)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on pull request #26735: Allow serialization of custom objects

Posted by GitBox <gi...@apache.org>.
uranusjr commented on PR #26735:
URL: https://github.com/apache/airflow/pull/26735#issuecomment-1260364031

   Something like this
   
   ```python
   class MyAirflowPlugin(AirflowPlugin):
       custom_serde = MyCustomSerialization()
   
   class MyCustomSerialization:
       def serialize(self, obj):
           if isinstance(obj, MyObj):
               return "MyObj", {"a": obj.a, "b": obj.get_b()}
   
       def deserialize(self, type_, data):
           if type_ == "MyObj":
               return MyObj(a=data["a"], ...)
   ```
   
   When the serializer encounters something it doesn’t recognise, it’d call the custom serialiser. Similarly, if the serialised data contains a `__type` that’s not in the known list, it calls the custom deserialiser to see if something can be produced.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dstandish commented on pull request #26735: Allow serialization of custom objects

Posted by GitBox <gi...@apache.org>.
dstandish commented on PR #26735:
URL: https://github.com/apache/airflow/pull/26735#issuecomment-1260359127

   >  but if we were to do that for any custom types, it may be easier to use a custom serialiser pattern instead, similar to how json.dumps handles this. A plugin can provide a set of serislise/deserialise hooks that would be called for any unknown object is encountered by the (de)serialiser.
   
   can you add more detail? i'm interested in what you're talking about but don't follow.
   
   separately, concerning security risks... perhaps we need to be specific about the context.  suppose we allow custom serialization in the xcom context, not in the base serialization code which is used in many places.  if someone wanted to do something malicious, and they had the ability to write a task that sent this malicious object through xcom, why would they need to bother sending it through xcom -- they could do whatevery malicious work they wanted in the task itself?  we're not talking about e.g. taking user input strings from the web UI for example... and if it's just in the task execution context, it's not run in the scheduler or webserver.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ashb commented on pull request #26735: Allow serialization of custom objects

Posted by GitBox <gi...@apache.org>.
ashb commented on PR #26735:
URL: https://github.com/apache/airflow/pull/26735#issuecomment-1320022794

   This is handled for _some_ use cases by #27540 I think


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] github-actions[bot] closed pull request #26735: Allow serialization of custom objects

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed pull request #26735: Allow serialization of custom objects
URL: https://github.com/apache/airflow/pull/26735


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org