You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/08/12 16:50:23 UTC

[GitHub] [beam] TheNeuralBit opened a new issue, #22714: [Bug]: Python schema generated types cannot be pickled

TheNeuralBit opened a new issue, #22714:
URL: https://github.com/apache/beam/issues/22714

   ### What happened?
   
   The NamedTuple types we generate in `apache_beam.typehints.schemas` confound pickle libraries. We work around this in many places (e.g. GeneratedClassRowTypeConstraint #22679). We should see if we can find a way to make these types picklable, and clean up the workarounds.
   
   Making the types work with cloudpickle should be the priority.
   
   ### Issue Priority
   
   Priority: 2
   
   ### Issue Component
   
   Component: sdk-py-core


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] TheNeuralBit commented on issue #22714: [Bug]: Python schema generated types cannot be pickled

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on issue #22714:
URL: https://github.com/apache/beam/issues/22714#issuecomment-1224419494

   Yes, I added a parameterized test that tries pickling with each library in #22679: https://github.com/apache/beam/blob/c7f64264451af12ff6c7c0ef4bc95fd7ce0f5418/sdks/python/apache_beam/typehints/schemas_test.py#L592-L605
   
   With cloudpickle we get:
   ```
   _______________________________________________________________________________________________ PickleTest_2.test_generated_class_pickle _______________________________________________________________________________________________
   
   self = <apache_beam.typehints.schemas_test.PickleTest_2 testMethod=test_generated_class_pickle>
   
       def test_generated_class_pickle(self):
         schema = schema_pb2.Schema(
             id="some-uuid",
             fields=[
                 schema_pb2.Field(
                     name='name',
                     type=schema_pb2.FieldType(atomic_type=schema_pb2.STRING),
                 )
             ])
         user_type = named_tuple_from_schema(schema)
       
         self.assertEqual(
   >         user_type, self.pickler.loads(self.pickler.dumps(user_type)))
   
   apache_beam/typehints/schemas_test.py:605: 
   _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
   ../../../../.pyenv/versions/beam/lib/python3.8/site-packages/cloudpickle/cloudpickle_fast.py:73: in dumps
       cp.dump(obj)
   _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
   
   self = <cloudpickle.cloudpickle_fast.CloudPickler object at 0x7fc1c273c880>, obj = <class 'apache_beam.typehints.schemas.BeamSchema_some_uuid'>
   
       def dump(self, obj):
           try:
   >           return Pickler.dump(self, obj)
   E           TypeError: cannot pickle 'google.protobuf.pyext._message.MessageDescriptor' object
   
   ../../../../.pyenv/versions/beam/lib/python3.8/site-packages/cloudpickle/cloudpickle_fast.py:633: TypeError
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] chamikaramj commented on issue #22714: [Bug]: Python schema generated types cannot be pickled

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on issue #22714:
URL: https://github.com/apache/beam/issues/22714#issuecomment-1289786239

   Can we close this since https://github.com/apache/beam/pull/23739 was merged ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] TheNeuralBit commented on issue #22714: [Bug]: Python schema generated types cannot be pickled

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on issue #22714:
URL: https://github.com/apache/beam/issues/22714#issuecomment-1289790150

   This is technically still an issue since dill can't pickle the types.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] tvalentyn commented on issue #22714: [Bug]: Python schema generated types cannot be pickled

Posted by GitBox <gi...@apache.org>.
tvalentyn commented on issue #22714:
URL: https://github.com/apache/beam/issues/22714#issuecomment-1224392663

   Have we tried pickling these types with CloudPickle?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org