You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/08/12 16:50:23 UTC
[GitHub] [beam] TheNeuralBit opened a new issue, #22714: [Bug]: Python schema generated types cannot be pickled
TheNeuralBit opened a new issue, #22714:
URL: https://github.com/apache/beam/issues/22714
### What happened?
The NamedTuple types we generate in `apache_beam.typehints.schemas` confound pickle libraries. We work around this in many places (e.g. GeneratedClassRowTypeConstraint #22679). We should see if we can find a way to make these types picklable, and clean up the workarounds.
Making the types work with cloudpickle should be the priority.
### Issue Priority
Priority: 2
### Issue Component
Component: sdk-py-core
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on issue #22714: [Bug]: Python schema generated types cannot be pickled
Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on issue #22714:
URL: https://github.com/apache/beam/issues/22714#issuecomment-1224419494
Yes, I added a parameterized test that tries pickling with each library in #22679: https://github.com/apache/beam/blob/c7f64264451af12ff6c7c0ef4bc95fd7ce0f5418/sdks/python/apache_beam/typehints/schemas_test.py#L592-L605
With cloudpickle we get:
```
_______________________________________________________________________________________________ PickleTest_2.test_generated_class_pickle _______________________________________________________________________________________________
self = <apache_beam.typehints.schemas_test.PickleTest_2 testMethod=test_generated_class_pickle>
def test_generated_class_pickle(self):
schema = schema_pb2.Schema(
id="some-uuid",
fields=[
schema_pb2.Field(
name='name',
type=schema_pb2.FieldType(atomic_type=schema_pb2.STRING),
)
])
user_type = named_tuple_from_schema(schema)
self.assertEqual(
> user_type, self.pickler.loads(self.pickler.dumps(user_type)))
apache_beam/typehints/schemas_test.py:605:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../../../../.pyenv/versions/beam/lib/python3.8/site-packages/cloudpickle/cloudpickle_fast.py:73: in dumps
cp.dump(obj)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <cloudpickle.cloudpickle_fast.CloudPickler object at 0x7fc1c273c880>, obj = <class 'apache_beam.typehints.schemas.BeamSchema_some_uuid'>
def dump(self, obj):
try:
> return Pickler.dump(self, obj)
E TypeError: cannot pickle 'google.protobuf.pyext._message.MessageDescriptor' object
../../../../.pyenv/versions/beam/lib/python3.8/site-packages/cloudpickle/cloudpickle_fast.py:633: TypeError
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] chamikaramj commented on issue #22714: [Bug]: Python schema generated types cannot be pickled
Posted by GitBox <gi...@apache.org>.
chamikaramj commented on issue #22714:
URL: https://github.com/apache/beam/issues/22714#issuecomment-1289786239
Can we close this since https://github.com/apache/beam/pull/23739 was merged ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on issue #22714: [Bug]: Python schema generated types cannot be pickled
Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on issue #22714:
URL: https://github.com/apache/beam/issues/22714#issuecomment-1289790150
This is technically still an issue since dill can't pickle the types.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] tvalentyn commented on issue #22714: [Bug]: Python schema generated types cannot be pickled
Posted by GitBox <gi...@apache.org>.
tvalentyn commented on issue #22714:
URL: https://github.com/apache/beam/issues/22714#issuecomment-1224392663
Have we tried pickling these types with CloudPickle?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org