You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/02/10 04:59:02 UTC

[GitHub] [spark] mridulm commented on pull request #31540: [SPARK-20977][CORE] Use a non-final field for the state of CollectionAccumulator

mridulm commented on pull request #31540:
URL: https://github.com/apache/spark/pull/31540#issuecomment-776442655


   This does not necessarily solve the issue that @zsxwing detailed - the issue here is `registerAccumulator` should not be called in `readObject` before subclasses have completed readObject.
   
   One possible solution would be to introduce two methods.
   
   a) A protected method `doHandleDriverSideAccumulator()` - which has all the code after `defaultReadObject` in readObject.
   b) Call `handleDriverSideAccumulator` after `defaultReadObject` in  `AccumulatorV2`. In `AccumulatorV2`, this protected method will simply delegate to `doHandleDriverSideAccumulator`.
   c) In subclasses with local state, override `doHandleDriverSideAccumulator` to make it do nothing - and after readObject in subclass is done, invoke `doHandleDriverSideAccumulator`
   
   This will ensure AccumulatorV2 and subclasses will register only after state has been initialized.
   (Rough sketch, please change logic/names/etc as relevant).
   
   Thoughts ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org