You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/07/07 19:21:34 UTC

[GitHub] [beam] steveniemitz commented on pull request #22103: [BEAM-13015, #22050] Make SDK harness msec counters faster using ordered puts

steveniemitz commented on PR #22103:
URL: https://github.com/apache/beam/pull/22103#issuecomment-1178114663

   Ah this is cool and was next up on my hit list, although for a different reason.  I've never seen the actual sampler code itself show up as a significant CPU/memory offender, however, the contention around add/removeTracker is significant since all worker threads need to serialize through it.
   
   There's a couple options I've tried to remove the lock there:
   - Switch `activeTrackers` back to a concurrent set, the problem here was that deactivating a tracker could still race with the sampler thread and double-sample a sampler.  I'm not sure if this is a big enough deal to worry about?
   - Rather than worry about synchronization at all, push all add/remove/sample operations into a queue and have a single thread consuming from it and maintaining activeTrackers and doing the sampling.  Doing so removes the race above and simplifies the logic a lot (I think).  I used the LMAX Disruptor queue to handle the multi-consumer-single-producer setup here, but you could probably use a built-in java concurrent blocking queue for this as well.
   
   In any case, I was going to put up a review for the other version of this today-ish, we can apply whatever we end up deciding on to this as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org