You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "vamossagar12 (via GitHub)" <gi...@apache.org> on 2023/02/21 11:41:20 UTC

[GitHub] [kafka] vamossagar12 opened a new pull request, #13283: Kip 770 buffer size fix

vamossagar12 opened a new pull request, #13283:
URL: https://github.com/apache/kafka/pull/13283

   *More detailed description of your change,
   if necessary. The PR title and PR message become
   the squashed commit message, so use a separate
   comment to ping reviewers.*
   
   *Summary of testing strategy (including rationale)
   for the feature or bug fix. Unit and/or integration
   tests are expected for any behaviour change and
   system tests should be considered for larger changes.*
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] KAFKA-13152: Kip 770 buffer size fix [kafka]

Posted by "ableegoldman (via GitHub)" <gi...@apache.org>.
ableegoldman commented on PR #13283:
URL: https://github.com/apache/kafka/pull/13283#issuecomment-1815619664

   > do we need this pausing semantics for ReadOnlyTask as well?
   
   Alright so I'm not super familiar with this new code which seems to be related to the state updater, but IIUC the ReadOnlyTask is like a kind of StreamTask that is still in restoration and should not be processed/operated on/written to by the StreamThread -- it's "read only" presumably for IQ  which might need to read a restoring task when stale reads are enabled.
   
   Given that, I would conclude that no, we don't need pausing semantics, if anything these tasks are effectively already paused. They don't really even have an input buffer for us to pause.
   
   @vamossagar12  I think we should try to avoid casting anyways, we don't want to deal with this breaking any time a new task type is added. Looks like we were just doing this to invoke the new `StreamTask#totalBytesBuffered` method, so how about we move that method to the `Task` interface and just return 0 for the implementation of it in ReadOnlyTask and StandbyTask?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] KAFKA-13152: Kip 770 buffer size fix [kafka]

Posted by "vamossagar12 (via GitHub)" <gi...@apache.org>.
vamossagar12 commented on PR #13283:
URL: https://github.com/apache/kafka/pull/13283#issuecomment-1811121235

   Hey @sean-rossignol , thanks for explaining your use case and how this can be useful for you! I hope to push it in the upcoming 3.7 release. I haven't gotten the chance to resolve the merge conflicts. Will try to do so this week and maybe you could help with the reviews?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] KAFKA-13152: Kip 770 buffer size fix [kafka]

Posted by "ableegoldman (via GitHub)" <gi...@apache.org>.
ableegoldman commented on PR #13283:
URL: https://github.com/apache/kafka/pull/13283#issuecomment-2059442356

   Hey @vamossagar12 sorry the second half of this KIP has dragged on for so long. Are you still interested in getting this done? I think we can target it for 3.8 if you have time on your end


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] KAFKA-13152: Kip 770 buffer size fix [kafka]

Posted by "vamossagar12 (via GitHub)" <gi...@apache.org>.
vamossagar12 commented on PR #13283:
URL: https://github.com/apache/kafka/pull/13283#issuecomment-1837036580

   @ableegoldman , just checking if you got a chance to look at my comment above? Don't mean to be pushy on this but since 3.7 release is approaching, I thought I would want to have the PR ready so that it gives you sufficient time to review as well. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] KAFKA-13152: Kip 770 buffer size fix [kafka]

Posted by "vamossagar12 (via GitHub)" <gi...@apache.org>.
vamossagar12 commented on PR #13283:
URL: https://github.com/apache/kafka/pull/13283#issuecomment-1824302840

   Thanks @ableegoldman , that makes sense. Well I have another question again due to me not being upto date with the latest developments in streams. I have this test in `StreamThreadTest` namely `shouldPauseNonEmptyPartitionsWhenTotalBufferSizeExceedsMaxBufferSize` and which are failing for the case when `processingThreadsEnabled` is set to true. I have another test which is failing but the 2 seem related.
   
   The reason for the failure is [this](https://github.com/apache/kafka/pull/13283/files#diff-e75351ae918dc9830b36a2c407203bfc4212ef78a658f053a2905e312ed47edfR3519-R3524), basically we are expecting `paused()` and `resume()` to be invoked. I think when we don't have processingThreadEnabled, the buffer size is reduced after `pollPhase` but this PR was pre-processingThreadEnabled so, I didn't have the checks. Now the question is, does it make sense to have this buffer based checks for the latter case? Because, IIUC, when `processingThreadEnabled` is true, the processing is being done on separate threads and not on this stream thread. When we don't have separate processing threads, it makes sense to track the memory usage on the StreamThread and pause/resume (which is what the original JIRA was about). Is my understanding correct? I can work around the test cases if it is.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] KAFKA-13152: Kip 770 buffer size fix [kafka]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #13283:
URL: https://github.com/apache/kafka/pull/13283#issuecomment-1765609193

   This PR is being marked as stale since it has not had any activity in 90 days. If you would like to keep this PR alive, please ask a committer for review. If the PR has  merge conflicts, please update it with the latest from trunk (or appropriate release branch) <p> If this PR is no longer valid or desired, please feel free to close it. If no activity occurs in the next 30 days, it will be automatically closed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] KAFKA-13152: Kip 770 buffer size fix [kafka]

Posted by "ableegoldman (via GitHub)" <gi...@apache.org>.
ableegoldman commented on PR #13283:
URL: https://github.com/apache/kafka/pull/13283#issuecomment-2061827689

   No pressure! Just wanted to make sure you weren't still waiting on me for this. Or rather, just wanted to say, if you do want to try and get this in to 3.8 I will make sure to help you. But it's no problem if you don't have time or have other things on your plate 🙂 
   
   Again, just sorry this one was neglected for so long.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] vamossagar12 commented on pull request #13283: KAFKA-13152: Kip 770 buffer size fix

Posted by "vamossagar12 (via GitHub)" <gi...@apache.org>.
vamossagar12 commented on PR #13283:
URL: https://github.com/apache/kafka/pull/13283#issuecomment-1639692086

   @mjsax , @guozhangwang this has been marked for 3.6 release. If possible can it be reviewed? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] KAFKA-13152: Kip 770 buffer size fix [kafka]

Posted by "sean-rossignol (via GitHub)" <gi...@apache.org>.
sean-rossignol commented on PR #13283:
URL: https://github.com/apache/kafka/pull/13283#issuecomment-1814804253

   > Hey @sean-rossignol , thanks for explaining your use case and how this can be useful for you! I hope to push it in the upcoming 3.7 release. I haven't gotten the chance to resolve the merge conflicts. Will try to do so this week and maybe you could help with the reviews?
   
   Thanks for fixing the merge conflicts.  I am happy to help with a review once the pr-merge build passes.  Appreciate your work on this @vamossagar12 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] KAFKA-13152: Kip 770 buffer size fix [kafka]

Posted by "vamossagar12 (via GitHub)" <gi...@apache.org>.
vamossagar12 commented on PR #13283:
URL: https://github.com/apache/kafka/pull/13283#issuecomment-2060631731

   hey @ableegoldman .. i am interested in wrapping this up but i haven't looked at the streams codebase for some time and things seemed to have changed a bit. I am not aware of the 3.8 timelines, will have to check those.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] KAFKA-13152: Kip 770 buffer size fix [kafka]

Posted by "sean-rossignol (via GitHub)" <gi...@apache.org>.
sean-rossignol commented on PR #13283:
URL: https://github.com/apache/kafka/pull/13283#issuecomment-1809404175

   Hey, anyway I could help unblock this pr @vamossagar12 ?
   
   We have an internal kstreams app with a wildcard consumer and one bottleneck we're hitting is as the number of source topics grows so does our initial total heap size when we need to rehydrate the components down stream of our cooker.  The problem is that we set the max heap size across all instances of the kstreams app to handle the initial expected load when we first deploy the application.  Over time the number of source topics grow and come rehydrate time we will have to increase the resources dedicated to the instance, futz with the buffered.records.per.partition config, or reset the offsets of the source topic partitions in batches, otherwise all instances of our application will go OOM.  Being able to define and enforce a maximum input buffer size at the instance/thread level would allow us to handle these rehydration events without needing to change any other elements of our deployments.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] KAFKA-13152: Kip 770 buffer size fix [kafka]

Posted by "vamossagar12 (via GitHub)" <gi...@apache.org>.
vamossagar12 commented on PR #13283:
URL: https://github.com/apache/kafka/pull/13283#issuecomment-1813728802

   hey @ableegoldman , quick question. I am noticing that tests like `shouldRecoverFromInvalidOffsetExceptionOnRestoreAndFinishRestore` are failing with an error like
   
   ```
   Exception in thread "clientId-StreamThread-1" org.apache.kafka.streams.errors.StreamsException: java.lang.ClassCastException: org.apache.kafka.streams.processor.internals.ReadOnlyTask cannot be cast to org.apache.kafka.streams.processor.internals.StreamTask
   	at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:715)
   	at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:633)
   Caused by: java.lang.ClassCastException: org.apache.kafka.streams.processor.internals.ReadOnlyTask cannot be cast to org.apache.kafka.streams.processor.internals.StreamTask
   	at org.apache.kafka.streams.processor.internals.TaskManager.getInputBufferSizeInBytes(TaskManager.java:1922)
   	at org.apache.kafka.streams.processor.internals.StreamThread.runOnceWithoutProcessingThreads(StreamThread.java:880)
   	at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:674)
   	... 1 more
   ```
    due to 
   https://github.com/apache/kafka/pull/13283/files#diff-8baa5d7209fc00074bf3fe24d709c2dcf2a44c1623d7ced8c0e29c1d832a3bcbR1919-R1926
   
   I haven't followed the latest developments of streams but it appears to me that a new type of task has been added called `ReadOnlyTask`. It had worked for StandbyTask because that task is not active but wanted to understand, do we need this pausing semantics for ReadOnlyTask as well?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] KAFKA-13152: Kip 770 buffer size fix [kafka]

Posted by "vamossagar12 (via GitHub)" <gi...@apache.org>.
vamossagar12 commented on PR #13283:
URL: https://github.com/apache/kafka/pull/13283#issuecomment-2071431998

   No worries and thanks for the proposed help! There are some things which I need to close out but I do hope to close it out. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org