You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by "Amar3tto (via GitHub)" <gi...@apache.org> on 2023/05/24 09:29:22 UTC

[GitHub] [beam] Amar3tto opened a new pull request, #26862: [SparkReceiverIO] Add startPollTimeoutSec parameter. Fix splitting of restriction

Amar3tto opened a new pull request, #26862:
URL: https://github.com/apache/beam/pull/26862

   - Resolves #26621 
   - Added `startPollTimeoutSec` parameter to `SparkReceiverIO` and `CdapIO` (previously it was constant)
   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [ ] Mention the appropriate issue in your description (for example: `addresses #123`), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment `fixes #<ISSUE NUMBER>` instead.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/get-started-contributing/#make-the-reviewers-job-easier).
   
   To check the build health, please visit [https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md](https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md)
   
   GitHub Actions Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   [![Build python source distribution and wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Go tests](https://github.com/apache/beam/workflows/Go%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Go+tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more information about GitHub Actions CI.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #26862: [SparkReceiverIO] Add startPollTimeoutSec parameter. Fix splitting of restriction

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #26862:
URL: https://github.com/apache/beam/pull/26862#issuecomment-1563064238

   @Amar3tto Thanks!
   
   Could you explain a bit why this change accelerate the test execution?
   
   Also, how did you calculate that now it takes only 9.53 min for this test? I see that [this build](https://ci-beam.apache.org/job/beam_PerformanceTests_SparkReceiver_IO/467/), that was run against this PR, took 36 mins. Do I misunderstand something?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Amar3tto commented on pull request #26862: [SparkReceiverIO] Add startPollTimeoutSec parameter. Fix splitting of restriction

Posted by "Amar3tto (via GitHub)" <gi...@apache.org>.
Amar3tto commented on PR #26862:
URL: https://github.com/apache/beam/pull/26862#issuecomment-1563291561

   > @Amar3tto Thanks!
   > 
   > Could you explain a bit why this change accelerate the test execution?
   > 
   > Also, how did you calculate that now it takes only 9.53 min for this test? I see that [this build](https://ci-beam.apache.org/job/beam_PerformanceTests_SparkReceiver_IO/467/), that was run against this PR, took 36 mins. Do I misunderstand something?
   
   This change prevents splitting restriction too often (now it won't split until there are no more records left in the queue), which was the reason for the Receivers to start and stop too many times - starting and stopping takes a while.
   I ran the `SparkReceiverIO Performance test` 2 times on this PR. I got the results from the Grafana dashboard (9.53 and 18.3 min). Previous successful runs (before fix) have been around 27 min, too close to the 30 min timeout.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Amar3tto commented on pull request #26862: [SparkReceiverIO] Add startPollTimeoutSec parameter. Fix splitting of restriction

Posted by "Amar3tto (via GitHub)" <gi...@apache.org>.
Amar3tto commented on PR #26862:
URL: https://github.com/apache/beam/pull/26862#issuecomment-1562855813

   R: @aromanenko-dev 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Amar3tto commented on pull request #26862: [SparkReceiverIO] Add startPollTimeoutSec parameter. Fix splitting of restriction

Posted by "Amar3tto (via GitHub)" <gi...@apache.org>.
Amar3tto commented on PR #26862:
URL: https://github.com/apache/beam/pull/26862#issuecomment-1560914532

   Run Java PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Amar3tto commented on pull request #26862: [SparkReceiverIO] Add startPollTimeoutSec parameter. Fix splitting of restriction

Posted by "Amar3tto (via GitHub)" <gi...@apache.org>.
Amar3tto commented on PR #26862:
URL: https://github.com/apache/beam/pull/26862#issuecomment-1562648218

   Run Java SparkReceiverIO Performance Test


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Amar3tto commented on pull request #26862: [SparkReceiverIO] Add startPollTimeoutSec parameter. Fix splitting of restriction

Posted by "Amar3tto (via GitHub)" <gi...@apache.org>.
Amar3tto commented on PR #26862:
URL: https://github.com/apache/beam/pull/26862#issuecomment-1562855617

   SparkReceiverIO performance test took 9.53 min with this fix, so there is no need to increase timeout.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev merged pull request #26862: [SparkReceiverIO] Add startPollTimeoutSec parameter. Fix splitting of restriction

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev merged PR #26862:
URL: https://github.com/apache/beam/pull/26862


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] github-actions[bot] commented on pull request #26862: [SparkReceiverIO] Add startPollTimeoutSec parameter. Fix splitting of restriction

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #26862:
URL: https://github.com/apache/beam/pull/26862#issuecomment-1562857530

   Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Amar3tto commented on pull request #26862: [SparkReceiverIO] Add startPollTimeoutSec parameter. Fix splitting of restriction

Posted by "Amar3tto (via GitHub)" <gi...@apache.org>.
Amar3tto commented on PR #26862:
URL: https://github.com/apache/beam/pull/26862#issuecomment-1560851312

   Run Java SparkReceiverIO Performance Test


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org