You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/11/15 22:00:56 UTC

[GitHub] [beam] egalpin opened a new pull request, #24186: Uses _all to follow alias/datastreams when estimating index size

egalpin opened a new pull request, #24186:
URL: https://github.com/apache/beam/pull/24186

   Fixes #24117
   
   Ensures that index size estimation accounts for index aliases, patterns, and datastreams by using `_all` stats for an aggregate value of `size_in_bytes`.
   
   All credit to @JeffBolle for this patch.
   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`).
    - [ ] Mention the appropriate issue in your description (for example: `addresses #123`), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment `fixes #<ISSUE NUMBER>` instead.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/get-started-contributing/#make-the-reviewers-job-easier).
   
   To check the build health, please visit [https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md](https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md)
   
   GitHub Actions Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   [![Build python source distribution and wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Go tests](https://github.com/apache/beam/workflows/Go%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Go+tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more information about GitHub Actions CI.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24186: Uses _all to follow alias/datastreams when estimating index size

Posted by GitBox <gi...@apache.org>.
aromanenko-dev commented on PR #24186:
URL: https://github.com/apache/beam/pull/24186#issuecomment-1318345340

   @egalpin Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] egalpin commented on pull request #24186: Uses _all to follow alias/datastreams when estimating index size

Posted by GitBox <gi...@apache.org>.
egalpin commented on PR #24186:
URL: https://github.com/apache/beam/pull/24186#issuecomment-1317828053

   @aromanenko-dev yes great call 👍  It should be trivial to add, I've got a WIP for the regression test I'll add to this PR today


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] egalpin commented on pull request #24186: Uses _all to follow alias/datastreams when estimating index size

Posted by GitBox <gi...@apache.org>.
egalpin commented on PR #24186:
URL: https://github.com/apache/beam/pull/24186#issuecomment-1317890407

   Confirmed locally by first seeing all tests pass, then reverting [a014637](https://github.com/apache/beam/pull/24186/commits/a014637106970a0a0e9eb7944aa5caf79fa5fd37) (i.e. removing @JeffBolle's patch) and re-running the test suite.  When the patch is removed, I received the following expected error:
   
   ```
   org.apache.beam.sdk.io.elasticsearch.ElasticsearchIOTest > testSizesWithAlias FAILED
       java.lang.AssertionError: Wrong estimated size
       Expected: a value greater than <1000L>
            but: <0L> was less than <1000L>
           at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
           at org.apache.beam.sdk.io.elasticsearch.ElasticsearchIOTestCommon.testSizes(ElasticsearchIOTestCommon.java:218)
           at org.apache.beam.sdk.io.elasticsearch.ElasticsearchIOTest.testSizesWithAlias(ElasticsearchIOTest.java:95)
   ```        


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev merged pull request #24186: Uses _all to follow alias/datastreams when estimating index size

Posted by GitBox <gi...@apache.org>.
aromanenko-dev merged PR #24186:
URL: https://github.com/apache/beam/pull/24186


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] github-actions[bot] commented on pull request #24186: Uses _all to follow alias/datastreams when estimating index size

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #24186:
URL: https://github.com/apache/beam/pull/24186#issuecomment-1315958877

   Assigning reviewers. If you would like to opt out of this review, comment `assign to next reviewer`:
   
   R: @lukecwik for label java.
   R: @Abacn for label io.
   
   Available commands:
   - `stop reviewer notifications` - opt out of the automated review tooling
   - `remind me after tests pass` - tag the comment author after tests pass
   - `waiting on author` - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)
   
   The PR bot will only process comments in the main thread (not review comments).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org