You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by "igiguere (via GitHub)" <gi...@apache.org> on 2023/05/10 18:48:58 UTC

[GitHub] [solr] igiguere opened a new pull request, #1638: SOLR-8393: Component for resource usage planning

igiguere opened a new pull request, #1638:
URL: https://github.com/apache/solr/pull/1638

   https://issues.apache.org/jira/browse/SOLR-8393
   
   
   # Description
   
   New component that attempts to extrapolate resources needed in the future by looking at resources currently used.
   
   Original idea by Steve Molloy, with additional parameter based on comment from Shawn Heisey.
   
   Documentation copied from the Jira ticket.
   
   # Solution
   
   New component: SizeComponent.java
   
   Action 'clustersizing' is added to CollectionsHandler.  Cluster sizing calls the size component for each core.
   
   # Tests
   
   The size component is tested in SizeComponentTest.java
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [* ] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability.
   - [ *] I have created a Jira issue and added the issue ID to my pull request title.
   - [* ] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended)
   - [* ] I have developed this patch against the `main` branch.
   - [* ] I have run `./gradlew check`.
   - [* ] I have added tests for my changes.
   - [* ] I have added documentation for the [Reference Guide](https://github.com/apache/solr/tree/main/solr/solr-ref-guide)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


Re: [PR] SOLR-8393: Component for resource usage planning [solr]

Posted by "epugh (via GitHub)" <gi...@apache.org>.
epugh commented on PR #1638:
URL: https://github.com/apache/solr/pull/1638#issuecomment-2045048667

   This looks very helpful, though I can't speak to if it's accurate or not...    I'd love to see somethign replace the old excel spreadsheet that we recently removed as it was no longer useful/accurate.  Maybe @janhoy you have some thoughts on this....


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


Re: [PR] SOLR-8393: Component for resource usage planning [solr]

Posted by "igiguere (via GitHub)" <gi...@apache.org>.
igiguere commented on PR #1638:
URL: https://github.com/apache/solr/pull/1638#issuecomment-2048377234

   > I'd prefer to see a V2 api added instead of a V1 api. Adding more V1 api's is just adding to the backlog of work on our V2 migration, so I'd love to see that instead added...!
   
   Agreed, but, as mentioned, this is from a pre-existing patch.
   I come back to Solr only about once a year, and usually to apply some old patch on a more recent version. That means I have a limited understanding of what ties into what and why.  So, implementing everything needed for clustersizing v2 would be a long and difficult process for me.
   
   Participation is welcomed!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


Re: [PR] SOLR-8393: Component for resource usage planning [solr]

Posted by "gerlowskija (via GitHub)" <gi...@apache.org>.
gerlowskija commented on PR #1638:
URL: https://github.com/apache/solr/pull/1638#issuecomment-2050163434

   About to take a look at the code and see if I can help with the v2 side of things, but before I dive into that I figured it was worth asking:
   
   Does `size-estimator-lucene-solr.xls` actually work for folks?  Do you use it regularly @igiguere ?  Have you found it to be pretty accurate?  Any other folks have experience with it?
   
   I'm happy to be wrong if we have several groups of folks out there in the wild that are using it, but my initial reaction is to be a little skeptical that it's reliable enough to incorporate into Solr.
   
   Primarily because, well, modeling resource-usage is a really really tough problem.  There's a reason that the community's only response to sizing questions has always been pretty much "You'll have to Guess-and-Check".
   
   And secondarily, because the spreadsheet this is all based off of was added in 2011 and hasn't really seen much iteration in the decade since.  There's an absolute ton that's changed in both Lucene and Solr since then.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


Re: [PR] SOLR-8393: Component for resource usage planning [solr]

Posted by "epugh (via GitHub)" <gi...@apache.org>.
epugh commented on PR #1638:
URL: https://github.com/apache/solr/pull/1638#issuecomment-2045041485

   The use of the hphenated (kebab style?) `total-disk-size` pattern I think should be changed to camelCase `totalDiskSize`, that is the pattern we use in the rest of our JSON output.    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


Re: [PR] SOLR-8393: Component for resource usage planning [solr]

Posted by "epugh (via GitHub)" <gi...@apache.org>.
epugh commented on PR #1638:
URL: https://github.com/apache/solr/pull/1638#issuecomment-2045019636

   I'd prefer to see a V2 api added instead of a V1 api.   Adding more V1 api's is just adding to the backlog of work on our V2 migration, so I'd love to see that instead added...!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


Re: [PR] SOLR-8393: Component for resource usage planning [solr]

Posted by "igiguere (via GitHub)" <gi...@apache.org>.
igiguere commented on PR #1638:
URL: https://github.com/apache/solr/pull/1638#issuecomment-2050511702

   @gerlowskija
   > Does `size-estimator-lucene-solr.xls` actually work for folks? Do you use it regularly @igiguere ? Have you found it to be pretty accurate? Any other folks have experience with it?
   
   Me, personally, no, I don't use it ;).  I'll try to find out from client-facing people in the company.  I doubt anyone has compiled usage vs success statistics.
    
   > ... the community's only response to sizing questions has always been pretty much "You'll have to Guess-and-Check".
   
   The cluster sizing feature is documented to estimate (i.e.: guess) resource usage.  We could make the documentation clearer that it's not a fool-proof measure.  But, at least it beats holding a finger to the wind.  And it's a bit less complicated that the xls and a calculator.
   
   > And secondarily, because the spreadsheet this is all based off of was added in 2011 and hasn't really seen much iteration in the decade since. There's an absolute ton that's changed in both Lucene and Solr since then.
   
   We're only calculating RAM, disk size, document size.  Whatever has changed in Solr and Lucene, if it has an effect on RAM, disk space, doc size, then it should be reflected on the results... No?
   
   Note that this feature is meant to be used on a current "staging" deployment, to evaluate the eventual size of a "production" environment, for the same version of Solr.  No one is expected to draw conclusions from a previous version, so changes from one version to another are not a concern in that way.
   
   As a more general note, I should add that I'm a linguist converted to Java dev.  Not a mathematician ;)  If there's an error in the math, I will never see it.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org