You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Jan Høydahl (Jira)" <ji...@apache.org> on 2020/03/04 20:22:00 UTC

[jira] [Created] (SOLR-14305) Improve tooling for OPS and Support personell

Jan Høydahl created SOLR-14305:
----------------------------------

             Summary: Improve tooling for OPS and Support personell
                 Key: SOLR-14305
                 URL: https://issues.apache.org/jira/browse/SOLR-14305
             Project: Solr
          Issue Type: Improvement
      Security Level: Public (Default Security Level. Issues are Public)
          Components: scripts and tools
            Reporter: Jan Høydahl


Umbrella issue for tasks which will improve the lives of Operations and Support personell running large Solr clusters. The following description snippet is copy/paste from a comment by Shalin on another issue:

There's plenty of information that is required for troubleshooting but is not available in clusterstatus or any other documented/public API. Sure there's the undocumented /admin/zookeeper which has a weird output format meant for I don't know who. But even that does not have a few things that I've found necessary to troubleshoot Solr.

Here's a non-exhaustive list of things you need to troubleshoot Solr:
 # Length of overseer queues (available in overseerstatus API)
 # Contents of overseer queue (mildly useful, available in /admin/zookeeper)
 # Overseer election queue and current leader (former is available in /admin/zookeeper and latter in overseer status)
 # Cluster state (cluster status API)
 # Solr.xml (no API regardless of whether it is in ZK or filesystem)
 # Leader election queue and current leader for each shard (available in /admin/zookeeper)
 # Shard terms for each shard/replica (not available in any API)
 # Metrics/stats (metrics API)
 # Solr Logs (log API? unless it is rolled over)
 # GC logs (no API)

Please link related tasks or create new sub tasks as necessary.

Fixing SOLR-7796 would probably help a lot in the short term since there would be a well defined way to zip up info and send to support. But it won't hurt adding better APIs, small tools and new AdminUI panels for simplified live troubleshooting as well.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org