You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jonathan Tan <jt...@gmail.com> on 2020/10/22 23:35:48 UTC

Metric Trigger not being recognised & picked up

Hi All

I've been trying to get a metric trigger set up in SolrCloud 8.4.1, but
it's not working, and was hoping for some help.

I've created a metric trigger using this:

```
POST /solr/admin/autoscaling {
  "set-trigger": {
    "name": "metric_trigger",
    "event": "metric",
    "waitFor": "10s",
    "metric": "metrics:solr.jvm:os.systemCpuLoad",
    "above": 0.7,
    "preferredOperation": "MOVEREPLICA",
    "enabled": true
  }
}
```

And I get a successful response.

I can also see the new trigger in the `files -> tree -> autoscaling.json`.

However, I don't see any difference in the logs (I had the autoscaling
logging set to debug), and it's definitely not moving any replicas around
when under load, and the node is consistently in the > 85% overall
systemCpuLoad. (I can see this as well when I use the `/metrics` endpoint
with the above key.)


I then restarted all the nodes, and saw this error on startup, saying it
couldn't set the state during a restore, with the worrying part saying that
it is discarding the trigger...

I'd really like some help with this.

We've been seeing that out of the 3 nodes, there's always - seemingly
randomly - massively utilised on CPU (maxed out 8 cores, and it's not
always the one with overseer), so we were hoping that we could let the
Metric Trigger sort it out in the short term.

```
2020-10-22 23:03:19.905 ERROR (ScheduledTrigger-7-thread-3) [   ]
o.a.s.c.a.ScheduledTriggers Error restoring trigger state jvm_cpu_trigger
=> java.lang.NullPointerException
at
org.apache.solr.cloud.autoscaling.MetricTrigger.setState(MetricTrigger.java:94)
java.lang.NullPointerException: null
at
org.apache.solr.cloud.autoscaling.MetricTrigger.setState(MetricTrigger.java:94)
~[?:?]
at
org.apache.solr.cloud.autoscaling.TriggerBase.restoreState(TriggerBase.java:279)
~[?:?]
at
org.apache.solr.cloud.autoscaling.ScheduledTriggers$TriggerWrapper.run(ScheduledTriggers.java:638)
~[?:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
~[?:?]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) ~[?:?]
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
~[?:?]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
~[?:?]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
~[?:?]
at java.lang.Thread.run(Thread.java:834) [?:?]
2020-10-22 23:03:19.912 ERROR (ScheduledTrigger-7-thread-1) [   ]
o.a.s.c.a.ScheduledTriggers Failed to re-play event, discarding: {
  "id":"dd2ebf3d56bTboddkoovyjxdvy1hauq2zskpt",
  "source":"metric_trigger",
  "eventTime":15199552918891,
  "eventType":"METRIC",
  "properties":{

"node":{"mycoll-solr-solr-service-1.mycoll-solr-solr-service-headless.mycoll-solr-test:8983_solr":0.7322834645669292},
    "_dequeue_time_":261690991035,
    "metric":"metrics:solr.jvm:os.systemCpuLoad",
    "preferredOperation":"MOVEREPLICA",
    "_enqueue_time_":15479182216601,
    "requestedOps":[{
        "action":"MOVEREPLICA",

"hints":{"SRC_NODE":["mycoll-solr-solr-service-1.mycoll-solr-solr-service-headless.mycoll-solr-test:8983_solr"]}}],
    "replaying":true}}
2020-10-22 23:03:19.913 INFO
 (OverseerStateUpdate-144115201265369088-mycoll-solr-solr-service-0.mycoll-solr-solr-service-headless.mycoll-solr-test:8983_solr-n_0000000199)
[   ] o.a.s.c.o.SliceMutator createReplica() {
  "operation":"addreplica",
  "collection":"mycoll-2",
  "shard":"shard5",
  "core":"mycoll-2_shard5_replica_n122",
  "state":"down",
  "base_url":"
http://mycoll-solr-solr-service-0.mycoll-solr-solr-service-headless.mycoll-solr-test:8983/solr
",

"node_name":"mycoll-solr-solr-service-0.mycoll-solr-solr-service-headless.mycoll-solr-test:8983_solr",
  "type":"NRT"}
2020-10-22 23:03:19.921 ERROR (ScheduledTrigger-7-thread-1) [   ]
o.a.s.c.a.ScheduledTriggers Error restoring trigger state metric_trigger =>
java.lang.NullPointerException
at
org.apache.solr.cloud.autoscaling.MetricTrigger.setState(MetricTrigger.java:94)
java.lang.NullPointerException: null
at
org.apache.solr.cloud.autoscaling.MetricTrigger.setState(MetricTrigger.java:94)
~[?:?]
at
org.apache.solr.cloud.autoscaling.TriggerBase.restoreState(TriggerBase.java:279)
~[?:?]
at
org.apache.solr.cloud.autoscaling.ScheduledTriggers$TriggerWrapper.run(ScheduledTriggers.java:638)
~[?:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
~[?:?]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) ~[?:?]
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
~[?:?]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
~[?:?]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
~[?:?]
at java.lang.Thread.run(Thread.java:834) [?:?]

```


Any help please?
Thank you
Jonathan