You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mynewt.apache.org by GitBox <gi...@apache.org> on 2021/10/28 07:21:45 UTC

[GitHub] [mynewt-nimble] haukepetersen opened a new pull request #1063: controller: fix assert bug in ble_ll_sched_insert()

haukepetersen opened a new pull request #1063:
URL: https://github.com/apache/mynewt-nimble/pull/1063


   The `ble_ll_sched_insert()` function in the controllers scheduler contains a bug, that under certain conditions triggers an assertion failure although all inputs are in valid state. 
   
   ## Problem description
   General state of our node:
   
   - a node has maintains 2 BLE connections to other nodes, lets call them `C0` and `C1`
   - at the same time the node is sending (legacy) advertisements, lets call the procedure `A0`
   
   After a non-deterministic amount of time (depending on timing parameters for advertising and connections), the node will crash with a failed assertion in line 246 (`BLE_LL_ASSERT(sch->enqueued);`). This is caused by the following (timings are simplified for illustration):
   - before calling `ble_ll_sched_insert()`, the scheduler queue contains 2 events, one for `A0` and once for `C0`
     - `A0n(start_time: 0, end_time: 10)`
     - `C0n(start_time: 15, ent_time: 25)`
   - now the next connection event for `C1` is to be scheduled: `C1n(start_time: 7, end_time: 17)`. Important to note, that the new event overlaps with both events already on the queue. So here is what happens:
     - `ble_ll_sched_check_overlap()` detects, that `C1n` overlaps with `A0n`. As `A0n` is an advertising event, it can be preemted. So `preemt_first` is set to `A0n` and the loop is continued
     - in the next iteration of the loop, `ble_ll_sched_check_overlap()` returns true again, as `C1n` also overlaps with `C0n`. This time however, `C0n` can not be preempted, so the loop ends up in the `else` branch of the preemtion check. As we are handling connection events here, `max_delay` is `0` for `C1n`. This means, `C1n` can not be moved passed `C0n`, so scheduling `C1n` fails and the event can simply not be scheduled -> this is the expected result.
     - Problem: once we detect, that `C1n` can not be scheduled, we set `sch->enqueued` to `0` and leave the loop by jumping to the `done` marker. But from the previous run of the loop, `preempt_first` is still set, leading the condition in line 245 to become true. And as we just set `sch->enqueued := 0`, the assertion in line 249 will fail.
   
   ## Fix
   The fix is simple: when we run into the `else` branch of the `if (preempt_cb(sch, entry))` condition, we need to reset the value of `preempt_first` in any case: in case the new event can be moved, there is no need for preemting (as in the current state of the code). In case we are not able to schedule the new event (e.g. `max_delay == 0`), there is also no need to preemt. 
   
   ## Verification
   I oberserved this bug on a IP over BLE border router node, that is connecting to 2 other nodes and running 24/7. With the used connection and advertising parameters, I observed the failed assertion roughly every 30 to 45 minutes, triggering the node to reboot. With this fix, the assertion was not triggered for 48 hours now...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mynewt.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [mynewt-nimble] haukepetersen commented on pull request #1063: controller: fix assert bug in ble_ll_sched_insert()

Posted by GitBox <gi...@apache.org>.
haukepetersen commented on pull request #1063:
URL: https://github.com/apache/mynewt-nimble/pull/1063#issuecomment-953595386


   All green, lets go then.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mynewt.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [mynewt-nimble] haukepetersen merged pull request #1063: controller: fix assert bug in ble_ll_sched_insert()

Posted by GitBox <gi...@apache.org>.
haukepetersen merged pull request #1063:
URL: https://github.com/apache/mynewt-nimble/pull/1063


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mynewt.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org