You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mynewt.apache.org by Christopher Collins <cc...@apache.org> on 2016/04/18 00:57:34 UTC

Proposed changes to Nimble host

Hello all,

The Mynewt BLE stack is called Nimble.  Nimble consists of two packages:
    * Controller (link-layer) [net/nimble/controller]
    * Host (upper layers)     [net/nimble/host]

This email concerns the Nimble host.  

As I indicated in an email a few weeks ago, the code size of the Nimble
host had increased beyond what I considered a reasonable level.  When
built for the ARM cortex-M4, with security enabled and the log level set
to INFO, the host code size was about 48 kB.  In recent days, I came up
with a few ideas for reducing the host code size.  As I explored these
ideas, I realized that they open the door for some major improvements in
the fundamental design of the host.  Making these changes would
introduce some backwards-compatibility issues, but I believe it is
absolutely the right thing to do.  If we do this, it needs to be done
now while Mynewt is still in its beta phase.  I have convinced myself
that this is the right way forward; now I would like to see what the
community thinks.  As always, all feedback is greatly appreciated.

There are two major changes that I am proposing:

1. All HCI command/acknowledgement exchanges are blocking.

Background: The host and controller communicate with one another via the
host-controller-interface (HCI) protocol.  The host sends _commands_ to
the controller; the controller sends _events_ to the host.  Whenever the
controller receives a command from the host, it immediately responds
with an acknowledgement event.  In addition, the controller also sends
unsolicited events to the host to indicate state changes or to request
information in a subsequent command.

In the current host, all HCI commands are sent asynchronously
(non-blocking).  When the host wants to send an HCI command, it
schedules a transmit operation by putting an OS event on its own event
queue.  The event points to a callback which does the actual HCI
transmission.  The callback also configures a second callback to be
executed when the expected acknowledgement is received from the
controller.  Each time the host receives an HCI event from the
controller, an OS event is put on the host's event queue.  Processing of
this OS event ultimately calls the configured callback (if it is an
acknowledgement), or a hardcoded callback (if it is an unsolicited HCI
event).

This design works, but it introduces a number of problems.  First, it
requires the host code to maintain some quite complex state machines for
what seem like simple HCI exchanges.  This FSM machinery translates into
a lot of extra code.  There is also a lot of ugliness involved in
canceling scheduled HCI transmits.

Another complication with non-blocking HCI commands is that they require
the host to jump through a lot of hoops to provide feedback to the
application.  Since all the work is done in parallel by the host task,
the host has to notify the application of failures by executing
callbacks configured by the application.  I did not want to place any
restrictions on what the application is allowed to do during these
callbacks, which means the host has to ensure that it is in a valid
state whenever a callback gets executed (no mutexes are locked, for
example).  This requires the code to use a large number of mutexes and
temporary copies of host data structures, resulting in a lot of
complicated code.

Finally, non-blocking HCI operations complicates the API presented to
the application.  A single return code from a blocking operation is
easier to manage than a return code plus the possibility of a callback
being executed sometime in the future from a different task.  A blocking
operation collapses several failure scenarios into a single function
return.

Making HCI command/acknowledgement exchanges blocking addresses all of
the above issues:
    * FSM machinery goes away; controller response is indicated in the
      return code of the HCI send function.
    * Nearly all HCI failures are indicated to the application
      immediately, so there is no need for lots of mutexes and temporary
      copies of data structures.
    * API is simplified; operation results are indicated via a simple
      function return code.

2. The Nimble host is "taskless"

Currently the Nimble host runs in its own OS task.  This is not
necessarily a bad thing, but in the case of the host, I think the costs
outweigh the benefits.  I can think of three benefits to running a
library in its own task:
    * Guarantee that timing requirements are met; just configure the
      task with an appropriate priority.
    * (related to the above point) The library task can continue to work
      while the application task is blocked.
    * Facilitates stack sizing. Since the library performs its
      operations in its own stack, it is easier to predict stack usage
      of both the library task and the application task.

I don't think any of these benefits are very compelling in the case of
the Nimble host for the following reasons:
    * The host has nothing resembling real-time timing requirements.
      There should be absolutely no problem with running the host task
      at the lowest priority, unless the hardware is simply
      overburdened, in which case there is no way to avoid issues no
      matter what you do.

    * The host code makes heavy use of application callbacks, making it
      quite difficult to estimate stack usage.  Since the host stack
      requirements depend on what the application does during these
      callbacks, the application would need to specify the host stack
      size during initialization anyway.

My proposal is to turn the Nimble host into a "flat" library that runs
in an application task.  When the application initializes the host, it
indicates which OS event queue should be used for host-related events.
Host operations would be captured in OS_EVENT_TIMER events that the
application task would need to handle generically, as it most likely
already does.  Note that these events would not be produced by an actual
timer; the events would be placed on the event queue immediately.  The
OS_EVENT_TIMER event type would just be used because it provides a basic
callback structure.

I should note also that it is fairly trivial for an application to turn
such a flat library into its own task if that is desired.  The
application developer would just need to create a simple task that
handles the OS_EVENT_TIMER events.

I think these two changes will have the following implications:
    1. Simpler API.
    2. Less RAM usage (no more FSM state, no parallel stacks).
    3. More RAM usage (larger stack).
    4. Major reduction in code size (I estimate a total size of 35 kB).

Hopefully points 2 and 3 will cancel each other out.

Thanks for reading,
Chris

Re: Proposed changes to Nimble host

Posted by Sterling Hughes <st...@apache.org>.
>>
>> 0 - 63: Core event types (TIMER, MQUEUE_DATA, etc.)
>> 64+: Per-task event types.
>>
>> So, the options for the host package are:
>> 1. Reserve new core event IDs.  This avoids conflicts, but permanently
>>    uses up a limited resource.
>> 2. Use arbitrary per-task event IDs.  This has the potential for
>>    conflicts, and doesn't strike me as a particularly good solution.
>> 3. Use a separate host task.  This allows the host use IDs in the per-task
>>    ID space without the risk of conflict.
>> 4. Leverage existing core events.  This is what I proposed.  It avoids
>>    conflicts and doesn't require any new event IDs, but it does feel a
>>    bit hacky to use the TIMER event ID for something that isn't a timer.
>>


What should use these core events?  I think reserving events is fine.

The other option is to reserve a generic "trampoline" event, which 
basically is like a callout, in that you have a pointer to a function 
call and an arg, and everything can repost them to a task.

I think we should have this regardless, but I'm not opposed to burning 
events for something as critical as the networking stack, for example.

Sterling

Re: Proposed changes to Nimble host

Posted by will sanfilippo <wi...@runtime.io>.
Yeah, I can see why you chose OS_EVENT_TIMER. It is almost like we should rename that event type :-) But I agree with everything you say below; creating a new event type for this seems wasteful. I am not quite sure what you mean by "My concern there is that applications may want to add special handling for certain event types…”. Are you referring to the events that a package may require of an application?

Anyway, solving this generically is definitely what we need to do.

Will


> On Apr 18, 2016, at 10:06 AM, Christopher Collins <cc...@apache.org> wrote:
> 
> On Mon, Apr 18, 2016 at 09:43:35AM -0700, Christopher Collins wrote:
>> On Mon, Apr 18, 2016 at 09:18:16AM -0700, will sanfilippo wrote:
>>> For #2, my only “concerns” (if you could call them such) are:
>>> * Using OS_EVENT_TIMER as opposed to some other event. Should all
>>> OS_EVENT_TIMER events be caused by a timer? Probably no big deal… What
>>> events are going to be processed here? Do you envision many host
>>> events?
>> 
>> Yes, I agree.  I think a more appropriate event type would be
>> OS_EVENT_CALLBACK or similar.  I am a bit leery about adding a new OS
>> event type for this case, because it would require all applications to
>> handle an extra event type without any practical benefit.  Perhaps
>> mynewt could relieve this burden with an "os_handle_event()" function
>> which processes these generic events.  My concern there is that
>> applications may want to add special handling for certain event types,
>> so they wouldn't want to call the helper function anyway.
>> 
>> The OS events that the host would generate are:
>>    * Incoming ACL data packets.
>>    * Incoming HCI events.
>>    * Expired timers.
> 
> (I meant "process", not "generate"!)
> 
> Oops... I went down a rabbit hole and forgot to address the main point
> :).  What we would *really* want here is something like:
>    * BLE_HS_EVENT_ACL_DATA_IN
>    * BLE_HS_EVENT_HCI_EVENT_IN
> 
> However, the issue here is that the event type IDs are defined in a
> single "number-space".  If the host package reserves IDs for its own
> events, then no other packages can use those IDs for its own events
> without a conflict.  The 8-bit ID space is divided into two parts:
> 
> 0 - 63: Core event types (TIMER, MQUEUE_DATA, etc.)
> 64+: Per-task event types.
> 
> So, the options for the host package are:
> 1. Reserve new core event IDs.  This avoids conflicts, but permanently
>   uses up a limited resource.
> 2. Use arbitrary per-task event IDs.  This has the potential for
>   conflicts, and doesn't strike me as a particularly good solution.
> 3. Use a separate host task.  This allows the host use IDs in the per-task
>   ID space without the risk of conflict.
> 4. Leverage existing core events.  This is what I proposed.  It avoids
>   conflicts and doesn't require any new event IDs, but it does feel a
>   bit hacky to use the TIMER event ID for something that isn't a timer.
> 
> I think this might be a common problem for other packages in the future.
> I don't think it is that unusual for a package to not create its own
> task, but still have the need to generate OS events.  So perhaps we
> should think about how to solve this general problem.
> 
> Chris


Re: Proposed changes to Nimble host

Posted by Christopher Collins <cc...@apache.org>.
On Mon, Apr 18, 2016 at 09:43:35AM -0700, Christopher Collins wrote:
> On Mon, Apr 18, 2016 at 09:18:16AM -0700, will sanfilippo wrote:
> > For #2, my only “concerns” (if you could call them such) are:
> > * Using OS_EVENT_TIMER as opposed to some other event. Should all
> > OS_EVENT_TIMER events be caused by a timer? Probably no big deal… What
> > events are going to be processed here? Do you envision many host
> > events?
> 
> Yes, I agree.  I think a more appropriate event type would be
> OS_EVENT_CALLBACK or similar.  I am a bit leery about adding a new OS
> event type for this case, because it would require all applications to
> handle an extra event type without any practical benefit.  Perhaps
> mynewt could relieve this burden with an "os_handle_event()" function
> which processes these generic events.  My concern there is that
> applications may want to add special handling for certain event types,
> so they wouldn't want to call the helper function anyway.
> 
> The OS events that the host would generate are:
>     * Incoming ACL data packets.
>     * Incoming HCI events.
>     * Expired timers.

(I meant "process", not "generate"!)

Oops... I went down a rabbit hole and forgot to address the main point
:).  What we would *really* want here is something like:
    * BLE_HS_EVENT_ACL_DATA_IN
    * BLE_HS_EVENT_HCI_EVENT_IN

However, the issue here is that the event type IDs are defined in a
single "number-space".  If the host package reserves IDs for its own
events, then no other packages can use those IDs for its own events
without a conflict.  The 8-bit ID space is divided into two parts:

0 - 63: Core event types (TIMER, MQUEUE_DATA, etc.)
64+: Per-task event types.

So, the options for the host package are:
1. Reserve new core event IDs.  This avoids conflicts, but permanently
   uses up a limited resource.
2. Use arbitrary per-task event IDs.  This has the potential for
   conflicts, and doesn't strike me as a particularly good solution.
3. Use a separate host task.  This allows the host use IDs in the per-task
   ID space without the risk of conflict.
4. Leverage existing core events.  This is what I proposed.  It avoids
   conflicts and doesn't require any new event IDs, but it does feel a
   bit hacky to use the TIMER event ID for something that isn't a timer.

I think this might be a common problem for other packages in the future.
I don't think it is that unusual for a package to not create its own
task, but still have the need to generate OS events.  So perhaps we
should think about how to solve this general problem.

Chris

Re: Proposed changes to Nimble host

Posted by Christopher Collins <cc...@apache.org>.
On Mon, Apr 18, 2016 at 09:18:16AM -0700, will sanfilippo wrote:
> For #2, my only “concerns” (if you could call them such) are:
> * Using OS_EVENT_TIMER as opposed to some other event. Should all
> OS_EVENT_TIMER events be caused by a timer? Probably no big deal… What
> events are going to be processed here? Do you envision many host
> events?

Yes, I agree.  I think a more appropriate event type would be
OS_EVENT_CALLBACK or similar.  I am a bit leery about adding a new OS
event type for this case, because it would require all applications to
handle an extra event type without any practical benefit.  Perhaps
mynewt could relieve this burden with an "os_handle_event()" function
which processes these generic events.  My concern there is that
applications may want to add special handling for certain event types,
so they wouldn't want to call the helper function anyway.

The OS events that the host would generate are:
    * Incoming ACL data packets.
    * Incoming HCI events.
    * Expired timers.

> * I wonder about the complexity of this from an application developers
> standpoint. Not saying that what you propose would be more or less
> complex; just something we should consider when making these changes.

I think the taskless design reduces complexity for the application
developer.  If there is no host task, the developer can worry less about
task priorities and stack sizes.  

> On a side note (I guess it is related), we should consider how
> applications are going to initialize the host and/or the controller in
> regards to system memory requirements (i.e. mbufs). While our current
> methodology to create a BLE app is not rocket science, I think we
> could make it a bit simpler.

Yes, definitely.  As you say, the setup is not terribly complicated, but
it does involve a fair number of steps, so it will seem complicated to
someone not familiar with Mynewt.

Chris

Re: Proposed changes to Nimble host

Posted by will sanfilippo <wi...@runtime.io>.
All sounds excellent!

+1 for #1. That only seems like a good thing.

For #2, my only “concerns” (if you could call them such) are:
* Using OS_EVENT_TIMER as opposed to some other event. Should all OS_EVENT_TIMER events be caused by a timer? Probably no big deal… What events are going to be processed here? Do you envision many host events?
* I wonder about the complexity of this from an application developers standpoint. Not saying that what you propose would be more or less complex; just something we should consider when making these changes.

On a side note (I guess it is related), we should consider how applications are going to initialize the host and/or the controller in regards to system memory requirements (i.e. mbufs). While our current methodology to create a BLE app is not rocket science, I think we could make it a bit simpler.


> On Apr 17, 2016, at 3:57 PM, Christopher Collins <cc...@apache.org> wrote:
> 
> Hello all,
> 
> The Mynewt BLE stack is called Nimble.  Nimble consists of two packages:
>    * Controller (link-layer) [net/nimble/controller]
>    * Host (upper layers)     [net/nimble/host]
> 
> This email concerns the Nimble host.  
> 
> As I indicated in an email a few weeks ago, the code size of the Nimble
> host had increased beyond what I considered a reasonable level.  When
> built for the ARM cortex-M4, with security enabled and the log level set
> to INFO, the host code size was about 48 kB.  In recent days, I came up
> with a few ideas for reducing the host code size.  As I explored these
> ideas, I realized that they open the door for some major improvements in
> the fundamental design of the host.  Making these changes would
> introduce some backwards-compatibility issues, but I believe it is
> absolutely the right thing to do.  If we do this, it needs to be done
> now while Mynewt is still in its beta phase.  I have convinced myself
> that this is the right way forward; now I would like to see what the
> community thinks.  As always, all feedback is greatly appreciated.
> 
> There are two major changes that I am proposing:
> 
> 1. All HCI command/acknowledgement exchanges are blocking.
> 
> Background: The host and controller communicate with one another via the
> host-controller-interface (HCI) protocol.  The host sends _commands_ to
> the controller; the controller sends _events_ to the host.  Whenever the
> controller receives a command from the host, it immediately responds
> with an acknowledgement event.  In addition, the controller also sends
> unsolicited events to the host to indicate state changes or to request
> information in a subsequent command.
> 
> In the current host, all HCI commands are sent asynchronously
> (non-blocking).  When the host wants to send an HCI command, it
> schedules a transmit operation by putting an OS event on its own event
> queue.  The event points to a callback which does the actual HCI
> transmission.  The callback also configures a second callback to be
> executed when the expected acknowledgement is received from the
> controller.  Each time the host receives an HCI event from the
> controller, an OS event is put on the host's event queue.  Processing of
> this OS event ultimately calls the configured callback (if it is an
> acknowledgement), or a hardcoded callback (if it is an unsolicited HCI
> event).
> 
> This design works, but it introduces a number of problems.  First, it
> requires the host code to maintain some quite complex state machines for
> what seem like simple HCI exchanges.  This FSM machinery translates into
> a lot of extra code.  There is also a lot of ugliness involved in
> canceling scheduled HCI transmits.
> 
> Another complication with non-blocking HCI commands is that they require
> the host to jump through a lot of hoops to provide feedback to the
> application.  Since all the work is done in parallel by the host task,
> the host has to notify the application of failures by executing
> callbacks configured by the application.  I did not want to place any
> restrictions on what the application is allowed to do during these
> callbacks, which means the host has to ensure that it is in a valid
> state whenever a callback gets executed (no mutexes are locked, for
> example).  This requires the code to use a large number of mutexes and
> temporary copies of host data structures, resulting in a lot of
> complicated code.
> 
> Finally, non-blocking HCI operations complicates the API presented to
> the application.  A single return code from a blocking operation is
> easier to manage than a return code plus the possibility of a callback
> being executed sometime in the future from a different task.  A blocking
> operation collapses several failure scenarios into a single function
> return.
> 
> Making HCI command/acknowledgement exchanges blocking addresses all of
> the above issues:
>    * FSM machinery goes away; controller response is indicated in the
>      return code of the HCI send function.
>    * Nearly all HCI failures are indicated to the application
>      immediately, so there is no need for lots of mutexes and temporary
>      copies of data structures.
>    * API is simplified; operation results are indicated via a simple
>      function return code.
> 
> 2. The Nimble host is "taskless"
> 
> Currently the Nimble host runs in its own OS task.  This is not
> necessarily a bad thing, but in the case of the host, I think the costs
> outweigh the benefits.  I can think of three benefits to running a
> library in its own task:
>    * Guarantee that timing requirements are met; just configure the
>      task with an appropriate priority.
>    * (related to the above point) The library task can continue to work
>      while the application task is blocked.
>    * Facilitates stack sizing. Since the library performs its
>      operations in its own stack, it is easier to predict stack usage
>      of both the library task and the application task.
> 
> I don't think any of these benefits are very compelling in the case of
> the Nimble host for the following reasons:
>    * The host has nothing resembling real-time timing requirements.
>      There should be absolutely no problem with running the host task
>      at the lowest priority, unless the hardware is simply
>      overburdened, in which case there is no way to avoid issues no
>      matter what you do.
> 
>    * The host code makes heavy use of application callbacks, making it
>      quite difficult to estimate stack usage.  Since the host stack
>      requirements depend on what the application does during these
>      callbacks, the application would need to specify the host stack
>      size during initialization anyway.
> 
> My proposal is to turn the Nimble host into a "flat" library that runs
> in an application task.  When the application initializes the host, it
> indicates which OS event queue should be used for host-related events.
> Host operations would be captured in OS_EVENT_TIMER events that the
> application task would need to handle generically, as it most likely
> already does.  Note that these events would not be produced by an actual
> timer; the events would be placed on the event queue immediately.  The
> OS_EVENT_TIMER event type would just be used because it provides a basic
> callback structure.
> 
> I should note also that it is fairly trivial for an application to turn
> such a flat library into its own task if that is desired.  The
> application developer would just need to create a simple task that
> handles the OS_EVENT_TIMER events.
> 
> I think these two changes will have the following implications:
>    1. Simpler API.
>    2. Less RAM usage (no more FSM state, no parallel stacks).
>    3. More RAM usage (larger stack).
>    4. Major reduction in code size (I estimate a total size of 35 kB).
> 
> Hopefully points 2 and 3 will cancel each other out.
> 
> Thanks for reading,
> Chris