You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nuttx.apache.org by ac...@apache.org on 2020/01/03 15:34:18 UTC

[incubator-nuttx] branch pr34 updated: Todo (#34)

This is an automated email from the ASF dual-hosted git repository.

acassis pushed a commit to branch pr34
in repository https://gitbox.apache.org/repos/asf/incubator-nuttx.git


The following commit(s) were added to refs/heads/pr34 by this push:
     new 5869bda  Todo (#34)
5869bda is described below

commit 5869bda3ce1727477e0dc8fdbb185bde62f4194e
Author: patacongo <sp...@yahoo.com>
AuthorDate: Fri Jan 3 09:34:09 2020 -0600

    Todo (#34)
    
    * Documentation/NuttXCCodingStandard.html:  Remove requirement to decorate ignored returned values with (void).
    
    * TODO:  Update TODO list
    
    Co-authored-by: Gregory Nutt <gn...@nuttx.org>
---
 TODO | 80 +++++++++++++++++++++++++++++++++++++++++++++-----------------------
 1 file changed, 53 insertions(+), 27 deletions(-)

diff --git a/TODO b/TODO
index 7a17287..029a46f 100644
--- a/TODO
+++ b/TODO
@@ -1,4 +1,4 @@
-NuttX TODO List (Last updated November 21, 2019)
+NuttX TODO List (Last updated January 3, 2019)
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 This file summarizes known NuttX bugs, limitations, inconsistencies with
@@ -589,32 +589,58 @@ o SMP
                can that occur?  I think it can occur in the following
                situation:
 
-               CPU0 - Task A is running.
-                    - The CPU0 IDLE task is the only other task in the
-                      CPU0 ready-to-run list.
-               CPU1 - Task B is running.
-                    - Task C is blocked but remains in the g_assignedtasks[]
-                      list because of a CPU affinity selection.  Task C
-                      also holds the critical section which is temporarily
-                      relinquished because Task C is blocked by Task B.
-                    - The CPU1 IDLE task is at the end of the list.
-
-               Actions:
-               1. Task A/CPU 0 takes the critical section.
-               2. Task B/CPU 1 suspends waiting for an event
-               3. Task C is restarted.
-
-               Now both Task A and Task C hold the critical section.
-
-               This problem has never been observed, but seems to be a
-               possibility.  I believe it could only occur if CPU affinity
-               is used (otherwise, tasks will pend must as when pre-
-               emption is disabled).
-
-               A proper solution would probably involve re-designing how
-               CPU affinity is implemented.  The CPU1 IDLE thread should
-               more appropriately run, but cannot because the Task C TCB
-               is in the g_assignedtasks[] list.
+               The log below was reported is Nuttx running on two cores
+               Cortex-A7 architecture in SMP mode.  You can notice see that
+               when sched_addreadytorun() was called, the g_cpu_irqset is 3.
+
+                 sched_addreadytorun: irqset cpu 1, me 0 btcbname init, irqset 1 irqcount 2.
+                 sched_addreadytorun: sched_addreadytorun line 338 g_cpu_irqset = 3.
+
+               This can happen, but only under a very certain condition.
+               g_cpu_irqset only exists to support this certain condition:
+
+                 a. A task running on CPU 0 takes the critical section.  So
+                    g_cpu_irqset == 0x1.
+
+                 b. A task exits on CPU 1 and a waiting, ready-to-run task
+                    is re-started on CPU 1.  This new task also holds the
+                    critical section.  So when the task is re-restarted on
+                    CPU 1, we than have g_cpu_irqset == 0x3
+
+               So we are in a very perverse state!  There are two tasks
+               running on two different CPUs and both hold the critical
+               section.  I believe that is a dangerous situation and there
+               could be undiscovered bugs that could happen in that case.
+               However, as of this moment, I have not heard of any specific
+               problems caused by this weird behavior.
+
+               A possible solution would be to add a new task state that
+               would exist only for SMP.
+
+               - Add a new SMP-only task list and state.  Say,
+                 g_csection_wait[].  It should be prioritized.
+               - When a task acquires the critical section, all tasks in
+                 g_readytorun[] that need the critical section would be
+                 moved to g_csection_wait[].
+               - When any task is unblocked for any reason and moved to the
+                 g_readytorun[] list, if that unblocked task needs the
+                 critical section, it would also be moved to the
+                 g_csection_wait[] list.  No task that needs the critical
+                 section can be in the ready-to-run list if the critical
+                 section is not available.
+               - When the task releases the critical section, all tasks in
+                 the g_csection_wait[] needs to be moved back to
+                 g_readytorun[].
+              - This may result in a context switch.  The tasks should be
+                 moved back to g_readytorun[] higest priority first.  If a
+                 context switch occurs and the critical section to re-taken
+                 by the re-started task, the lower priority tasks in
+                 g_csection_wait[] must stay in that list.
+
+               That is really not as much work as it sounds.  It is
+               something that could be done in 2-3 days of work if you know
+               what you are doing.  Getting the proper test setup and
+               verifying the cahnge would be the more difficult task.
 
 Status:        Open
 Priority:      Unknown.  Might be high, but first we would need to confirm