You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nuttx.apache.org by je...@apache.org on 2020/11/03 07:19:49 UTC
[incubator-nuttx] branch master updated: sched: irq: Fix
enter_critical_section() in an irq handler for SMP
This is an automated email from the ASF dual-hosted git repository.
jerpelea pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-nuttx.git
The following commit(s) were added to refs/heads/master by this push:
new 6ec9408 sched: irq: Fix enter_critical_section() in an irq handler for SMP
6ec9408 is described below
commit 6ec94082a10a914acc50330ab40f63a425fa67d0
Author: Masayuki Ishikawa <ma...@gmail.com>
AuthorDate: Mon Nov 2 17:06:21 2020 +0900
sched: irq: Fix enter_critical_section() in an irq handler for SMP
Summary:
- I found a deadlock during Wi-Fi audio streaming test plus stress test
- The testing environment was spresense:wifi_smp (NCPUS=4)
- The deadlock happened because two CPUs called up_cpu_pause() almost simultaneously
- This situation should not happen, because up_cpu_pause() is called in a critical section
- Actually, the latter call was from nxsem_post() in an IRQ handler
- And when enter_critical_section() was called, irq_waitlock() detected a deadlock
- Then it called up_cpu_paused() to break the deadlock
- However, this resulted in setting g_cpu_irqset on the CPU
- Even though another CPU had held a g_cpu_irqlock
- This situation violates the critical section and should be avoided
- To avoid the situation, if a CPU sets g_cpu_irqset after calling up_cpu_paused()
- The CPU must release g_cpu_irqlock first
- Then retry irq_waitlock() to acquire g_cpu_irqlock
Impact:
- Affect SMP
Testing:
- Tested with spresense:wifi_smp (NCPUS=2 and 4)
- Tested with spresense:smp
- Tested with sim:smp
- Tested with sabre-6quad:smp (QEMU)
- Tested with maix-bit:smp (QEMU)
- Tested with esp32-core:smp (QEMU)
- Tested with lc823450-xgevk:rndis
Signed-off-by: Masayuki Ishikawa <Ma...@jp.sony.com>
---
sched/irq/irq_csection.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/sched/irq/irq_csection.c b/sched/irq/irq_csection.c
index 6993641..9eeb4b4 100644
--- a/sched/irq/irq_csection.c
+++ b/sched/irq/irq_csection.c
@@ -265,6 +265,7 @@ try_again:
* no longer blocked by the critical section).
*/
+try_again_in_irq:
if (!irq_waitlock(cpu))
{
/* We are in a deadlock condition due to a pending
@@ -273,6 +274,24 @@ try_again:
*/
DEBUGVERIFY(up_cpu_paused(cpu));
+
+ /* NOTE: As the result of up_cpu_paused(cpu), this CPU
+ * might set g_cpu_irqset in nxsched_resume_scheduler()
+ * However, another CPU might hold g_cpu_irqlock.
+ * To avoid this situation, releae g_cpu_irqlock first.
+ */
+
+ if ((g_cpu_irqset & (1 << cpu)) != 0)
+ {
+ spin_clrbit(&g_cpu_irqset, cpu, &g_cpu_irqsetlock,
+ &g_cpu_irqlock);
+ }
+
+ /* NOTE: Here, this CPU does not hold g_cpu_irqlock,
+ * so call irq_waitlock(cpu) to acquire g_cpu_irqlock.
+ */
+
+ goto try_again_in_irq;
}
}