You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nuttx.apache.org by GitBox <gi...@apache.org> on 2022/04/22 11:22:20 UTC

[GitHub] [incubator-nuttx] pussuw opened a new pull request, #6134: RISC-V: Fix system crash when FPU is in use

pussuw opened a new pull request, #6134:
URL: https://github.com/apache/incubator-nuttx/pull/6134

   FPU registers need to be written prior to updating CSR_STATUS
   
   ## Summary
   Fixes full system crash due to illegal instruction when accessing FPU when it is too late
   ## Impact
   Fixes system crash
   ## Testing
   icicle:nsh and icicle:knsh, ostest fails at once prior fix, passes with fix
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@nuttx.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-nuttx] masayuki2009 commented on pull request #6134: RISC-V: Fix system crash when FPU is in use

Posted by GitBox <gi...@apache.org>.
masayuki2009 commented on PR #6134:
URL: https://github.com/apache/incubator-nuttx/pull/6134#issuecomment-1110343817

   @pussuw 
   
   >I confirmed that maix-bit:smp works on real hardware.
   
   I noticed that fpu_test works ** without ** this PR on maix-bit (real hardware).
   So I think that something is wrong with your icicle.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@nuttx.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-nuttx] pussuw commented on pull request #6134: RISC-V: Fix system crash when FPU is in use

Posted by GitBox <gi...@apache.org>.
pussuw commented on PR #6134:
URL: https://github.com/apache/incubator-nuttx/pull/6134#issuecomment-1109545726

   I have never used any emulated environment so obviously I did not notice this issue. I ran several tests with FPU enabled/disabled and with flat/protected mode, and both work with this patch, but at least protected mode fails randomly without. 
   
   I'll continue monitoring this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@nuttx.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-nuttx] masayuki2009 commented on pull request #6134: RISC-V: Fix system crash when FPU is in use

Posted by GitBox <gi...@apache.org>.
masayuki2009 commented on PR #6134:
URL: https://github.com/apache/incubator-nuttx/pull/6134#issuecomment-1109525750

   >@masayuki2009 I am not able to reproduce on hardware (mpfs icicle:nsh):
   
   @pussuw 
   
   I confirmed that `maix-bit:smp` works on real hardware.
   
   ```
   nsh> uname -a
   NuttX 10.3.0-RC0 aaa5316235 Apr 26 2022 17:30:16 risc-v maix-bit
   nsh> mount
     /proc type procfs
   nsh> ps
     PID GROUP CPU PRI POLICY   TYPE    NPX STATE    EVENT     SIGMASK   STACK   USED  FILLED COMMAND
       0     0   0   0 FIFO     Kthread N-- Assigned           00000000 002000 001072  53.6%  CPU0 IDLE
       1     1   1   0 FIFO     Kthread N-- Running            00000000 002000 000592  29.6%  CPU1 IDLE
       2     2   0 100 RR       Task    --- Running            00000000 003040 002256  74.2%  nsh_main
   nsh> free
                      total       used       free    largest  nused  nfree
           Umem:    2079920      11984    2067936    2067936     38      1
   
   ...
   
   nsh> ostest
   stdio_test: write fd=1
   stdio_test: Standard I/O Check: printf
   stdio_test: write fd=2
   stdio_test: Standard I/O Check: fprintf to stderr
   ostest_main: putenv(Variable1=BadValue3)
   ostest_main: setenv(Variable1, GoodValue1, TRUE)
   ostest_main: setenv(Variable2, BadValue1, FALSE)
   ostest_main: setenv(Variable2, GoodValue2, TRUE)
   ostest_main: setenv(Variable3, Variable3, FALSE)
   ostest_main: setenv(Variable3, Variable3, FALSE)
   show_variable: Variable=Variable1 has value=GoodValue1
   show_variable: Variable=Variable2 has value=GoodValue2
   show_variable: Variable=Variable3 has value=GoodValue3
   ostest_main: Started user_main at PID=24
   ...
   
   user_main: FPU test
   Starting task FPU#1
   FPU#1: pass 1
   fpu_test: Started task FPU#1 at PID=25
   Starting task FPU#2
   FPU#2: pass 1
   fpu_test: Started task FPU#2 at PID=26
   FPU#1: pass 2
   FPU#2: pass 2
   FPU#1: pass 3
   FPU#2: pass 3
   FPU#1: pass 4
   FPU#2: pass 4
   FPU#1: pass 5
   FPU#2: pass 5
   FPU#1: pass 6
   FPU#2: pass 6
   FPU#1: pass 7
   FPU#2: pass 7
   FPU#1: pass 8
   FPU#2: pass 8
   FPU#1: pass 9
   FPU#2: pass 9
   FPU#1: pass 10
   FPU#2: pass 10
   FPU#1: pass 11
   FPU#2: pass 11
   FPU#1: pass 12
   FPU#2: pass 12
   FPU#1: pass 13
   FPU#2: pass 13
   FPU#1: pass 14
   FPU#2: pass 14
   FPU#1: pass 15
   FPU#2: pass 15
   FPU#1: pass 16
   FPU#2: pass 16
   FPU#1: Succeeded
   FPU#2: Succeeded
   fpu_test: Returning
   
   ....
   
   ```
   
   So the issue might only happen with `rv-virt:nsh64` on QEMU.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@nuttx.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-nuttx] masayuki2009 commented on pull request #6134: RISC-V: Fix system crash when FPU is in use

Posted by GitBox <gi...@apache.org>.
masayuki2009 commented on PR #6134:
URL: https://github.com/apache/incubator-nuttx/pull/6134#issuecomment-1109173828

   @pussuw 
   
   I noticed that fpu_test with rv-virt:nsh64 on QEMU failed.
   If I revert this PR, it works.
   
   $ ~/opensource/QEMU/qemu-6.2/build/riscv64-softmmu/qemu-system-riscv64 -nographic -cpu rv64 -smp 8 -M virt -bios none -kernel ./nuttx
   
   ```
   user_main: FPU test
   Starting task FPU#1
   fpu_test: Started task FPU#1 at PID=4
   riscv_exception: EXCEPTION: Illegal instruction. MCAUSE: 0000000000000002
   riscv_exception: PANIC!!! Exception = 0000000000000002
   up_assert: Assertion failed at file:common/riscv_exception.c line: 89 task: FPU#1
   riscv_registerdump: EPC: 000000008000f574
   riscv_registerdump: A0: 0000000000000000 A1: 0000000080025f30 A2: 0000000080025f30 A3: 000000008000f518
   riscv_registerdump: A4: 0000000000000020 A5: 0000000000000440 A6: 0000000000000000 A7: 0000000000000000
   riscv_registerdump: T0: 0000000000000000 T1: 0000000000000000 T2: 0000000000000000 T3: 0000000000000000
   riscv_registerdump: T4: 0000000000000000 T5: 0000000000000000 T6: 0000000000000000
   riscv_registerdump: S0: 0000000000000000 S1: 0000000000000001 S2: 0000000000000000 S3: 0000000000000000
   riscv_registerdump: S4: 0000000000000000 S5: 000000008001f608 S6: 0000000000000000 S7: 0000000000000000
   riscv_registerdump: S8: 0000000000000000 S9: 0000000000000000 S10: 0000000000000000 S11: 0000000000000000
   riscv_registerdump: SP: 0000000080026670 FP: 0000000000000000 TP: 0000000000000000 RA: 000000008000f566
   riscv_dumpstate: sp:     000000008001ea70
   riscv_dumpstate: IRQ stack:
   riscv_dumpstate:   base: 000000008001e310
   riscv_dumpstate:   size: 0000000000000800
   riscv_stackdump: 000000008001ea60: 000007e0 00000000 80004baa 00000000 00000059 00000000 80015478 00000000
   riscv_stackdump: 000000008001ea80: 00000000 00000000 00000000 00000000 00000000 00000000 80026460 00000000
   riscv_stackdump: 000000008001eaa0: 00000002 00000000 8000307c 00000000 00000009 00000000 800006a4 00000000
   riscv_stackdump: 000000008001eac0: 8001ff18 00000000 00000001 00000000 8001ff18 00000000 80000aa8 00000000
   riscv_stackdump: 000000008001eae0: deadbeef deadbeef 80000400 00000000 00000002 00000000 800003c2 00000000
   riscv_dumpstate: sp:     0000000080026670
   riscv_dumpstate: User stack:
   riscv_dumpstate:   base: 0000000080025f50
   riscv_dumpstate:   size: 00000000000007e0
   riscv_dumpstate: User Stack
   riscv_stackdump: 0000000080026660: 00000000 00000000 8000f566 00000000 8001ec38 00000000 00000c00 00000000
   riscv_stackdump: 0000000080026680: 80005082 00000000 00000000 00000000 80021540 00000000 80000bec 00000000
   riscv_stackdump: 00000000800266a0: 80021580 00000000 80000fb0 00000000 80021580 00000000 80000cfa 00000000
   riscv_stackdump: 00000000800266c0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
   riscv_stackdump: 00000000800266e0: 00000000 00000000 00000000 00000000 8000f518 00000000 80003160 00000000
   riscv_stackdump: 0000000080026700: 80025f30 00000000 00000001 00000000 00000000 00000000 8000100a 00000000
   riscv_showtasks:    PID    PRI      USED     STACK   FILLED    COMMAND
   riscv_showtasks:   ----   ----       648      2048    31.6%    irq
   riscv_dump_task:      0      0      1040      2000    52.0%    Idle Task
   riscv_dump_task:      1    100      1616      3040    53.1%    nsh_main
   riscv_dump_task:      2    100      1328      2016    65.8%    ostest
   riscv_dump_task:      3    100      1440      8112    17.7%    ostest
   riscv_dump_task:      4    100       720      2016    35.7%    FPU#1
   
   
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@nuttx.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-nuttx] pussuw commented on pull request #6134: RISC-V: Fix system crash when FPU is in use

Posted by GitBox <gi...@apache.org>.
pussuw commented on PR #6134:
URL: https://github.com/apache/incubator-nuttx/pull/6134#issuecomment-1301895125

   @masayuki2009 It has been a while but I was revisiting the context saving code and remembered you had an issue with FPU on QEMU. I found at least one issue, that might be the cause for my crashes:
   
   The int/fpu register restoration ordering in exception_common:
   ```
     load_ctx   sp
     riscv_loadfpu sp
   ```
   is bad, because riscv_loadfpu uses registers t0,t1,t2
   ```
   .macro riscv_loadfpu out
   
   #ifdef CONFIG_ARCH_FPU
     REGLOAD      t0, REG_INT_CTX(\out)
     li           t1, MSTATUS_FS
     and          t2, t0, t1
     li           t1, MSTATUS_FS_INIT
     ble          t2, t1, skip_load_fpu
   ```
   
   So the bad ordering restores the original integer registers including t0,t1,t2, and then these are immediately destroyed by riscv_loadfpu.
   
   One potential fix would be use the "bad" ordering above, but to restore t0,t1,t2 from the context after the FPU context is restored:
   
   ```
   skip_load_fpu:
   ... restore t0,t1,t2
   ```
   
   I will try this again when I have time, but if you ever get the urge to get FPU working on QEMU this might be a potential fix.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@nuttx.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-nuttx] pussuw commented on pull request #6134: RISC-V: Fix system crash when FPU is in use

Posted by GitBox <gi...@apache.org>.
pussuw commented on PR #6134:
URL: https://github.com/apache/incubator-nuttx/pull/6134#issuecomment-1110623077

   @masayuki2009 
   
   I also did testing on icicle:nsh (BUILD_FLAT) and it works with and without this patch. 
   
   However, with icicle:knsh (BUILD_PROTECTED) the issue is 100% reproducible. FPU gets enabled, but ostest does not run fpu_test as it only compiles with build flat. With knsh, the system crashes at random locations during ostest, which has never completed successfully **without** this patch, and I have not seen an error **with* this patch.
   
   The way I found this (potential) issue, is by bisecting:
   5b6dd876b87196cc82dd45258d77e22d0c8b033b <- good commit (risc-v/riscv_assert.c Fix dumping of status from ISR)
   c29a3b7bd89c45d3781133679ae875e0e835bdd4 <- bad commit (pipe: Increase buffer size by one byte to ompensate the full indicator)
   
   So somewhere between those two commits there is an issue.
   
   48b81bda0904a426360fa809fd747e3966d29024 <- (arch/risc-v: Change riscv_savefpu/riscv_loadfpu to macro) by partially reverting this commit the crash goes away. The partial revert is to move loadfpu back to where it was prior to 48b8...
   
   So this is how I made this patch. Of course there can be some other issue and I just masked it with this patch. One clear candidate can be `riscv_get_newintctx()`, because ostest creates / destroys new tasks for certain tests


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@nuttx.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-nuttx] xiaoxiang781216 merged pull request #6134: RISC-V: Fix system crash when FPU is in use

Posted by GitBox <gi...@apache.org>.
xiaoxiang781216 merged PR #6134:
URL: https://github.com/apache/incubator-nuttx/pull/6134


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@nuttx.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-nuttx] pussuw commented on pull request #6134: RISC-V: Fix system crash when FPU is in use

Posted by GitBox <gi...@apache.org>.
pussuw commented on PR #6134:
URL: https://github.com/apache/incubator-nuttx/pull/6134#issuecomment-1109396888

   @masayuki2009 I'll run fpu test on hardware. I get an immediate crash due to illegal instruction if this patch is not in place, so the exact opposite behavior to yours.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@nuttx.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-nuttx] pussuw commented on pull request #6134: RISC-V: Fix system crash when FPU is in use

Posted by GitBox <gi...@apache.org>.
pussuw commented on PR #6134:
URL: https://github.com/apache/incubator-nuttx/pull/6134#issuecomment-1109419451

   @masayuki2009 I am not able to reproduce on hardware (mpfs icicle:nsh):
   ```
   user_main: FPU test
   Starting task FPU#1
   fpu_test: Started task FPU#1 at PID=6
   FPU#1: pass 1
   Starting task FPU#2
   fpu_test: Started task FPU#2 at PID=7
   FPU#2: pass 1
   FPU#1: pass 2
   FPU#2: pass 2
   FPU#1: pass 3
   FPU#2: pass 3
   FPU#1: pass 4
   FPU#2: pass 4
   FPU#1: pass 5
   FPU#2: pass 5
   FPU#1: pass 6
   FPU#2: pass 6
   FPU#1: pass 7
   FPU#2: pass 7
   FPU#1: pass 8
   FPU#2: pass 8
   FPU#1: pass 9
   FPU#2: pass 9
   FPU#1: pass 10
   FPU#2: pass 10
   FPU#1: pass 11
   FPU#2: pass 11
   FPU#1: pass 12
   FPU#2: pass 12
   FPU#1: pass 13
   FPU#2: pass 13
   FPU#1: pass 14
   FPU#2: pass 14
   FPU#1: pass 15
   FPU#2: pass 15
   FPU#1: pass 16
   FPU#2: pass 16
   FPU#1: Succeeded
   FPU#2: Succeeded
   fpu_test: Returning
   
   user_main: Exiting
   ostest_main: Exiting with status 0
   nsh> 
   
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@nuttx.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-nuttx] pussuw commented on pull request #6134: RISC-V: Fix system crash when FPU is in use

Posted by GitBox <gi...@apache.org>.
pussuw commented on PR #6134:
URL: https://github.com/apache/incubator-nuttx/pull/6134#issuecomment-1109449088

   I'm wondering if this is somehow related:
   https://github.com/apache/incubator-nuttx/pull/6125
   
   This PR:
   https://github.com/apache/incubator-nuttx/pull/6101
   
   Changed the order in which FPU and CSR_STATUS are restored, 6134 (this PR) basically just restores the original ordering. In my case, without 6134 the system crashes totally randomly, not even necessarily during the FPU test. With 6134 I don't get crashes and FPU test also works fine.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@nuttx.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org