You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nuttx.apache.org by "MainframeReboot (via GitHub)" <gi...@apache.org> on 2024/04/29 16:20:03 UTC

[I] MPFS Icicle Kit knsh config failure [nuttx]

MainframeReboot opened a new issue, #12259:
URL: https://github.com/apache/nuttx/issues/12259

   Hi,
   
   I am new to NuttX and have hit a wall when it comes to running NuttX as a kernel build. The provided flat build (nsh) config worked and I was able to run NuttX without issue. Building NuttX using the provided kernel build config (knsh) causes errors when running on the Icicle kit:
   
   ![image](https://github.com/apache/nuttx/assets/168458700/81334cd5-2b29-46d6-bf77-2c5dd5051eac)
   
   I have attempted to debug this issue on my own and while comparing the Icicle kit knsh config to other RISC-V platform knsh configs, I've noticed that a number of parameters that differ. For instance, in the Icicle kit knsh config, the number of pages for DATA, HEAP and TEXT are all set to 0. As are the addresses for TEXT and DATA.
   
   ```
   CONFIG_ARCH_DATA_NPAGES=0
   CONFIG_ARCH_DATA_VBASE=0xC0000000
   CONFIG_ARCH_HEAP_NPAGES=0
   CONFIG_ARCH_HEAP_VBASE=0x00000000
   CONFIG_ARCH_TEXT_NPAGES=0
   CONFIG_ARCH_TEXT_VBASE=0x00000000
   ```
   
   Is this potentially the problem or am I completely off track here? Any direction here would be greatly appreciated!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@nuttx.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] MPFS Icicle Kit knsh config failure [nuttx]

Posted by "acassis (via GitHub)" <gi...@apache.org>.
acassis commented on issue #12259:
URL: https://github.com/apache/nuttx/issues/12259#issuecomment-2083323903

   @jkivilin @pussuw do you know about this issue?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@nuttx.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] MPFS Icicle Kit knsh config failure [nuttx]

Posted by "MainframeReboot (via GitHub)" <gi...@apache.org>.
MainframeReboot commented on issue #12259:
URL: https://github.com/apache/nuttx/issues/12259#issuecomment-2099435798

   I have spent another day debugging and although some small progress has been made in advancing when the kernel build crashes, I am still stuck on getting the build to run.
   
   So far I have done the following:
   
   - Confirmed pmpcfg0 is set to 0x1F (NAPOT/R/W/X)
   - Confirmed pmpaddr0 is set to max (0xFFFFFFFF)
   - Confirmed that the satp register is being configured correctly to allow virtual addressing.
   - Attempted this build with the skip-opensbi flag in the hss-payload-generator set to both true and false while toggling the NuttSBI feature on and off.
   - Increased stack sizes.
   - Followed the code to the initial spot of failure, which is not related to any interrupts so apologies for the confusion in my last post.
   
   It appears that in the default knsh config, CONFIG_RAM_VSTART is set to 0x0 and CONFIG_RAM_START is set to 0x80200000 (this matches the linker script). When the code enters the following function with the passed in paddr value set to 0x8020E000:
   ```
   static inline uintptr_t riscv_pgvaddr(uintptr_t paddr)
   {
     if (paddr >= CONFIG_ARCH_PGPOOL_PBASE && paddr < CONFIG_ARCH_PGPOOL_PEND)
       {
         return paddr - CONFIG_ARCH_PGPOOL_PBASE + CONFIG_ARCH_PGPOOL_VBASE;
       }
     else if (paddr >= CONFIG_RAM_START && paddr < CONFIG_RAM_END)
       {
         return paddr - CONFIG_RAM_START + CONFIG_RAM_VSTART;
       }
   
     return 0;
   }
   ```
   The returned value is 0xE000. This value is then subsequently passed to _mmu_ln_getentry_ as the value for _lnvaddr_ and is set as memory address for _lntable_.  The DEBUGASSERT call succeeds but the following index of _lntable_ immediately causes a crash:
   
   ```
   uintptr_t mmu_ln_getentry(uint32_t ptlevel, uintptr_t lnvaddr,
                             uintptr_t vaddr)
   {
     uintptr_t *lntable = (uintptr_t *)lnvaddr;
     uint32_t  index;
   
     DEBUGASSERT(ptlevel > 0 && ptlevel <= RV_MMU_PT_LEVELS);
   
     index = (vaddr >> RV_MMU_VADDR_SHIFT(ptlevel)) & RV_MMU_VPN_MASK
   
     return lntable[index]; // Crashes here.
   }
   ```
   Out of curiousity I changed _CONFIG_RAM_VSTART_ to 0x80200000 so that _lnvaddr_ is no longer 0xE000 as this does not seem right but this just causes issues further down the line (_riscv_stack_color_ function to be exact) so I am assuming this is not correct.
   
   Any ideas on what I could be doing wrong here given all of the things I have tried?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@nuttx.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] MPFS Icicle Kit knsh config failure [nuttx]

Posted by "sastel (via GitHub)" <gi...@apache.org>.
sastel commented on issue #12259:
URL: https://github.com/apache/nuttx/issues/12259#issuecomment-2088424776

   @pussuw to summarize what you are saying, we have to set up the HSS OpenSBI memory areas and access rights in a way that match/support the NuttX knsh config. Did I get that right?
   
   @MainframeReboot I'm pretty sure the PMP configuration is done through Microchip's MSS Configurator tool, which generates an .xml file. The HSS makefile runs this .xml file through a Python script to generate a bunch of header/source files for the HSS build. This config is what the HSS applies to the MSS during boot.
   
   You can see here that the sample .xml for the Icicle kit has config for PMP [ICICLE_MSS_mss_cfg.xml#L797-L800:](https://github.com/polarfire-soc/hart-software-services/blob/29e88d4b8acfc5bcc5d0a8e930393aee0d1f6915/boards/mpfs-icicle-kit-es/soc_fpga_design/xml/ICICLE_MSS_mss_cfg.xml#L797-L800)
   ```          
               <register address="0x00000016" description="pmp setup register, 64 bits" name="MPU_CFG_PMP2">
               <field Type="RW" name="PMP" offset="0" width="38">0xFFFFFFFFF</field>
               <field Type="RW" name="RESERVED" offset="38" width="18">0x0</field>
               <field Type="RW" name="MODE" offset="56" width="8">0x1F</field>
   ```
   
   Here are some resources on OpenSBI.
   https://www.thegoodpenguin.co.uk/blog/an-overview-of-opensbi/
   https://riscv.org/wp-content/uploads/2019/12/Summit_bootflow.pdf


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@nuttx.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] MPFS Icicle Kit knsh config failure [nuttx]

Posted by "pussuw (via GitHub)" <gi...@apache.org>.
pussuw commented on issue #12259:
URL: https://github.com/apache/nuttx/issues/12259#issuecomment-2100101473

   Yes that makes sense, forgot I enabled kmap for icicle. Adding CONFIG_RAM_VSTART to the defconfig should fix this issue. What's causing the stack coloring issue depends on which stack is in question.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@nuttx.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] MPFS Icicle Kit knsh config failure [nuttx]

Posted by "pussuw (via GitHub)" <gi...@apache.org>.
pussuw commented on issue #12259:
URL: https://github.com/apache/nuttx/issues/12259#issuecomment-2096032118

   Easiest way is to just open up everything and not worry about PMP. PMP is used for inter-hart isolation, if you don't use AMP I would not worry about it.
   
   Our own stripped down / limited SBI does it during boot:
   
   ```
     /* Open everything for PMP */
   
     WRITE_CSR(pmpaddr0, -1);
     WRITE_CSR(pmpcfg0, (PMPCFG_A_NAPOT | PMPCFG_R | PMPCFG_W | PMPCFG_X));
   
   ```
   Don't know how this is done for HSS bootloader but there must be some way.
   
   Adding one rule in the first pmpcfg register is enough, as the CPU goes through the PMP list in-order, and if a rule that matches the memory area is found, access is either immediately given or revoked, so it does not matter what the rest of the pmpcfg registers even contain.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@nuttx.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] MPFS Icicle Kit knsh config failure [nuttx]

Posted by "pussuw (via GitHub)" <gi...@apache.org>.
pussuw commented on issue #12259:
URL: https://github.com/apache/nuttx/issues/12259#issuecomment-2084485774

   Look like a pmp/memory protection issue. The knsh config does not work out of the box, you need opensbi ton run this config


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@nuttx.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] MPFS Icicle Kit knsh config failure [nuttx]

Posted by "pussuw (via GitHub)" <gi...@apache.org>.
pussuw commented on issue #12259:
URL: https://github.com/apache/nuttx/issues/12259#issuecomment-2086962430

   Opensbi domains control pmp configuration. Set the memory areas and access rights accordingly. Or open up everything. Disabling pmp is not enough as supervisor/user modes need to be *given* explicit access to memory.
   
   I don't use hss so can't help you with that


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@nuttx.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] MPFS Icicle Kit knsh config failure [nuttx]

Posted by "MainframeReboot (via GitHub)" <gi...@apache.org>.
MainframeReboot commented on issue #12259:
URL: https://github.com/apache/nuttx/issues/12259#issuecomment-2103149725

   I did some more debugging and have managed to boot NuttX in kernel mode, _sort of_.
   
   After updating CONFIG_RAM_VSTART to 0x80200000, the initial crash went away but led to another crash that I tracked down to the function `up_initial_state` within `riscv_initialstate.c`. 
   
   Within this function, the `stack_alloc_ptr` field within the passed in stack control block is set to a value that is obtained from the array `g_cpux_idlestack`. This array is indexed using the value provided by a call to `riscv_mhartid()`.
   
   The value returned by `riscv_mhartid()` for my current build is 1. This makes sense as I am using HSS and building the NuttX payload to run on the U54_1 core, while HSS runs on the E54 (hart id 0).
   
   What isn't good is that `g_cpux_idlestack` is defined within `riscv_common_memorymap.h` as `EXTERN const uint8_t *g_cpux_idlestack[CONFIG_SMP_NCPUS];`. I am not using SMP so in the config this value is not set and as such defaults to 1. Then, inside the function `riscv_set_basestack`, `g_cpux_idlestack` is populated using `CONFIG_SMP_NCPUS`:
   
   ```
   static inline void riscv_set_basestack(uintptr_t base, uintptr_t size)
   {
     unsigned int i;
   
     for (i = 0; i < CONFIG_SMP_NCPUS; i++)
       {
         g_cpux_idlestack[i] = (const uint8_t *)(base + size * i);
       }
   }
   ```
   So we can see a value for the idle stack is being created and set but then it's being accessed incorrectly within the `up_initial_state` function as it uses the hart id obtained from `riscv_mhartid()`, which in my build is 1 and there is nothing set there:
   
   ```
    if (tcb->pid == IDLE_PROCESS_ID)
       {
         tcb->stack_alloc_ptr = (void *)g_cpux_idlestack[riscv_mhartid()]; // hartid is 1 but only index 0 exists
         tcb->stack_base_ptr  = tcb->stack_alloc_ptr;
         tcb->adj_stack_size  = SMP_STACK_SIZE;
   ```
   
   Is the idea that NuttX kernel must be used alongside SMP? Even so, in then scenarios where the HSS is used to boot say a non-NuttX application on cores 1&2 and NuttX on cores 3&4 will mean `CONFIG_SMP_NCPUS` would be set to 2, meaning `g_cpux_idlestack` would have entries in indices 0&1 but the `riscv_mhartid()` calls would return values of 3 and 4 respectively, so this fault would occur. Not to mention that anyone using the Icicle kit in its default flow of having HSS boot payloads would always run into this fault as the HSS runs on hart 0, so even if all 4 U54 cores were used and the array was filled with 4 values, the call to obtain the hartid will inevitably return a value of 4 and cause a fault.
   
   For the purposes of my debug, I have indexed the array directly at 0 just to get NuttX to boot in kernel mode. It now does so, albeit I get an error when it attempts to load the ELF file `/bin/init` but I am assuming the default mpfs code doesn't mount a file system so I can work on getting that going now. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@nuttx.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] MPFS Icicle Kit knsh config failure [nuttx]

Posted by "MainframeReboot (via GitHub)" <gi...@apache.org>.
MainframeReboot commented on issue #12259:
URL: https://github.com/apache/nuttx/issues/12259#issuecomment-2097011737

   Thanks for the continued suggestions. Today I recompiled the HSS and ensured PMP was disabled. I can confirm that pmpcfg0 was set to 0x1F (NAPOT, R/W/X) and pmpaddr0 set to its max value and I continued to monitor these CSRs during debug. Unfortunately, this did not appear to fix my issue. I have already increased the stack sizes in case this was the issue but again that did not help either. Here is a snippet of the error from the output (as you can see OpenSBI is running as well):
   
   
   ![image](https://github.com/apache/nuttx/assets/168458700/b7a709a9-261f-4daf-8c90-a0a3f3bfc74c)
   
   From what I can tell from the error in the image above and from reading the NuttX source code, it appears the system is raising an interrupt that is captured by NuttX and immediately routed to the irq_unexpected_isr() function as all interrupt vectors are pointed to the irq_unexpected_isr by NuttX. I will continue to dig into this but in the meantime, can you confirm if having interrupts enabled by the HSS will cause issues with NuttX? I am assuming I will have to ensure NuttX does not set any interrupt vectors enabled outside of NuttX to point to the irq_unexpected_isr handler or else this will continue to occur.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@nuttx.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] MPFS Icicle Kit knsh config failure [nuttx]

Posted by "MainframeReboot (via GitHub)" <gi...@apache.org>.
MainframeReboot commented on issue #12259:
URL: https://github.com/apache/nuttx/issues/12259#issuecomment-2086911477

   > Looks like a pmp/memory protection issue. The knsh config does not work out of the box, you need opensbi ton run this config
   
   I took a look and OpenSBI is running (at least the logo displays so I assume it is running as part of the HSS). Is there any more information I can get regarding what needs to be done to get this to work?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@nuttx.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] MPFS Icicle Kit knsh config failure [nuttx]

Posted by "pussuw (via GitHub)" <gi...@apache.org>.
pussuw commented on issue #12259:
URL: https://github.com/apache/nuttx/issues/12259#issuecomment-2099463410

   I have not tried the knsh config in many months, maybe there is some regression in upstream.
   
   The mapping should be vaddr=paddr so CONFIG_RAM_START==CONFIG_RAM_VSTART==0x80200000. I think this has been the default behavior before, maybe something has changed that breaks this?
   
   If setting CONFIG_RAM_VSTART explicitly in the defconfig fixes this, then it is a correct fix. I'm just wondering, where does this query for address 0x8020E000 come from ? As far as I can remember riscv_pgvaddr is used only in the MMU / address environment logic and those addresses should be in page pool memory.
   
   Problems in riscv_stack_color could indicate a problem with user heap, as the process's stack is allocated from there.
   
   I run a downstream mpfs target with kernel mode all the time so that should work just fine. Thus, it would make sense the icicle:knsh defconfig has some issues that cause your crashes. Or a regression in upstream we have not detected yet.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@nuttx.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] MPFS Icicle Kit knsh config failure [nuttx]

Posted by "MainframeReboot (via GitHub)" <gi...@apache.org>.
MainframeReboot commented on issue #12259:
URL: https://github.com/apache/nuttx/issues/12259#issuecomment-2099651570

   I can provide some more information regarding the queries in question.
   
   I started debugging from the function _mpfs_start_ which contains a function call to _mpfs_mm_init_. This function itself performs two things:
   
   1. Sets up the Kernel mapping through a function call to _mpfs_kernel_mappings_. In this step, _lnvaddr_ looks reasonable (0x8020D000).
   2. Enables the MMU through a call to the _mmu_enable_. 
   
   After step 2, _nx_start_ is called, which runs until it hits the function _kmm_map_initialize_. Within this function is a call to _up_addrenv_kmap_init_ and this is where the problems begin. In here, the value of _paddr_ (0x8020E000) is obtained from _g_kernel_pgt_pbase_ and the value of _vaddr_ is obtained from _CONFIG_ARCH_KMAP_VBASE_ (which is set to 0xBF000000 in the default config). _paddr_ is then passed_ on to the function _riscv_pgvaddr_. Since this value is not >= _CONFIG_ARCH_PGPOOL_PBASE_ (0x80400000) but >= to _CONFIG_RAM_START_ (0x8020000), the value of 0xE000 is returned (until I explicitly set _CONFIG_RAM_VSTART_ to 0x80200000).
   
   Hopefully this provides more clarity into what is occurring in my system at the moment. I will attempt a few more things tomorrow, including going over the older versions of the knsh config to see if I can spot anything, as well as looking into my user heap to ensure it's being configured correctly.
   
   Thanks again for your support through this debug.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@nuttx.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org