You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mynewt.apache.org by Aditya Xavier <ad...@me.com.INVALID> on 2018/08/31 10:47:26 UTC
Mynewt crash when releasing semaphore
Hi !
Am having an issue with Sending and Receiving a Mesh Message. Though am positive the problem is more towards releasing the semaphore.
Action Received over MESH Length :- 15
012273 Unhandled interrupt (3), exception sp 0x2000abd0
012273 r0:0xd7229882 r1:0xd929b3bb r2:0xcf0f98cb r3:0x5c5a76b3
012273 r4:0x1b000000 r5:0x2000acc0 r6:0x2000aca0 r7:0x00000008
012273 r8:0x00000000 r9:0x200008a9 r10:0x2000ad28 r11:0x00011d91
012273 r12:0x681af5c8 lr:0xb1334673 pc:0x7e3cdeb8 psr:0x2266a80b
012273 ICSR:0x00411803 HFSR:0x40000000 CFSR:0x00040000
012273 BFAR:0xe000ed38 MMFAR:0xe000ed34
Am sending a group mesh message for testing. The sequence of events are as follows.
Button TASK -> send message over MESH -> Mesh receives message on model -> copies the data and starts releases the Semaphore for another task -> LOG Task starts and logs the message.
In this entire flow, the moment I receive the message and release the semaphore the firmware crashes.
I tried increasing the STACK size of the LOG task, however that didn’t help.
Could someone let me know how to understand where / why the crash is happening ?
Thanks,
Aditya Xavier.
Re: Mynewt crash when releasing semaphore
Posted by marko kiiskila <ma...@runtime.io>.
it’s easiest to inspect these addresses with gdb :)
arm-none-eabi-gdb bin/targets/……. .elf
and then start feeding those addresses to see which ones look likely to be part
of callchain.
x/i 0x0003b4d8
x/i 0x000246a7
x/i 0x0003b4d8
etc
> On Aug 31, 2018, at 3:30 PM, Aditya Xavier <ad...@me.com.INVALID> wrote:
>
> Am really bad at GDB. Also its like a rabbit hole :)
>
> I ported over my application with the git version of Mynewt-core, and enabled OS_CRASH_STACKTRACE.
>
> With it enabled, the following is the dump.
>
> #mesh-onoff STATUS: Sent !
> Action Received over MESH Length :- 14
> 000486 Unhandled interrupt (3), exception sp 0x2000aba0
> 000486 r0:0xcf0f98cb r1:0x5c5a76b3 r2:0x681af5c8 r3:0xb1334673
> 000486 r4:0x2000ac68 r5:0x00000007 r6:0x00000000 r7:0x200008a9
> 000486 r8:0x2000acf0 r9:0x00012101 r10:0xd7229882 r11:0xd929b3bb
> 000486 r12:0x7e3cdeb8 lr:0x2266a80b pc:0x59d8de5b psr:0xe8eb9828
> 000486 ICSR:0x00421803 HFSR:0x40000000 CFSR:0x00040000
> 000486 BFAR:0xe000ed38 MMFAR:0xe000ed34
> 000486 task:DECODE_TASK
> 000486 0x2000abec: 0x0003b4d8
> 000486 0x2000abf4: 0x000246a7
> 000486 0x2000ac04: 0x0003b4d8
> 000486 0x2000ac0c: 0x0002488d
> 000486 0x2000ac4c: 0x00012101
> 000486 0x2000ad0c: 0x0000c1e7
> 000486 0x2000ad1c: 0x0000c1e7
> 000486 0x2000ad2c: 0x0000c211
> 000486 0x2000ad30: 0x0003ad44
> 000486 0x2000ad3c: 0x00013023
> 000486 0x2000ad58: 0x000238e1
> 000486 0x2000ad60: 0x00037f81
> 000486 0x2000ad6c: 0x00023a79
> 000486 0x2000ad70: 0x00039b80
> 000486 0x2000ad74: 0x00039b7f
> 000486 0x2000ad84: 0x00023587
> 000486 0x2000ada8: 0x000087cd
> 000486 0x2000adc4: 0x0000d51d
> 000486 0x2000adc8: 0x0000d51c
> 000486 0x2000add8: 0x000398cd
> 000486 0x2000ade4: 0x000087e9
> 000486 0x2000ae08: 0x00010001
> 000486 0x2000ae0c: 0x0001c239
> 000486 0x2000ae10: 0x0003b35c
> 000486 0x2000ae1c: 0x00020001
> 000486 0x2000ae20: 0x0001c38d
> 000486 0x2000ae30: 0x00030001
> 000486 0x2000ae34: 0x0001c509
> 000486 0x2000ae48: 0x0001c38d
> 000486 0x2000ae5c: 0x0001c509
> 000486 0x2000ae70: 0x0001c239
> 000486 0x2000ae74: 0x0003b37c
> 000486 0x2000ae84: 0x0001c38d
> 000486 0x2000ae98: 0x0001c509
> 000486 0x2000aeac: 0x0001c54d
> 000486 0x2000aec0: 0x0001c239
> 000486 0x2000aec4: 0x0003ba28
> 000486 0x2000aed4: 0x0001c38d
> 000486 0x2000aee8: 0x0001c509
> 000486 0x2000aefc: 0x0001c38d
> 000486 0x2000af10: 0x0001c509
> 000486 0x2000af24: 0x0001c54d
> 000486 0x2000af38: 0x0001c38d
> 000486 0x2000af4c: 0x0001c509
> 000486 0x2000af60: 0x0001c38d
> 000486 0x2000af74: 0x0001c509
> 000486 0x2000af88: 0x0001c54d
> 000486 0x2000af9c: 0x0001c38d
> 000486 0x2000afb0: 0x0001c509
>
>
>> On 31-Aug-2018, at 5:21 PM, marko kiiskila <ma...@runtime.io> wrote:
>>
>> Some suggestions (inline).
>>
>>> On Aug 31, 2018, at 2:32 PM, Aditya Xavier <ad...@me.com.INVALID> wrote:
>>>
>>> Gosh, this doesn’t make much sense to me :(
>>>
>>> (gdb) monitor go
>>> (gdb) monitor reset
>>> Resetting target
>>> (gdb) c
>>> Continuing.
>>>
>>> Program received signal SIGTRAP, Trace/breakpoint trap.
>>> hal_system_reset () at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
>>> 50 asm("bkpt");
>>> (gdb) bt
>>> #0 hal_system_reset () at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
>>> #1 0x0000bf2e in os_default_irq (tf=0x2000ffc8) at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:170
>>> #2 0x0000da56 in os_default_irq_asm () at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/m4/HAL_CM4.s:260
>>> #3 <signal handler called>
>>> #4 0x00000000 in ?? ()
>>> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>>> (gdb) frame 1
>>> #1 0x0000bf2e in os_default_irq (tf=0x2000ffc8) at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:170
>>> 170 hal_system_reset();
>>> (gdb) p/x *tf
>>> $1 = {ef = 0x2000abd0, r4 = 0x1b000000, r5 = 0x2000acc0, r6 = 0x2000aca0, r7 = 0x7, r8 = 0x0, r9 = 0x200008a9, r10 = 0x2000ad28, r11 = 0x11d91, lr = 0xfffffffd}
>>> (gdb) p/x *tf->ef
>>> $2 = {r0 = 0xd7229882, r1 = 0xd929b3bb, r2 = 0xcf0f98cb, r3 = 0x5c5a76b3, r12 = 0x681af5c8, lr = 0xb1334673, pc = 0x7e3cdeb8, psr = 0x2266a80b}
>>> (gdb) x/32x 0xd7229882
>>> 0xd7229882: 0x00000000 0x00000000 0x00000000 0x00000000
>>> 0xd7229892: 0x00000000 0x00000000 0x00000000 0x00000000
>>> 0xd72298a2: 0x00000000 0x00000000 0x00000000 0x00000000
>>> 0xd72298b2: 0x00000000 0x00000000 0x00000000 0x00000000
>>> 0xd72298c2: 0x00000000 0x00000000 0x00000000 0x00000000
>>> 0xd72298d2: 0x00000000 0x00000000 0x00000000 0x00000000
>>> 0xd72298e2: 0x00000000 0x00000000 0x00000000 0x00000000
>>> 0xd72298f2: 0x00000000 0x00000000 0x00000000 0x00000000
>>> (gdb) x/32x 0x2000abd0
>>> 0x2000abd0: 0xd7229882 0xd929b3bb 0xcf0f98cb 0x5c5a76b3
>>> 0x2000abe0: 0x681af5c8 0xb1334673 0x7e3cdeb8 0x2266a80b
>>> 0x2000abf0: 0x59d8de5b 0xe8eb9828 0x96d74690 0xb4b1ee9b
>>> 0x2000ac00: 0x95f0cad6 0x7d1b52fe 0xebcc146e 0x5f7dfaf5
>>> 0x2000ac10: 0x62dd2c19 0x1fc67ee7 0xf40a6a89 0xab77907c
>>
>> ^^^^^ looks bad, especially the top area. Should have dump of registers
>> stored at the time the crash.
>>
>>
>>> 0x2000ac20: 0x00000010 0x00039c74 0x2000ad28 0x0002329f
>>> 0x2000ac30: 0xd87c5730 0xa203a288 0x00000010 0x00039c74
>>> 0x2000ac40: 0x2000ad28 0x00023485 0x00000000 0x00000000
>>> (gdb) p &__text
>>> No symbol "__text" in current context.
>>> (gdb) p &__etext
>>> $3 = (<data variable, no debug info> *) 0x3a9c8
>>> (gdb) p &__text
>>> No symbol "__text" in current context.
>>
>> This was probably added at the same time as OS_STACK_BACKTRACE.
>> You’re looking for values between start of your image slot and 0x3a9c8.
>>
>>> (gdb) x/i 0xd7229882
>>> 0xd7229882: movs r0, r0
>>> (gdb) list *0xd7229882
>>> (gdb) x/i 0x681af5c8
>>> 0x681af5c8: movs r0, r0
>>> (gdb) x/i 0x59d8de5b
>>> 0x59d8de5b: movs r0, r0
>>> (gdb) x/i 0x62dd2c19
>>> 0x62dd2c19: movs r0, r0
>>> (gdb) x/i 0x2000ad28
>>> 0x2000ad28: lsls r0, r2, #6
>>> (gdb) x/i 0x1fc67ee7
>>> 0x1fc67ee7: movs r0, r0
>>> (gdb) x/i 0xa203a288
>>> 0xa203a288: movs r0, r0
>>> (gdb) x/i 0xe8eb9828
>>> 0xe8eb9828: movs r0, r0
>>> (gdb) x/i 0xcf0f98cb
>>> 0xcf0f98cb: movs r0, r0
>>> (gdb) x/i 0x96d74690
>>> 0x96d74690: movs r0, r0
>>> (gdb) x/i 0xf40a6a89
>>> 0xf40a6a89: movs r0, r0
>>> (gdb) x/i 0x2000ad28
>>> 0x2000ad28: lsls r0, r2, #6
>>> (gdb) x/i 0x00000010
>>> 0x10: movs r0, r0
>>> (gdb) x/i 0x0002329f
>>> 0x2329f <shift_rows+108>: add sp, #20
>>> (gdb) x/i 0x00039c74
>>> 0x39c74 <sbox>: ldrb r3, [r4, #17]
>>> (gdb) x/i 0xa203a288
>>> 0xa203a288: movs r0, r0
>>> (gdb) x/i 0x0002329f
>>> 0x2329f <shift_rows+108>: add sp, #20
>>> (gdb) list *0x0002329f
>>> 0x2329f is in shift_rows (repos/apache-mynewt-core/crypto/tinycrypt/src/aes_encrypt.c:156).
>>> 151 t[0] = s[0]; t[1] = s[5]; t[2] = s[10]; t[3] = s[15];
>>> 152 t[4] = s[4]; t[5] = s[9]; t[6] = s[14]; t[7] = s[3];
>>> 153 t[8] = s[8]; t[9] = s[13]; t[10] = s[2]; t[11] = s[7];
>>> 154 t[12] = s[12]; t[13] = s[1]; t[14] = s[6]; t[15] = s[11];
>>> 155 (void) _copy(s, sizeof(t), t, sizeof(t));
>>> 156 }
>>> 157
>>> 158 int tc_aes_encrypt(uint8_t *out, const uint8_t *in, const TCAesKeySched_t s)
>>> 159 {
>>> 160 uint8_t state[Nk*Nb];
>>
>> That could be writing that random looking data in the stack. encrypted data should
>> look like gibberish.
>> Follow the stack a bit further starting continuing from 0x2000ac50. See if you
>> find who called it. I’m hazarding a guess that one of those args passed to aes_encrypt()
>> is pointing to stack, and there’s not enough memory allocated to hold that data.
>>
>>
>>>> On 31-Aug-2018, at 4:46 PM, marko kiiskila <ma...@runtime.io> wrote:
>>>>
>>>> Sure. Something like this:
>>>>
>>>> 000933 compat> crash div0
>>>> crash div0
>>>> 003157 Unhandled interrupt (3), exception sp 0x20001dd8
>>>> 003157 r0:0x00000000 r1:0x00017161 r2:0x00000000 r3:0x0000002a
>>>> 003157 r4:0x200041d6 r5:0x00000000 r6:0x20000318 r7:0x00000000
>>>> 003157 r8:0x00000000 r9:0x00000000 r10:0x00000000 r11:0x00000000
>>>> 003157 r12:0x00000000 lr:0x00014949 pc:0x00014978 psr:0x61000000
>>>> 003157 ICSR:0x00421803 HFSR:0x40000000 CFSR:0x02000000
>>>> 003157 BFAR:0xe000ed38 MMFAR:0xe000ed34
>>>>
>>>> Then from gdb:
>>>>
>>>> Program received signal SIGTRAP, Trace/breakpoint trap.
>>>> hal_system_reset ()
>>>> at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
>>>> 50 asm("bkpt");
>>>> (gdb) bt
>>>> #0 hal_system_reset ()
>>>> at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
>>>> #1 0x00008be8 in os_default_irq (tf=0x2000ffc0)
>>>> at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:171
>>>> #2 0x0000a5b6 in os_default_irq_asm ()
>>>> at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/m4/HAL_CM4.s:260
>>>> #3 <signal handler called>
>>>> #4 0x00000000 in ?? ()
>>>> #5 0x0000812c in Reset_Handler ()
>>>> at repos/apache-mynewt-core/hw/bsp/nrf52dk/src/arch/cortex_m4/gcc_startup_nrf52.s:180
>>>> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>>>> (gdb) frame 1
>>>> #1 0x00008be8 in os_default_irq (tf=0x2000ffc0)
>>>> at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:171
>>>> 171 hal_system_reset();
>>>> (gdb) p/x *tf
>>>> $1 = {ef = 0x20001dd8, r4 = 0x200041d6, r5 = 0x0, r6 = 0x20000318, r7 = 0x0,
>>>> r8 = 0x0, r9 = 0x0, r10 = 0x0, r11 = 0x0, lr = 0xfffffffd}
>>>> (gdb) p/x *tf->ef
>>>> $2 = {r0 = 0x0, r1 = 0x17161, r2 = 0x0, r3 = 0x2a, r12 = 0x0, lr = 0x14949,
>>>> pc = 0x14978, psr = 0x61000000}
>>>> (gdb) x/32x 0x20001dd8
>>>> 0x20001dd8 <os_main_stack+3896>: 0x00000000 0x00017161 0x00000000 0x0000002a
>>>> 0x20001de8 <os_main_stack+3912>: 0x00000000 0x00014949 0x00014978 0x61000000
>>>> 0x20001df8 <os_main_stack+3928>: 0x00000003 0x00000000 0x00000000 0x0000002a
>>>> 0x20001e08 <os_main_stack+3944>: 0x00000001 0x00000002 0x0000000a 0x00014a21
>>>> 0x20001e18 <os_main_stack+3960>: 0x00014a15 0x0000ebd9 0x00000000 0x200041d0
>>>> 0x20001e28 <os_main_stack+3976>: 0x200041d6 0x00000000 0x0000000a 0x0001574d
>>>> 0x20001e38 <os_main_stack+3992>: 0x00015741 0x0000c925 0x200041d0 0x00000011
>>>> 0x20001e48 <os_main_stack+4008>: 0x00000073 0x200041d3 0x00000000 0x0000ede9
>>>> (gdb) p &__text
>>>> $3 = (<data variable, no debug info> *) 0x8020 <__isr_vector>
>>>> (gdb) p &__etext
>>>> $4 = (<data variable, no debug info> *) 0x175f0
>>>> (gdb) x/i 0x00017161
>>>> 0x17161: movs r0, r0
>>>> (gdb) x/i 0x00014949
>>>> 0x14949 <crash_device+12>: cbz r0, 0x1496a <crash_device+46>
>>>> (gdb) x/i 0x00014978
>>>> 0x14978 <crash_device+60>: sdiv r3, r3, r2
>>>> (gdb) x/i 0x00014a21
>>>> 0x14a21 <crash_cli_cmd+12>: cbz r0, 0x14a28 <crash_cli_cmd+20>
>>>> (gdb) x/i 0x00014a15
>>>> 0x14a15 <crash_cli_cmd>: push {r3, lr}
>>>> (gdb) list *0x14949
>>>> 0x14949 is in crash_device (repos/apache-mynewt-core/test/crash_test/src/crash_test.c:42).
>>>> warning: Source file is more recent than executable.
>>>> 37 int
>>>> 38 crash_device(char *how)
>>>> 39 {
>>>> 40 volatile int val1, val2, val3;
>>>> 41
>>>> 42 if (!strcmp(how, "div0")) {
>>>> 43
>>>> 44 val1 = 42;
>>>> 45 val2 = 0;
>>>> 46
>>>> (gdb) list *0x00014a21
>>>> 0x14a21 is in crash_cli_cmd (repos/apache-mynewt-core/test/crash_test/src/crash_cli.c:41).
>>>> 36 };
>>>> 37
>>>> 38 static int
>>>> 39 crash_cli_cmd(int argc, char **argv)
>>>> 40 {
>>>> 41 if (argc >= 2 && crash_device(argv[1]) == 0) {
>>>> 42 return 0;
>>>> 43 }
>>>> 44 console_printf("Usage crash [div0|jump0|ref0|assert|wdog]\n");
>>>> 45 return 0;
>>>> (gdb) list *0x14a21
>>>> 0x14a21 is in crash_cli_cmd (repos/apache-mynewt-core/test/crash_test/src/crash_cli.c:41).
>>>> 36 };
>>>> 37
>>>> 38 static int
>>>> 39 crash_cli_cmd(int argc, char **argv)
>>>> 40 {
>>>> 41 if (argc >= 2 && crash_device(argv[1]) == 0) {
>>>> 42 return 0;
>>>> 43 }
>>>> 44 console_printf("Usage crash [div0|jump0|ref0|assert|wdog]\n");
>>>> 45 return 0;
>>>>
>>>> good luck.
>>>>
>>>>> On Aug 31, 2018, at 2:10 PM, Aditya Xavier <ad...@me.com.INVALID> wrote:
>>>>>
>>>>> It seems OS_CRASH_STACKTRACE was introduced after 1.4.1 and hence not in the release.
>>>>>
>>>>> If I change the release, I believe there would be many API changes to be done on MESH side.
>>>>>
>>>>> Can you guide me on how to "manually walk the stack for looking for things which look like pointers to text” ?
>>>>>
>>>>> My gdb skill are pretty weak.
>>>>>
>>>>> I tried gdb where, with the following outcome.
>>>>>
>>>>> (gdb) c
>>>>> Continuing.
>>>>>
>>>>>
>>>>> Program received signal SIGTRAP, Trace/breakpoint trap.
>>>>> hal_system_reset () at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
>>>>> 50 asm("bkpt");
>>>>> (gdb)
>>>>> Continuing.
>>>>>
>>>>> Program received signal SIGTRAP, Trace/breakpoint trap.
>>>>> hal_system_reset () at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
>>>>> 50 asm("bkpt");
>>>>> (gdb) where
>>>>> #0 hal_system_reset () at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
>>>>> #1 0x0000bf2e in os_default_irq (tf=0x2000ffc8) at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:170
>>>>> #2 0x0000da56 in os_default_irq_asm () at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/m4/HAL_CM4.s:260
>>>>> #3 <signal handler called>
>>>>> #4 0x00000000 in ?? ()
>>>>> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>>>>>
>>>>>
>>>>>
>>>>>> On 31-Aug-2018, at 4:30 PM, marko kiiskila <ma...@runtime.io> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>> On Aug 31, 2018, at 1:47 PM, Aditya Xavier <ad...@me.com.INVALID> wrote:
>>>>>>>
>>>>>>> Hi !
>>>>>>>
>>>>>>> Am having an issue with Sending and Receiving a Mesh Message. Though am positive the problem is more towards releasing the semaphore.
>>>>>>>
>>>>>>> Action Received over MESH Length :- 15
>>>>>>> 012273 Unhandled interrupt (3), exception sp 0x2000abd0
>>>>>>> 012273 r0:0xd7229882 r1:0xd929b3bb r2:0xcf0f98cb r3:0x5c5a76b3
>>>>>>> 012273 r4:0x1b000000 r5:0x2000acc0 r6:0x2000aca0 r7:0x00000008
>>>>>>> 012273 r8:0x00000000 r9:0x200008a9 r10:0x2000ad28 r11:0x00011d91
>>>>>>> 012273 r12:0x681af5c8 lr:0xb1334673 pc:0x7e3cdeb8 psr:0x2266a80b
>>>>>>> 012273 ICSR:0x00411803 HFSR:0x40000000 CFSR:0x00040000
>>>>>>> 012273 BFAR:0xe000ed38 MMFAR:0xe000ed34
>>>>>>>
>>>>>>> Am sending a group mesh message for testing. The sequence of events are as follows.
>>>>>>>
>>>>>>> Button TASK -> send message over MESH -> Mesh receives message on model -> copies the data and starts releases the Semaphore for another task -> LOG Task starts and logs the message.
>>>>>>>
>>>>>>> In this entire flow, the moment I receive the message and release the semaphore the firmware crashes.
>>>>>>>
>>>>>>> I tried increasing the STACK size of the LOG task, however that didn’t help.
>>>>>>>
>>>>>>> Could someone let me know how to understand where / why the crash is happening ?
>>>>>>
>>>>>> Looking at your registers they seem to be garbage, so I’m guessing stack
>>>>>> corruption of some sort; does not have to be overflow.
>>>>>> Try turning on OS_CRASH_STACKTRACE, or manually walk the stack for looking for things which
>>>>>> look like pointers to text.
>
Re: Mynewt crash when releasing semaphore
Posted by Aditya Xavier <ad...@me.com.INVALID>.
Am really bad at GDB. Also its like a rabbit hole :)
I ported over my application with the git version of Mynewt-core, and enabled OS_CRASH_STACKTRACE.
With it enabled, the following is the dump.
#mesh-onoff STATUS: Sent !
Action Received over MESH Length :- 14
000486 Unhandled interrupt (3), exception sp 0x2000aba0
000486 r0:0xcf0f98cb r1:0x5c5a76b3 r2:0x681af5c8 r3:0xb1334673
000486 r4:0x2000ac68 r5:0x00000007 r6:0x00000000 r7:0x200008a9
000486 r8:0x2000acf0 r9:0x00012101 r10:0xd7229882 r11:0xd929b3bb
000486 r12:0x7e3cdeb8 lr:0x2266a80b pc:0x59d8de5b psr:0xe8eb9828
000486 ICSR:0x00421803 HFSR:0x40000000 CFSR:0x00040000
000486 BFAR:0xe000ed38 MMFAR:0xe000ed34
000486 task:DECODE_TASK
000486 0x2000abec: 0x0003b4d8
000486 0x2000abf4: 0x000246a7
000486 0x2000ac04: 0x0003b4d8
000486 0x2000ac0c: 0x0002488d
000486 0x2000ac4c: 0x00012101
000486 0x2000ad0c: 0x0000c1e7
000486 0x2000ad1c: 0x0000c1e7
000486 0x2000ad2c: 0x0000c211
000486 0x2000ad30: 0x0003ad44
000486 0x2000ad3c: 0x00013023
000486 0x2000ad58: 0x000238e1
000486 0x2000ad60: 0x00037f81
000486 0x2000ad6c: 0x00023a79
000486 0x2000ad70: 0x00039b80
000486 0x2000ad74: 0x00039b7f
000486 0x2000ad84: 0x00023587
000486 0x2000ada8: 0x000087cd
000486 0x2000adc4: 0x0000d51d
000486 0x2000adc8: 0x0000d51c
000486 0x2000add8: 0x000398cd
000486 0x2000ade4: 0x000087e9
000486 0x2000ae08: 0x00010001
000486 0x2000ae0c: 0x0001c239
000486 0x2000ae10: 0x0003b35c
000486 0x2000ae1c: 0x00020001
000486 0x2000ae20: 0x0001c38d
000486 0x2000ae30: 0x00030001
000486 0x2000ae34: 0x0001c509
000486 0x2000ae48: 0x0001c38d
000486 0x2000ae5c: 0x0001c509
000486 0x2000ae70: 0x0001c239
000486 0x2000ae74: 0x0003b37c
000486 0x2000ae84: 0x0001c38d
000486 0x2000ae98: 0x0001c509
000486 0x2000aeac: 0x0001c54d
000486 0x2000aec0: 0x0001c239
000486 0x2000aec4: 0x0003ba28
000486 0x2000aed4: 0x0001c38d
000486 0x2000aee8: 0x0001c509
000486 0x2000aefc: 0x0001c38d
000486 0x2000af10: 0x0001c509
000486 0x2000af24: 0x0001c54d
000486 0x2000af38: 0x0001c38d
000486 0x2000af4c: 0x0001c509
000486 0x2000af60: 0x0001c38d
000486 0x2000af74: 0x0001c509
000486 0x2000af88: 0x0001c54d
000486 0x2000af9c: 0x0001c38d
000486 0x2000afb0: 0x0001c509
> On 31-Aug-2018, at 5:21 PM, marko kiiskila <ma...@runtime.io> wrote:
>
> Some suggestions (inline).
>
>> On Aug 31, 2018, at 2:32 PM, Aditya Xavier <ad...@me.com.INVALID> wrote:
>>
>> Gosh, this doesn’t make much sense to me :(
>>
>> (gdb) monitor go
>> (gdb) monitor reset
>> Resetting target
>> (gdb) c
>> Continuing.
>>
>> Program received signal SIGTRAP, Trace/breakpoint trap.
>> hal_system_reset () at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
>> 50 asm("bkpt");
>> (gdb) bt
>> #0 hal_system_reset () at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
>> #1 0x0000bf2e in os_default_irq (tf=0x2000ffc8) at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:170
>> #2 0x0000da56 in os_default_irq_asm () at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/m4/HAL_CM4.s:260
>> #3 <signal handler called>
>> #4 0x00000000 in ?? ()
>> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>> (gdb) frame 1
>> #1 0x0000bf2e in os_default_irq (tf=0x2000ffc8) at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:170
>> 170 hal_system_reset();
>> (gdb) p/x *tf
>> $1 = {ef = 0x2000abd0, r4 = 0x1b000000, r5 = 0x2000acc0, r6 = 0x2000aca0, r7 = 0x7, r8 = 0x0, r9 = 0x200008a9, r10 = 0x2000ad28, r11 = 0x11d91, lr = 0xfffffffd}
>> (gdb) p/x *tf->ef
>> $2 = {r0 = 0xd7229882, r1 = 0xd929b3bb, r2 = 0xcf0f98cb, r3 = 0x5c5a76b3, r12 = 0x681af5c8, lr = 0xb1334673, pc = 0x7e3cdeb8, psr = 0x2266a80b}
>> (gdb) x/32x 0xd7229882
>> 0xd7229882: 0x00000000 0x00000000 0x00000000 0x00000000
>> 0xd7229892: 0x00000000 0x00000000 0x00000000 0x00000000
>> 0xd72298a2: 0x00000000 0x00000000 0x00000000 0x00000000
>> 0xd72298b2: 0x00000000 0x00000000 0x00000000 0x00000000
>> 0xd72298c2: 0x00000000 0x00000000 0x00000000 0x00000000
>> 0xd72298d2: 0x00000000 0x00000000 0x00000000 0x00000000
>> 0xd72298e2: 0x00000000 0x00000000 0x00000000 0x00000000
>> 0xd72298f2: 0x00000000 0x00000000 0x00000000 0x00000000
>> (gdb) x/32x 0x2000abd0
>> 0x2000abd0: 0xd7229882 0xd929b3bb 0xcf0f98cb 0x5c5a76b3
>> 0x2000abe0: 0x681af5c8 0xb1334673 0x7e3cdeb8 0x2266a80b
>> 0x2000abf0: 0x59d8de5b 0xe8eb9828 0x96d74690 0xb4b1ee9b
>> 0x2000ac00: 0x95f0cad6 0x7d1b52fe 0xebcc146e 0x5f7dfaf5
>> 0x2000ac10: 0x62dd2c19 0x1fc67ee7 0xf40a6a89 0xab77907c
>
> ^^^^^ looks bad, especially the top area. Should have dump of registers
> stored at the time the crash.
>
>
>> 0x2000ac20: 0x00000010 0x00039c74 0x2000ad28 0x0002329f
>> 0x2000ac30: 0xd87c5730 0xa203a288 0x00000010 0x00039c74
>> 0x2000ac40: 0x2000ad28 0x00023485 0x00000000 0x00000000
>> (gdb) p &__text
>> No symbol "__text" in current context.
>> (gdb) p &__etext
>> $3 = (<data variable, no debug info> *) 0x3a9c8
>> (gdb) p &__text
>> No symbol "__text" in current context.
>
> This was probably added at the same time as OS_STACK_BACKTRACE.
> You’re looking for values between start of your image slot and 0x3a9c8.
>
>> (gdb) x/i 0xd7229882
>> 0xd7229882: movs r0, r0
>> (gdb) list *0xd7229882
>> (gdb) x/i 0x681af5c8
>> 0x681af5c8: movs r0, r0
>> (gdb) x/i 0x59d8de5b
>> 0x59d8de5b: movs r0, r0
>> (gdb) x/i 0x62dd2c19
>> 0x62dd2c19: movs r0, r0
>> (gdb) x/i 0x2000ad28
>> 0x2000ad28: lsls r0, r2, #6
>> (gdb) x/i 0x1fc67ee7
>> 0x1fc67ee7: movs r0, r0
>> (gdb) x/i 0xa203a288
>> 0xa203a288: movs r0, r0
>> (gdb) x/i 0xe8eb9828
>> 0xe8eb9828: movs r0, r0
>> (gdb) x/i 0xcf0f98cb
>> 0xcf0f98cb: movs r0, r0
>> (gdb) x/i 0x96d74690
>> 0x96d74690: movs r0, r0
>> (gdb) x/i 0xf40a6a89
>> 0xf40a6a89: movs r0, r0
>> (gdb) x/i 0x2000ad28
>> 0x2000ad28: lsls r0, r2, #6
>> (gdb) x/i 0x00000010
>> 0x10: movs r0, r0
>> (gdb) x/i 0x0002329f
>> 0x2329f <shift_rows+108>: add sp, #20
>> (gdb) x/i 0x00039c74
>> 0x39c74 <sbox>: ldrb r3, [r4, #17]
>> (gdb) x/i 0xa203a288
>> 0xa203a288: movs r0, r0
>> (gdb) x/i 0x0002329f
>> 0x2329f <shift_rows+108>: add sp, #20
>> (gdb) list *0x0002329f
>> 0x2329f is in shift_rows (repos/apache-mynewt-core/crypto/tinycrypt/src/aes_encrypt.c:156).
>> 151 t[0] = s[0]; t[1] = s[5]; t[2] = s[10]; t[3] = s[15];
>> 152 t[4] = s[4]; t[5] = s[9]; t[6] = s[14]; t[7] = s[3];
>> 153 t[8] = s[8]; t[9] = s[13]; t[10] = s[2]; t[11] = s[7];
>> 154 t[12] = s[12]; t[13] = s[1]; t[14] = s[6]; t[15] = s[11];
>> 155 (void) _copy(s, sizeof(t), t, sizeof(t));
>> 156 }
>> 157
>> 158 int tc_aes_encrypt(uint8_t *out, const uint8_t *in, const TCAesKeySched_t s)
>> 159 {
>> 160 uint8_t state[Nk*Nb];
>
> That could be writing that random looking data in the stack. encrypted data should
> look like gibberish.
> Follow the stack a bit further starting continuing from 0x2000ac50. See if you
> find who called it. I’m hazarding a guess that one of those args passed to aes_encrypt()
> is pointing to stack, and there’s not enough memory allocated to hold that data.
>
>
>>> On 31-Aug-2018, at 4:46 PM, marko kiiskila <ma...@runtime.io> wrote:
>>>
>>> Sure. Something like this:
>>>
>>> 000933 compat> crash div0
>>> crash div0
>>> 003157 Unhandled interrupt (3), exception sp 0x20001dd8
>>> 003157 r0:0x00000000 r1:0x00017161 r2:0x00000000 r3:0x0000002a
>>> 003157 r4:0x200041d6 r5:0x00000000 r6:0x20000318 r7:0x00000000
>>> 003157 r8:0x00000000 r9:0x00000000 r10:0x00000000 r11:0x00000000
>>> 003157 r12:0x00000000 lr:0x00014949 pc:0x00014978 psr:0x61000000
>>> 003157 ICSR:0x00421803 HFSR:0x40000000 CFSR:0x02000000
>>> 003157 BFAR:0xe000ed38 MMFAR:0xe000ed34
>>>
>>> Then from gdb:
>>>
>>> Program received signal SIGTRAP, Trace/breakpoint trap.
>>> hal_system_reset ()
>>> at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
>>> 50 asm("bkpt");
>>> (gdb) bt
>>> #0 hal_system_reset ()
>>> at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
>>> #1 0x00008be8 in os_default_irq (tf=0x2000ffc0)
>>> at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:171
>>> #2 0x0000a5b6 in os_default_irq_asm ()
>>> at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/m4/HAL_CM4.s:260
>>> #3 <signal handler called>
>>> #4 0x00000000 in ?? ()
>>> #5 0x0000812c in Reset_Handler ()
>>> at repos/apache-mynewt-core/hw/bsp/nrf52dk/src/arch/cortex_m4/gcc_startup_nrf52.s:180
>>> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>>> (gdb) frame 1
>>> #1 0x00008be8 in os_default_irq (tf=0x2000ffc0)
>>> at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:171
>>> 171 hal_system_reset();
>>> (gdb) p/x *tf
>>> $1 = {ef = 0x20001dd8, r4 = 0x200041d6, r5 = 0x0, r6 = 0x20000318, r7 = 0x0,
>>> r8 = 0x0, r9 = 0x0, r10 = 0x0, r11 = 0x0, lr = 0xfffffffd}
>>> (gdb) p/x *tf->ef
>>> $2 = {r0 = 0x0, r1 = 0x17161, r2 = 0x0, r3 = 0x2a, r12 = 0x0, lr = 0x14949,
>>> pc = 0x14978, psr = 0x61000000}
>>> (gdb) x/32x 0x20001dd8
>>> 0x20001dd8 <os_main_stack+3896>: 0x00000000 0x00017161 0x00000000 0x0000002a
>>> 0x20001de8 <os_main_stack+3912>: 0x00000000 0x00014949 0x00014978 0x61000000
>>> 0x20001df8 <os_main_stack+3928>: 0x00000003 0x00000000 0x00000000 0x0000002a
>>> 0x20001e08 <os_main_stack+3944>: 0x00000001 0x00000002 0x0000000a 0x00014a21
>>> 0x20001e18 <os_main_stack+3960>: 0x00014a15 0x0000ebd9 0x00000000 0x200041d0
>>> 0x20001e28 <os_main_stack+3976>: 0x200041d6 0x00000000 0x0000000a 0x0001574d
>>> 0x20001e38 <os_main_stack+3992>: 0x00015741 0x0000c925 0x200041d0 0x00000011
>>> 0x20001e48 <os_main_stack+4008>: 0x00000073 0x200041d3 0x00000000 0x0000ede9
>>> (gdb) p &__text
>>> $3 = (<data variable, no debug info> *) 0x8020 <__isr_vector>
>>> (gdb) p &__etext
>>> $4 = (<data variable, no debug info> *) 0x175f0
>>> (gdb) x/i 0x00017161
>>> 0x17161: movs r0, r0
>>> (gdb) x/i 0x00014949
>>> 0x14949 <crash_device+12>: cbz r0, 0x1496a <crash_device+46>
>>> (gdb) x/i 0x00014978
>>> 0x14978 <crash_device+60>: sdiv r3, r3, r2
>>> (gdb) x/i 0x00014a21
>>> 0x14a21 <crash_cli_cmd+12>: cbz r0, 0x14a28 <crash_cli_cmd+20>
>>> (gdb) x/i 0x00014a15
>>> 0x14a15 <crash_cli_cmd>: push {r3, lr}
>>> (gdb) list *0x14949
>>> 0x14949 is in crash_device (repos/apache-mynewt-core/test/crash_test/src/crash_test.c:42).
>>> warning: Source file is more recent than executable.
>>> 37 int
>>> 38 crash_device(char *how)
>>> 39 {
>>> 40 volatile int val1, val2, val3;
>>> 41
>>> 42 if (!strcmp(how, "div0")) {
>>> 43
>>> 44 val1 = 42;
>>> 45 val2 = 0;
>>> 46
>>> (gdb) list *0x00014a21
>>> 0x14a21 is in crash_cli_cmd (repos/apache-mynewt-core/test/crash_test/src/crash_cli.c:41).
>>> 36 };
>>> 37
>>> 38 static int
>>> 39 crash_cli_cmd(int argc, char **argv)
>>> 40 {
>>> 41 if (argc >= 2 && crash_device(argv[1]) == 0) {
>>> 42 return 0;
>>> 43 }
>>> 44 console_printf("Usage crash [div0|jump0|ref0|assert|wdog]\n");
>>> 45 return 0;
>>> (gdb) list *0x14a21
>>> 0x14a21 is in crash_cli_cmd (repos/apache-mynewt-core/test/crash_test/src/crash_cli.c:41).
>>> 36 };
>>> 37
>>> 38 static int
>>> 39 crash_cli_cmd(int argc, char **argv)
>>> 40 {
>>> 41 if (argc >= 2 && crash_device(argv[1]) == 0) {
>>> 42 return 0;
>>> 43 }
>>> 44 console_printf("Usage crash [div0|jump0|ref0|assert|wdog]\n");
>>> 45 return 0;
>>>
>>> good luck.
>>>
>>>> On Aug 31, 2018, at 2:10 PM, Aditya Xavier <ad...@me.com.INVALID> wrote:
>>>>
>>>> It seems OS_CRASH_STACKTRACE was introduced after 1.4.1 and hence not in the release.
>>>>
>>>> If I change the release, I believe there would be many API changes to be done on MESH side.
>>>>
>>>> Can you guide me on how to "manually walk the stack for looking for things which look like pointers to text” ?
>>>>
>>>> My gdb skill are pretty weak.
>>>>
>>>> I tried gdb where, with the following outcome.
>>>>
>>>> (gdb) c
>>>> Continuing.
>>>>
>>>>
>>>> Program received signal SIGTRAP, Trace/breakpoint trap.
>>>> hal_system_reset () at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
>>>> 50 asm("bkpt");
>>>> (gdb)
>>>> Continuing.
>>>>
>>>> Program received signal SIGTRAP, Trace/breakpoint trap.
>>>> hal_system_reset () at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
>>>> 50 asm("bkpt");
>>>> (gdb) where
>>>> #0 hal_system_reset () at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
>>>> #1 0x0000bf2e in os_default_irq (tf=0x2000ffc8) at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:170
>>>> #2 0x0000da56 in os_default_irq_asm () at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/m4/HAL_CM4.s:260
>>>> #3 <signal handler called>
>>>> #4 0x00000000 in ?? ()
>>>> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>>>>
>>>>
>>>>
>>>>> On 31-Aug-2018, at 4:30 PM, marko kiiskila <ma...@runtime.io> wrote:
>>>>>
>>>>>
>>>>>
>>>>>> On Aug 31, 2018, at 1:47 PM, Aditya Xavier <ad...@me.com.INVALID> wrote:
>>>>>>
>>>>>> Hi !
>>>>>>
>>>>>> Am having an issue with Sending and Receiving a Mesh Message. Though am positive the problem is more towards releasing the semaphore.
>>>>>>
>>>>>> Action Received over MESH Length :- 15
>>>>>> 012273 Unhandled interrupt (3), exception sp 0x2000abd0
>>>>>> 012273 r0:0xd7229882 r1:0xd929b3bb r2:0xcf0f98cb r3:0x5c5a76b3
>>>>>> 012273 r4:0x1b000000 r5:0x2000acc0 r6:0x2000aca0 r7:0x00000008
>>>>>> 012273 r8:0x00000000 r9:0x200008a9 r10:0x2000ad28 r11:0x00011d91
>>>>>> 012273 r12:0x681af5c8 lr:0xb1334673 pc:0x7e3cdeb8 psr:0x2266a80b
>>>>>> 012273 ICSR:0x00411803 HFSR:0x40000000 CFSR:0x00040000
>>>>>> 012273 BFAR:0xe000ed38 MMFAR:0xe000ed34
>>>>>>
>>>>>> Am sending a group mesh message for testing. The sequence of events are as follows.
>>>>>>
>>>>>> Button TASK -> send message over MESH -> Mesh receives message on model -> copies the data and starts releases the Semaphore for another task -> LOG Task starts and logs the message.
>>>>>>
>>>>>> In this entire flow, the moment I receive the message and release the semaphore the firmware crashes.
>>>>>>
>>>>>> I tried increasing the STACK size of the LOG task, however that didn’t help.
>>>>>>
>>>>>> Could someone let me know how to understand where / why the crash is happening ?
>>>>>
>>>>> Looking at your registers they seem to be garbage, so I’m guessing stack
>>>>> corruption of some sort; does not have to be overflow.
>>>>> Try turning on OS_CRASH_STACKTRACE, or manually walk the stack for looking for things which
>>>>> look like pointers to text.
Re: Mynewt crash when releasing semaphore
Posted by marko kiiskila <ma...@runtime.io>.
Some suggestions (inline).
> On Aug 31, 2018, at 2:32 PM, Aditya Xavier <ad...@me.com.INVALID> wrote:
>
> Gosh, this doesn’t make much sense to me :(
>
> (gdb) monitor go
> (gdb) monitor reset
> Resetting target
> (gdb) c
> Continuing.
>
> Program received signal SIGTRAP, Trace/breakpoint trap.
> hal_system_reset () at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
> 50 asm("bkpt");
> (gdb) bt
> #0 hal_system_reset () at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
> #1 0x0000bf2e in os_default_irq (tf=0x2000ffc8) at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:170
> #2 0x0000da56 in os_default_irq_asm () at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/m4/HAL_CM4.s:260
> #3 <signal handler called>
> #4 0x00000000 in ?? ()
> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
> (gdb) frame 1
> #1 0x0000bf2e in os_default_irq (tf=0x2000ffc8) at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:170
> 170 hal_system_reset();
> (gdb) p/x *tf
> $1 = {ef = 0x2000abd0, r4 = 0x1b000000, r5 = 0x2000acc0, r6 = 0x2000aca0, r7 = 0x7, r8 = 0x0, r9 = 0x200008a9, r10 = 0x2000ad28, r11 = 0x11d91, lr = 0xfffffffd}
> (gdb) p/x *tf->ef
> $2 = {r0 = 0xd7229882, r1 = 0xd929b3bb, r2 = 0xcf0f98cb, r3 = 0x5c5a76b3, r12 = 0x681af5c8, lr = 0xb1334673, pc = 0x7e3cdeb8, psr = 0x2266a80b}
> (gdb) x/32x 0xd7229882
> 0xd7229882: 0x00000000 0x00000000 0x00000000 0x00000000
> 0xd7229892: 0x00000000 0x00000000 0x00000000 0x00000000
> 0xd72298a2: 0x00000000 0x00000000 0x00000000 0x00000000
> 0xd72298b2: 0x00000000 0x00000000 0x00000000 0x00000000
> 0xd72298c2: 0x00000000 0x00000000 0x00000000 0x00000000
> 0xd72298d2: 0x00000000 0x00000000 0x00000000 0x00000000
> 0xd72298e2: 0x00000000 0x00000000 0x00000000 0x00000000
> 0xd72298f2: 0x00000000 0x00000000 0x00000000 0x00000000
> (gdb) x/32x 0x2000abd0
> 0x2000abd0: 0xd7229882 0xd929b3bb 0xcf0f98cb 0x5c5a76b3
> 0x2000abe0: 0x681af5c8 0xb1334673 0x7e3cdeb8 0x2266a80b
> 0x2000abf0: 0x59d8de5b 0xe8eb9828 0x96d74690 0xb4b1ee9b
> 0x2000ac00: 0x95f0cad6 0x7d1b52fe 0xebcc146e 0x5f7dfaf5
> 0x2000ac10: 0x62dd2c19 0x1fc67ee7 0xf40a6a89 0xab77907c
^^^^^ looks bad, especially the top area. Should have dump of registers
stored at the time the crash.
> 0x2000ac20: 0x00000010 0x00039c74 0x2000ad28 0x0002329f
> 0x2000ac30: 0xd87c5730 0xa203a288 0x00000010 0x00039c74
> 0x2000ac40: 0x2000ad28 0x00023485 0x00000000 0x00000000
> (gdb) p &__text
> No symbol "__text" in current context.
> (gdb) p &__etext
> $3 = (<data variable, no debug info> *) 0x3a9c8
> (gdb) p &__text
> No symbol "__text" in current context.
This was probably added at the same time as OS_STACK_BACKTRACE.
You’re looking for values between start of your image slot and 0x3a9c8.
> (gdb) x/i 0xd7229882
> 0xd7229882: movs r0, r0
> (gdb) list *0xd7229882
> (gdb) x/i 0x681af5c8
> 0x681af5c8: movs r0, r0
> (gdb) x/i 0x59d8de5b
> 0x59d8de5b: movs r0, r0
> (gdb) x/i 0x62dd2c19
> 0x62dd2c19: movs r0, r0
> (gdb) x/i 0x2000ad28
> 0x2000ad28: lsls r0, r2, #6
> (gdb) x/i 0x1fc67ee7
> 0x1fc67ee7: movs r0, r0
> (gdb) x/i 0xa203a288
> 0xa203a288: movs r0, r0
> (gdb) x/i 0xe8eb9828
> 0xe8eb9828: movs r0, r0
> (gdb) x/i 0xcf0f98cb
> 0xcf0f98cb: movs r0, r0
> (gdb) x/i 0x96d74690
> 0x96d74690: movs r0, r0
> (gdb) x/i 0xf40a6a89
> 0xf40a6a89: movs r0, r0
> (gdb) x/i 0x2000ad28
> 0x2000ad28: lsls r0, r2, #6
> (gdb) x/i 0x00000010
> 0x10: movs r0, r0
> (gdb) x/i 0x0002329f
> 0x2329f <shift_rows+108>: add sp, #20
> (gdb) x/i 0x00039c74
> 0x39c74 <sbox>: ldrb r3, [r4, #17]
> (gdb) x/i 0xa203a288
> 0xa203a288: movs r0, r0
> (gdb) x/i 0x0002329f
> 0x2329f <shift_rows+108>: add sp, #20
> (gdb) list *0x0002329f
> 0x2329f is in shift_rows (repos/apache-mynewt-core/crypto/tinycrypt/src/aes_encrypt.c:156).
> 151 t[0] = s[0]; t[1] = s[5]; t[2] = s[10]; t[3] = s[15];
> 152 t[4] = s[4]; t[5] = s[9]; t[6] = s[14]; t[7] = s[3];
> 153 t[8] = s[8]; t[9] = s[13]; t[10] = s[2]; t[11] = s[7];
> 154 t[12] = s[12]; t[13] = s[1]; t[14] = s[6]; t[15] = s[11];
> 155 (void) _copy(s, sizeof(t), t, sizeof(t));
> 156 }
> 157
> 158 int tc_aes_encrypt(uint8_t *out, const uint8_t *in, const TCAesKeySched_t s)
> 159 {
> 160 uint8_t state[Nk*Nb];
That could be writing that random looking data in the stack. encrypted data should
look like gibberish.
Follow the stack a bit further starting continuing from 0x2000ac50. See if you
find who called it. I’m hazarding a guess that one of those args passed to aes_encrypt()
is pointing to stack, and there’s not enough memory allocated to hold that data.
>> On 31-Aug-2018, at 4:46 PM, marko kiiskila <ma...@runtime.io> wrote:
>>
>> Sure. Something like this:
>>
>> 000933 compat> crash div0
>> crash div0
>> 003157 Unhandled interrupt (3), exception sp 0x20001dd8
>> 003157 r0:0x00000000 r1:0x00017161 r2:0x00000000 r3:0x0000002a
>> 003157 r4:0x200041d6 r5:0x00000000 r6:0x20000318 r7:0x00000000
>> 003157 r8:0x00000000 r9:0x00000000 r10:0x00000000 r11:0x00000000
>> 003157 r12:0x00000000 lr:0x00014949 pc:0x00014978 psr:0x61000000
>> 003157 ICSR:0x00421803 HFSR:0x40000000 CFSR:0x02000000
>> 003157 BFAR:0xe000ed38 MMFAR:0xe000ed34
>>
>> Then from gdb:
>>
>> Program received signal SIGTRAP, Trace/breakpoint trap.
>> hal_system_reset ()
>> at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
>> 50 asm("bkpt");
>> (gdb) bt
>> #0 hal_system_reset ()
>> at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
>> #1 0x00008be8 in os_default_irq (tf=0x2000ffc0)
>> at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:171
>> #2 0x0000a5b6 in os_default_irq_asm ()
>> at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/m4/HAL_CM4.s:260
>> #3 <signal handler called>
>> #4 0x00000000 in ?? ()
>> #5 0x0000812c in Reset_Handler ()
>> at repos/apache-mynewt-core/hw/bsp/nrf52dk/src/arch/cortex_m4/gcc_startup_nrf52.s:180
>> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>> (gdb) frame 1
>> #1 0x00008be8 in os_default_irq (tf=0x2000ffc0)
>> at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:171
>> 171 hal_system_reset();
>> (gdb) p/x *tf
>> $1 = {ef = 0x20001dd8, r4 = 0x200041d6, r5 = 0x0, r6 = 0x20000318, r7 = 0x0,
>> r8 = 0x0, r9 = 0x0, r10 = 0x0, r11 = 0x0, lr = 0xfffffffd}
>> (gdb) p/x *tf->ef
>> $2 = {r0 = 0x0, r1 = 0x17161, r2 = 0x0, r3 = 0x2a, r12 = 0x0, lr = 0x14949,
>> pc = 0x14978, psr = 0x61000000}
>> (gdb) x/32x 0x20001dd8
>> 0x20001dd8 <os_main_stack+3896>: 0x00000000 0x00017161 0x00000000 0x0000002a
>> 0x20001de8 <os_main_stack+3912>: 0x00000000 0x00014949 0x00014978 0x61000000
>> 0x20001df8 <os_main_stack+3928>: 0x00000003 0x00000000 0x00000000 0x0000002a
>> 0x20001e08 <os_main_stack+3944>: 0x00000001 0x00000002 0x0000000a 0x00014a21
>> 0x20001e18 <os_main_stack+3960>: 0x00014a15 0x0000ebd9 0x00000000 0x200041d0
>> 0x20001e28 <os_main_stack+3976>: 0x200041d6 0x00000000 0x0000000a 0x0001574d
>> 0x20001e38 <os_main_stack+3992>: 0x00015741 0x0000c925 0x200041d0 0x00000011
>> 0x20001e48 <os_main_stack+4008>: 0x00000073 0x200041d3 0x00000000 0x0000ede9
>> (gdb) p &__text
>> $3 = (<data variable, no debug info> *) 0x8020 <__isr_vector>
>> (gdb) p &__etext
>> $4 = (<data variable, no debug info> *) 0x175f0
>> (gdb) x/i 0x00017161
>> 0x17161: movs r0, r0
>> (gdb) x/i 0x00014949
>> 0x14949 <crash_device+12>: cbz r0, 0x1496a <crash_device+46>
>> (gdb) x/i 0x00014978
>> 0x14978 <crash_device+60>: sdiv r3, r3, r2
>> (gdb) x/i 0x00014a21
>> 0x14a21 <crash_cli_cmd+12>: cbz r0, 0x14a28 <crash_cli_cmd+20>
>> (gdb) x/i 0x00014a15
>> 0x14a15 <crash_cli_cmd>: push {r3, lr}
>> (gdb) list *0x14949
>> 0x14949 is in crash_device (repos/apache-mynewt-core/test/crash_test/src/crash_test.c:42).
>> warning: Source file is more recent than executable.
>> 37 int
>> 38 crash_device(char *how)
>> 39 {
>> 40 volatile int val1, val2, val3;
>> 41
>> 42 if (!strcmp(how, "div0")) {
>> 43
>> 44 val1 = 42;
>> 45 val2 = 0;
>> 46
>> (gdb) list *0x00014a21
>> 0x14a21 is in crash_cli_cmd (repos/apache-mynewt-core/test/crash_test/src/crash_cli.c:41).
>> 36 };
>> 37
>> 38 static int
>> 39 crash_cli_cmd(int argc, char **argv)
>> 40 {
>> 41 if (argc >= 2 && crash_device(argv[1]) == 0) {
>> 42 return 0;
>> 43 }
>> 44 console_printf("Usage crash [div0|jump0|ref0|assert|wdog]\n");
>> 45 return 0;
>> (gdb) list *0x14a21
>> 0x14a21 is in crash_cli_cmd (repos/apache-mynewt-core/test/crash_test/src/crash_cli.c:41).
>> 36 };
>> 37
>> 38 static int
>> 39 crash_cli_cmd(int argc, char **argv)
>> 40 {
>> 41 if (argc >= 2 && crash_device(argv[1]) == 0) {
>> 42 return 0;
>> 43 }
>> 44 console_printf("Usage crash [div0|jump0|ref0|assert|wdog]\n");
>> 45 return 0;
>>
>> good luck.
>>
>>> On Aug 31, 2018, at 2:10 PM, Aditya Xavier <ad...@me.com.INVALID> wrote:
>>>
>>> It seems OS_CRASH_STACKTRACE was introduced after 1.4.1 and hence not in the release.
>>>
>>> If I change the release, I believe there would be many API changes to be done on MESH side.
>>>
>>> Can you guide me on how to "manually walk the stack for looking for things which look like pointers to text” ?
>>>
>>> My gdb skill are pretty weak.
>>>
>>> I tried gdb where, with the following outcome.
>>>
>>> (gdb) c
>>> Continuing.
>>>
>>>
>>> Program received signal SIGTRAP, Trace/breakpoint trap.
>>> hal_system_reset () at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
>>> 50 asm("bkpt");
>>> (gdb)
>>> Continuing.
>>>
>>> Program received signal SIGTRAP, Trace/breakpoint trap.
>>> hal_system_reset () at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
>>> 50 asm("bkpt");
>>> (gdb) where
>>> #0 hal_system_reset () at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
>>> #1 0x0000bf2e in os_default_irq (tf=0x2000ffc8) at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:170
>>> #2 0x0000da56 in os_default_irq_asm () at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/m4/HAL_CM4.s:260
>>> #3 <signal handler called>
>>> #4 0x00000000 in ?? ()
>>> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>>>
>>>
>>>
>>>> On 31-Aug-2018, at 4:30 PM, marko kiiskila <ma...@runtime.io> wrote:
>>>>
>>>>
>>>>
>>>>> On Aug 31, 2018, at 1:47 PM, Aditya Xavier <ad...@me.com.INVALID> wrote:
>>>>>
>>>>> Hi !
>>>>>
>>>>> Am having an issue with Sending and Receiving a Mesh Message. Though am positive the problem is more towards releasing the semaphore.
>>>>>
>>>>> Action Received over MESH Length :- 15
>>>>> 012273 Unhandled interrupt (3), exception sp 0x2000abd0
>>>>> 012273 r0:0xd7229882 r1:0xd929b3bb r2:0xcf0f98cb r3:0x5c5a76b3
>>>>> 012273 r4:0x1b000000 r5:0x2000acc0 r6:0x2000aca0 r7:0x00000008
>>>>> 012273 r8:0x00000000 r9:0x200008a9 r10:0x2000ad28 r11:0x00011d91
>>>>> 012273 r12:0x681af5c8 lr:0xb1334673 pc:0x7e3cdeb8 psr:0x2266a80b
>>>>> 012273 ICSR:0x00411803 HFSR:0x40000000 CFSR:0x00040000
>>>>> 012273 BFAR:0xe000ed38 MMFAR:0xe000ed34
>>>>>
>>>>> Am sending a group mesh message for testing. The sequence of events are as follows.
>>>>>
>>>>> Button TASK -> send message over MESH -> Mesh receives message on model -> copies the data and starts releases the Semaphore for another task -> LOG Task starts and logs the message.
>>>>>
>>>>> In this entire flow, the moment I receive the message and release the semaphore the firmware crashes.
>>>>>
>>>>> I tried increasing the STACK size of the LOG task, however that didn’t help.
>>>>>
>>>>> Could someone let me know how to understand where / why the crash is happening ?
>>>>
>>>> Looking at your registers they seem to be garbage, so I’m guessing stack
>>>> corruption of some sort; does not have to be overflow.
>>>> Try turning on OS_CRASH_STACKTRACE, or manually walk the stack for looking for things which
>>>> look like pointers to text.
>>>>
>>>>
>>>
>>
>
Re: Mynewt crash when releasing semaphore
Posted by Aditya Xavier <ad...@me.com.INVALID>.
Gosh, this doesn’t make much sense to me :(
(gdb) monitor go
(gdb) monitor reset
Resetting target
(gdb) c
Continuing.
Program received signal SIGTRAP, Trace/breakpoint trap.
hal_system_reset () at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
50 asm("bkpt");
(gdb) bt
#0 hal_system_reset () at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
#1 0x0000bf2e in os_default_irq (tf=0x2000ffc8) at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:170
#2 0x0000da56 in os_default_irq_asm () at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/m4/HAL_CM4.s:260
#3 <signal handler called>
#4 0x00000000 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) frame 1
#1 0x0000bf2e in os_default_irq (tf=0x2000ffc8) at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:170
170 hal_system_reset();
(gdb) p/x *tf
$1 = {ef = 0x2000abd0, r4 = 0x1b000000, r5 = 0x2000acc0, r6 = 0x2000aca0, r7 = 0x7, r8 = 0x0, r9 = 0x200008a9, r10 = 0x2000ad28, r11 = 0x11d91, lr = 0xfffffffd}
(gdb) p/x *tf->ef
$2 = {r0 = 0xd7229882, r1 = 0xd929b3bb, r2 = 0xcf0f98cb, r3 = 0x5c5a76b3, r12 = 0x681af5c8, lr = 0xb1334673, pc = 0x7e3cdeb8, psr = 0x2266a80b}
(gdb) x/32x 0xd7229882
0xd7229882: 0x00000000 0x00000000 0x00000000 0x00000000
0xd7229892: 0x00000000 0x00000000 0x00000000 0x00000000
0xd72298a2: 0x00000000 0x00000000 0x00000000 0x00000000
0xd72298b2: 0x00000000 0x00000000 0x00000000 0x00000000
0xd72298c2: 0x00000000 0x00000000 0x00000000 0x00000000
0xd72298d2: 0x00000000 0x00000000 0x00000000 0x00000000
0xd72298e2: 0x00000000 0x00000000 0x00000000 0x00000000
0xd72298f2: 0x00000000 0x00000000 0x00000000 0x00000000
(gdb) x/32x 0x2000abd0
0x2000abd0: 0xd7229882 0xd929b3bb 0xcf0f98cb 0x5c5a76b3
0x2000abe0: 0x681af5c8 0xb1334673 0x7e3cdeb8 0x2266a80b
0x2000abf0: 0x59d8de5b 0xe8eb9828 0x96d74690 0xb4b1ee9b
0x2000ac00: 0x95f0cad6 0x7d1b52fe 0xebcc146e 0x5f7dfaf5
0x2000ac10: 0x62dd2c19 0x1fc67ee7 0xf40a6a89 0xab77907c
0x2000ac20: 0x00000010 0x00039c74 0x2000ad28 0x0002329f
0x2000ac30: 0xd87c5730 0xa203a288 0x00000010 0x00039c74
0x2000ac40: 0x2000ad28 0x00023485 0x00000000 0x00000000
(gdb) p &__text
No symbol "__text" in current context.
(gdb) p &__etext
$3 = (<data variable, no debug info> *) 0x3a9c8
(gdb) p &__text
No symbol "__text" in current context.
(gdb) x/i 0xd7229882
0xd7229882: movs r0, r0
(gdb) list *0xd7229882
(gdb) x/i 0x681af5c8
0x681af5c8: movs r0, r0
(gdb) x/i 0x59d8de5b
0x59d8de5b: movs r0, r0
(gdb) x/i 0x62dd2c19
0x62dd2c19: movs r0, r0
(gdb) x/i 0x2000ad28
0x2000ad28: lsls r0, r2, #6
(gdb) x/i 0x1fc67ee7
0x1fc67ee7: movs r0, r0
(gdb) x/i 0xa203a288
0xa203a288: movs r0, r0
(gdb) x/i 0xe8eb9828
0xe8eb9828: movs r0, r0
(gdb) x/i 0xcf0f98cb
0xcf0f98cb: movs r0, r0
(gdb) x/i 0x96d74690
0x96d74690: movs r0, r0
(gdb) x/i 0xf40a6a89
0xf40a6a89: movs r0, r0
(gdb) x/i 0x2000ad28
0x2000ad28: lsls r0, r2, #6
(gdb) x/i 0x00000010
0x10: movs r0, r0
(gdb) x/i 0x0002329f
0x2329f <shift_rows+108>: add sp, #20
(gdb) x/i 0x00039c74
0x39c74 <sbox>: ldrb r3, [r4, #17]
(gdb) x/i 0xa203a288
0xa203a288: movs r0, r0
(gdb) x/i 0x0002329f
0x2329f <shift_rows+108>: add sp, #20
(gdb) list *0x0002329f
0x2329f is in shift_rows (repos/apache-mynewt-core/crypto/tinycrypt/src/aes_encrypt.c:156).
151 t[0] = s[0]; t[1] = s[5]; t[2] = s[10]; t[3] = s[15];
152 t[4] = s[4]; t[5] = s[9]; t[6] = s[14]; t[7] = s[3];
153 t[8] = s[8]; t[9] = s[13]; t[10] = s[2]; t[11] = s[7];
154 t[12] = s[12]; t[13] = s[1]; t[14] = s[6]; t[15] = s[11];
155 (void) _copy(s, sizeof(t), t, sizeof(t));
156 }
157
158 int tc_aes_encrypt(uint8_t *out, const uint8_t *in, const TCAesKeySched_t s)
159 {
160 uint8_t state[Nk*Nb];
> On 31-Aug-2018, at 4:46 PM, marko kiiskila <ma...@runtime.io> wrote:
>
> Sure. Something like this:
>
> 000933 compat> crash div0
> crash div0
> 003157 Unhandled interrupt (3), exception sp 0x20001dd8
> 003157 r0:0x00000000 r1:0x00017161 r2:0x00000000 r3:0x0000002a
> 003157 r4:0x200041d6 r5:0x00000000 r6:0x20000318 r7:0x00000000
> 003157 r8:0x00000000 r9:0x00000000 r10:0x00000000 r11:0x00000000
> 003157 r12:0x00000000 lr:0x00014949 pc:0x00014978 psr:0x61000000
> 003157 ICSR:0x00421803 HFSR:0x40000000 CFSR:0x02000000
> 003157 BFAR:0xe000ed38 MMFAR:0xe000ed34
>
> Then from gdb:
>
> Program received signal SIGTRAP, Trace/breakpoint trap.
> hal_system_reset ()
> at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
> 50 asm("bkpt");
> (gdb) bt
> #0 hal_system_reset ()
> at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
> #1 0x00008be8 in os_default_irq (tf=0x2000ffc0)
> at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:171
> #2 0x0000a5b6 in os_default_irq_asm ()
> at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/m4/HAL_CM4.s:260
> #3 <signal handler called>
> #4 0x00000000 in ?? ()
> #5 0x0000812c in Reset_Handler ()
> at repos/apache-mynewt-core/hw/bsp/nrf52dk/src/arch/cortex_m4/gcc_startup_nrf52.s:180
> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
> (gdb) frame 1
> #1 0x00008be8 in os_default_irq (tf=0x2000ffc0)
> at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:171
> 171 hal_system_reset();
> (gdb) p/x *tf
> $1 = {ef = 0x20001dd8, r4 = 0x200041d6, r5 = 0x0, r6 = 0x20000318, r7 = 0x0,
> r8 = 0x0, r9 = 0x0, r10 = 0x0, r11 = 0x0, lr = 0xfffffffd}
> (gdb) p/x *tf->ef
> $2 = {r0 = 0x0, r1 = 0x17161, r2 = 0x0, r3 = 0x2a, r12 = 0x0, lr = 0x14949,
> pc = 0x14978, psr = 0x61000000}
> (gdb) x/32x 0x20001dd8
> 0x20001dd8 <os_main_stack+3896>: 0x00000000 0x00017161 0x00000000 0x0000002a
> 0x20001de8 <os_main_stack+3912>: 0x00000000 0x00014949 0x00014978 0x61000000
> 0x20001df8 <os_main_stack+3928>: 0x00000003 0x00000000 0x00000000 0x0000002a
> 0x20001e08 <os_main_stack+3944>: 0x00000001 0x00000002 0x0000000a 0x00014a21
> 0x20001e18 <os_main_stack+3960>: 0x00014a15 0x0000ebd9 0x00000000 0x200041d0
> 0x20001e28 <os_main_stack+3976>: 0x200041d6 0x00000000 0x0000000a 0x0001574d
> 0x20001e38 <os_main_stack+3992>: 0x00015741 0x0000c925 0x200041d0 0x00000011
> 0x20001e48 <os_main_stack+4008>: 0x00000073 0x200041d3 0x00000000 0x0000ede9
> (gdb) p &__text
> $3 = (<data variable, no debug info> *) 0x8020 <__isr_vector>
> (gdb) p &__etext
> $4 = (<data variable, no debug info> *) 0x175f0
> (gdb) x/i 0x00017161
> 0x17161: movs r0, r0
> (gdb) x/i 0x00014949
> 0x14949 <crash_device+12>: cbz r0, 0x1496a <crash_device+46>
> (gdb) x/i 0x00014978
> 0x14978 <crash_device+60>: sdiv r3, r3, r2
> (gdb) x/i 0x00014a21
> 0x14a21 <crash_cli_cmd+12>: cbz r0, 0x14a28 <crash_cli_cmd+20>
> (gdb) x/i 0x00014a15
> 0x14a15 <crash_cli_cmd>: push {r3, lr}
> (gdb) list *0x14949
> 0x14949 is in crash_device (repos/apache-mynewt-core/test/crash_test/src/crash_test.c:42).
> warning: Source file is more recent than executable.
> 37 int
> 38 crash_device(char *how)
> 39 {
> 40 volatile int val1, val2, val3;
> 41
> 42 if (!strcmp(how, "div0")) {
> 43
> 44 val1 = 42;
> 45 val2 = 0;
> 46
> (gdb) list *0x00014a21
> 0x14a21 is in crash_cli_cmd (repos/apache-mynewt-core/test/crash_test/src/crash_cli.c:41).
> 36 };
> 37
> 38 static int
> 39 crash_cli_cmd(int argc, char **argv)
> 40 {
> 41 if (argc >= 2 && crash_device(argv[1]) == 0) {
> 42 return 0;
> 43 }
> 44 console_printf("Usage crash [div0|jump0|ref0|assert|wdog]\n");
> 45 return 0;
> (gdb) list *0x14a21
> 0x14a21 is in crash_cli_cmd (repos/apache-mynewt-core/test/crash_test/src/crash_cli.c:41).
> 36 };
> 37
> 38 static int
> 39 crash_cli_cmd(int argc, char **argv)
> 40 {
> 41 if (argc >= 2 && crash_device(argv[1]) == 0) {
> 42 return 0;
> 43 }
> 44 console_printf("Usage crash [div0|jump0|ref0|assert|wdog]\n");
> 45 return 0;
>
> good luck.
>
>> On Aug 31, 2018, at 2:10 PM, Aditya Xavier <ad...@me.com.INVALID> wrote:
>>
>> It seems OS_CRASH_STACKTRACE was introduced after 1.4.1 and hence not in the release.
>>
>> If I change the release, I believe there would be many API changes to be done on MESH side.
>>
>> Can you guide me on how to "manually walk the stack for looking for things which look like pointers to text” ?
>>
>> My gdb skill are pretty weak.
>>
>> I tried gdb where, with the following outcome.
>>
>> (gdb) c
>> Continuing.
>>
>>
>> Program received signal SIGTRAP, Trace/breakpoint trap.
>> hal_system_reset () at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
>> 50 asm("bkpt");
>> (gdb)
>> Continuing.
>>
>> Program received signal SIGTRAP, Trace/breakpoint trap.
>> hal_system_reset () at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
>> 50 asm("bkpt");
>> (gdb) where
>> #0 hal_system_reset () at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
>> #1 0x0000bf2e in os_default_irq (tf=0x2000ffc8) at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:170
>> #2 0x0000da56 in os_default_irq_asm () at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/m4/HAL_CM4.s:260
>> #3 <signal handler called>
>> #4 0x00000000 in ?? ()
>> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>>
>>
>>
>>> On 31-Aug-2018, at 4:30 PM, marko kiiskila <ma...@runtime.io> wrote:
>>>
>>>
>>>
>>>> On Aug 31, 2018, at 1:47 PM, Aditya Xavier <ad...@me.com.INVALID> wrote:
>>>>
>>>> Hi !
>>>>
>>>> Am having an issue with Sending and Receiving a Mesh Message. Though am positive the problem is more towards releasing the semaphore.
>>>>
>>>> Action Received over MESH Length :- 15
>>>> 012273 Unhandled interrupt (3), exception sp 0x2000abd0
>>>> 012273 r0:0xd7229882 r1:0xd929b3bb r2:0xcf0f98cb r3:0x5c5a76b3
>>>> 012273 r4:0x1b000000 r5:0x2000acc0 r6:0x2000aca0 r7:0x00000008
>>>> 012273 r8:0x00000000 r9:0x200008a9 r10:0x2000ad28 r11:0x00011d91
>>>> 012273 r12:0x681af5c8 lr:0xb1334673 pc:0x7e3cdeb8 psr:0x2266a80b
>>>> 012273 ICSR:0x00411803 HFSR:0x40000000 CFSR:0x00040000
>>>> 012273 BFAR:0xe000ed38 MMFAR:0xe000ed34
>>>>
>>>> Am sending a group mesh message for testing. The sequence of events are as follows.
>>>>
>>>> Button TASK -> send message over MESH -> Mesh receives message on model -> copies the data and starts releases the Semaphore for another task -> LOG Task starts and logs the message.
>>>>
>>>> In this entire flow, the moment I receive the message and release the semaphore the firmware crashes.
>>>>
>>>> I tried increasing the STACK size of the LOG task, however that didn’t help.
>>>>
>>>> Could someone let me know how to understand where / why the crash is happening ?
>>>
>>> Looking at your registers they seem to be garbage, so I’m guessing stack
>>> corruption of some sort; does not have to be overflow.
>>> Try turning on OS_CRASH_STACKTRACE, or manually walk the stack for looking for things which
>>> look like pointers to text.
>>>
>>>
>>
>
Re: Mynewt crash when releasing semaphore
Posted by marko kiiskila <ma...@runtime.io>.
Sure. Something like this:
000933 compat> crash div0
crash div0
003157 Unhandled interrupt (3), exception sp 0x20001dd8
003157 r0:0x00000000 r1:0x00017161 r2:0x00000000 r3:0x0000002a
003157 r4:0x200041d6 r5:0x00000000 r6:0x20000318 r7:0x00000000
003157 r8:0x00000000 r9:0x00000000 r10:0x00000000 r11:0x00000000
003157 r12:0x00000000 lr:0x00014949 pc:0x00014978 psr:0x61000000
003157 ICSR:0x00421803 HFSR:0x40000000 CFSR:0x02000000
003157 BFAR:0xe000ed38 MMFAR:0xe000ed34
Then from gdb:
Program received signal SIGTRAP, Trace/breakpoint trap.
hal_system_reset ()
at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
50 asm("bkpt");
(gdb) bt
#0 hal_system_reset ()
at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
#1 0x00008be8 in os_default_irq (tf=0x2000ffc0)
at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:171
#2 0x0000a5b6 in os_default_irq_asm ()
at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/m4/HAL_CM4.s:260
#3 <signal handler called>
#4 0x00000000 in ?? ()
#5 0x0000812c in Reset_Handler ()
at repos/apache-mynewt-core/hw/bsp/nrf52dk/src/arch/cortex_m4/gcc_startup_nrf52.s:180
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) frame 1
#1 0x00008be8 in os_default_irq (tf=0x2000ffc0)
at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:171
171 hal_system_reset();
(gdb) p/x *tf
$1 = {ef = 0x20001dd8, r4 = 0x200041d6, r5 = 0x0, r6 = 0x20000318, r7 = 0x0,
r8 = 0x0, r9 = 0x0, r10 = 0x0, r11 = 0x0, lr = 0xfffffffd}
(gdb) p/x *tf->ef
$2 = {r0 = 0x0, r1 = 0x17161, r2 = 0x0, r3 = 0x2a, r12 = 0x0, lr = 0x14949,
pc = 0x14978, psr = 0x61000000}
(gdb) x/32x 0x20001dd8
0x20001dd8 <os_main_stack+3896>: 0x00000000 0x00017161 0x00000000 0x0000002a
0x20001de8 <os_main_stack+3912>: 0x00000000 0x00014949 0x00014978 0x61000000
0x20001df8 <os_main_stack+3928>: 0x00000003 0x00000000 0x00000000 0x0000002a
0x20001e08 <os_main_stack+3944>: 0x00000001 0x00000002 0x0000000a 0x00014a21
0x20001e18 <os_main_stack+3960>: 0x00014a15 0x0000ebd9 0x00000000 0x200041d0
0x20001e28 <os_main_stack+3976>: 0x200041d6 0x00000000 0x0000000a 0x0001574d
0x20001e38 <os_main_stack+3992>: 0x00015741 0x0000c925 0x200041d0 0x00000011
0x20001e48 <os_main_stack+4008>: 0x00000073 0x200041d3 0x00000000 0x0000ede9
(gdb) p &__text
$3 = (<data variable, no debug info> *) 0x8020 <__isr_vector>
(gdb) p &__etext
$4 = (<data variable, no debug info> *) 0x175f0
(gdb) x/i 0x00017161
0x17161: movs r0, r0
(gdb) x/i 0x00014949
0x14949 <crash_device+12>: cbz r0, 0x1496a <crash_device+46>
(gdb) x/i 0x00014978
0x14978 <crash_device+60>: sdiv r3, r3, r2
(gdb) x/i 0x00014a21
0x14a21 <crash_cli_cmd+12>: cbz r0, 0x14a28 <crash_cli_cmd+20>
(gdb) x/i 0x00014a15
0x14a15 <crash_cli_cmd>: push {r3, lr}
(gdb) list *0x14949
0x14949 is in crash_device (repos/apache-mynewt-core/test/crash_test/src/crash_test.c:42).
warning: Source file is more recent than executable.
37 int
38 crash_device(char *how)
39 {
40 volatile int val1, val2, val3;
41
42 if (!strcmp(how, "div0")) {
43
44 val1 = 42;
45 val2 = 0;
46
(gdb) list *0x00014a21
0x14a21 is in crash_cli_cmd (repos/apache-mynewt-core/test/crash_test/src/crash_cli.c:41).
36 };
37
38 static int
39 crash_cli_cmd(int argc, char **argv)
40 {
41 if (argc >= 2 && crash_device(argv[1]) == 0) {
42 return 0;
43 }
44 console_printf("Usage crash [div0|jump0|ref0|assert|wdog]\n");
45 return 0;
(gdb) list *0x14a21
0x14a21 is in crash_cli_cmd (repos/apache-mynewt-core/test/crash_test/src/crash_cli.c:41).
36 };
37
38 static int
39 crash_cli_cmd(int argc, char **argv)
40 {
41 if (argc >= 2 && crash_device(argv[1]) == 0) {
42 return 0;
43 }
44 console_printf("Usage crash [div0|jump0|ref0|assert|wdog]\n");
45 return 0;
good luck.
> On Aug 31, 2018, at 2:10 PM, Aditya Xavier <ad...@me.com.INVALID> wrote:
>
> It seems OS_CRASH_STACKTRACE was introduced after 1.4.1 and hence not in the release.
>
> If I change the release, I believe there would be many API changes to be done on MESH side.
>
> Can you guide me on how to "manually walk the stack for looking for things which look like pointers to text” ?
>
> My gdb skill are pretty weak.
>
> I tried gdb where, with the following outcome.
>
> (gdb) c
> Continuing.
>
>
> Program received signal SIGTRAP, Trace/breakpoint trap.
> hal_system_reset () at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
> 50 asm("bkpt");
> (gdb)
> Continuing.
>
> Program received signal SIGTRAP, Trace/breakpoint trap.
> hal_system_reset () at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
> 50 asm("bkpt");
> (gdb) where
> #0 hal_system_reset () at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
> #1 0x0000bf2e in os_default_irq (tf=0x2000ffc8) at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:170
> #2 0x0000da56 in os_default_irq_asm () at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/m4/HAL_CM4.s:260
> #3 <signal handler called>
> #4 0x00000000 in ?? ()
> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>
>
>
>> On 31-Aug-2018, at 4:30 PM, marko kiiskila <ma...@runtime.io> wrote:
>>
>>
>>
>>> On Aug 31, 2018, at 1:47 PM, Aditya Xavier <ad...@me.com.INVALID> wrote:
>>>
>>> Hi !
>>>
>>> Am having an issue with Sending and Receiving a Mesh Message. Though am positive the problem is more towards releasing the semaphore.
>>>
>>> Action Received over MESH Length :- 15
>>> 012273 Unhandled interrupt (3), exception sp 0x2000abd0
>>> 012273 r0:0xd7229882 r1:0xd929b3bb r2:0xcf0f98cb r3:0x5c5a76b3
>>> 012273 r4:0x1b000000 r5:0x2000acc0 r6:0x2000aca0 r7:0x00000008
>>> 012273 r8:0x00000000 r9:0x200008a9 r10:0x2000ad28 r11:0x00011d91
>>> 012273 r12:0x681af5c8 lr:0xb1334673 pc:0x7e3cdeb8 psr:0x2266a80b
>>> 012273 ICSR:0x00411803 HFSR:0x40000000 CFSR:0x00040000
>>> 012273 BFAR:0xe000ed38 MMFAR:0xe000ed34
>>>
>>> Am sending a group mesh message for testing. The sequence of events are as follows.
>>>
>>> Button TASK -> send message over MESH -> Mesh receives message on model -> copies the data and starts releases the Semaphore for another task -> LOG Task starts and logs the message.
>>>
>>> In this entire flow, the moment I receive the message and release the semaphore the firmware crashes.
>>>
>>> I tried increasing the STACK size of the LOG task, however that didn’t help.
>>>
>>> Could someone let me know how to understand where / why the crash is happening ?
>>
>> Looking at your registers they seem to be garbage, so I’m guessing stack
>> corruption of some sort; does not have to be overflow.
>> Try turning on OS_CRASH_STACKTRACE, or manually walk the stack for looking for things which
>> look like pointers to text.
>>
>>
>
Re: Mynewt crash when releasing semaphore
Posted by Aditya Xavier <ad...@me.com.INVALID>.
It seems OS_CRASH_STACKTRACE was introduced after 1.4.1 and hence not in the release.
If I change the release, I believe there would be many API changes to be done on MESH side.
Can you guide me on how to "manually walk the stack for looking for things which look like pointers to text” ?
My gdb skill are pretty weak.
I tried gdb where, with the following outcome.
(gdb) c
Continuing.
Program received signal SIGTRAP, Trace/breakpoint trap.
hal_system_reset () at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
50 asm("bkpt");
(gdb)
Continuing.
Program received signal SIGTRAP, Trace/breakpoint trap.
hal_system_reset () at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
50 asm("bkpt");
(gdb) where
#0 hal_system_reset () at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
#1 0x0000bf2e in os_default_irq (tf=0x2000ffc8) at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:170
#2 0x0000da56 in os_default_irq_asm () at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/m4/HAL_CM4.s:260
#3 <signal handler called>
#4 0x00000000 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
> On 31-Aug-2018, at 4:30 PM, marko kiiskila <ma...@runtime.io> wrote:
>
>
>
>> On Aug 31, 2018, at 1:47 PM, Aditya Xavier <ad...@me.com.INVALID> wrote:
>>
>> Hi !
>>
>> Am having an issue with Sending and Receiving a Mesh Message. Though am positive the problem is more towards releasing the semaphore.
>>
>> Action Received over MESH Length :- 15
>> 012273 Unhandled interrupt (3), exception sp 0x2000abd0
>> 012273 r0:0xd7229882 r1:0xd929b3bb r2:0xcf0f98cb r3:0x5c5a76b3
>> 012273 r4:0x1b000000 r5:0x2000acc0 r6:0x2000aca0 r7:0x00000008
>> 012273 r8:0x00000000 r9:0x200008a9 r10:0x2000ad28 r11:0x00011d91
>> 012273 r12:0x681af5c8 lr:0xb1334673 pc:0x7e3cdeb8 psr:0x2266a80b
>> 012273 ICSR:0x00411803 HFSR:0x40000000 CFSR:0x00040000
>> 012273 BFAR:0xe000ed38 MMFAR:0xe000ed34
>>
>> Am sending a group mesh message for testing. The sequence of events are as follows.
>>
>> Button TASK -> send message over MESH -> Mesh receives message on model -> copies the data and starts releases the Semaphore for another task -> LOG Task starts and logs the message.
>>
>> In this entire flow, the moment I receive the message and release the semaphore the firmware crashes.
>>
>> I tried increasing the STACK size of the LOG task, however that didn’t help.
>>
>> Could someone let me know how to understand where / why the crash is happening ?
>
> Looking at your registers they seem to be garbage, so I’m guessing stack
> corruption of some sort; does not have to be overflow.
> Try turning on OS_CRASH_STACKTRACE, or manually walk the stack for looking for things which
> look like pointers to text.
>
>
Re: Mynewt crash when releasing semaphore
Posted by marko kiiskila <ma...@runtime.io>.
> On Aug 31, 2018, at 1:47 PM, Aditya Xavier <ad...@me.com.INVALID> wrote:
>
> Hi !
>
> Am having an issue with Sending and Receiving a Mesh Message. Though am positive the problem is more towards releasing the semaphore.
>
> Action Received over MESH Length :- 15
> 012273 Unhandled interrupt (3), exception sp 0x2000abd0
> 012273 r0:0xd7229882 r1:0xd929b3bb r2:0xcf0f98cb r3:0x5c5a76b3
> 012273 r4:0x1b000000 r5:0x2000acc0 r6:0x2000aca0 r7:0x00000008
> 012273 r8:0x00000000 r9:0x200008a9 r10:0x2000ad28 r11:0x00011d91
> 012273 r12:0x681af5c8 lr:0xb1334673 pc:0x7e3cdeb8 psr:0x2266a80b
> 012273 ICSR:0x00411803 HFSR:0x40000000 CFSR:0x00040000
> 012273 BFAR:0xe000ed38 MMFAR:0xe000ed34
>
> Am sending a group mesh message for testing. The sequence of events are as follows.
>
> Button TASK -> send message over MESH -> Mesh receives message on model -> copies the data and starts releases the Semaphore for another task -> LOG Task starts and logs the message.
>
> In this entire flow, the moment I receive the message and release the semaphore the firmware crashes.
>
> I tried increasing the STACK size of the LOG task, however that didn’t help.
>
> Could someone let me know how to understand where / why the crash is happening ?
Looking at your registers they seem to be garbage, so I’m guessing stack
corruption of some sort; does not have to be overflow.
Try turning on OS_CRASH_STACKTRACE, or manually walk the stack for looking for things which
look like pointers to text.