You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nuttx.apache.org by Tim Hardisty <ti...@hardisty.co.uk> on 2023/03/03 18:07:11 UTC

Help me understand file open/close behaviours?

Hi all,

The bug I thought I had in a driver I'm developing (well, one of them!) appears to be related to file closing.

- I have a related example-type app I'm using to exercise and check the driver. It opens 2 "files" (O-RDONLY) to read data from the device driver
- I have enabled CONFIG_SIGKILL_ACTION to allow me to ctrl-c from the console if the app is misbehaving or, I thought, just to exit it.

The behaviours I see are:

1) If I ctrl-c, the open files are not closed and if I re-run the app, the system crashes. It is the very first printf statement of the app that causes the crash, at the point the printf routine calls lib_fflush (not looked further yet).
2) If I ensure the test app reads console input too, and map a character received character (e.g. 'x') to a clean exit, I can then re-run the app without issue.

I don't think I saw this behaviour with the previous driver I did, so I have probably changed something via menuconfig, but I still would have thought/hoped that this sort of behaviour wouldn't happen regardless?

Anyone got any suggestions or hints (other than go back to school!)?

TimH

RE: Help me understand file open/close behaviours?

Posted by Tim Hardisty <ti...@hardisty.co.uk>.
>From: Nathan Hartman <ha...@gmail.com>
>Sent: 03 March 2023 18:36
>To: dev@nuttx.apache.org
>Subject: Re: Help me understand file open/close behaviours?
>
>On Fri, Mar 3, 2023 at 1:07 PM Tim Hardisty <ti...@hardisty.co.uk> wrote:
>> The bug I thought I had in a driver I'm developing (well, one of them!)
>appears to be related to file closing.
>>
>> - I have a related example-type app I'm using to exercise and check
>> the driver. It opens 2 "files" (O-RDONLY) to read data from the device
>> driver
>> - I have enabled CONFIG_SIGKILL_ACTION to allow me to ctrl-c from the
>console if the app is misbehaving or, I thought, just to exit it.
>>
>> The behaviours I see are:
>>
>> 1) If I ctrl-c, the open files are not closed and if I re-run the app,
>the system crashes. It is the very first printf statement of the app that
>causes the crash, at the point the printf routine calls lib_fflush (not
>looked further yet).
>> 2) If I ensure the test app reads console input too, and map a
>character received character (e.g. 'x') to a clean exit, I can then re-
>run the app without issue.
>>
>> I don't think I saw this behaviour with the previous driver I did, so I
>have probably changed something via menuconfig, but I still would have
>thought/hoped that this sort of behaviour wouldn't happen regardless?
>>
>> Anyone got any suggestions or hints (other than go back to school!)?
>
>
>Have you tried installing signal handlers for SIGQUIT and SIGINT and
>ensuring that the files are closed before the program is quit?
>

CONFIG_SIGKILL_ACTION covers SIGQUIT and SIGINT, so yes. Also SIGKILL and SIGTERM as it happens.

When I trap for the 'x' from the console, the test app does close both files, but ctrl-c to abort the app doesn't run the code in the app to close the files of course, so a close would have to be handled by the NuttX system itself, I think?

RE: Help me understand file open/close behaviours?

Posted by Tim Hardisty <ti...@hardisty.co.uk>.
>From: Gregory Nutt <sp...@gmail.com>
>Sent: 03 March 2023 19:03
>To: dev@nuttx.apache.org
>Subject: Re: Help me understand file open/close behaviours?
>
>On 3/3/2023 12:56 PM, Gregory Nutt wrote:
>> On 3/3/2023 12:36 PM, Nathan Hartman wrote:
>>> On Fri, Mar 3, 2023 at 1:07 PM Tim Hardisty <ti...@hardisty.co.uk> wrote:
>>>> The bug I thought I had in a driver I'm developing (well, one of
>>>> them!) appears to be related to file closing.
>>>>
>>>> - I have a related example-type app I'm using to exercise and check
>>>> the driver. It opens 2 "files" (O-RDONLY) to read data from the
>>>> device driver
>>>> - I have enabled CONFIG_SIGKILL_ACTION to allow me to ctrl-c from
>>>> the console if the app is misbehaving or, I thought, just to exit it.
>>>>
>>>> The behaviours I see are:
>>>>
>>>> 1) If I ctrl-c, the open files are not closed and if I re-run the
>>>> app, the system crashes. It is the very first printf statement of
>>>> the app that causes the crash, at the point the printf routine calls
>>>> lib_fflush (not looked further yet).
>>>> 2) If I ensure the test app reads console input too, and map a
>>>> character received character (e.g. 'x') to a clean exit, I can then
>>>> re-run the app without issue.
>>>>
>>>> I don't think I saw this behaviour with the previous driver I did,
>>>> so I have probably changed something via menuconfig, but I still
>>>> would have thought/hoped that this sort of behaviour wouldn't happen
>>>> regardless?
>>>>
>>>> Anyone got any suggestions or hints (other than go back to school!)?
>>>
>>> Have you tried installing signal handlers for SIGQUIT and SIGINT and
>>> ensuring that the files are closed before the program is quit?
>>>
>>> Nathan
>>
>> SIGINT, SIGKILL, etc. don't do graceful shutdowns like exit() does.
>> They should behave as though _exit() were called which does an
>> immediate termination.  However, _exit() is still required to close
>> all open file descriptor (Per Linux man page) and, if it does not,
>> that would be a bug.
>>
>> SIGKILL can't be caught (again per the Linux man page)
>>
>> https://man7.org/linux/man-pages/man2/exit.2.html
>> https://man7.org/linux/man-pages/man7/signal.7.html
>>
>"Closing" per se is probably not a the root of the problem when the file
>descriptors are deallocated when the task group terminates, all of the
>descriptors are freed.  However, I suspect the there may be an open
>reference count in the drivers which is not decrements and whcih could
>subsequently in interfere with the correct behavior or a driver.
>
>
Aha! Yes, I added open counts just a few days ago so that is most likely it. I understand it all a bit better now (which is why I asked) so thank you Gregory. I will check other drivers that use open counts and see what daft thing I've done this time <roll eyes>

RE: Help me understand file open/close behaviours?

Posted by Tim Hardisty <ti...@hardisty.co.uk>.
I was wrong. My app was mishandling console input :(

>-----Original Message-----
>From: Tim Hardisty <ti...@hardisty.co.uk>
>Sent: 09 March 2023 15:34
>To: dev@nuttx.apache.org
>Subject: RE: Help me understand file open/close behaviours?
>
>Guess what - this behaviour is only on Master. Spent an hour or 2 getting
>the test app and new driver to work under 12.0 release and I can ctrl-C
>out of my test app without problem and re-run it without an apocalyptic
>crash.
>
>I will now raise it as an issue on GitHub.
>
>>-----Original Message-----
>>From: Tim Hardisty <ti...@hardisty.co.uk>
>>Sent: 08 March 2023 17:37
>>To: dev@nuttx.apache.org
>>Subject: RE: Help me understand file open/close behaviours?
>>
>>>From: Gregory Nutt <sp...@gmail.com>
>>>Sent: 03 March 2023 19:03
>>>
>>>On 3/3/2023 12:56 PM, Gregory Nutt wrote:
>>>> On 3/3/2023 12:36 PM, Nathan Hartman wrote:
>>>>> On Fri, Mar 3, 2023 at 1:07 PM Tim Hardisty <ti...@hardisty.co.uk>
>>wrote:
>>>>>> - I have enabled CONFIG_SIGKILL_ACTION to allow me to ctrl-c from
>>>>>> the console if the app is misbehaving or, I thought, just to exit
>>it.
>>>>>>
>>>>>> The behaviours I see are:
>>>>>>
>>>>>> 1) If I ctrl-c, the open files are not closed and if I re-run the
>>>>>> app, the system crashes. It is the very first printf statement of
>>>>>> the app that causes the crash, at the point the printf routine
>>>>>> calls lib_fflush (not looked further yet).
>>>>
>>>> SIGINT, SIGKILL, etc. don't do graceful shutdowns like exit() does.
>>>> They should behave as though _exit() were called which does an
>>>> immediate termination.  However, _exit() is still required to close
>>>> all open file descriptor (Per Linux man page) and, if it does not,
>>>> that would be a bug.
>>>>
>>>> SIGKILL can't be caught (again per the Linux man page)
>>>>
>>>> https://man7.org/linux/man-pages/man2/exit.2.html
>>>> https://man7.org/linux/man-pages/man7/signal.7.html
>>>>
>>>"Closing" per se is probably not a the root of the problem when the
>>>file descriptors are deallocated when the task group terminates, all
>>>of the descriptors are freed.  However, I suspect the there may be an
>>>open reference count in the drivers which is not decrements and whcih
>>>could subsequently in interfere with the correct behavior or a driver.
>>>
>>>
>>
>>To divert from procedural discussion...
>>
>>I think I take it from the above that a CTRL-C is not a clean exit; but
>>is not the root cause of the issue I'm seeing.
>>
>>The crash is at a printf the very first line of the app. And the same
>>is true if I run another example app after a ctrl-c out of my new test
>>app that uses printf. It is not to do with any interaction with file
>>open/close/ioctl etc. as I first thought.
>>
>>I don't get why a ctrl-c out of an app causes printf to completely
>>crash the board with no useful debug info to be had! The call stack
>>simply shows it was due to an ARM data abort exception.
>>
>>I am quite sure I have made some "daft" Kconfig change (or not made a
>>selection I should have done) - I welcome any suggestions before I
>>shrug my shoulders and conclude this is just "one of those things" :)

RE: Help me understand file open/close behaviours?

Posted by Tim Hardisty <ti...@hardisty.co.uk>.
Guess what - this behaviour is only on Master. Spent an hour or 2 getting the test app and new driver to work under 12.0 release and I can ctrl-C out of my test app without problem and re-run it without an apocalyptic crash.

I will now raise it as an issue on GitHub.

>-----Original Message-----
>From: Tim Hardisty <ti...@hardisty.co.uk>
>Sent: 08 March 2023 17:37
>To: dev@nuttx.apache.org
>Subject: RE: Help me understand file open/close behaviours?
>
>>From: Gregory Nutt <sp...@gmail.com>
>>Sent: 03 March 2023 19:03
>>
>>On 3/3/2023 12:56 PM, Gregory Nutt wrote:
>>> On 3/3/2023 12:36 PM, Nathan Hartman wrote:
>>>> On Fri, Mar 3, 2023 at 1:07 PM Tim Hardisty <ti...@hardisty.co.uk>
>wrote:
>>>>> - I have enabled CONFIG_SIGKILL_ACTION to allow me to ctrl-c from
>>>>> the console if the app is misbehaving or, I thought, just to exit
>it.
>>>>>
>>>>> The behaviours I see are:
>>>>>
>>>>> 1) If I ctrl-c, the open files are not closed and if I re-run the
>>>>> app, the system crashes. It is the very first printf statement of
>>>>> the app that causes the crash, at the point the printf routine
>>>>> calls lib_fflush (not looked further yet).
>>>
>>> SIGINT, SIGKILL, etc. don't do graceful shutdowns like exit() does.
>>> They should behave as though _exit() were called which does an
>>> immediate termination.  However, _exit() is still required to close
>>> all open file descriptor (Per Linux man page) and, if it does not,
>>> that would be a bug.
>>>
>>> SIGKILL can't be caught (again per the Linux man page)
>>>
>>> https://man7.org/linux/man-pages/man2/exit.2.html
>>> https://man7.org/linux/man-pages/man7/signal.7.html
>>>
>>"Closing" per se is probably not a the root of the problem when the
>>file descriptors are deallocated when the task group terminates, all of
>>the descriptors are freed.  However, I suspect the there may be an open
>>reference count in the drivers which is not decrements and whcih could
>>subsequently in interfere with the correct behavior or a driver.
>>
>>
>
>To divert from procedural discussion...
>
>I think I take it from the above that a CTRL-C is not a clean exit; but
>is not the root cause of the issue I'm seeing.
>
>The crash is at a printf the very first line of the app. And the same is
>true if I run another example app after a ctrl-c out of my new test app
>that uses printf. It is not to do with any interaction with file
>open/close/ioctl etc. as I first thought.
>
>I don't get why a ctrl-c out of an app causes printf to completely crash
>the board with no useful debug info to be had! The call stack simply
>shows it was due to an ARM data abort exception.
>
>I am quite sure I have made some "daft" Kconfig change (or not made a
>selection I should have done) - I welcome any suggestions before I shrug
>my shoulders and conclude this is just "one of those things" :)

RE: Help me understand file open/close behaviours?

Posted by Tim Hardisty <ti...@hardisty.co.uk>.
>From: Gregory Nutt <sp...@gmail.com>
>Sent: 03 March 2023 19:03
>
>On 3/3/2023 12:56 PM, Gregory Nutt wrote:
>> On 3/3/2023 12:36 PM, Nathan Hartman wrote:
>>> On Fri, Mar 3, 2023 at 1:07 PM Tim Hardisty <ti...@hardisty.co.uk> wrote:
>>>> - I have enabled CONFIG_SIGKILL_ACTION to allow me to ctrl-c from
>>>> the console if the app is misbehaving or, I thought, just to exit it.
>>>>
>>>> The behaviours I see are:
>>>>
>>>> 1) If I ctrl-c, the open files are not closed and if I re-run the
>>>> app, the system crashes. It is the very first printf statement of
>>>> the app that causes the crash, at the point the printf routine calls
>>>> lib_fflush (not looked further yet).
>>
>> SIGINT, SIGKILL, etc. don't do graceful shutdowns like exit() does.
>> They should behave as though _exit() were called which does an
>> immediate termination.  However, _exit() is still required to close
>> all open file descriptor (Per Linux man page) and, if it does not,
>> that would be a bug.
>>
>> SIGKILL can't be caught (again per the Linux man page)
>>
>> https://man7.org/linux/man-pages/man2/exit.2.html
>> https://man7.org/linux/man-pages/man7/signal.7.html
>>
>"Closing" per se is probably not a the root of the problem when the file
>descriptors are deallocated when the task group terminates, all of the
>descriptors are freed.  However, I suspect the there may be an open
>reference count in the drivers which is not decrements and whcih could
>subsequently in interfere with the correct behavior or a driver.
>
>

To divert from procedural discussion...

I think I take it from the above that a CTRL-C is not a clean exit; but is not the root cause of the issue I'm seeing.

The crash is at a printf the very first line of the app. And the same is true if I run another example app after a ctrl-c out of my new test app that uses printf. It is not to do with any interaction with file open/close/ioctl etc. as I first thought.

I don't get why a ctrl-c out of an app causes printf to completely crash the board with no useful debug info to be had! The call stack simply shows it was due to an ARM data abort exception.

I am quite sure I have made some "daft" Kconfig change (or not made a selection I should have done) - I welcome any suggestions before I shrug my shoulders and conclude this is just "one of those things" :)

Re: Help me understand file open/close behaviours?

Posted by Gregory Nutt <sp...@gmail.com>.
On 3/3/2023 12:56 PM, Gregory Nutt wrote:
> On 3/3/2023 12:36 PM, Nathan Hartman wrote:
>> On Fri, Mar 3, 2023 at 1:07 PM Tim Hardisty <ti...@hardisty.co.uk> wrote:
>>> The bug I thought I had in a driver I'm developing (well, one of 
>>> them!) appears to be related to file closing.
>>>
>>> - I have a related example-type app I'm using to exercise and check 
>>> the driver. It opens 2 "files" (O-RDONLY) to read data from the 
>>> device driver
>>> - I have enabled CONFIG_SIGKILL_ACTION to allow me to ctrl-c from 
>>> the console if the app is misbehaving or, I thought, just to exit it.
>>>
>>> The behaviours I see are:
>>>
>>> 1) If I ctrl-c, the open files are not closed and if I re-run the 
>>> app, the system crashes. It is the very first printf statement of 
>>> the app that causes the crash, at the point the printf routine calls 
>>> lib_fflush (not looked further yet).
>>> 2) If I ensure the test app reads console input too, and map a 
>>> character received character (e.g. 'x') to a clean exit, I can then 
>>> re-run the app without issue.
>>>
>>> I don't think I saw this behaviour with the previous driver I did, 
>>> so I have probably changed something via menuconfig, but I still 
>>> would have thought/hoped that this sort of behaviour wouldn't happen 
>>> regardless?
>>>
>>> Anyone got any suggestions or hints (other than go back to school!)?
>>
>> Have you tried installing signal handlers for SIGQUIT and SIGINT and
>> ensuring that the files are closed before the program is quit?
>>
>> Nathan
>
> SIGINT, SIGKILL, etc. don't do graceful shutdowns like exit() does.  
> They should behave as though _exit() were called which does an 
> immediate termination.  However, _exit() is still required to close 
> all open file descriptor (Per Linux man page) and, if it does not, 
> that would be a bug.
>
> SIGKILL can't be caught (again per the Linux man page)
>
> https://man7.org/linux/man-pages/man2/exit.2.html
> https://man7.org/linux/man-pages/man7/signal.7.html
>
"Closing" per se is probably not a the root of the problem when the file 
descriptors are deallocated when the task group terminates, all of the 
descriptors are freed.  However, I suspect the there may be an open 
reference count in the drivers which is not decrements and whcih could 
subsequently in interfere with the correct behavior or a driver.




Re: Help me understand file open/close behaviours?

Posted by Gregory Nutt <sp...@gmail.com>.
On 3/3/2023 12:36 PM, Nathan Hartman wrote:
> On Fri, Mar 3, 2023 at 1:07 PM Tim Hardisty <ti...@hardisty.co.uk> wrote:
>> The bug I thought I had in a driver I'm developing (well, one of them!) appears to be related to file closing.
>>
>> - I have a related example-type app I'm using to exercise and check the driver. It opens 2 "files" (O-RDONLY) to read data from the device driver
>> - I have enabled CONFIG_SIGKILL_ACTION to allow me to ctrl-c from the console if the app is misbehaving or, I thought, just to exit it.
>>
>> The behaviours I see are:
>>
>> 1) If I ctrl-c, the open files are not closed and if I re-run the app, the system crashes. It is the very first printf statement of the app that causes the crash, at the point the printf routine calls lib_fflush (not looked further yet).
>> 2) If I ensure the test app reads console input too, and map a character received character (e.g. 'x') to a clean exit, I can then re-run the app without issue.
>>
>> I don't think I saw this behaviour with the previous driver I did, so I have probably changed something via menuconfig, but I still would have thought/hoped that this sort of behaviour wouldn't happen regardless?
>>
>> Anyone got any suggestions or hints (other than go back to school!)?
>
> Have you tried installing signal handlers for SIGQUIT and SIGINT and
> ensuring that the files are closed before the program is quit?
>
> Nathan

SIGINT, SIGKILL, etc. don't do graceful shutdowns like exit() does.  
They should behave as though _exit() were called which does an immediate 
termination.  However, _exit() is still required to close all open file 
descriptor (Per Linux man page) and, if it does not, that would be a bug.

SIGKILL can't be caught (again per the Linux man page)

https://man7.org/linux/man-pages/man2/exit.2.html
https://man7.org/linux/man-pages/man7/signal.7.html




Re: Help me understand file open/close behaviours?

Posted by Nathan Hartman <ha...@gmail.com>.
On Fri, Mar 3, 2023 at 1:07 PM Tim Hardisty <ti...@hardisty.co.uk> wrote:
> The bug I thought I had in a driver I'm developing (well, one of them!) appears to be related to file closing.
>
> - I have a related example-type app I'm using to exercise and check the driver. It opens 2 "files" (O-RDONLY) to read data from the device driver
> - I have enabled CONFIG_SIGKILL_ACTION to allow me to ctrl-c from the console if the app is misbehaving or, I thought, just to exit it.
>
> The behaviours I see are:
>
> 1) If I ctrl-c, the open files are not closed and if I re-run the app, the system crashes. It is the very first printf statement of the app that causes the crash, at the point the printf routine calls lib_fflush (not looked further yet).
> 2) If I ensure the test app reads console input too, and map a character received character (e.g. 'x') to a clean exit, I can then re-run the app without issue.
>
> I don't think I saw this behaviour with the previous driver I did, so I have probably changed something via menuconfig, but I still would have thought/hoped that this sort of behaviour wouldn't happen regardless?
>
> Anyone got any suggestions or hints (other than go back to school!)?


Have you tried installing signal handlers for SIGQUIT and SIGINT and
ensuring that the files are closed before the program is quit?

Nathan