You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Klaus Malorny (Jira)" <ji...@apache.org> on 2023/04/19 15:44:00 UTC

[jira] [Updated] (DAEMON-459) Restart only works once (regression)

     [ https://issues.apache.org/jira/browse/DAEMON-459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Klaus Malorny updated DAEMON-459:
---------------------------------
    Description: 
For certain functions, especially code updates, we rely on the ability to restart the child process. This seems to work only once. On the subsequent attempt, the child process hangs.

I tracked down the problem and found out that the problem is within the {{jsvc-unix.c}} file. The {{main_reload}} function is called to send the signal to itself, but this does not happen. In the first restart, the {{controlled}} variable holds the value of 0. This works by chance, as the signal is sent to the parent, which sends it back to the child. In the second attempt, the variable holds the PID of the previous child, thus the signal is sent to a no longer existing process.

The {{controlled}} variable is used both by the parent and the child process. In earlier versions of the file, the child process determines its own PID by using the {{getpid}} system function. This call has been – likely accidentally – removed in version 1.3.3 or earlier. Thus, the variable contains the parent's value before the fork which has created the child.

The solution is simple: in the function {{{}child{}}}, add

{{    controlled = getpid ();}}

between the {{sigaction}} calls and the {{log_debug ("Waiting for a signal to be delivered")}} call (line 913 in my copy of the file), i.e.

{{    ...}}
{{    memset(&act, '\0', sizeof(act));}}
{{    act.sa_handler = handler;}}
{{    sigemptyset(&act.sa_mask);}}
{{    act.sa_flags = SA_RESTART | SA_NOCLDSTOP;}}
{{    sigaction(SIGHUP, &act, NULL);}}
{{    sigaction(SIGUSR1, &act, NULL);}}
{{    sigaction(SIGUSR2, &act, NULL);}}
{{    sigaction(SIGTERM, &act, NULL);}}
{{    sigaction(SIGINT, &act, NULL);}}
{{    *controlled = getpid ();*}}
{{    log_debug("Waiting for a signal to be delivered");}}
{{    create_tmp_file(args);}}
{{    while (!stopping) {}}
{{    ...}}

  was:
For certain functions, especially code updates, we rely on the ability to restart the child process. This seems to work only once. On the subsequent attempt, the child process hangs.

I tracked down the problem and found out that the problem is within the {{jsvc-unix.c}} file. The {{main_reload}} function is called to send the signal to itself, but this does not happen. In the first restart, the {{controlled}} variable holds the value of 0. This works by chance, as the signal is sent to the parent, which sends it back to the child. In the second attempt, the variable holds the PID of the previous child, thus the signal is sent to a no longer existing process.

The {{controlled}} variable is used both by the parent and the child process. In earlier versions of the file, the child process determines its own PID by using the {{getpid}} system function. This call has been – likely accidentally – removed in version 1.3.3 or earlier. Thus, the variable contains the parent's value before the fork which has created the child.

The solution is simple: in the function {{{}child{}}}, add

{{    controlled = getpid ();}}

between the {{sigaction}} calls and the {{log_debug ("Waiting for a signal to be delivered")}} call (line 913 in my copy of the file), i.e.

{{    ...}}
{{    memset(&act, '\0', sizeof(act));}}
{{    act.sa_handler = handler;}}
{{    sigemptyset(&act.sa_mask);}}
{{    act.sa_flags = SA_RESTART | SA_NOCLDSTOP;}}{{    sigaction(SIGHUP, &act, NULL);}}
{{    sigaction(SIGUSR1, &act, NULL);}}
{{    sigaction(SIGUSR2, &act, NULL);}}
{{    sigaction(SIGTERM, &act, NULL);}}
{{    sigaction(SIGINT, &act, NULL);}}
{{    *controlled = getpid ();*}}
{{    log_debug("Waiting for a signal to be delivered");}}
{{    create_tmp_file(args);}}
{{    while (!stopping) {}}
{{    ...}}


> Restart only works once (regression)
> ------------------------------------
>
>                 Key: DAEMON-459
>                 URL: https://issues.apache.org/jira/browse/DAEMON-459
>             Project: Commons Daemon
>          Issue Type: Bug
>          Components: Jsvc
>    Affects Versions: 1.3.3
>            Reporter: Klaus Malorny
>            Priority: Major
>
> For certain functions, especially code updates, we rely on the ability to restart the child process. This seems to work only once. On the subsequent attempt, the child process hangs.
> I tracked down the problem and found out that the problem is within the {{jsvc-unix.c}} file. The {{main_reload}} function is called to send the signal to itself, but this does not happen. In the first restart, the {{controlled}} variable holds the value of 0. This works by chance, as the signal is sent to the parent, which sends it back to the child. In the second attempt, the variable holds the PID of the previous child, thus the signal is sent to a no longer existing process.
> The {{controlled}} variable is used both by the parent and the child process. In earlier versions of the file, the child process determines its own PID by using the {{getpid}} system function. This call has been – likely accidentally – removed in version 1.3.3 or earlier. Thus, the variable contains the parent's value before the fork which has created the child.
> The solution is simple: in the function {{{}child{}}}, add
> {{    controlled = getpid ();}}
> between the {{sigaction}} calls and the {{log_debug ("Waiting for a signal to be delivered")}} call (line 913 in my copy of the file), i.e.
> {{    ...}}
> {{    memset(&act, '\0', sizeof(act));}}
> {{    act.sa_handler = handler;}}
> {{    sigemptyset(&act.sa_mask);}}
> {{    act.sa_flags = SA_RESTART | SA_NOCLDSTOP;}}
> {{    sigaction(SIGHUP, &act, NULL);}}
> {{    sigaction(SIGUSR1, &act, NULL);}}
> {{    sigaction(SIGUSR2, &act, NULL);}}
> {{    sigaction(SIGTERM, &act, NULL);}}
> {{    sigaction(SIGINT, &act, NULL);}}
> {{    *controlled = getpid ();*}}
> {{    log_debug("Waiting for a signal to be delivered");}}
> {{    create_tmp_file(args);}}
> {{    while (!stopping) {}}
> {{    ...}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)