You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2008/08/02 20:57:35 UTC

[Bug 5950] New: spamd disconnects from terminal before writing pid file

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5950

           Summary: spamd disconnects from terminal before writing pid file
           Product: Spamassassin
           Version: 3.2.4
          Platform: Other
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P5
         Component: spamc/spamd
        AssignedTo: dev@spamassassin.apache.org
        ReportedBy: tom.viza@gmail.com


Hi all

I'm having a problem with spamd.

When I use both the -rPIDFILE and -d options, the program disconnects from
the terminal before the pidfile is created.  This means that the script
that calls the spamd starts the next program before the pidfile exists. 
If this next command requires to read from that file, it will fail.

Doesn't it make sense to write the pidfile before daemonizing?

simple example:

/usr/local/bin/spamd -d -s "$sdlog" -r "$sdpid" \
    --socketpath="$sdsock" --socketmode 600 || exit

kill -INT `cat "$sdpid"`

I just get "cat: No such file" and the signal doesn't get sent.  The pidfile
shows up shortly afterwards.

Perhaps there is a sensible way to do this that I haven't thought of.

It is possible to poll for the file, eg:

while ( ! [ -e "$sdpid" ] ) || ( ! [ -e "$sdsock" ] )
do
  sleep 1
done

But that is awkward and in more complex situations creates race conditions.

viza


-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 5950] spamd disconnects from terminal before writing pid file

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5950





--- Comment #5 from Tom Vajzovic <to...@gmail.com>  2008-08-04 13:05:21 PST ---
You are quite right that I am being a little closed-minded in thinking that
this is all about the pid file.

The pidfile is only part of the issue.  What I would like is that the process I
start doesn't disconnect from the terminal until the daemon is ready to accept
connections, too.

In daemonize() we have:

  exit if $pid;

what I would like is that this does:

 if($pid)
   wait until daemon is ready to receive connections,
   AND has written its pidfile if requested, 
   then exit.

Perhaps my requirements are quite different from the usual situation where
spamd is started at boot and it is best to be fast and to disconnect ASAP.

If that is the case, would an alternative option --daemonize-when-ready be
acceptable to help me in this case?

I am happy if the answer is "No, you have to read the log", but rather than
having shell scripts try to make sense of the log, wouldn't it be better to
keep this internal to SA?  I'd rather not have to check if the log format has
changed in every new version.

Thanks for your time,
viza


-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 5950] spamd disconnects from terminal before writing pid file

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5950





--- Comment #1 from Justin Mason <jm...@jmason.org>  2008-08-02 13:46:38 PST ---
agreed, this is suboptimal.  patches welcome! ;)


-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 5950] spamd disconnects from terminal before writing pid file

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5950





--- Comment #17 from Sidney Markowitz <si...@sidney.com>  2008-08-06 11:21:02 PST ---
(In reply to comment #15)

Tom,

What Justin said. I did test it by commenting out the signal and killing the
child, and waitpid() does detect that correctly.

> you could perhaps get rid of the timeout altogether (no
> need to just give up on slow systems if they are still going)

The timeout is supposed to be long enough that if you hit it something is wrong
and you are better off getting a warning message and continuing in case the
child process is actually running ok and the problem was just with the signal.
That's supposed to be better than having the script that calls spamd hung up
indefinitely. If you have experience that real heavily loaded systems take
longer than 30 seconds to start up spamd I can increase the number.


-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 5950] spamd disconnects from terminal before writing pid file

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5950





--- Comment #10 from Justin Mason <jm...@jmason.org>  2008-08-05 12:51:25 PST ---
(In reply to comment #8)
> (in reply to comment #7)
> >and for that to abort waiting, if the child itself exits.
> 
> I've been reading about how to use SIGCHLD to do that and it seems complicated.
> Example code I found uses a SIGCHLD handler, uses waitpid(-1, &WNOHANG) and
> checks WIFEXITED($?) to avoid responding to a SIGCHLD sent when the child is
> suspended but not exited. Worst, I see that WNOHANG is not supported on all
> platforms and I didn't see an alternative way of doing this that does not
> require a non-blocking waitpid().

thankfully perl insulates us from the hard stuff a lot here -- see
child_handler() for a sane WNOHANG check.  there's no need to deal with
SIGCHLD, or indeed modify that at all. the logic would be (warning, messy
p-code):

# in daemonize()

$got_pidfile = 0;   # global
$SIG{'USR2'} = \&got_pidfile;

  defined( my $pid = fork ) or die "spamd: cannot fork: $!\n";

# this code is new:
if ($pid) {  #parent
for my $retry (0 .. $somenumber) {
sleep 1;
if ($got_pidfile) { last; }
my $waited = waitpid($pid, WNOHANG);
if ($waited == $pid) {
last; # pid has exited
} else {
warn "waitpid failed: $!";
}
}

if (!$got_pidfile) {
die "child process exited or timed out without producing a PID file";
} else {
exit 0;
}
}
else {
delete $SIG{'USR2'};
}


  exit if $pid;
  setsid or die "spamd: cannot start new session: $!\n";
[..etc.]

sub got_pidfile {
$got_pidfile = 1;
}


> It seems to me that for this purpose it would work fine to wait no more than,
> say 5 seconds. All we are doing is making sure that the parent process does not
> exit before the log entry (and pid file) is written by the child so that the
> invoking shell script does not have to bother waiting for it. If the log entry
> is not written by the child in the first few seconds (probably even in the
> first second) then something is wrong anyway and it will probably not be
> written.

5 secs is pretty short under extremely heavy load.... but 30 would be plenty
IMO ;)


-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 5950] spamd disconnects from terminal before writing pid file

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5950





--- Comment #9 from Tom Vajzovic <to...@gmail.com>  2008-08-05 12:25:06 PST ---
(In reply to comment #8)
> I see that WNOHANG is not supported on all platforms and I didn't see an
> alternative way of doing this that does not require a non-blocking waitpid().

Do the platforms in question support sending signal 0 as a test for a process?


-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 5950] spamd disconnects from terminal before writing pid file

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5950





--- Comment #16 from Justin Mason <jm...@jmason.org>  2008-08-06 05:40:12 PST ---
(In reply to comment #15)
> Thanks for your effort on this.
> 
> Just a suggestion:
> Why not make the parent process test if the child process is still running? 
> That way if the server dies after one second you don't end up waiting another
> 29 seconds needlessly.

Tom, reread the code- if the server dies after 1 sec the waitpid() call will
indicate this, and the parent pid will in turn die.


-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 5950] spamd disconnects from terminal before writing pid file

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5950





--- Comment #2 from Tom Vajzovic <to...@gmail.com>  2008-08-02 17:33:33 PST ---
Created an attachment (id=4352)
 --> (https://issues.apache.org/SpamAssassin/attachment.cgi?id=4352)
patch to move writing pidfile before daemonize

I really don't know if this is safe, but it looks like it should work.  It
needs someone who is familiar with the code to check it.

As the comments in the file indicate not to write the pidfile before installing
the handlers, I moved them up before daemonize(), and then moved writing the
pidfile after them and immediately before daemonize().

If I was writing this in C, I would fsync() (or fdatasync) both the pidfile and
its parent directory before detaching from the terminal, to ensure that
commands following "spamd -d" could find it.  I don't know that that can be
done in perl.  What I have done here is probably enough if it doesn't break
anything else.


-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 5950] spamd disconnects from terminal before writing pid file

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5950


Sidney Markowitz <si...@sidney.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED




--- Comment #14 from Sidney Markowitz <si...@sidney.com>  2008-08-06 04:24:00 PST ---
(In reply to comment #13)

Ok, I'll close this bug then.

Tom, please note that the patch I attached will apply cleanly to spamd.raw in
the 3.2 branch so you can use it locally if you want to.


-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 5950] spamd disconnects from terminal before writing pid file

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5950


Tom Vajzovic <to...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |tom.viza@gmail.com




--- Comment #3 from Tom Vajzovic <to...@gmail.com>  2008-08-04 09:16:41 PST ---
Gah!

Ignore that patch.

It won't work, because writing the pid before calling daemonize() will write
the wrong process id.

What is needed it to break up daemonize().  Currently it does:

fork ->

old process:
exit.

new process:
write pid.
continue...


It should go:

fork ->

old process:
write child's pid.
exit.

new process:
continue...


-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 5950] spamd disconnects from terminal before writing pid file

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5950





--- Comment #11 from Sidney Markowitz <si...@sidney.com>  2008-08-05 14:15:18 PST ---
(In reply to comment #10)

Ah, a second look showed that I had misread the problem about portability of
WNOHANG. I thought I had read that not all platforms support WNOHANG. What it
actually said was:

Most systems support a non-blocking waitpid. Use Perl's standard Config.pm
module to find out:

use Config;
$has_nonblocking = $Config{d_waitpid} eq "define" ||
                   $Config{d_wait4}   eq "define";

So it is waitpid itself that is not completely portable, and since we use it
(as well as WNOHANG) I won't worry about using it here.

> 5 secs is pretty short under extremely heavy load.... but 30 would be plenty

Yes, I was surprised to discover that it takes 4 seconds on my Macbook with
nothing loadinng it down. And 30 is too long to simply sleep without using
SIGUSR1. I'll code up a patch now.


-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 5950] spamd disconnects from terminal before writing pid file

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5950





--- Comment #15 from Tom Vajzovic <to...@gmail.com>  2008-08-06 04:57:34 PST ---
Thanks for your effort on this.

Just a suggestion:
Why not make the parent process test if the child process is still running? 
That way if the server dies after one second you don't end up waiting another
29 seconds needlessly.

I don't know perl but certainly in C/C++ etc. you can test for the existence of
a process by trying to do kill( pid, 0 ).  If there is no such process, you get
errno= ESRCH.

In fact, if this works you could perhaps get rid of the timeout altogether (no
need to just give up on slow systems if they are still going).


-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 5950] spamd disconnects from terminal before writing pid file

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5950





--- Comment #4 from Sidney Markowitz <si...@sidney.com>  2008-08-04 12:27:21 PST ---
The comments say

# now allow waiting processes to connect, if they're watching the log.
# The test suite does this!

It looks like much care is taken to get things like the signal handlers done
before the pid is written to the log. That implies to me that your script
should look at the log file like the test suite does rather than simply
continue after calling spamd. I guess it could test and wait for the existence
of the pid file if you know that spamd is being called with the pidfile option,
but looking at the log file seems more correct.


-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 5950] spamd disconnects from terminal before writing pid file

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5950





--- Comment #7 from Justin Mason <jm...@jmason.org>  2008-08-05 05:22:45 PST ---
(In reply to comment #6)
> Would it make sense for the daemonize function to either add a SIGUSR1 handler
> to the parent or create a pipe, have the parent wait for a signal or something
> sent over the pipe from the child before it calls exit() and if the daemonize
> option is set have the child signal the parent (via kill or writing tothe pipe)
> after it has written the pid to the log file?

yes, that would be the ideal fix -- and for that to abort waiting, if the child
itself exits.

> Perl experts, is there any problem with using a signal or a pipe in this case
> and is one better to use than the other?

in this setup, either would be fine; signal would be easiest I think.


-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 5950] spamd disconnects from terminal before writing pid file

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5950


Sidney Markowitz <si...@sidney.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
Attachment #4352 is|0                           |1
           obsolete|                            |




--- Comment #12 from Sidney Markowitz <si...@sidney.com>  2008-08-05 18:32:52 PST ---
Created an attachment (id=4353)
 --> (https://issues.apache.org/SpamAssassin/attachment.cgi?id=4353)
patch: wait for child to signal parent so spamd parent doesn't exit until pid
file is written by child

This patch should take care of what we have discussed in the comments.

Committed to trunk revision 683074, but please look it over and let me know if
you think anything should be changed.

Do we want to review this for the 3.2 branch?


-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 5950] spamd disconnects from terminal before writing pid file

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5950


Justin Mason <jm...@jmason.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|Undefined                   |3.3.0




--- Comment #13 from Justin Mason <jm...@jmason.org>  2008-08-06 01:47:49 PST ---
(In reply to comment #12)
> Committed to trunk revision 683074, but please look it over and let me know if
> you think anything should be changed.

looks good to me.  thanks!

> Do we want to review this for the 3.2 branch?

IMO, I'd prefer not to. I think it's a significant change to existing
behaviour, particularly for packagers... for it to appear in 3.2.6 could be a
surprise.  and I don't think users are clamouring for it to be fixed in huge
numbers ;)


-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 5950] spamd disconnects from terminal before writing pid file

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5950





--- Comment #8 from Sidney Markowitz <si...@sidney.com>  2008-08-05 11:31:59 PST ---
(in reply to comment #7)
>and for that to abort waiting, if the child itself exits.

I've been reading about how to use SIGCHLD to do that and it seems complicated.
Example code I found uses a SIGCHLD handler, uses waitpid(-1, &WNOHANG) and
checks WIFEXITED($?) to avoid responding to a SIGCHLD sent when the child is
suspended but not exited. Worst, I see that WNOHANG is not supported on all
platforms and I didn't see an alternative way of doing this that does not
require a non-blocking waitpid().

It seems to me that for this purpose it would work fine to wait no more than,
say 5 seconds. All we are doing is making sure that the parent process does not
exit before the log entry (and pid file) is written by the child so that the
invoking shell script does not have to bother waiting for it. If the log entry
is not written by the child in the first few seconds (probably even in the
first second) then something is wrong anyway and it will probably not be
written.

So how about having a SIGUSR1 handler that sets a flag and a loop that for a
maximum of 5 times sleeps 1 second until the flag is set?


-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 5950] spamd disconnects from terminal before writing pid file

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5950





--- Comment #6 from Sidney Markowitz <si...@sidney.com>  2008-08-04 15:29:16 PST ---
Would it make sense for the daemonize function to either add a SIGUSR1 handler
to the parent or create a pipe, have the parent wait for a signal or something
sent over the pipe from the child before it calls exit() and if the daemonize
option is set have the child signal the parent (via kill or writing tothe pipe)
after it has written the pid to the log file?

Perl experts, is there any problem with using a signal or a pipe in this case
and is one better to use than the other?


-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.