You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2007/10/22 12:25:10 UTC

[Bug 5695] New: flock interrupted by system call

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5695

           Summary: flock interrupted by system call
           Product: Spamassassin
           Version: 3.2.3
          Platform: Other
        OS/Version: other
            Status: NEW
          Severity: normal
          Priority: P5
         Component: Libraries
        AssignedTo: dev@spamassassin.apache.org
        ReportedBy: dietmar@maurer-it.com


I get tons of warning on heavyly loaded system:

WARNING: bayes: cannot open bayes databases /root/.spamassassin/bayes_* R/W: 
lock failed: Interrupted system call

locking at the code reveals that EINTR is not catched:

Flock.pm line 84:   if (flock ($fh, LOCK_EX)) 

should be somethimg more resonable, like

    while (!flock($fh, LOCK_EX)) {
          next if $! == EINTR;
          die "unable to get lock ..."
    }

- Dietmar



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5695] flock interrupted by system call

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5695





------- Additional Comments From lwilton@earthlink.net  2007-10-22 05:04 -------
You should look at the docs on EINTR.  It is a Unix oddity that comes about 
when a program is in the middle of some kernel routine, usually waiting for 
some near-term event to complete (like an IO buffer read) and some other 
completely random unrelated action comes along that can't be handled in that 
state.  So the routine simply throws up its hands and bails out of the kernel 
with EINTR.  This means "routine interrupted, try again".  

It can theoretically happen on virtually any system call, meaning that in 
theory almost all system calls have to check for EINTR afterwards and retry the 
operation if it is received.  Realistically though it will normally only show 
up on things that can wait for a kernel event of some sort.  Which means most 
any kind of IO operation, including open and close.

Under some rare conditions you might want to bail on EINTR, but they are quite 
rare.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5695] flock interrupted by system call

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5695





------- Additional Comments From jm@jmason.org  2007-10-22 03:37 -------
why should we catch an interrupt there?  if it's interrupted, we probably do
*not* want to continue attempting to get the lock.

what is sending the interrupt signal?



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5695] flock interrupted by system call

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5695





------- Additional Comments From jm@jmason.org  2007-10-22 05:33 -------
'some other completely random unrelated action comes along that can't be handled
in that state.'

we'd need to have a good idea of what exactly that unrelated action *is*.  On
Linux, at least, EINTR in flock() means specifically a signal:

       EINTR  While waiting to acquire a lock, the call was interrupted by deliv-
              ery of a signal caught by a handler.

And if we get a signal in that state, we need to stop trying to get the lock,
since -- yes-- we've been interrupted.

If something is delivering signals to spamd child processes while they're trying
to get database locks, we need to figure out what that something is, and why
it's signalling them -- basically, the EINTR is a symptom and we need to be
treating the cause, not the symptom.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.