You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2005/08/19 07:01:06 UTC

[Bug 4546] New: spamc/spamd learning has potentially dangerous side-effects

http://bugzilla.spamassassin.org/show_bug.cgi?id=4546

           Summary: spamc/spamd learning has potentially dangerous side-
                    effects
           Product: Spamassassin
           Version: SVN Trunk (Latest Devel Version)
          Platform: Other
               URL: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=278030
        OS/Version: other
            Status: NEW
          Severity: blocker
          Priority: P5
         Component: spamc/spamd
        AssignedTo: dev@spamassassin.apache.org
        ReportedBy: duncf@debian.org


Now that spamd supports learning, but still trusts spamc's Username: option, it
is trivial for users to posion (accidentally or intentionally) other users'
Bayes DBs.

For SQL, this isn't a big deal. Users can access the database directly and screw
with it all they want, because the username and password are in the config file
(usually local.cf). Or, they can use the SpamAssassin internals to do the job,
or they can use sa-learn -u.

For DBM/SDBM, this is a big deal. Before spamd/spamc learning the only way to
write to another users' database is to be root (or have the db permissions wide
open).


There are ways to limit users from learning as others in both of the above
scenarios (I think).

In the SQL case, make local.cf or sql.cf or something readable only by root AND
use spamd's --auth-ident option. Users won't be able to directly access the
database, or use sa-learn, they will only be able to learn via spamc/spamd.

In the DBM case, spamd's --auth-ident option should prevent this problem, as it
prevents use of spamc -u <random user>.

Unfortunately, it's my understanding that --auth-ident doesn't work very well,
and not at all with some identd servers. See for example, Debian bug 278030 (URL
above).

It would be far better IMHO, to be able to disable (via command line option?)
spamd learning.


There of course still exist the above problems with respect to autolearning --
people can craft messages to be autolearned and poison a DB, but this can be
solved by disabling autolearning for the truly (even more) paranoid.

At the very least the above should be documented in README.spamd so sysadmins
can decide for themselves.

Set as blocker -- we need to talk about this before we release. At the very
least we should document it, but I think it'd be better to add the command-line
options to spamd to prevent it.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4546] spamc/spamd learning has potentially dangerous side-effects

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4546





------- Additional Comments From duncf@debian.org  2005-08-21 21:36 -------
Created an attachment (id=3092)
 --> (http://bugzilla.spamassassin.org/attachment.cgi?id=3092&action=view)
Patch to add --allow-tell / -l option to spamd

This patch adds an --allow-tell or -l option to spamd to enable TELLs. If
disabled, EX_UNAVAILABLE is returned to spamc (which promptly ignores it, see
bug 4551).

I picked -l because -t and -T were taken, and -l is the first letter of learn
and the last letter of tell, and because it wasnt in use already.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4546] spamc/spamd learning has potentially dangerous side-effects

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4546





------- Additional Comments From sidney@sidney.com  2005-08-21 15:42 -------
I've opened bug 4550 as an RFE to continue the discussion of my comment 7 since
clearly we are going to go with the simple command line option change that
Duncan is coding up for 3.1rc2 in the near term.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4546] spamc/spamd learning has potentially dangerous side-effects

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4546


duncf@debian.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Status Whiteboard|                            |needs 2 votes






------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4546] spamc/spamd learning has potentially dangerous side-effects

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4546





------- Additional Comments From sidney@sidney.com  2005-08-19 22:52 -------
I also agree that learning disabled by default is correct.

I've been thinking about how secure authentication can be done. The biggest
problems I see:

1) The spamd protocol does not have any notion of session. Authentication that
does not pass a password in the clear requires a back and forth protocol and the
establishment of a session. A session based protocol is a huge change.

2) I really don't like the idea of rolling our own secure authentication
protocol. Security protocols never come out right when you just make them up.

3) Some ideas I thought about don't scale well in, for example, an ISP
environment. For example, let's say that we store a password in each user's
bayes database and require that the password be somehow passed in the spamc
call. Ignoring for the moment how to do that without the password being in the
clear on the network (a solvable problem), it seems like it would be difficult
for an ISP to set up for all users. Similarly for setting up a client-side SSL
certificate for each user, which would have to go both on the client and server
side.

Here is a simple solution. Add a password configuration option to both the spamc
configuration file and the regular (spamd) per-user configuration file. Spamc
only uses it when -S is specified. The protocol gets an new field for password.
If spamd sees a password configured in the specified user's preferences, it only
will process a message if SSL is being used and the correct password is sent by
spamc.

The sysadmin already has to provide users with the ability to set their own
local preferences and the user configuration files or database entries already
have to be kept secure from other people. The user can enable learning via spamc
-L by setting matching password options in the two config files.

By requiring SSL, there is no security hole in passing the password in the clear
from spamc to spamd.

Note that this is not necessary if auth-identd can be trusted. That will be the
case if spamd is configured to only accept connections from a trusted IP address
(such as localhost) on a system that supports identd. In that case, the password
is not necessary, --auth-ident takes care of the authentication, and everything
just works. If SSL adds too much overhead for a system with many users and high
load, then the sysadmin just has to set things up so that the spamd server can
trust the spamc client machines' identd.

Comments?




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4546] [review] spamc/spamd learning has potentially dangerous side-effects

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4546





------- Additional Comments From duncf@debian.org  2005-08-21 23:02 -------
Subject: Re:  [review] spamc/spamd learning has potentially dangerous side-effects

> "This is not ususally a concern with an SQL Bayes store as users
> will typically have read-write access directly to the database, and
> can also use C<sa-learn> with the B<-u> option to achieve the same
> result"

I'm not sure I understand your comment. I think the word "This" above
is a bit vague.

What I'm trying to say is that enabling TELLs is not ususally a
problem for systems with an SQL Bayes Store as users could previously
use sa-learn -u.

Maybe what I should say is:

If the Bayes database backend is a DBM or SDBM file, B<--allow-tell>
allows any user to write to any other user's database, which is not
possible using just C<sa-learn>.

If the Bayes database is being stored in SQL, any user can already to
write to any other user's database using the B<-u> option to
C<sa-learn>, so B<--allow-tell> is not less secure. In addition,
B<--allow-tell> alleviates the need to give all users direct database
access.

Is this clearer? 

I don't think we need another patch for this -- feel free to vote on
this one, the doc can be changed after this is committed, without the
need for another vote. Also, I'm not going to be around this week, so
waiting for another patch from me is a bad idea!




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4546] spamc/spamd learning has potentially dangerous side-effects

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4546


duncf@debian.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|Undefined                   |3.1.0






------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4546] spamc/spamd learning has potentially dangerous side-effects

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4546





------- Additional Comments From duncf@debian.org  2005-08-18 22:16 -------
Subject: Re:  spamc/spamd learning has potentially dangerous side-effects

> I brought this up when first adding the functionality, and no one cared
> then.  I find it odd that suddenly this is SO important and a release
> blocker.

Either I missed it, or it was primarily in the context of SQL. In SQL,
adding this functionality makes NO difference, because of sa-learn -u
or direct database edits. I don't think anyone brought up that it
would affect DBM like this.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4546] [review] spamc/spamd learning has potentially dangerous side-effects

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4546


jm@jmason.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Status Whiteboard|needs 1 vote                |ready to apply




------- Additional Comments From jm@jmason.org  2005-08-23 19:18 -------
+1. looks good



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4546] [review] spamc/spamd learning has potentially dangerous side-effects

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4546





------- Additional Comments From sidney@sidney.com  2005-08-22 02:58 -------
Duncan, Ok, I think I understand. With DBM or SDBM users each have their own
database file, so running sa-learn can give them no access to another user's
database. With SQL, there is only one shared database, so access via sa-learn
implies access to all users' data. With DBM and SDBM you can get security by
setting things up so users can run sa-learn and either not allow spamc -L or set
up --auth-ident properly. With SQL you can't get security if they can run
sa-learn. But you still would have to either not allow spamc -L or else set up
--auth-ident properly.

In any case, I agree that doc can be dealt with later and the voting for the
patch can proceed without it being perfect. I'll vote on this as is after I get
a chance to build and run with the patch.

Loren, --allow-tell is not an option that means
"--allow-malicious-data-corruption". It means allow spamc to tell spamd the
spam/ham status of the message being sent for the purpose of bayes learning. It
just so happens that there is a side effect that allows any user to affect the
Bayes database of any other user. The name reflects what it is for, not the side
effect. In any case, I think that getting this fix in is what is important. If
we figure out a way to get real user authentication for version 3.2, this option
could change anyway.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4546] spamc/spamd learning has potentially dangerous side-effects

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4546





------- Additional Comments From sidney@sidney.com  2005-08-18 22:38 -------
I know I didn't notice this until now. As a matter of fact, only a few days ago
I set up spamd on my home server and figured out a way to pipe email through it
from procmail on my ISP's server so that I can run the latest SA and use Bayes.
It wasn't until then that I realized that there was no authentication of the
user name when I set up a system for piping mistakes through spamc -L. I was
going to look to see if perhaps I just missed some docuentation on how to do it
before bringing it up.

I agree that this is a serious matter.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4546] [review] spamc/spamd learning has potentially dangerous side-effects

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4546


duncf@debian.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|spamc/spamd learning has    |[review] spamc/spamd
                   |potentially dangerous side- |learning has potentially
                   |effects                     |dangerous side-effects




------- Additional Comments From duncf@debian.org  2005-08-21 21:37 -------
Fixed in trunk BTW



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4546] spamc/spamd learning has potentially dangerous side-effects

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4546





------- Additional Comments From duncf@debian.org  2005-08-19 15:36 -------
Subject: Re:  spamc/spamd learning has potentially dangerous side-effects

> I agree that a command-line flag is the way to fix this -- however,
> I'd prefer it to be the other way around -- IOW, the flag is
> required to *enable* remote learning.  otherwise we'll expose
> sysadmins to this danger once they upgrade, on existing systems,
> which I don't think is good.

That was my preference also. I'll see if I can code something up.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4546] [review] spamc/spamd learning has potentially dangerous side-effects

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4546


sidney@sidney.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Status Whiteboard|needs 2 votes               |needs 1 vote




------- Additional Comments From sidney@sidney.com  2005-08-22 04:56 -------
+1




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4546] spamc/spamd learning has potentially dangerous side-effects

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4546





------- Additional Comments From parkerm@pobox.com  2005-08-18 22:07 -------
Subject: Re:  spamc/spamd learning has potentially dangerous side-effects

I brought this up when first adding the functionality, and no one cared
then.  I find it odd that suddenly this is SO important and a release
blocker.

Michael




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4546] [review] spamc/spamd learning has potentially dangerous side-effects

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4546





------- Additional Comments From sidney@sidney.com  2005-08-21 22:29 -------
I don't get the following in the POD comments:

"This is not ususally a concern with an SQL Bayes store as users will typically
have read-write access directly to the database, and can also use C<sa-learn>
with the B<-u> option to achieve the same result"

If the user is able to run sa-learn, it doesn't matter if Bayes is using SQL or
something else, as sa-learn will only work if the user's userid has permission
to access the database (or to access the password that grants access to the
database). If the user is not able to run sa-learn, then spamc -L will have this
problem whether the database is SQL or anything else.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4546] spamc/spamd learning has potentially dangerous side-effects

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4546





------- Additional Comments From sidney@sidney.com  2005-08-19 14:09 -------
Client/server user authentication is difficult, or at least it will take a lot
of thought and work to get right. Unless someone comes up with some standard way
of doing it that can be just dropped in to our code, it isn't something to add
for the 3.1 time frame. auth-ident is not meant to be a security measure except
under specialised circumstances. We could think about whether adding the ability
to use client side SSL certificates would make it possible to configure
spamc/spamd with secure user authentication without much change to the code, but
I think even that small change is too much for 3.1 and is not a sufficiently
complete solution.

On the other hand, as Duncan points out in the bug description, we have ended up
with something that has a vulnerability in any configuration that allows
untrusted people to have personal Bayes databases. I don't see why the problem
is any different with SQL vs DBM, as in both anyone can call spamc -L with any
-u argument that they want.

If we only document this, we would have to say that spamd can be used with
individual Bayes databases only in configurations in which identd authentication
can be trusted, or in which something like a VPN or ssh tunnel is used along
with accepting connections only from localhost. That seems to me to be too
restrictive.

I agree with Duncan that adding a spamd command line option that disables spamc
-L, plus adding documentation, is a sufficient step to take that is simple
enough to add in for the 3.1 ship. That would allow a sysadmin to run spamd on a
server, maintain individual Bayes databases for users, not have to count on
--auth-ident, and set up something ad hoc for users to train on errors such as
special IMAP folders or special email address to forward mistakes to.

We can then open an enhancement bugzilla ticket to consider how to get secure
spamc/spamd user authentication for 3.2.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4546] [review] spamc/spamd learning has potentially dangerous side-effects

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4546





------- Additional Comments From lwilton@earthlink.net  2005-08-21 23:34 -------
Subject: Re:  [review] spamc/spamd learning has potentially dangerous side-effects

> If the Bayes database backend is a DBM or SDBM file, B<--allow-tell>
> allows any user to write to any other user's database, which is not
> possible using just C<sa-learn>.

Maybe I'm missing some fine point of DB semantics here, but from the
viewpoint of the English language, the concept of something called
"allow-tell" allowing a tool to have world write access seems more than a
little strange.  I would more expect an "allow-tell" option to grant world
*read* access, if anything of the sort.

Would it maybe make more sense to name the
option --allow-mallicious-data-corruption or some such, that might give a
person looking at a system config file a hint that this option is related to
granting write access to the database?

        Loren





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4546] spamc/spamd learning has potentially dangerous side-effects

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4546





------- Additional Comments From jm@jmason.org  2005-08-19 15:06 -------
I agree that a command-line flag is the way to fix this -- however, I'd prefer
it to be the other way around -- IOW, the flag is required to *enable* remote
learning.   otherwise we'll expose sysadmins to this danger once they upgrade,
on existing systems, which I don't think is good.


having said that, I don't think this is a 3.1.0 blocker.  IMO it could wait for
3.1.1, if nobody produces a patch in time.  (Duncan?)



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4546] [review] spamc/spamd learning has potentially dangerous side-effects

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4546


jm@jmason.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED




------- Additional Comments From jm@jmason.org  2005-08-24 20:40 -------
Committed revision 239986.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.