You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2014/05/31 02:45:29 UTC

[Bug 7046] New: spamc --randomize/-H switch broken

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7046

            Bug ID: 7046
           Summary: spamc --randomize/-H switch broken
           Product: Spamassassin
           Version: 3.4.2
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: spamc/spamd
          Assignee: dev@spamassassin.apache.org
          Reporter: kgc@sonic.net

spamc does not correctly randomize multiple target hosts when the
-H/--randomize switch is used.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7046] spamc --randomize/-H switch broken

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7046

Karsten Bräckelmann <gu...@rudersport.de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Version|3.4.2                       |unspecified

--- Comment #1 from Karsten Bräckelmann <gu...@rudersport.de> ---
Kelsey, please describe the issue in detail and add information on how to
reproduce.

* In what way does --randomize not work? What is the behavior you observe,
  and what do you expect instead?
* Are there any other spamc options used?
* Is DNS involved? What does it resolve to, single or multiple IP addresses?


Resetting version to unspecified. The originally reported version of 3.4.2 does
not exist. Please update with the correct version.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7046] spamc --randomize/-H switch broken

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7046

Kevin A. McGrail <km...@pccc.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kmcgrail@pccc.com

--- Comment #4 from Kevin A. McGrail <km...@pccc.com> ---
(In reply to RW from comment #3)
> I was just looking at the code and it appear that it doesn't attempt to
> randomize the order, it just rotates the list a random number of time. This
> may be a bit confusing if you are expecting proper randomization.
> 
> Irrespective of whether it's working as intended, this is a poor way of
> doing load balancing. For example if there are four ip addresses A,B,C,D and
> A and B are offline, C gets three times as much load as D.

I believe you are misunderstanding what the switch does.

       -H, --randomize
           For TCP/IP sockets, randomize the IP addresses returned for the
hosts given by the -d switch. This provides for a simple kind of load
balancing.  It will try only three times though.

if you do -d host1,host2,host3... it randomizes those hosts.  It does not
randomize the A records if you have a host1 that gives 3 A records, for
example.

Are you proposing that you would like it to randomize all entries at a weight
of 1 whether explicit or from an A/AAAA record look up?

Regards,
KAM

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7046] spamc --randomize/-H switch broken

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7046

--- Comment #5 from Kelsey Cummings <kg...@corp.sonic.net> ---
Regardless, the 3.3.2 spamc client, used with identical arguments, distributes
load evenly amongst the multiple servers returned by a host record with
multiple IP addresses.  The 3.4.0 client no longer exhibits this desirable
behavior.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7046] spamc --randomize/-H switch broken

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7046

--- Comment #7 from Kevin A. McGrail <km...@pccc.com> ---
Thanks.  Likely someone adding quad A record support and not realizing the rest
of the loop impact is my off the cuff guess.  

Overall, I believed Kelsey and you've helped that I don't have to find the code
difference.

But I was on the same path because it's part of this svn change:

http://mail-archives.apache.org/mod_mbox/spamassassin-commits/201210.mbox/%3C20121002001948.25E0523888E4@eris.apache.org%3E

So apparently he matched the -H to the documentation.

Mark, do you remember why you did this?  Do we need to add yet another option
for the more truly randomization?

regards,
KAM

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7046] spamc --randomize/-H switch broken

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7046

Kelsey Cummings <kg...@corp.sonic.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kgc@corp.sonic.net
            Version|unspecified                 |3.4.0

--- Comment #2 from Kelsey Cummings <kg...@corp.sonic.net> ---
--randomize appears to have no affect ;)

This is easily reproducible with any DNS record that returns multiple A records
or, presumably, multiple servers passed in with -d


# strace -econnect spamc -d spam-cluster -H < test.msg  2>&1 | grep 783

Will display the hosts tried in a order consistent with how they are returned
by the NS request - this is particularly problematic for users where their name
server does not shuffle multiple records (powerdns.)

eg:

connect(3, {sa_family=AF_INET, sin_port=htons(783),
sin_addr=inet_addr("1.1.221.234")}, 16) = 0
connect(3, {sa_family=AF_INET, sin_port=htons(783),
sin_addr=inet_addr("1.1.208.75")}, 16) = 0
connect(3, {sa_family=AF_INET, sin_port=htons(783),
sin_addr=inet_addr("1.1.208.73")}, 16) = 0
connect(3, {sa_family=AF_INET, sin_port=htons(783),
sin_addr=inet_addr("1.1.221.236")}, 16) = 0
connect(3, {sa_family=AF_INET, sin_port=htons(783),
sin_addr=inet_addr("1.1.208.76")}, 16) = 0
connect(3, {sa_family=AF_INET, sin_port=htons(783),
sin_addr=inet_addr("1.1.221.233")}, 16) = 0
connect(3, {sa_family=AF_INET, sin_port=htons(783),
sin_addr=inet_addr("1.1.221.235")}, 16) = 0
connect(3, {sa_family=AF_INET, sin_port=htons(783),
sin_addr=inet_addr("1.1.208.74")}, 16) = 0

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7046] spamc --randomize/-H switch broken

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7046

RW <rw...@googlemail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rwmaillists@googlemail.com

--- Comment #3 from RW <rw...@googlemail.com> ---
I was just looking at the code and it appear that it doesn't attempt to
randomize the order, it just rotates the list a random number of time. This may
be a bit confusing if you are expecting proper randomization.

Irrespective of whether it's working as intended, this is a poor way of doing
load balancing. For example if there are four ip addresses A,B,C,D and  A and B
are offline, C gets three times as much load as D.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7046] spamc --randomize/-H switch broken

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7046

--- Comment #8 from RW <rw...@googlemail.com> ---

The old behaviour wouldn't have worked optimally with multiple host entries
that are multi-homed, since it tries 3 consecutive IP addresses rather than 3
consecutive hosts. 

IMO  it would be better to leave that code as it is and fix the randomization
function so it moves a random 3 host entries to the beginning of the array and
then randomizes the address list for those entries. Note, you can see from the
above strace output above that all the IP addresses for a host are tried at
present.

It would be better to avoid random rotations altogether.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7046] spamc --randomize/-H switch broken

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7046

William Taylor <wi...@napanet.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |williamt@napanet.net

--- Comment #6 from William Taylor <wi...@napanet.net> ---
What Kelsey is saying is true. Ever since we upgraded to 3.4.0 the -H flag to
spamc has no effect when multiple ip addresses are returned. It always keeps
the first one it gets.

This appears to be the code that broke -H. The for loop is the old behavior. If
you comment out the tp->hosts line and un-comment the for loop you will get the
old behavior.


            /* treat all A or AAAA records of each host as one entry */
            tp->hosts[tp->nhosts++] = res;

            /* alternatively, treat multiple A or AAAA records
               of one host as individual entries */
/*
          for (addrp = res; addrp != NULL; ) {
               tp->hosts[tp->nhosts] = addrp;
               addrp = addrp->ai_next;     /* before NULLing ai_next */
               tp->hosts[tp->nhosts]->ai_next = NULL;
               tp->nhosts++;
           }
*/

-- 
You are receiving this mail because:
You are the assignee for the bug.