You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2005/12/19 11:20:02 UTC

[Bug 4742] New: rounding of score number and score stars is different

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4742

           Summary: rounding of score number and score stars is different
           Product: Spamassassin
           Version: SVN Trunk (Latest Devel Version)
          Platform: All
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P5
         Component: Score Generation
        AssignedTo: dev@spamassassin.apache.org
        ReportedBy: sehh@altered.com


Currently, after the final score has been generated, its rounded. Thus a score
of 7.9 will become score=8.0 on the email.

Unfortunately, while a score of 7.9 becomes 8.0, the rounding for the stars
works by rounding down even a 7.9 score, thus the email gets 7 stars!

As a result, anyone using a filter to catch emails with score of 8 stars or
above, will loose many emails which get score 8 but only have 7 stars.

For consistent rounding, if a score of X.Y becomes X+1 then it should get the
equivalent stars of X+1 as well.

Thank you.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4742] rounding of score number and score stars is different

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4742





------- Additional Comments From sehh@altered.com  2005-12-19 19:08 -------
I fully understand what you mean. I'm already using the X-Spam-Status header to
detect spam, although i use the stars to delete spam with score 8 or higher.

I do understand that score=8.0 probably means a real score of 7.9.

Unfortunately, i still believe that its confusing when we see spam not being
deleted even though they still get score=8.0 with 7 stars.

Anyway, thank you for taking the time to explain the rounding problem, i
appriciate it.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4742] rounding of score number and score stars is different

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4742





------- Additional Comments From lwilton@earthlink.net  2005-12-19 18:03 -------
> Unfortunately, thats the easy case, here is what happens in reality:

> If your threshold is 5, then 5.1 is spam and 5.0 is ham, as well as 4.9 is 
ham.

No.  What happens in *reality* is that an indefinite-precision number is 
compared to a definite-precision constant, in your case 5.00000000000. Any 
number in internal storage format with a value less than the 5.000000000000000 
will be shown as ham, for instance 4.999999999999999.  Any value of 5. or 
greater is spam.  Contrary to your statement, 5.000000000000 is SPAM, not HAM.

*AFTER* this comparison is made, using the internal values, the *internal* 
value is rounded to a *display* value, using the *number of rounding digits you 
specify*.

The number of display rounding digits you specify does not affect the 
comparison; nor should it.  It is a DISPLAY option, not a COMPARISON option.

If you wish to set an integral display rounding, and still want to know if the 
mail is ham or spam, there are three possible ways:

1. Examine the X-Spam_Status Yes/No header.  It represents the results of the 
internal comparison and does not lie.

2. Use an integral spam threshold (not fractional) and examine the number of 
stars.  The number of stars do not lie IF the spam threshold is integral.

3. Examine the Subject header for the spam flag if configured, typically *** 
SPAM ***.  It also does not lie.

If you want an *erroneous* determination of whether the message is ham or spam, 
set the rounding to less than three fractional digits and examine the spam 
score.  Since it will round up or down to match the requested number of 
fractional digits, it can lie any time the *actual* score is close to the 
threshold, and gets rounded to the other side due to loss of siginicant digits 
in the rounding.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4742] rounding of score number and score stars is different

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4742


sidney@sidney.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |DUPLICATE




------- Additional Comments From sidney@sidney.com  2005-12-19 11:57 -------
Closing as duplicate of bug 2659.

Bug 2659, comment #5 explains our rationale for the current behavior.
It is confusing enough to users that it is in the FAQ, but the rationale still
stands.


*** This bug has been marked as a duplicate of 2659 ***



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4742] rounding of score number and score stars is different

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4742





------- Additional Comments From sehh@altered.com  2005-12-19 12:41 -------
comment #5: "Stars are rounded down because they are often used as a filtering
criteria (so 5 stars should be spam)"

i'm sorry, i dont understand.

I'm using stars as filtering criteria via procmail.

Why is that a reason to have less stars than the real score??

If i want to delete spam with score 8 or above, i'd expect to filter 8 stars or
above.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4742] rounding of score number and score stars is different

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4742





------- Additional Comments From sehh@altered.com  2005-12-19 17:42 -------
"If your threshold is 5, then 5.1 is spam and 4.9 is ham."

Unfortunately, thats the easy case, here is what happens in reality:

If your threshold is 5, then 5.1 is spam and 5.0 is ham, as well as 4.9 is ham.

Thats what we see, we see score=5.0 as ham. Because you use different rounding
for the score and the stars. This is confusing.

I see two solutions, either make spam checking AFTER you've rounded the score,
thus both score and stars will be consistent or specify more decimal places as
suggested by Sidney.

Thus the first solution gives score=5.0 and 5 stars while the second solution
gives score=4.981 with 4 stars.

Both solutions are ofcourse much more consistent than the current method and
don't confuse the user.






------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4742] rounding of score number and score stars is different

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4742





------- Additional Comments From sehh@altered.com  2005-12-19 13:27 -------
But my email was rounded up to 8.0, thus that is a valid 8-star score!

Why round up the numeric score and round down the star score?

It would only make sense, if 7.9 was rounded to 7! Thus, it should get 7 stars.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4742] rounding of score number and score stars is different

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4742





------- Additional Comments From sidney@sidney.com  2005-12-19 15:59 -------
If your threshold is 5, then 5.1 is spam and 4.9 is ham.

The first case gets a header of X-Spam-Status: Yes, the latter gets
X-Spam-Status: No

The number of stars is intended to be used for filtering spam from ham. The
first case is spam and so should have five stars. The second case is ham and so
should have four stars. Rounding the up the number of stars would break that
filtering.

Displaying the score follows normal rules of rounding for displaying numbers
with a certain precision. If you really care to see the scores with more
precision, specify more decimal places in your configuration options. But in any
case use the X-Spam-Status: Yes or No, or use the number of stars for filtering
the spam from the ham.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4742] rounding of score number and score stars is different

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4742





------- Additional Comments From bas@debian.org  2005-12-19 13:18 -------
> If i want to delete spam with score 8 or above, i'd expect to filter 8 stars or
> above.

Exactly.  So if the score is 7.98, you don't want to delete the message, which
is exactly what happens.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.