You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2004/09/15 17:44:25 UTC

[Bug 3780] ability to decode less than full attachment with decode()

http://bugzilla.spamassassin.org/show_bug.cgi?id=3780





------- Additional Comments From felicity@kluge.net  2004-09-15 08:44 -------
Subject: Re:  New: ability to decode less than full attachment with decode()

On Tue, Sep 14, 2004 at 10:04:51PM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> It should be possible to define how much data needs to be decoded for
> non-textual parts.  It might even be a good idea to allow this functionality
> to be generic to any MIME type (as long as there is a way to say "non-text"
> still).

That doesn't really make any sense.  Decode will do what you ask it to do --
it doesn't need to know the part is text/* vs attachment/*.  That's up to the
caller.  All decode needs to know is: what part (implicit in the function
call), how many bytes to return (already does that), and in this case how many
bytes to decode.

The easiest thing, actually, would be to change the "number of bytes to
return" to mean "decode up to this many bytes and return them".  And what
we're really talking about is base64 since quoted-printable isn't much bigger
than the raw version.

So something like:

$part->decode(10);

would actually mean "take 10 * 8/6 (and round up to the nearest multiple of 4)
bytes, decode the base64, then return the first 10 decoded bytes".  Then if
other things called for the same or less data, it could be returned, otherwise
it would have to decode the next set, and to avoid issues only the next bits
of data that weren't already decoded.  The main issue here, of course, is that
the base64 decode function takes cares of encoded sections that aren't valid,
whitespace, etc.  So that would have to function in here somehow.

I would do this as:

When decode is called the first time, create an object variable ala
"bytes_decoded" set to 0.  If exists decode was previously called.  If undef,
it means the whole thing was decoded.  If a number, only part.  There'd be
some logic at the top to figure out what needs to happen.

> The main reason is so that binary attachments can be decoded partially if
> only a small amount is needed.  The configuration setting should be something
> that can be upped as needed by plugins and callers (while configuration is
> read).

It would have to be an option to decode(), which would have to be able to
know the full part wasn't already decoded and handle in a dynamic fashion the
more bytes need decoding as the need arises.

Another option is slightly reverting to the SA3 version of code and
using temp files for large decoded/rendered parts.


Overall, I don't know if this functionality is worth the effort.  The vast
majority of calls are for text parts where you want the full part decoded.  I
don't know how many binaries we're going to be decoding.  I still haven't
heard why you'd want to do this.





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.