You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@santuario.apache.org by Scott Cantor <ca...@osu.edu> on 2003/09/03 21:38:55 UTC

Extra linefeed in base64 encoder?

Not sure if this is a bug or not, but the Base64 encoder used by the Java
xmlsec code inserts an extra linefeed if an encoded object is exactly the
length of a multiple of the line length setting. (IOW, if you wrap after 76
bytes, it puts an extra linefeed in if the object is a multiple of 76 after
encoding). There's even a comment in the code about it, and it says Sun did
something similar.

Well, Xerces-C claims it's not valid base64 when it schema-validates.

Now, Xerces-C as of 2.2 is horrendously strict, so much so that I'm pretty
sure it rejects legal stuff, and I've sent in a note about it. But they may
actually be following the letter of the spec, I don't know. I'm not sure
there *is* a clear spec for putting base64 in XML, frankly. It's a mess, to
me.

So I guess my question is, does anybody know who's wrong? And if xmlsec is,
can that code be patched?

-- Scott


RE: Extra linefeed in base64 encoder?

Posted by Scott Cantor <ca...@osu.edu>.
> base64 doesn't seem to care too much about whitespace and no 
> decoder (general purpose decoder that is) should fail on any 
> base64 encoding with or without whitespace.

Xerces-C admittedly doesn't promise that their decoder class is for public
consumption, but this error is being thrown by the validator anyway, so
that's technically correct no matter what their decoder might allow.

> They are right and I think it should be fixed. If anyone has 
> any objections, please let me know sometime soon. I will be 
> looking to commit a patch at the earliest opportunity from 
> Sunday on, but Sunday is not all that unlikely. Patches are welcome :)

I'm jammed myself, and we won't be shipping a new release immediately
anyway, so I'm not in a major hurry. So far only one cert ended up aligning
on a 76 byte boundary in the whole testing community. ;-)

Thanks...I mostly wanted to make sure it got addressed before any new
release of the codebase, which I saw was being tossed around.

-- Scott


RE: Extra linefeed in base64 encoder?

Posted by Erwin van der Koogh <vd...@apache.org>.
> Ok, here's some info:
> http://marc.theaimsgroup.com/?l=xerces-c-dev&m=106261978927834&w=2
>
> Probably what xmlsec is producing is legal base64, but it's not legal
> XML Schema base64:
>
> http://www.w3.org/2001/05/xmlschema-rec-comments.html#pfibase64

The spec is pretty specific, although somewhat confusing, but it
officially doesn't allow more than 1 linefeed after the base64 quartets
and does not even allow other white characters other than that.
> The base64 spec itself doesn't have much to say about this issue, but
> there are specific text-encodings of base64 that do. XML itself has no
> such encoding specified that I've ever found. XML Schema apparently now
> does, in this errata he referenced.

base64 doesn't seem to care too much about whitespace and no decoder
(general purpose decoder that is) should fail on any base64 encoding with
or without whitespace.
> It now dictates that lexically, linefeeds are not required, which is
> nice. Too bad so many decoders choke if they're not present, in my
> experience, though maybe Xerces-J no longer does. ;-)

I would definately be in favor of keeping the linefeeds in the base64
encoded strings within Xerces, if nothing else because it looks prettier
:)
> Lexically speaking, Xerces-C claims that you can't have multiple
> whitespace characters in a row according to that grammar, which would
> show up if you have the extra linefeed at the end.

They are right and I think it should be fixed. If anyone has any
objections, please let me know sometime soon. I will be looking to commit
a patch at the earliest opportunity from Sunday on, but Sunday is not all
that unlikely. Patches are welcome :)
>> I am horrendiously busy this and next week, but I would be
>> more than willing to patch it in the meantime if someone else
>> would be willing to do a little research and c&p the
>> following in an email to the list:
>> - Base64 spec about wrapping
>> - XML spec about base64 encoding (if anything)
>> - XML Signature spec about base64 encoding (if anything)
>
> I don't think the Signature spec says anything, but I could be wrong. I
> think the issue here is with XML Schema, and I would think it would be
> preferable for anything outputting base64 to shoot for
> schema-compliance lest it produce XML that can't validate.

I don't expect the signature spec to say anything but "the xml must be
well-formed according to XML-Schema" of something similar, but I'll check
just to be sure.
Erwin



RE: Extra linefeed in base64 encoder?

Posted by Scott Cantor <ca...@osu.edu>.
> I would have to go back and read the Base64 spec.. I have 
> done it so many times, but I still get headaches when trying 
> to remember details from it

+1

> > Well, Xerces-C claims it's not valid base64 when it schema-validates.
> 
> It might be right :)

Ok, here's some info:
http://marc.theaimsgroup.com/?l=xerces-c-dev&m=106261978927834&w=2

Probably what xmlsec is producing is legal base64, but it's not legal XML
Schema base64:

http://www.w3.org/2001/05/xmlschema-rec-comments.html#pfibase64

The base64 spec itself doesn't have much to say about this issue, but there
are specific text-encodings of base64 that do. XML itself has no such
encoding specified that I've ever found. XML Schema apparently now does, in
this errata he referenced.

It now dictates that lexically, linefeeds are not required, which is nice.
Too bad so many decoders choke if they're not present, in my experience,
though maybe Xerces-J no longer does. ;-)

Lexically speaking, Xerces-C claims that you can't have multiple whitespace
characters in a row according to that grammar, which would show up if you
have the extra linefeed at the end.

> I am horrendiously busy this and next week, but I would be 
> more than willing to patch it in the meantime if someone else 
> would be willing to do a little research and c&p the 
> following in an email to the list:
> - Base64 spec about wrapping
> - XML spec about base64 encoding (if anything)
> - XML Signature spec about base64 encoding (if anything)

I don't think the Signature spec says anything, but I could be wrong. I
think the issue here is with XML Schema, and I would think it would be
preferable for anything outputting base64 to shoot for schema-compliance
lest it produce XML that can't validate.

-- Scott


Re: Extra linefeed in base64 encoder?

Posted by Erwin van der Koogh <vd...@apache.org>.
> Not sure if this is a bug or not, but the Base64 encoder used by the
> Java xmlsec code inserts an extra linefeed if an encoded object is
> exactly the length of a multiple of the line length setting. (IOW, if
> you wrap after 76 bytes, it puts an extra linefeed in if the object is
> a multiple of 76 after encoding). There's even a comment in the code
> about it, and it says Sun did something similar.

I would have to go back and read the Base64 spec.. I have done it so many
times, but I still get headaches when trying to remember details from it
:)
> Well, Xerces-C claims it's not valid base64 when it schema-validates.

It might be right :)

> Now, Xerces-C as of 2.2 is horrendously strict, so much so that I'm
> pretty sure it rejects legal stuff, and I've sent in a note about it.
> But they may actually be following the letter of the spec, I don't
> know. I'm not sure there *is* a clear spec for putting base64 in XML,
> frankly. It's a mess, to me.
>
> So I guess my question is, does anybody know who's wrong? And if xmlsec
> is, can that code be patched?

I am horrendiously busy this and next week, but I would be more than
willing to patch it in the meantime if someone else would be willing to do
a little research and c&p the following in an email to the list:
- Base64 spec about wrapping
- XML spec about base64 encoding (if anything)
- XML Signature spec about base64 encoding (if anything)

If there is nothing in the specs I will commit a patch if it passes the
interop testing.
Erwin