You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Alex Herbert <al...@gmail.com> on 2020/01/20 15:25:42 UTC

[codec] Base32/Base64 trailing bit validation is incomplete

In release 1.14 I fixed a patch made in 1.13 to reject decoding bytes 
that cannot be re-encoded to the same bytes (Codec-134 [1]). My fix was 
to correct the use of a mask to check the trailing bytes.

The implementation checks trailing bits that are to be discarded are all 
zero.

However the check does not go so far as to throw an exception when there 
are trailing bits below the count of 8. In this case there cannot be any 
more bytes and the bits are discarded.

I assume this is because before the trailing bit validation it was not 
necessary to do anything when the number of trailing bits was less than 
a single byte.

However this means it is still possible to decode some bytes and then 
encode them to create different bytes as shown here:

@Test
public void testTrailing6bits() {
     final Base64 codec = new Base64();
     // A 4 byte block plus an extra one
     byte[] bytes = new byte[] { 'A', 'B', 'C', 'D' };
     byte[] decoded = codec.decode(bytes);
     byte[] encoded = codec.encode(decoded);
     Assert.assertTrue("The round trip is possible", 
Arrays.equals(bytes, encoded));
     // A 4 byte block plus an extra one
     bytes = new byte[] { 'A', 'B', 'C', 'D', 'E' };
     decoded = codec.decode(bytes);
     encoded = codec.encode(decoded);
     Assert.assertFalse("The round trip is not possible", 
Arrays.equals(bytes, encoded));
}

@Test
public void testTrailing5bits() {
     final Base32 codec = new Base32();
     // An 8 byte block plus an extra one
     byte[] bytes = new byte[] { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H' };
     byte[] decoded = codec.decode(bytes);
     byte[] encoded = codec.encode(decoded);
     Assert.assertTrue("The round trip is possible", 
Arrays.equals(bytes, encoded));
     // An 8 byte block plus an extra one
     bytes = new byte[] { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I' };
     decoded = codec.decode(bytes);
     encoded = codec.encode(decoded);
     Assert.assertFalse("The round trip is not possible", 
Arrays.equals(bytes, encoded));
}

Previously when fixing the trailing bit validation I suggested 
validating all trailing bits, even when they cannot be converted into 
any bytes. However this was not done and there are still some invalid 
inputs that do not trigger an exception.

I propose to update the fix by throwing an IllegalArgumentException if 
there are left-over bits that cannot be decoded. That seems to be the 
intention of CODEC-134 to prevent security exploitation.

However note that stricter validation is causing users to complain about 
exceptions when they should be questioning their own data. Recently a 
user raised Codec-279 which stated that exceptions were being thrown 
[2]. I believe the code is functioning as expected. So adding extra 
validation may prove to be more of a headache to users who have somehow 
obtained invalid encodings.

One workaround is to fix the implementation to reject anything that 
cannot be re-encoded to the same bytes. Then add a property that allows 
this behaviour to be suppressed allowing for example:

Base64 codec = new Base64();
byte[] bytes = new byte[] { 'A', 'B', 'C', 'D', 'E' };
byte[] decoded;
try {
     decoded = codec.decode(bytes);
} catch (IllegalArgumentException ignore) {
     codec.setAllowTrailingBits(true);
     // OK
     decoded = codec.decode(bytes);
}

WDYT?

1. Fully fix CODEC-134
2. Optionally allow CODEC-134 behaviour to be suppressed

I would vote for option 1. It requires 1 line change in the code for 
Base32 and Base64.

I am open to opinions for option 2. It would allow users to upgrade and 
then opt-in to previous functionality.

Alex


[1] https://issues.apache.org/jira/browse/CODEC-134

[2] https://issues.apache.org/jira/browse/CODEC-279


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [codec] Base32/Base64 trailing bit validation is incomplete

Posted by Alex Herbert <al...@gmail.com>.
This has turned out to be non-trivial. Current progress in in PR [1].


Easy

- Added a property to BaseNCodec to enable strict decoding

All other properties of the codec are set in the constructor and are final. So using a property is a change. I can change to overloaded constructors but there are already a lot of these so a property or builder type pattern is cleaner e.g:

// Returns a new codec with strict decoding, all other properties the same
Base64 withStrictDecoding();

I can remove the isStrictEncoding() method. However this would have to replaced by the package/protected level access to the property for Base32/Base64. Other properties in the class are protected. So maybe drop the isStrictEncoding() method and change the accessor level for the property. However there is another property in Base64 (isUrlSafe()) so properties are not excluded from the design pattern.

- Added tests to show Base64 strict mode works

This is fine. Base64 encoding cannot create a final 6-bit character. Previously this was ignored. It now optionally throws in strict mode.

All other validation of trailing 12 or 18 bits works. I added tests to show that a decoded byte[] is re-encoded to the same byte[] when in strict mode but not in lenient mode.


Issues

- Added tests to show Base32 strict mode works

It does. It also shows that the current decoder is capable of decoding non-sense.

Referring to the final section for the encode method shows encoding can create a count of 2, 4, 5, or 7 trailing characters. So a valid Base32 encoding will never have 1, 3, or 6 trailing characters.

Previously 1 trailing character was ignored. It now optionally throws. So far, so good.

But the previous behaviour for 3 or 6 trailing characters is to decode them into as many bytes as is possible. This does not make sense. They could only have come from an invalid encoding:

3 chars = 15-bits = 1 byte and 7 remainder. So this should have used 2 trailing characters and a pad.
6 chars = 30-bits = 3 bytes and 6 remainder. So this should have used 5 trailing characters and a pad.

In the PR I have preserved this behaviour for lenient mode. This is backwards compatible. In strict mode this will now throw. So lenient mode is allowing non-pad characters to act as padding in certain cases.

As before I added tests to show that a decoded byte[] is re-encoded to the same byte[] when in strict mode but not in lenient mode.


Unresolved

Base64 is used by codec.net.BCodec. This also had tests for impossible encodings. These now fail as the default is to revert to lenient mode.

So this requires a decision on whether BCodec should:

- Regress to lenient mode
- Move to strict mode and throw more exceptions than before
- Also add a property to enable strict mode


Have a look at the PR and give some suggestions.

Alex



[1] https://github.com/apache/commons-codec/pull/35 <https://github.com/apache/commons-codec/pull/35>


Re: [codec] Base32/Base64 trailing bit validation is incomplete

Posted by Alex Herbert <al...@gmail.com>.

> On 20 Jan 2020, at 23:54, Gary Gregory <ga...@gmail.com> wrote:
> 
> On Mon, Jan 20, 2020 at 6:45 PM Alex Herbert <alex.d.herbert@gmail.com <ma...@gmail.com>>
> wrote:
> 
>> 
>> 
>>> On 20 Jan 2020, at 23:20, Gary Gregory <ga...@gmail.com> wrote:
>>> 
>>> I hope this would be at the level of BaseNCodec, not just Base64.
>>> 
>>> I am not sure I like BaseNCodec#setAllowTrailingBits(boolean), maybe
>>> someone more general like BaseNCodec.setStrictDecoding(boolean) where the
>>> default is false for backward compatibility.
>> 
>> I prefer that name.
>> 
>>> 
>>> I do not think we want to go as far as creating an enum for various
>>> enforcement features.
>> 
>> At the moment the implementation is half way between strict and relaxed.
>> Setting it back to relaxed by default removes changes attempted in
>> CODEC-134. This perhaps needs some input from the originator of CODEC-134
>> for their use case. If it is for an uncommon edge case with a secure
>> context then having a non-strict default seems more sensible. The default
>> just throws away stuff that it cannot decode at the end. In most cases this
>> is fine. User reports on this contain statements along the lines of "other
>> libraries can do the decoding, why does codec error?"
>> 
>> So set this to strict=false by default and then allow a strict=true via a
>> BaseNCodec property.
>> 
>> I don’t think we want to add more static overrides with this flag so the
>> default for the static methods would go back to non-strict.
>> 
>> If no objections I’ll raise a Jira ticket and PR to implement this.
>> 
> 
> Sure, and let's use the term 'lenient' instead of 'non-strict’.

OK. I’ll work on this in the next few days.

> 
> Gary
> 
> 
>>> 
>>> Gary
>>> 
>>> On Mon, Jan 20, 2020 at 10:25 AM Alex Herbert <al...@gmail.com>
>>> wrote:
>>> 
>>>> In release 1.14 I fixed a patch made in 1.13 to reject decoding bytes
>>>> that cannot be re-encoded to the same bytes (Codec-134 [1]). My fix was
>>>> to correct the use of a mask to check the trailing bytes.
>>>> 
>>>> The implementation checks trailing bits that are to be discarded are all
>>>> zero.
>>>> 
>>>> However the check does not go so far as to throw an exception when there
>>>> are trailing bits below the count of 8. In this case there cannot be any
>>>> more bytes and the bits are discarded.
>>>> 
>>>> I assume this is because before the trailing bit validation it was not
>>>> necessary to do anything when the number of trailing bits was less than
>>>> a single byte.
>>>> 
>>>> However this means it is still possible to decode some bytes and then
>>>> encode them to create different bytes as shown here:
>>>> 
>>>> @Test
>>>> public void testTrailing6bits() {
>>>>    final Base64 codec = new Base64();
>>>>    // A 4 byte block plus an extra one
>>>>    byte[] bytes = new byte[] { 'A', 'B', 'C', 'D' };
>>>>    byte[] decoded = codec.decode(bytes);
>>>>    byte[] encoded = codec.encode(decoded);
>>>>    Assert.assertTrue("The round trip is possible",
>>>> Arrays.equals(bytes, encoded));
>>>>    // A 4 byte block plus an extra one
>>>>    bytes = new byte[] { 'A', 'B', 'C', 'D', 'E' };
>>>>    decoded = codec.decode(bytes);
>>>>    encoded = codec.encode(decoded);
>>>>    Assert.assertFalse("The round trip is not possible",
>>>> Arrays.equals(bytes, encoded));
>>>> }
>>>> 
>>>> @Test
>>>> public void testTrailing5bits() {
>>>>    final Base32 codec = new Base32();
>>>>    // An 8 byte block plus an extra one
>>>>    byte[] bytes = new byte[] { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H'
>> };
>>>>    byte[] decoded = codec.decode(bytes);
>>>>    byte[] encoded = codec.encode(decoded);
>>>>    Assert.assertTrue("The round trip is possible",
>>>> Arrays.equals(bytes, encoded));
>>>>    // An 8 byte block plus an extra one
>>>>    bytes = new byte[] { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I' };
>>>>    decoded = codec.decode(bytes);
>>>>    encoded = codec.encode(decoded);
>>>>    Assert.assertFalse("The round trip is not possible",
>>>> Arrays.equals(bytes, encoded));
>>>> }
>>>> 
>>>> Previously when fixing the trailing bit validation I suggested
>>>> validating all trailing bits, even when they cannot be converted into
>>>> any bytes. However this was not done and there are still some invalid
>>>> inputs that do not trigger an exception.
>>>> 
>>>> I propose to update the fix by throwing an IllegalArgumentException if
>>>> there are left-over bits that cannot be decoded. That seems to be the
>>>> intention of CODEC-134 to prevent security exploitation.
>>>> 
>>>> However note that stricter validation is causing users to complain about
>>>> exceptions when they should be questioning their own data. Recently a
>>>> user raised Codec-279 which stated that exceptions were being thrown
>>>> [2]. I believe the code is functioning as expected. So adding extra
>>>> validation may prove to be more of a headache to users who have somehow
>>>> obtained invalid encodings.
>>>> 
>>>> One workaround is to fix the implementation to reject anything that
>>>> cannot be re-encoded to the same bytes. Then add a property that allows
>>>> this behaviour to be suppressed allowing for example:
>>>> 
>>>> Base64 codec = new Base64();
>>>> byte[] bytes = new byte[] { 'A', 'B', 'C', 'D', 'E' };
>>>> byte[] decoded;
>>>> try {
>>>>    decoded = codec.decode(bytes);
>>>> } catch (IllegalArgumentException ignore) {
>>>>    codec.setAllowTrailingBits(true);
>>>>    // OK
>>>>    decoded = codec.decode(bytes);
>>>> }
>>>> 
>>>> WDYT?
>>>> 
>>>> 1. Fully fix CODEC-134
>>>> 2. Optionally allow CODEC-134 behaviour to be suppressed
>>>> 
>>>> I would vote for option 1. It requires 1 line change in the code for
>>>> Base32 and Base64.
>>>> 
>>>> I am open to opinions for option 2. It would allow users to upgrade and
>>>> then opt-in to previous functionality.
>>>> 
>>>> Alex
>>>> 
>>>> 
>>>> [1] https://issues.apache.org/jira/browse/CODEC-134
>>>> 
>>>> [2] https://issues.apache.org/jira/browse/CODEC-279
>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>> 
>>>> 
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org <ma...@commons.apache.org>
>> For additional commands, e-mail: dev-help@commons.apache.org <ma...@commons.apache.org>

Re: [codec] Base32/Base64 trailing bit validation is incomplete

Posted by Gary Gregory <ga...@gmail.com>.
On Mon, Jan 20, 2020 at 6:45 PM Alex Herbert <al...@gmail.com>
wrote:

>
>
> > On 20 Jan 2020, at 23:20, Gary Gregory <ga...@gmail.com> wrote:
> >
> > I hope this would be at the level of BaseNCodec, not just Base64.
> >
> > I am not sure I like BaseNCodec#setAllowTrailingBits(boolean), maybe
> > someone more general like BaseNCodec.setStrictDecoding(boolean) where the
> > default is false for backward compatibility.
>
> I prefer that name.
>
> >
> > I do not think we want to go as far as creating an enum for various
> > enforcement features.
>
> At the moment the implementation is half way between strict and relaxed.
> Setting it back to relaxed by default removes changes attempted in
> CODEC-134. This perhaps needs some input from the originator of CODEC-134
> for their use case. If it is for an uncommon edge case with a secure
> context then having a non-strict default seems more sensible. The default
> just throws away stuff that it cannot decode at the end. In most cases this
> is fine. User reports on this contain statements along the lines of "other
> libraries can do the decoding, why does codec error?"
>
> So set this to strict=false by default and then allow a strict=true via a
> BaseNCodec property.
>
> I don’t think we want to add more static overrides with this flag so the
> default for the static methods would go back to non-strict.
>
> If no objections I’ll raise a Jira ticket and PR to implement this.
>

Sure, and let's use the term 'lenient' instead of 'non-strict'.

Gary


> >
> > Gary
> >
> > On Mon, Jan 20, 2020 at 10:25 AM Alex Herbert <al...@gmail.com>
> > wrote:
> >
> >> In release 1.14 I fixed a patch made in 1.13 to reject decoding bytes
> >> that cannot be re-encoded to the same bytes (Codec-134 [1]). My fix was
> >> to correct the use of a mask to check the trailing bytes.
> >>
> >> The implementation checks trailing bits that are to be discarded are all
> >> zero.
> >>
> >> However the check does not go so far as to throw an exception when there
> >> are trailing bits below the count of 8. In this case there cannot be any
> >> more bytes and the bits are discarded.
> >>
> >> I assume this is because before the trailing bit validation it was not
> >> necessary to do anything when the number of trailing bits was less than
> >> a single byte.
> >>
> >> However this means it is still possible to decode some bytes and then
> >> encode them to create different bytes as shown here:
> >>
> >> @Test
> >> public void testTrailing6bits() {
> >>     final Base64 codec = new Base64();
> >>     // A 4 byte block plus an extra one
> >>     byte[] bytes = new byte[] { 'A', 'B', 'C', 'D' };
> >>     byte[] decoded = codec.decode(bytes);
> >>     byte[] encoded = codec.encode(decoded);
> >>     Assert.assertTrue("The round trip is possible",
> >> Arrays.equals(bytes, encoded));
> >>     // A 4 byte block plus an extra one
> >>     bytes = new byte[] { 'A', 'B', 'C', 'D', 'E' };
> >>     decoded = codec.decode(bytes);
> >>     encoded = codec.encode(decoded);
> >>     Assert.assertFalse("The round trip is not possible",
> >> Arrays.equals(bytes, encoded));
> >> }
> >>
> >> @Test
> >> public void testTrailing5bits() {
> >>     final Base32 codec = new Base32();
> >>     // An 8 byte block plus an extra one
> >>     byte[] bytes = new byte[] { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H'
> };
> >>     byte[] decoded = codec.decode(bytes);
> >>     byte[] encoded = codec.encode(decoded);
> >>     Assert.assertTrue("The round trip is possible",
> >> Arrays.equals(bytes, encoded));
> >>     // An 8 byte block plus an extra one
> >>     bytes = new byte[] { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I' };
> >>     decoded = codec.decode(bytes);
> >>     encoded = codec.encode(decoded);
> >>     Assert.assertFalse("The round trip is not possible",
> >> Arrays.equals(bytes, encoded));
> >> }
> >>
> >> Previously when fixing the trailing bit validation I suggested
> >> validating all trailing bits, even when they cannot be converted into
> >> any bytes. However this was not done and there are still some invalid
> >> inputs that do not trigger an exception.
> >>
> >> I propose to update the fix by throwing an IllegalArgumentException if
> >> there are left-over bits that cannot be decoded. That seems to be the
> >> intention of CODEC-134 to prevent security exploitation.
> >>
> >> However note that stricter validation is causing users to complain about
> >> exceptions when they should be questioning their own data. Recently a
> >> user raised Codec-279 which stated that exceptions were being thrown
> >> [2]. I believe the code is functioning as expected. So adding extra
> >> validation may prove to be more of a headache to users who have somehow
> >> obtained invalid encodings.
> >>
> >> One workaround is to fix the implementation to reject anything that
> >> cannot be re-encoded to the same bytes. Then add a property that allows
> >> this behaviour to be suppressed allowing for example:
> >>
> >> Base64 codec = new Base64();
> >> byte[] bytes = new byte[] { 'A', 'B', 'C', 'D', 'E' };
> >> byte[] decoded;
> >> try {
> >>     decoded = codec.decode(bytes);
> >> } catch (IllegalArgumentException ignore) {
> >>     codec.setAllowTrailingBits(true);
> >>     // OK
> >>     decoded = codec.decode(bytes);
> >> }
> >>
> >> WDYT?
> >>
> >> 1. Fully fix CODEC-134
> >> 2. Optionally allow CODEC-134 behaviour to be suppressed
> >>
> >> I would vote for option 1. It requires 1 line change in the code for
> >> Base32 and Base64.
> >>
> >> I am open to opinions for option 2. It would allow users to upgrade and
> >> then opt-in to previous functionality.
> >>
> >> Alex
> >>
> >>
> >> [1] https://issues.apache.org/jira/browse/CODEC-134
> >>
> >> [2] https://issues.apache.org/jira/browse/CODEC-279
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> >> For additional commands, e-mail: dev-help@commons.apache.org
> >>
> >>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>

Re: [codec] Base32/Base64 trailing bit validation is incomplete

Posted by Alex Herbert <al...@gmail.com>.

> On 20 Jan 2020, at 23:20, Gary Gregory <ga...@gmail.com> wrote:
> 
> I hope this would be at the level of BaseNCodec, not just Base64.
> 
> I am not sure I like BaseNCodec#setAllowTrailingBits(boolean), maybe
> someone more general like BaseNCodec.setStrictDecoding(boolean) where the
> default is false for backward compatibility.

I prefer that name.

> 
> I do not think we want to go as far as creating an enum for various
> enforcement features.

At the moment the implementation is half way between strict and relaxed. Setting it back to relaxed by default removes changes attempted in CODEC-134. This perhaps needs some input from the originator of CODEC-134 for their use case. If it is for an uncommon edge case with a secure context then having a non-strict default seems more sensible. The default just throws away stuff that it cannot decode at the end. In most cases this is fine. User reports on this contain statements along the lines of "other libraries can do the decoding, why does codec error?"

So set this to strict=false by default and then allow a strict=true via a BaseNCodec property.

I don’t think we want to add more static overrides with this flag so the default for the static methods would go back to non-strict.

If no objections I’ll raise a Jira ticket and PR to implement this.

> 
> Gary
> 
> On Mon, Jan 20, 2020 at 10:25 AM Alex Herbert <al...@gmail.com>
> wrote:
> 
>> In release 1.14 I fixed a patch made in 1.13 to reject decoding bytes
>> that cannot be re-encoded to the same bytes (Codec-134 [1]). My fix was
>> to correct the use of a mask to check the trailing bytes.
>> 
>> The implementation checks trailing bits that are to be discarded are all
>> zero.
>> 
>> However the check does not go so far as to throw an exception when there
>> are trailing bits below the count of 8. In this case there cannot be any
>> more bytes and the bits are discarded.
>> 
>> I assume this is because before the trailing bit validation it was not
>> necessary to do anything when the number of trailing bits was less than
>> a single byte.
>> 
>> However this means it is still possible to decode some bytes and then
>> encode them to create different bytes as shown here:
>> 
>> @Test
>> public void testTrailing6bits() {
>>     final Base64 codec = new Base64();
>>     // A 4 byte block plus an extra one
>>     byte[] bytes = new byte[] { 'A', 'B', 'C', 'D' };
>>     byte[] decoded = codec.decode(bytes);
>>     byte[] encoded = codec.encode(decoded);
>>     Assert.assertTrue("The round trip is possible",
>> Arrays.equals(bytes, encoded));
>>     // A 4 byte block plus an extra one
>>     bytes = new byte[] { 'A', 'B', 'C', 'D', 'E' };
>>     decoded = codec.decode(bytes);
>>     encoded = codec.encode(decoded);
>>     Assert.assertFalse("The round trip is not possible",
>> Arrays.equals(bytes, encoded));
>> }
>> 
>> @Test
>> public void testTrailing5bits() {
>>     final Base32 codec = new Base32();
>>     // An 8 byte block plus an extra one
>>     byte[] bytes = new byte[] { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H' };
>>     byte[] decoded = codec.decode(bytes);
>>     byte[] encoded = codec.encode(decoded);
>>     Assert.assertTrue("The round trip is possible",
>> Arrays.equals(bytes, encoded));
>>     // An 8 byte block plus an extra one
>>     bytes = new byte[] { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I' };
>>     decoded = codec.decode(bytes);
>>     encoded = codec.encode(decoded);
>>     Assert.assertFalse("The round trip is not possible",
>> Arrays.equals(bytes, encoded));
>> }
>> 
>> Previously when fixing the trailing bit validation I suggested
>> validating all trailing bits, even when they cannot be converted into
>> any bytes. However this was not done and there are still some invalid
>> inputs that do not trigger an exception.
>> 
>> I propose to update the fix by throwing an IllegalArgumentException if
>> there are left-over bits that cannot be decoded. That seems to be the
>> intention of CODEC-134 to prevent security exploitation.
>> 
>> However note that stricter validation is causing users to complain about
>> exceptions when they should be questioning their own data. Recently a
>> user raised Codec-279 which stated that exceptions were being thrown
>> [2]. I believe the code is functioning as expected. So adding extra
>> validation may prove to be more of a headache to users who have somehow
>> obtained invalid encodings.
>> 
>> One workaround is to fix the implementation to reject anything that
>> cannot be re-encoded to the same bytes. Then add a property that allows
>> this behaviour to be suppressed allowing for example:
>> 
>> Base64 codec = new Base64();
>> byte[] bytes = new byte[] { 'A', 'B', 'C', 'D', 'E' };
>> byte[] decoded;
>> try {
>>     decoded = codec.decode(bytes);
>> } catch (IllegalArgumentException ignore) {
>>     codec.setAllowTrailingBits(true);
>>     // OK
>>     decoded = codec.decode(bytes);
>> }
>> 
>> WDYT?
>> 
>> 1. Fully fix CODEC-134
>> 2. Optionally allow CODEC-134 behaviour to be suppressed
>> 
>> I would vote for option 1. It requires 1 line change in the code for
>> Base32 and Base64.
>> 
>> I am open to opinions for option 2. It would allow users to upgrade and
>> then opt-in to previous functionality.
>> 
>> Alex
>> 
>> 
>> [1] https://issues.apache.org/jira/browse/CODEC-134
>> 
>> [2] https://issues.apache.org/jira/browse/CODEC-279
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [codec] Base32/Base64 trailing bit validation is incomplete

Posted by Gary Gregory <ga...@gmail.com>.
I hope this would be at the level of BaseNCodec, not just Base64.

I am not sure I like BaseNCodec#setAllowTrailingBits(boolean), maybe
someone more general like BaseNCodec.setStrictDecoding(boolean) where the
default is false for backward compatibility.

I do not think we want to go as far as creating an enum for various
enforcement features.

Gary

On Mon, Jan 20, 2020 at 10:25 AM Alex Herbert <al...@gmail.com>
wrote:

> In release 1.14 I fixed a patch made in 1.13 to reject decoding bytes
> that cannot be re-encoded to the same bytes (Codec-134 [1]). My fix was
> to correct the use of a mask to check the trailing bytes.
>
> The implementation checks trailing bits that are to be discarded are all
> zero.
>
> However the check does not go so far as to throw an exception when there
> are trailing bits below the count of 8. In this case there cannot be any
> more bytes and the bits are discarded.
>
> I assume this is because before the trailing bit validation it was not
> necessary to do anything when the number of trailing bits was less than
> a single byte.
>
> However this means it is still possible to decode some bytes and then
> encode them to create different bytes as shown here:
>
> @Test
> public void testTrailing6bits() {
>      final Base64 codec = new Base64();
>      // A 4 byte block plus an extra one
>      byte[] bytes = new byte[] { 'A', 'B', 'C', 'D' };
>      byte[] decoded = codec.decode(bytes);
>      byte[] encoded = codec.encode(decoded);
>      Assert.assertTrue("The round trip is possible",
> Arrays.equals(bytes, encoded));
>      // A 4 byte block plus an extra one
>      bytes = new byte[] { 'A', 'B', 'C', 'D', 'E' };
>      decoded = codec.decode(bytes);
>      encoded = codec.encode(decoded);
>      Assert.assertFalse("The round trip is not possible",
> Arrays.equals(bytes, encoded));
> }
>
> @Test
> public void testTrailing5bits() {
>      final Base32 codec = new Base32();
>      // An 8 byte block plus an extra one
>      byte[] bytes = new byte[] { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H' };
>      byte[] decoded = codec.decode(bytes);
>      byte[] encoded = codec.encode(decoded);
>      Assert.assertTrue("The round trip is possible",
> Arrays.equals(bytes, encoded));
>      // An 8 byte block plus an extra one
>      bytes = new byte[] { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I' };
>      decoded = codec.decode(bytes);
>      encoded = codec.encode(decoded);
>      Assert.assertFalse("The round trip is not possible",
> Arrays.equals(bytes, encoded));
> }
>
> Previously when fixing the trailing bit validation I suggested
> validating all trailing bits, even when they cannot be converted into
> any bytes. However this was not done and there are still some invalid
> inputs that do not trigger an exception.
>
> I propose to update the fix by throwing an IllegalArgumentException if
> there are left-over bits that cannot be decoded. That seems to be the
> intention of CODEC-134 to prevent security exploitation.
>
> However note that stricter validation is causing users to complain about
> exceptions when they should be questioning their own data. Recently a
> user raised Codec-279 which stated that exceptions were being thrown
> [2]. I believe the code is functioning as expected. So adding extra
> validation may prove to be more of a headache to users who have somehow
> obtained invalid encodings.
>
> One workaround is to fix the implementation to reject anything that
> cannot be re-encoded to the same bytes. Then add a property that allows
> this behaviour to be suppressed allowing for example:
>
> Base64 codec = new Base64();
> byte[] bytes = new byte[] { 'A', 'B', 'C', 'D', 'E' };
> byte[] decoded;
> try {
>      decoded = codec.decode(bytes);
> } catch (IllegalArgumentException ignore) {
>      codec.setAllowTrailingBits(true);
>      // OK
>      decoded = codec.decode(bytes);
> }
>
> WDYT?
>
> 1. Fully fix CODEC-134
> 2. Optionally allow CODEC-134 behaviour to be suppressed
>
> I would vote for option 1. It requires 1 line change in the code for
> Base32 and Base64.
>
> I am open to opinions for option 2. It would allow users to upgrade and
> then opt-in to previous functionality.
>
> Alex
>
>
> [1] https://issues.apache.org/jira/browse/CODEC-134
>
> [2] https://issues.apache.org/jira/browse/CODEC-279
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>