You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by Mark Thomas <ma...@apache.org> on 2022/08/23 20:42:55 UTC
[DISCUSS] MessageBytes refactoring
Hi all,
I've been looking at a fix for bug 66196. My ideas so far have revolved
around MessageBytes but the solutions are being made more complex by the
current behaviour of MessageBytes in some cases.
For example (I'm using strings in place of byte[] and char[] to keep it
simple):
mb.setBytes("aaa");
mb.setChars("bbb");
mb.toBytes();
mb.getByteChunk() returns "aaa" whereas I'd expect it to be "bbb".
I'd like to refactor MessageBytes so it always behaves as if it has a
single current value regardless of whether that value was set as a
String, byte[] or char[]. If a get() method is called for a different
type, conversion occurs on demand.
I'm reasonably confident that changing MessageBytes to always have a
single, consistent value will also enable a few useful optimizations -
particularly around ISO-8859-1 String to byte conversions which gets
used a lot for HTTP response headers.
Note: As currently, if you write to the ByteChunk or CharChunk directly
the caller is expected to take responsibility for keeping the values in
sync or dealing with the consequences.
Thoughts?
Mark
[1] https://bz.apache.org/bugzilla/show_bug.cgi?id=66196
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org
Re: [DISCUSS] MessageBytes refactoring
Posted by Han Li <li...@apache.org>.
> 2022年8月24日 16:16,Mark Thomas <ma...@apache.org> 写道:
>
> On 24/08/2022 09:08, Rémy Maucherat wrote:
>> On Tue, Aug 23, 2022 at 10:43 PM Mark Thomas <ma...@apache.org> wrote:
>>>
>>> Hi all,
>>>
>>> I've been looking at a fix for bug 66196. My ideas so far have revolved
>>> around MessageBytes but the solutions are being made more complex by the
>>> current behaviour of MessageBytes in some cases.
>>>
>>> For example (I'm using strings in place of byte[] and char[] to keep it
>>> simple):
>>>
>>> mb.setBytes("aaa");
>>> mb.setChars("bbb");
>>> mb.toBytes();
>>>
>>> mb.getByteChunk() returns "aaa" whereas I'd expect it to be "bbb".
>>>
>>> I'd like to refactor MessageBytes so it always behaves as if it has a
>>> single current value regardless of whether that value was set as a
>>> String, byte[] or char[]. If a get() method is called for a different
>>> type, conversion occurs on demand.
>>>
>>> I'm reasonably confident that changing MessageBytes to always have a
>>> single, consistent value will also enable a few useful optimizations -
>>> particularly around ISO-8859-1 String to byte conversions which gets
>>> used a lot for HTTP response headers.
>>>
>>> Note: As currently, if you write to the ByteChunk or CharChunk directly
>>> the caller is expected to take responsibility for keeping the values in
>>> sync or dealing with the consequences.
>>>
>>> Thoughts?
>> Well, this is a bit risky obviously but you can attempt it.
>
> Fair point.
>
> On my first pass I found that the RewriteValve was accessing the internals directly. That case looked to be manageable. I agree the risk is that this is happening in other places that don't get spotted.
>
> One option would be to refactor 10.1.x but delay the back-port to see if any regressions emerge.
Are there any sub-tasks I can do? I would be happy to help!
Han
>
> Mark
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org <ma...@tomcat.apache.org>
> For additional commands, e-mail: dev-help@tomcat.apache.org <ma...@tomcat.apache.org>
Re: [DISCUSS] MessageBytes refactoring
Posted by Mark Thomas <ma...@apache.org>.
On 24/08/2022 09:08, Rémy Maucherat wrote:
> On Tue, Aug 23, 2022 at 10:43 PM Mark Thomas <ma...@apache.org> wrote:
>>
>> Hi all,
>>
>> I've been looking at a fix for bug 66196. My ideas so far have revolved
>> around MessageBytes but the solutions are being made more complex by the
>> current behaviour of MessageBytes in some cases.
>>
>> For example (I'm using strings in place of byte[] and char[] to keep it
>> simple):
>>
>> mb.setBytes("aaa");
>> mb.setChars("bbb");
>> mb.toBytes();
>>
>> mb.getByteChunk() returns "aaa" whereas I'd expect it to be "bbb".
>>
>> I'd like to refactor MessageBytes so it always behaves as if it has a
>> single current value regardless of whether that value was set as a
>> String, byte[] or char[]. If a get() method is called for a different
>> type, conversion occurs on demand.
>>
>> I'm reasonably confident that changing MessageBytes to always have a
>> single, consistent value will also enable a few useful optimizations -
>> particularly around ISO-8859-1 String to byte conversions which gets
>> used a lot for HTTP response headers.
>>
>> Note: As currently, if you write to the ByteChunk or CharChunk directly
>> the caller is expected to take responsibility for keeping the values in
>> sync or dealing with the consequences.
>>
>> Thoughts?
>
> Well, this is a bit risky obviously but you can attempt it.
Fair point.
On my first pass I found that the RewriteValve was accessing the
internals directly. That case looked to be manageable. I agree the risk
is that this is happening in other places that don't get spotted.
One option would be to refactor 10.1.x but delay the back-port to see if
any regressions emerge.
Mark
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org
Re: [DISCUSS] MessageBytes refactoring
Posted by Rémy Maucherat <re...@apache.org>.
On Tue, Aug 23, 2022 at 10:43 PM Mark Thomas <ma...@apache.org> wrote:
>
> Hi all,
>
> I've been looking at a fix for bug 66196. My ideas so far have revolved
> around MessageBytes but the solutions are being made more complex by the
> current behaviour of MessageBytes in some cases.
>
> For example (I'm using strings in place of byte[] and char[] to keep it
> simple):
>
> mb.setBytes("aaa");
> mb.setChars("bbb");
> mb.toBytes();
>
> mb.getByteChunk() returns "aaa" whereas I'd expect it to be "bbb".
>
> I'd like to refactor MessageBytes so it always behaves as if it has a
> single current value regardless of whether that value was set as a
> String, byte[] or char[]. If a get() method is called for a different
> type, conversion occurs on demand.
>
> I'm reasonably confident that changing MessageBytes to always have a
> single, consistent value will also enable a few useful optimizations -
> particularly around ISO-8859-1 String to byte conversions which gets
> used a lot for HTTP response headers.
>
> Note: As currently, if you write to the ByteChunk or CharChunk directly
> the caller is expected to take responsibility for keeping the values in
> sync or dealing with the consequences.
>
> Thoughts?
Well, this is a bit risky obviously but you can attempt it.
Rémy
> Mark
>
>
> [1] https://bz.apache.org/bugzilla/show_bug.cgi?id=66196
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: dev-help@tomcat.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org