You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@sling.apache.org by Jörg Hoh <jh...@googlemail.com.INVALID> on 2018/09/07 17:18:37 UTC

[Discussion] | should buffer output in memory

In my sling application I have a number of nested components, which are
called via the <sling:include> tag. If a component late in the rendering
process throws an error, I would like to return an HTTP statuscode 500. But
if more data than the size of the output buffer are written, the response
headers are already comitted and written to the network, so I cannot change
the statuscode anymore and I get in the log something like "response
already comitted". Thus the client receives a HTTP status 200, but the
rendering is not complete.

The problem could be mitigated, if the <sling:include> tag could provide a
ByteArrayOutputStream as part of the Response object, so there is never an
overflow of the OutputBuffer. Instead everything is buffered until an
explicit flush is invoked (which is already happening). This would cause
the entire rendering output being stored in-memory while retaining the
ability to set the HTTP statuscode (and less important: all other response
headers) until all the request processing is completed.

This would avoid my problem of sending a statuscode while the rendering is
not completed yet. As downsides:

   - The response cannot be streamed to the client while the rendering is
   still happening, potentially slowing down the response.
   - The memory usage to process a request will increase.

If you need to stream data or send large amounts of data, you could still
implement a servlet and avoid the use of the taglib; but for the usecase
where the sling:include is used, the output is unlikely to grow into the
megabytes.

WDYT? Should this part of sling, or should be rather create a custom tag
library with our own version of the non-streaming "include" tag?

Jörg
-- 
Cheers,
Jörg Hoh,

http://cqdump.wordpress.com
Twitter: @joerghoh

Re: [Discussion] | should buffer output in memory

Posted by Jörg Hoh <jh...@googlemail.com.INVALID>.
Hi Jason,

Am Fr., 7. Sep. 2018 um 19:33 Uhr schrieb Jason E Bailey <je...@apache.org>:

> Have you tried utilizing the var attribute in the sling:include tag?
>
>
> https://sling.apache.org/documentation/bundles/scripting/scripting-jsp.html#include
>
> That may be what you're looking for.
>

This might be a workaround, but I don't the markup rendered by the include
as variable around; also this would require us to change all occurrences of
the include, which is something I would like to avoid (also hard to
automate). I would rather change the default :-)

Jörg

-- 
Cheers,
Jörg Hoh,

http://cqdump.wordpress.com
Twitter: @joerghoh

Re: [Discussion] | should buffer output in memory

Posted by Jason E Bailey <je...@apache.org>.
Have you tried utilizing the var attribute in the sling:include tag?

https://sling.apache.org/documentation/bundles/scripting/scripting-jsp.html#include

That may be what you're looking for.

- Jason

On Fri, Sep 7, 2018, at 1:18 PM, Jörg Hoh wrote:
> In my sling application I have a number of nested components, which are
> called via the <sling:include> tag. If a component late in the rendering
> process throws an error, I would like to return an HTTP statuscode 500. But
> if more data than the size of the output buffer are written, the response
> headers are already comitted and written to the network, so I cannot change
> the statuscode anymore and I get in the log something like "response
> already comitted". Thus the client receives a HTTP status 200, but the
> rendering is not complete.
> 
> The problem could be mitigated, if the <sling:include> tag could provide a
> ByteArrayOutputStream as part of the Response object, so there is never an
> overflow of the OutputBuffer. Instead everything is buffered until an
> explicit flush is invoked (which is already happening). This would cause
> the entire rendering output being stored in-memory while retaining the
> ability to set the HTTP statuscode (and less important: all other response
> headers) until all the request processing is completed.
> 
> This would avoid my problem of sending a statuscode while the rendering is
> not completed yet. As downsides:
> 
>    - The response cannot be streamed to the client while the rendering is
>    still happening, potentially slowing down the response.
>    - The memory usage to process a request will increase.
> 
> If you need to stream data or send large amounts of data, you could still
> implement a servlet and avoid the use of the taglib; but for the usecase
> where the sling:include is used, the output is unlikely to grow into the
> megabytes.
> 
> WDYT? Should this part of sling, or should be rather create a custom tag
> library with our own version of the non-streaming "include" tag?
> 
> Jörg
> -- 
> Cheers,
> Jörg Hoh,
> 
> http://cqdump.wordpress.com
> Twitter: @joerghoh

Re: [Discussion] | should buffer output in memory

Posted by Jörg Hoh <jh...@googlemail.com.INVALID>.
Am Do., 13. Sep. 2018 um 18:45 Uhr schrieb Stefan Seifert <
sseifert@pro-vision.de>:

>
> yes, but currently we are quite convinced that if you have some places in
> the response rendering (e.g. JSP pages) that do a manual flush, there is no
> chance the prevent the problem you describe, only with hacks like mocked
> requests to avoid the flush command hits the servlet engine. not sure if we
> really should go this way. we assume (have not tested it) the proposal you
> describe will not help if there are still flush() calls in the code.
>

Sure. I will try with a filter and some mock response object wrapping the
original one.

Jörg


-- 
Cheers,
Jörg Hoh,

http://cqdump.wordpress.com
Twitter: @joerghoh

RE: [Discussion] | should buffer output in memory

Posted by Stefan Seifert <ss...@pro-vision.de>.
>> 3. additionally make sure the JSP files of your application or that are
>> used from the platform do not contain calls to response.flush()!
>>
>>
>Yes, that's something we can validate and figure out, but I've seen this in
>too many cases, and not always a flush() was repsonsible. Also it's some
>very hard to validate all scripts and occassions (especially if they are
>part of the framework you use).

yes, but currently we are quite convinced that if you have some places in the response rendering (e.g. JSP pages) that do a manual flush, there is no chance the prevent the problem you describe, only with hacks like mocked requests to avoid the flush command hits the servlet engine. not sure if we really should go this way. we assume (have not tested it) the proposal you describe will not help if there are still flush() calls in the code.

stefan


Re: [Discussion] | should buffer output in memory

Posted by Jörg Hoh <jh...@googlemail.com.INVALID>.
HI Stefan,

Am Do., 13. Sep. 2018 um 18:16 Uhr schrieb Stefan Seifert <
sseifert@pro-vision.de>:

> hello jörg.
>
> we had a short discussion about this on our sling hackathon today:
>
> 1. check the configured response buffer size of the jetty engine that is
> currently configured for your instance. if the response you expected is
> smaller than the buffer size (in recent version this is 24KB i think),
> error handling should work fine.
>

I don't want to change that buffer size, because it's hard to estimate what
the maximum size of the rendered content is (and will be). Just setting to
100k and hope for the best is not what I would like to do. Also because
it's very hard to monitor if your response exceeds that size.

>
> 2. if your response is smaller than this and it is still happening it is
> possible that here are calls to response.flush() that should not be done.
> the recent sling modules should be correct and not using response.flush()
> during rendering, but check your environment if this is the case there as
> well
>


> 3. additionally make sure the JSP files of your application or that are
> used from the platform do not contain calls to response.flush()!
>
>
Yes, that's something we can validate and figure out, but I've seen this in
too many cases, and not always a flush() was repsonsible. Also it's some
very hard to validate all scripts and occassions (especially if they are
part of the framework you use).

Jörg

-- 
Cheers,
Jörg Hoh,

http://cqdump.wordpress.com
Twitter: @joerghoh

RE: [Discussion] | should buffer output in memory

Posted by Stefan Seifert <ss...@pro-vision.de>.
hello jörg.

we had a short discussion about this on our sling hackathon today:

1. check the configured response buffer size of the jetty engine that is currently configured for your instance. if the response you expected is smaller than the buffer size (in recent version this is 24KB i think), error handling should work fine.

2. if your response is smaller than this and it is still happening it is possible that here are calls to response.flush() that should not be done. the recent sling modules should be correct and not using response.flush() during rendering, but check your environment if this is the case there as well.

3. additionally make sure the JSP files of your application or that are used from the platform do not contain calls to response.flush()!

stefan


>-----Original Message-----
>From: Jörg Hoh <jh...@googlemail.com.INVALID>
>Sent: Friday, September 7, 2018 7:19 PM
>To: users@sling.apache.org
>Subject: [Discussion] |<sling:include> should buffer output in memory
>
>In my sling application I have a number of nested components, which are
>called via the <sling:include> tag. If a component late in the rendering
>process throws an error, I would like to return an HTTP statuscode 500. But
>if more data than the size of the output buffer are written, the response
>headers are already comitted and written to the network, so I cannot change
>the statuscode anymore and I get in the log something like "response
>already comitted". Thus the client receives a HTTP status 200, but the
>rendering is not complete.
>
>The problem could be mitigated, if the <sling:include> tag could provide a
>ByteArrayOutputStream as part of the Response object, so there is never an
>overflow of the OutputBuffer. Instead everything is buffered until an
>explicit flush is invoked (which is already happening). This would cause
>the entire rendering output being stored in-memory while retaining the
>ability to set the HTTP statuscode (and less important: all other response
>headers) until all the request processing is completed.
>
>This would avoid my problem of sending a statuscode while the rendering is
>not completed yet. As downsides:
>
>   - The response cannot be streamed to the client while the rendering is
>   still happening, potentially slowing down the response.
>   - The memory usage to process a request will increase.
>
>If you need to stream data or send large amounts of data, you could still
>implement a servlet and avoid the use of the taglib; but for the usecase
>where the sling:include is used, the output is unlikely to grow into the
>megabytes.
>
>WDYT? Should this part of sling, or should be rather create a custom tag
>library with our own version of the non-streaming "include" tag?
>
>Jörg
>--
>Cheers,
>Jörg Hoh,
>
>http://cqdump.wordpress.com
>Twitter: @joerghoh