You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hc.apache.org by Craig Skinfill <cr...@gmail.com> on 2014/06/04 21:38:02 UTC

Caching and async http client

HTTP Components team - I have some approved time this summer to work on an
open source project, and I'd like to work on improving the caching support
in the async http client.  Currently, the requests to the origin are
non-blocking, but the requests to the cache are blocking.  The async
caching support appears to be implemented as a decorator of the http
client, while in the blocking client case its implemented by decorating the
internal ClientExecChain instance.

My initial idea was to follow the same pattern in the async client as with
the blocking client, and use an internal ExecutorService to submit requests
to the cache, and then block (with a timeout) the returned Future with the
cache lookup result.  This is of course still blocking, but at least
provides a potentially configurable timeout when checking the cache.

How should I approach this?  I see a comment in
https://issues.apache.org/jira/browse/HTTPASYNC-76 regarding the likely
need to make changes to the existing blocking http client caching
implementation along with changes to the core async http client protocol
pipeline processing.  Are there any existing ideas, plans, etc., for making
the caching non-blocking for the async client?  Or what changes would be
needed in the blocking client's caching implementation?

Is there enough need to make this improvement?

Thanks.

Re: Caching and async http client

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Wed, 2014-06-04 at 15:38 -0400, Craig Skinfill wrote:
> HTTP Components team - I have some approved time this summer to work on an
> open source project, and I'd like to work on improving the caching support
> in the async http client.  Currently, the requests to the origin are
> non-blocking, but the requests to the cache are blocking.  The async
> caching support appears to be implemented as a decorator of the http
> client, while in the blocking client case its implemented by decorating the
> internal ClientExecChain instance.
> 
> My initial idea was to follow the same pattern in the async client as with
> the blocking client, and use an internal ExecutorService to submit requests
> to the cache, and then block (with a timeout) the returned Future with the
> cache lookup result.  This is of course still blocking, but at least
> provides a potentially configurable timeout when checking the cache.
> 
> How should I approach this?  I see a comment in
> https://issues.apache.org/jira/browse/HTTPASYNC-76 regarding the likely
> need to make changes to the existing blocking http client caching
> implementation along with changes to the core async http client protocol
> pipeline processing.  Are there any existing ideas, plans, etc., for making
> the caching non-blocking for the async client?  Or what changes would be
> needed in the blocking client's caching implementation?
> 
> Is there enough need to make this improvement?
> 
> Thanks.

Hi Craig

Async HTTP caching is a much neglected area in HC. Any contribution
there would be enormously welcome. I, for one, am very happy to have you
on board.

Async HTTP caching is a difficult task from a purely design perspective
and is likely to require several iterations to get things right. In
general non-blocking I/O makes certain things easier but it also other
things much more complex. Content (data) streaming is one of those
things. Standard Java InputStream / OutputStream API is simple and
effective but it is inherently blocking and simply does not work well
with even-driven designs. For non-blocking transports we use consumer /
producer based model that enables reactive programming style and works
well for data intensive applications. The problem is it is damn hard to
organize those consumers and producers into a pipeline based on the
chain of responsibility patterns. The ability to model protocol
processing logic as a sequence of related and interdependent elements is
what makes integration of caching aspects into the blocking client
seamless and efficient. Ideally, we should be able to do the same for
the non-blocking client. Another major issue is that presently HTTP
cache components are tightly coupled with InputStream and the whole
design of the caching APIs is effectively blocking. 

I must confess that I do not see an easy solution to those design
issues. No matter what we do we are likely to end up breaking existing
APIs, which is also a problem. So, I can also well imagine that we make
the decision to _not_ support data streaming with caching at all (at
least initially). If we always buffer messages in memory it would make
it much easier to come up with a reasonable processing pipeline design,
which is asynchronous but only at the HTTP message level. This would
also enable us to fully re-use blocking caching elements without having
to alter them. It might be an unpleasant but necessary compromise.

If this all does not sound too depressing this issue might be a good
starting point. It would also give you a good expose to the existing
code base and API design.

Cheers

Oleg


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org