You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@openwhisk.apache.org by Tyson Norris <tn...@adobe.com.INVALID> on 2018/06/25 16:23:40 UTC

Concurrency in invoker http client

Hi -
As part of the support for concurrency in action containers, I have this PR open in the nodejs action container: https://github.com/apache/incubator-openwhisk-runtime-nodejs/pull/41

I’ve been having real trouble getting concurrency tests to operate properly in travis env.

The behavior appears something like the http client is only allowing 2 concurrent requests - additional requests are processed later. This is only happening in travis env, unfortunately.

A couple of questions:
* does anyone have advice for the org.apache.http.client to make it behave in travis env? Using PoolingHttpClientConnectionManager for concurrency locally works swell - it only gives me grief in travis, and only by processing requests in “more serial fashion” than when I run it locally.
* at this point, I’m actively looking at replacing org.apache.http.client (which has been discussed in the past) with PoolingRestClient (wrapper for akka http), but I’m not sure how to best extend it to support retries - any pointers here?
    * Related to PoolingRestClient: A non-retry use of PoolingRestClient works great locally with 100s of concurrent requests, but fails only in travis with "akka.stream.StreamTcpException: Tcp command [Connect(172.17.0.9:8080,None,List(),Some(10 seconds),true)] failed because of Connection refused” - I assume because the container takes longer to startup in travis, and requires retries?

Thanks for any tips!
Tyson

Re: Concurrency in invoker http client

Posted by Tyson Norris <tn...@adobe.com.INVALID>.
Right, got it.

Got an initial impl working here:
https://github.com/apache/incubator-openwhisk/pull/3812/files#diff-3fca67626fe5f92ce431902dd2318a10R170

So now in that PR, the entity is either consumed via Unmarshal().to[String], when within the allowed response size, or else via the truncated() function, when content length is too large.

Thanks
Tyson

On Jul 14, 2018, at 5:04 AM, Markus Thoemmes <ma...@de.ibm.com>> wrote:

Hi Tyson,

I think that is (kinda) expected. The ByteStrings are coming through in chunks, usually in sizes like 4k or 8k depending on the underlying connection. You won't know beforehand how big these chunks are.

Have you tried with a big response (like 1M?). The chunk should not be 1M in size, in theory at least.

Cheers,
Markus



Re: Concurrency in invoker http client

Posted by Markus Thoemmes <ma...@de.ibm.com>.
Hi Tyson,

I think that is (kinda) expected. The ByteStrings are coming through in chunks, usually in sizes like 4k or 8k depending on the underlying connection. You won't know beforehand how big these chunks are.

Have you tried with a big response (like 1M?). The chunk should not be 1M in size, in theory at least.

Cheers,
Markus


Re: Concurrency in invoker http client

Posted by Tyson Norris <tn...@adobe.com.INVALID>.
BTW, trying prefixAndTail, this doesn’t work for me
e.g.
response.entity.dataBytes
  .prefixAndTail(1)
  .runWith(Sink.head)
  .map({
    case (Seq(r), _) =>
      Right(ContainerResponse(response.status.intValue, r.utf8String, Some(contentLength.B, maxResponse)))
  })

Does not result with r.size==1, rather r.size == response.entity.size

I will try to fiddle with it some more(and ignore the tail)


> On Jul 13, 2018, at 5:29 PM, Tyson Norris <tn...@adobe.com.INVALID> wrote:
> 
> Thanks
> 
> Currently HttpUtils does return (almost) the same response when it is truncated.. so that the truncated version can later be included in the ErrorResponse…
> 
> I would be OK with changing the error message to not include ANY of the response, since either way it is an error message. 
> Then the error message would ONLY be:
> "The action produced a response that exceeded the allowed length: ${length.toBytes} > ${maxLength.toBytes} bytes.”
> Instead of 
> "The action produced a response that exceeded the allowed length: ${length.toBytes} > ${maxLength.toBytes} bytes.  The truncated response was: $trunk"
> 
> WDYT?
> 
> 
>> On Jul 13, 2018, at 5:21 PM, Markus Thoemmes <ma...@de.ibm.com> wrote:
>> 
>> Hi Tyson,
>> 
>> Chetan solved a similar problem in "inlineOrAttach" in "AttachmentSupport.scala". He did this by continuously running a stream through "prefixAndTail", where he'd pick just one element and then run the stream to check whether he already crossed the limit.
>> 
>> This case now is a lot more performance critical, so I propose we add a "prefixAndTailWeighted" stage, which runs similar to "prefixAndTail" but weighs the inputs by their siuze so you can define how much you actually want to consume in bytes. You can then run the "tail" stream to an ignore Sink. Further, I'd propose to implement a "quick path" (like there is on in the current HttpUtils), which checks the "Content-Length" field of the response and just consumes the whole stream into a string if it's safe to do so to only use this special method on the truncation path.
>> 
>> As a more general, existential question: Do we even need the truncation path? Could we just deny the response straight away if the user's action returns a bigger value than allowed?
>> 
>> Hope this helps
>> 
>> Cheers,
>> Markus
>> 
> 


Re: Concurrency in invoker http client

Posted by Tyson Norris <tn...@adobe.com.INVALID>.
Thanks

Currently HttpUtils does return (almost) the same response when it is truncated.. so that the truncated version can later be included in the ErrorResponse…

I would be OK with changing the error message to not include ANY of the response, since either way it is an error message. 
Then the error message would ONLY be:
"The action produced a response that exceeded the allowed length: ${length.toBytes} > ${maxLength.toBytes} bytes.”
Instead of 
"The action produced a response that exceeded the allowed length: ${length.toBytes} > ${maxLength.toBytes} bytes.  The truncated response was: $trunk"

WDYT?
 

> On Jul 13, 2018, at 5:21 PM, Markus Thoemmes <ma...@de.ibm.com> wrote:
> 
> Hi Tyson,
> 
> Chetan solved a similar problem in "inlineOrAttach" in "AttachmentSupport.scala". He did this by continuously running a stream through "prefixAndTail", where he'd pick just one element and then run the stream to check whether he already crossed the limit.
> 
> This case now is a lot more performance critical, so I propose we add a "prefixAndTailWeighted" stage, which runs similar to "prefixAndTail" but weighs the inputs by their siuze so you can define how much you actually want to consume in bytes. You can then run the "tail" stream to an ignore Sink. Further, I'd propose to implement a "quick path" (like there is on in the current HttpUtils), which checks the "Content-Length" field of the response and just consumes the whole stream into a string if it's safe to do so to only use this special method on the truncation path.
> 
> As a more general, existential question: Do we even need the truncation path? Could we just deny the response straight away if the user's action returns a bigger value than allowed?
> 
> Hope this helps
> 
> Cheers,
> Markus
> 


Re: Concurrency in invoker http client

Posted by Rodric Rabbah <ro...@gmail.com>.
> On Jul 13, 2018, at 8:21 PM, Markus Thoemmes <ma...@de.ibm.com> wrote:
> 
> As a more general, existential question: Do we even need the truncation path? Could we just deny the response straight away if the user's action returns a bigger value than allowed?

Thanks for asking the question. We could change the behavior. I don’t know how often we’ve truncated a response in production (is  it even common)?

The thought was you’ve paid the activation, you should see part of the result. Otherwise you have to check the size of your response and be mindful of that limit as a developer. If you’re passing large blobs in-line you’re already living dangerously perhaps. 

We do truncate logs but unlike a truncated result the latter is less useful and will terminate your action in a sequence. In a conductor you’d end up with an exception unless you have a “try/catch” handler.

-r

Re: Concurrency in invoker http client

Posted by Markus Thoemmes <ma...@de.ibm.com>.
Hi Tyson,

Chetan solved a similar problem in "inlineOrAttach" in "AttachmentSupport.scala". He did this by continuously running a stream through "prefixAndTail", where he'd pick just one element and then run the stream to check whether he already crossed the limit.

This case now is a lot more performance critical, so I propose we add a "prefixAndTailWeighted" stage, which runs similar to "prefixAndTail" but weighs the inputs by their siuze so you can define how much you actually want to consume in bytes. You can then run the "tail" stream to an ignore Sink. Further, I'd propose to implement a "quick path" (like there is on in the current HttpUtils), which checks the "Content-Length" field of the response and just consumes the whole stream into a string if it's safe to do so to only use this special method on the truncation path.

As a more general, existential question: Do we even need the truncation path? Could we just deny the response straight away if the user's action returns a bigger value than allowed?

Hope this helps

Cheers,
Markus


Re: Concurrency in invoker http client

Posted by Tyson Norris <tn...@adobe.com.INVALID>.
Getting back to this, I’m close, but not sure how to get akka http client to truncate a response?

There is response.entity.withSizeLimit() - but this throws an exception, not really what I want. 

I think what I want is a variant of the HttpEntity.Limitable GraphStage that just stops emitting characters, but I’m not sure how exactly to do this?

Thanks
Tyson

> On Jun 25, 2018, at 10:32 AM, Rodric Rabbah <ro...@gmail.com> wrote:
> 
> the retires are only on failure to establish a connection - no other
> retries should be happening iirc.
> 
> -r
> 
> On Mon, Jun 25, 2018 at 1:29 PM Tyson Norris <tn...@adobe.com.invalid>
> wrote:
> 
>> Thanks Markus - one other question:
>> 
>> Assuming retry is the current missing piece to using PoolingRestClient (or
>> akka http directly), I’m also wondering if “retry” is the proper approach
>> here?
>> It may be worthwhile to initiate a port connection (with its own
>> timeout/retry behavior) before the /init so that we can distinguish between
>> “container startup is slow” and “bad behavior in action container after
>> startup”?
>> 
>> Also, I’m wondering if there are cases where rampant retry causes
>> unintended side affects, etc - this could be worse with concurrency
>> enabled, but I don’t know if this should be considered a real problem.
>> 
>> FWIW We avoid this (http request to container that is not yet listening)
>> in mesos by not returning the container till the mesos health check passes
>> (which currently just check the port connection), so this would be a
>> similar setup at the invoker layer.
>> 
>> Thanks
>> Tyson
>> 
>>> On Jun 25, 2018, at 10:08 AM, Markus Thoemmes <
>> markus.thoemmes@de.ibm.com> wrote:
>>> 
>>> Hi Tyson,
>>> 
>>> Ha, I was thinking about moving back to akka the other day. A few
>> comments:
>>> 
>>> 1. Travis build environments have 1.5 CPU cores which might explain the
>> "strange" behavior you get from the apache client? Maybe it adjusts its
>> thread pool based on the number of cores available?
>>> 2. To implement retries based on akka-http, have a look at what we used
>> to use for invoker communication:
>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-openwhisk%2Fcommit%2F31946029cad740a00c6e6f367637a1bcfea5dd18%23diff-5c6f165d3e8395b6fe915ef0d24e5d1f&data=02%7C01%7Ctnorris%40adobe.com%7Ca07af0966c50438a270908d5dabe4b3a%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C1%7C636655433199601572&sdata=ZHjbf0ukNQaEFq2i58f5hxMt3zRa3JCHdHR0MRAn8Uo%3D&reserved=0
>> (NewHttpUtils.scala to be precise).
>>> 3. I guess you're going to create a new PoolingRestClient per container?
>> I could imagine it is problematic if new containers come and go with a
>> global connection pool. Just something to be aware of.
>>> 
>>> Oh, another thing to be aware of: We **used** to have akka-http there
>> and never tried again after the revert. We're certainly on a much newer
>> version now but we had issues of indefinitly hanging connections when we
>> first implemented it. Running some high-load scenarios before pushing this
>> into master will be needed.
>>> 
>>> I don't want to put you off the task though, give it a shot, I'd love to
>> have this back :). Thanks for attacking!
>>> 
>>> Cheers,
>>> -m
>>> 
>> 
>> 


Re: Concurrency in invoker http client

Posted by Rodric Rabbah <ro...@gmail.com>.
the retires are only on failure to establish a connection - no other
retries should be happening iirc.

-r

On Mon, Jun 25, 2018 at 1:29 PM Tyson Norris <tn...@adobe.com.invalid>
wrote:

> Thanks Markus - one other question:
>
> Assuming retry is the current missing piece to using PoolingRestClient (or
> akka http directly), I’m also wondering if “retry” is the proper approach
> here?
> It may be worthwhile to initiate a port connection (with its own
> timeout/retry behavior) before the /init so that we can distinguish between
> “container startup is slow” and “bad behavior in action container after
> startup”?
>
> Also, I’m wondering if there are cases where rampant retry causes
> unintended side affects, etc - this could be worse with concurrency
> enabled, but I don’t know if this should be considered a real problem.
>
> FWIW We avoid this (http request to container that is not yet listening)
> in mesos by not returning the container till the mesos health check passes
> (which currently just check the port connection), so this would be a
> similar setup at the invoker layer.
>
> Thanks
> Tyson
>
> > On Jun 25, 2018, at 10:08 AM, Markus Thoemmes <
> markus.thoemmes@de.ibm.com> wrote:
> >
> > Hi Tyson,
> >
> > Ha, I was thinking about moving back to akka the other day. A few
> comments:
> >
> > 1. Travis build environments have 1.5 CPU cores which might explain the
> "strange" behavior you get from the apache client? Maybe it adjusts its
> thread pool based on the number of cores available?
> > 2. To implement retries based on akka-http, have a look at what we used
> to use for invoker communication:
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-openwhisk%2Fcommit%2F31946029cad740a00c6e6f367637a1bcfea5dd18%23diff-5c6f165d3e8395b6fe915ef0d24e5d1f&data=02%7C01%7Ctnorris%40adobe.com%7Ca07af0966c50438a270908d5dabe4b3a%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C1%7C636655433199601572&sdata=ZHjbf0ukNQaEFq2i58f5hxMt3zRa3JCHdHR0MRAn8Uo%3D&reserved=0
> (NewHttpUtils.scala to be precise).
> > 3. I guess you're going to create a new PoolingRestClient per container?
> I could imagine it is problematic if new containers come and go with a
> global connection pool. Just something to be aware of.
> >
> > Oh, another thing to be aware of: We **used** to have akka-http there
> and never tried again after the revert. We're certainly on a much newer
> version now but we had issues of indefinitly hanging connections when we
> first implemented it. Running some high-load scenarios before pushing this
> into master will be needed.
> >
> > I don't want to put you off the task though, give it a shot, I'd love to
> have this back :). Thanks for attacking!
> >
> > Cheers,
> > -m
> >
>
>

Re: Concurrency in invoker http client

Posted by Tyson Norris <tn...@adobe.com.INVALID>.
Thanks Markus - one other question: 

Assuming retry is the current missing piece to using PoolingRestClient (or akka http directly), I’m also wondering if “retry” is the proper approach here?
It may be worthwhile to initiate a port connection (with its own timeout/retry behavior) before the /init so that we can distinguish between “container startup is slow” and “bad behavior in action container after startup”?

Also, I’m wondering if there are cases where rampant retry causes unintended side affects, etc - this could be worse with concurrency enabled, but I don’t know if this should be considered a real problem. 

FWIW We avoid this (http request to container that is not yet listening) in mesos by not returning the container till the mesos health check passes (which currently just check the port connection), so this would be a similar setup at the invoker layer.

Thanks
Tyson

> On Jun 25, 2018, at 10:08 AM, Markus Thoemmes <ma...@de.ibm.com> wrote:
> 
> Hi Tyson,
> 
> Ha, I was thinking about moving back to akka the other day. A few comments:
> 
> 1. Travis build environments have 1.5 CPU cores which might explain the "strange" behavior you get from the apache client? Maybe it adjusts its thread pool based on the number of cores available?
> 2. To implement retries based on akka-http, have a look at what we used to use for invoker communication: https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-openwhisk%2Fcommit%2F31946029cad740a00c6e6f367637a1bcfea5dd18%23diff-5c6f165d3e8395b6fe915ef0d24e5d1f&data=02%7C01%7Ctnorris%40adobe.com%7Ca07af0966c50438a270908d5dabe4b3a%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C1%7C636655433199601572&sdata=ZHjbf0ukNQaEFq2i58f5hxMt3zRa3JCHdHR0MRAn8Uo%3D&reserved=0 (NewHttpUtils.scala to be precise).
> 3. I guess you're going to create a new PoolingRestClient per container? I could imagine it is problematic if new containers come and go with a global connection pool. Just something to be aware of.
> 
> Oh, another thing to be aware of: We **used** to have akka-http there and never tried again after the revert. We're certainly on a much newer version now but we had issues of indefinitly hanging connections when we first implemented it. Running some high-load scenarios before pushing this into master will be needed.
> 
> I don't want to put you off the task though, give it a shot, I'd love to have this back :). Thanks for attacking!
> 
> Cheers,
> -m
> 


Re: Concurrency in invoker http client

Posted by Markus Thoemmes <ma...@de.ibm.com>.
Hi Tyson,

Ha, I was thinking about moving back to akka the other day. A few comments:

1. Travis build environments have 1.5 CPU cores which might explain the "strange" behavior you get from the apache client? Maybe it adjusts its thread pool based on the number of cores available?
2. To implement retries based on akka-http, have a look at what we used to use for invoker communication: https://github.com/apache/incubator-openwhisk/commit/31946029cad740a00c6e6f367637a1bcfea5dd18#diff-5c6f165d3e8395b6fe915ef0d24e5d1f (NewHttpUtils.scala to be precise).
3. I guess you're going to create a new PoolingRestClient per container? I could imagine it is problematic if new containers come and go with a global connection pool. Just something to be aware of.

Oh, another thing to be aware of: We **used** to have akka-http there and never tried again after the revert. We're certainly on a much newer version now but we had issues of indefinitly hanging connections when we first implemented it. Running some high-load scenarios before pushing this into master will be needed.

I don't want to put you off the task though, give it a shot, I'd love to have this back :). Thanks for attacking!

Cheers,
-m