You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@openwhisk.apache.org by Dominic Kim <st...@gmail.com> on 2018/12/21 08:52:58 UTC

[Discussion] Log collection enhancement

Dear whiskers.

I am starting this thread to discuss the better way to collect action logs.
After this commit(
https://github.com/apache/incubator-openwhisk/commit/57475367b509fd2d4c14f5678d0c26642c52cc91)
is merged, TPS for blocking call is significantly decreased.
It's about 1/10 of previous results.

I think the logic behind that PR is quite fair.
But still, such a huge performance degradation would not be the one that we
want.
I performed simple benchmarks and figured out this is because log
collection takes a long time.
(Performed benchmarking with 0 log limit and observed high TPS.)

One more related issue is the log collection with the action concurrency.
AFAIK, logs for each invocation are differentiated by some special string,
so-called "sentinel".
This way does fit well when there is no concurrency(the concurrency limit
is 1).

But if we enable the concurrency, logs are interleaved and the sentinel is
no longer effective as is.
So currently, if we want to enable concurrency, we need to disable log
collection or use different LogStore implementation with external log
collection capability.

Above issues imply the needs for a new way to collect and manage logs.

The current solution depends on container logs, ContainerProxy accesses
directly to the container log file and collect them as an AkkaStream.
Since it introduces disk IO, it's relatively slow.

Without deep consideration, I think one option is to include logs in the
activation response from the action container.
With concurrency, each invocation will keep its own buffered stream to
write logs and create an activation response including it.
Each currency invocation will create its own buffered stream, logs for each
invocation can be segregated.
It happens in the memory only, I expect it would show better performance in
general.

One issue is the log limit. The current maximum log limit is 10MB.
But I have been curious there is the real case that one function generates
10MB logs for each invocation.
If yes, would it be really meaningful for users? I think it would not be
easy to look into it.

If we can reduce the maximum limit to 1MB or less, it might be effective to
collect logs in this way.
Since I did not ponder on it, there might be some side effects.

Please share any idea or feedbacks.

Thanks
Regards
Dominic.

Re: [Discussion] Log collection enhancement

Posted by Rodric Rabbah <ro...@gmail.com>.

Hi Dominic,

(Sorry for responding late.)

If I understand correctly - you're suggesting the action proxy inside the
container capture stdout & stderr and return them as part of the
request/response to the invoker.
In doing this, you avoid the sentinel lag (caused by delayed stream flush
from the container log driver), and hence reduce the critical path on
container reuse or reclamation. For concurrent activations inside the same
container, you're further thinking that each activation can somehow buffer
the logs for that activation and then the logs are disambiguated
accordingly.

I think this is a neat idea if you can make it work. Have you thought about
how you'd capture stdout/stderr across all the runtimes, and dealing with
libraries that might log not just to those stream but might use loggers.

Others who have thought more about integrating logging with the native
platform capabilities might have more to say on this.

-r

On Fri, Dec 21, 2018 at 3:53 AM Dominic Kim <st...@gmail.com> wrote:

> Dear whiskers.
>
> I am starting this thread to discuss the better way to collect action logs.
> After this commit(
>
> https://github.com/apache/incubator-openwhisk/commit/57475367b509fd2d4c14f5678d0c26642c52cc91
> )
> is merged, TPS for blocking call is significantly decreased.
> It's about 1/10 of previous results.
>
> I think the logic behind that PR is quite fair.
> But still, such a huge performance degradation would not be the one that we
> want.
> I performed simple benchmarks and figured out this is because log
> collection takes a long time.
> (Performed benchmarking with 0 log limit and observed high TPS.)
>
> One more related issue is the log collection with the action concurrency.
> AFAIK, logs for each invocation are differentiated by some special string,
> so-called "sentinel".
> This way does fit well when there is no concurrency(the concurrency limit
> is 1).
>
> But if we enable the concurrency, logs are interleaved and the sentinel is
> no longer effective as is.
> So currently, if we want to enable concurrency, we need to disable log
> collection or use different LogStore implementation with external log
> collection capability.
>
> Above issues imply the needs for a new way to collect and manage logs.
>
> The current solution depends on container logs, ContainerProxy accesses
> directly to the container log file and collect them as an AkkaStream.
> Since it introduces disk IO, it's relatively slow.
>
>
> Without deep consideration, I think one option is to include logs in the
> activation response from the action container.
> With concurrency, each invocation will keep its own buffered stream to
> write logs and create an activation response including it.
> Each currency invocation will create its own buffered stream, logs for each
> invocation can be segregated.
> It happens in the memory only, I expect it would show better performance in
> general.
>
> One issue is the log limit. The current maximum log limit is 10MB.
> But I have been curious there is the real case that one function generates
> 10MB logs for each invocation.
> If yes, would it be really meaningful for users? I think it would not be
> easy to look into it.
>
> If we can reduce the maximum limit to 1MB or less, it might be effective to
> collect logs in this way.
> Since I did not ponder on it, there might be some side effects.
>
> Please share any idea or feedbacks.
>
> Thanks
> Regards
> Dominic.
>