You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@trafficserver.apache.org by Walt Karas <wk...@verizonmedia.com.INVALID> on 2019/06/11 20:21:29 UTC

Re: TS API change proposal: Reduce performance impact of TSDebug calls when the tag is disabled

Any +1's for simply replacing TSDebug with the proposed TSFastDbg ?

On Mon, Apr 8, 2019 at 6:20 PM Leif Hedstrom <zw...@apache.org> wrote:

>
>
> On Apr 4, 2019, at 8:05 AM, Walt Karas <wk...@verizonmedia.com.INVALID>
> wrote:
>
> Because senility. probably "on" should be a non-pointer of type const
> volatile char, and InkAPI in the core can just keep a pointer to the whole
> structure for updating.
>
> But I'm leaning more to fixing up the capability that already kinda there.
> Which is optimizimg only for the case when all debug output is disabled.
> Especially since maybe zwoop would be on my side for once :o) .
>
>
>
> My bad, whatever Walt said, I take the opposite stance.
>
> — leif
>
> P.s
>   J/K!
>
>
> On Thu, Apr 4, 2019 at 8:51 AM Alan Carroll
> <so...@verizonmedia.com.invalid> wrote:
>
> This could enable a significant speed up for debug tags. One point is that
> when the debug tag string is set, at that point the debug objects could be
> updated to the correct state according to the debug tag string, rather than
> checking it every time a debug message is logged.
>
> I have to ask, why is there a pointer to the enable flag, instead of just
> the enable flag? Your logic doesn't even check that flag, it checks whether
> the pointer is null.
>
> On Wed, Apr 3, 2019 at 11:52 AM Leif Hedstrom <zw...@apache.org> wrote:
>
>
>
> On Apr 3, 2019, at 10:47 AM, Walt Karas <wkaras@verizonmedia.com
>
> .INVALID>
>
> wrote:
>
>
> It looks like the TS API already has less granular version of this:
>
> extern int diags_on_for_plugins;
>
> #define TSDEBUG             \
>
> if (diags_on_for_plugins) \
>
> TSDebug
>
>
> A problem with this is if you write:
>
>
> if (flag)
>
> TSDEBUG(yadayada);
>
> else
>
> do_something();
>
>
> then the else will be associated with the if hidden in the macro.  I'd
> prefer to change to something like:
>
>
> extern const volatile char TS_detail_dbg_output_enabled;
>
>
> #define TSFastDbg(_TAG, ...) \
> do { \
> if (TS_detail_dbg_output_enabled) \
>  TSDebug(_TAG, __VA_ARGS__); \
>
> } while(0)
>
>
> Yeh, that kinda sounds like a bug, doesn’t it?
>
>
> The negative is that we'd only see the benefit when all debug output
>
> was
>
> disabled.  But, in any case, that is the main benefit, to avoid impacts
> during normal operation.
>
>
> Right, we should always optimize for the common cases, obviously. When
> there’s a decision to make, where you can either make it “equally”
> expensive always, or more expensive in a rare case, but cheaper in the
> common case, we should likely chose the latter.
>
>
>
>
> Looking in src/tscore/Diags.cc, it looks like the debug enable flag is
>
> only
>
> set at startup time.  So we wouldn't be able to enable TSFastDbg using
> traffic_cntl without adding complexity.
>
> On Tue, Apr 2, 2019 at 5:48 PM Walt Karas <wk...@verizonmedia.com>
>
> wrote:
>
>
> Add this new structure to apidefs.h:
>
> typedef struct TSFastDbgCntl_
> {
> const char * const tag; // nul-terminated string
> const volatile char * const on; // pointer to 1-byte flag
> }
> TSFastDbgCntl;
>
> Add this new API function, that returns a pointer to an instance of
>
> the
>
> above structure:
>
> TSFastDbgCntl * TSCreateFastDbgCntl(const char *tag);
>
> Add this macro, which would be used in place of direct use of
>
> TSDebug().
>
>
> #define TSFastDbg(_FD_CNTL, ...) \
> do { \
> if ((_FD_CNTL)->on) \
>  TSDebug((_FD_CNTL)->tag, __VA_ARGS__); \
> } while(0)
>
> The first parameter to TSFastDbg() is a pointer that was returned by
> TSCreateFastDbgCntl().  The remaining parameters are printf parameters
> (format string and values matching format specifiers).  The core would
>
> be
>
> responsible for changing the value of the 'on' fields in the
>
> TSFastDbgCntl
>
> instances.  For direct calls to TSDebug(), even if the tag is
>
> disabled,
>
> all
>
> the parameters have to be prepared and pushed onto the stack.  As was
> discussed in IRC, you have to feel guilty about doing converting a
> string_view to a string and outputting it with c_str().  Because the
>
> code
>
> to do this will be executed even if the debug tag is disabled.
> Furthermore, for each call to TSDebug, it's necessary to do an
>
> associative
>
> lookup on the tag string to determine that output for the tag is
>
> disabled.
>
>
> We would want to put equivalent capabilities in Diags.h in the core
>
> for
>
> debug output.  The implementation is non-trivial, but I think it would
>
> take
>
> at most a week including the Au test additions.
>
> I looked at making TSFastDbg() an inline function rather than a macro:
> https://godbolt.org/z/IfVbBk
> The opitimization works well if you compile the code with the inline
> function as C, but the optimization is poor if you compile it as C++.
>
> This
>
> difference exists for both gcc and clang.  It's weird.
>
>
>
>
>
>
>
>
>
>
>

Re: TS API change proposal: Reduce performance impact of TSDebug calls when the tag is disabled

Posted by Leif Hedstrom <zw...@apache.org>.

> On Jun 12, 2019, at 11:00 AM, Walt Karas <wk...@verizonmedia.com.INVALID> wrote:
> 
> What do you mean by "exposing diags"?
> 
> On Wed, Jun 12, 2019 at 11:50 AM Bryan Call <bc...@apache.org> wrote:
> 
>> I would rather see TSDebug be optimized and exposing diags instead of
>> creating a new type of debug.


For a while now, we’ve talked about making better APIs for conditionally enabling diags per transaction / event basis. Such as, “If client_ip = x.y.z/24 and request contains “?walt=1”, turn on diags for http.*”. There’s some of this in the code already, it was done by Uri if I recall. In addition, Yahoo (Susan?) added some partial support for per-IP diagnostics, but I don’t think we ended up with any APIs that lets you control it in any reasonable way.

I agree with Bryan, improving on TSDebug() / Debug() (core) / TSDebugSpecific() would be preferable. In fact, I wish that TSDebugSpecific() could go away, merged into TSDebug/Debug() and a better API for programmatic control over when diagnostics should be conditionally enabled.

Cheers,

— Leif

>> 
>> -Bryan
>> 
>> 
>>> On Jun 12, 2019, at 9:44 AM, Walt Karas <wk...@verizonmedia.com.INVALID>
>> wrote:
>>> 
>>> I don't believe so.  The issue I'm trying to address is that there is
>> still
>>> significant execution time overhead for debug output calls when
>>> proxy.config.diags.debug.enabled is 0.  This change would make the
>>> execution time overhead negligible (if TSFastDbg is used or TSDebug is
>>> changed to be like the proposed TSFastDbg).  Another way of putting it
>> is,
>>> I want to clean up the existing TSDEBUG alternative to TSDebug.
>>> 
>>> On Wed, Jun 12, 2019 at 11:31 AM Bryan Call <bc...@apache.org> wrote:
>>> 
>>>> How would this be controlled on the command line when running
>>>> trafic_server -T?
>>>> 
>>>> It looks like it is a departure from how debug tags and controlled now.
>>>> 
>>>> -Bryan
>>>> 
>>>> 
>>>>> On Jun 11, 2019, at 1:21 PM, Walt Karas <wkaras@verizonmedia.com
>> .INVALID>
>>>> wrote:
>>>>> 
>>>>> Any +1's for simply replacing TSDebug with the proposed TSFastDbg ?
>>>>> 
>>>>> On Mon, Apr 8, 2019 at 6:20 PM Leif Hedstrom <zw...@apache.org> wrote:
>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Apr 4, 2019, at 8:05 AM, Walt Karas <wkaras@verizonmedia.com
>>>> .INVALID>
>>>>>> wrote:
>>>>>> 
>>>>>> Because senility. probably "on" should be a non-pointer of type const
>>>>>> volatile char, and InkAPI in the core can just keep a pointer to the
>>>> whole
>>>>>> structure for updating.
>>>>>> 
>>>>>> But I'm leaning more to fixing up the capability that already kinda
>>>> there.
>>>>>> Which is optimizimg only for the case when all debug output is
>> disabled.
>>>>>> Especially since maybe zwoop would be on my side for once :o) .
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> My bad, whatever Walt said, I take the opposite stance.
>>>>>> 
>>>>>> — leif
>>>>>> 
>>>>>> P.s
>>>>>> J/K!
>>>>>> 
>>>>>> 
>>>>>> On Thu, Apr 4, 2019 at 8:51 AM Alan Carroll
>>>>>> <so...@verizonmedia.com.invalid> wrote:
>>>>>> 
>>>>>> This could enable a significant speed up for debug tags. One point is
>>>> that
>>>>>> when the debug tag string is set, at that point the debug objects
>> could
>>>> be
>>>>>> updated to the correct state according to the debug tag string, rather
>>>> than
>>>>>> checking it every time a debug message is logged.
>>>>>> 
>>>>>> I have to ask, why is there a pointer to the enable flag, instead of
>>>> just
>>>>>> the enable flag? Your logic doesn't even check that flag, it checks
>>>> whether
>>>>>> the pointer is null.
>>>>>> 
>>>>>> On Wed, Apr 3, 2019 at 11:52 AM Leif Hedstrom <zw...@apache.org>
>> wrote:
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Apr 3, 2019, at 10:47 AM, Walt Karas <wkaras@verizonmedia.com
>>>>>> 
>>>>>> .INVALID>
>>>>>> 
>>>>>> wrote:
>>>>>> 
>>>>>> 
>>>>>> It looks like the TS API already has less granular version of this:
>>>>>> 
>>>>>> extern int diags_on_for_plugins;
>>>>>> 
>>>>>> #define TSDEBUG             \
>>>>>> 
>>>>>> if (diags_on_for_plugins) \
>>>>>> 
>>>>>> TSDebug
>>>>>> 
>>>>>> 
>>>>>> A problem with this is if you write:
>>>>>> 
>>>>>> 
>>>>>> if (flag)
>>>>>> 
>>>>>> TSDEBUG(yadayada);
>>>>>> 
>>>>>> else
>>>>>> 
>>>>>> do_something();
>>>>>> 
>>>>>> 
>>>>>> then the else will be associated with the if hidden in the macro.  I'd
>>>>>> prefer to change to something like:
>>>>>> 
>>>>>> 
>>>>>> extern const volatile char TS_detail_dbg_output_enabled;
>>>>>> 
>>>>>> 
>>>>>> #define TSFastDbg(_TAG, ...) \
>>>>>> do { \
>>>>>> if (TS_detail_dbg_output_enabled) \
>>>>>> TSDebug(_TAG, __VA_ARGS__); \
>>>>>> 
>>>>>> } while(0)
>>>>>> 
>>>>>> 
>>>>>> Yeh, that kinda sounds like a bug, doesn’t it?
>>>>>> 
>>>>>> 
>>>>>> The negative is that we'd only see the benefit when all debug output
>>>>>> 
>>>>>> was
>>>>>> 
>>>>>> disabled.  But, in any case, that is the main benefit, to avoid
>> impacts
>>>>>> during normal operation.
>>>>>> 
>>>>>> 
>>>>>> Right, we should always optimize for the common cases, obviously. When
>>>>>> there’s a decision to make, where you can either make it “equally”
>>>>>> expensive always, or more expensive in a rare case, but cheaper in the
>>>>>> common case, we should likely chose the latter.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Looking in src/tscore/Diags.cc, it looks like the debug enable flag is
>>>>>> 
>>>>>> only
>>>>>> 
>>>>>> set at startup time.  So we wouldn't be able to enable TSFastDbg using
>>>>>> traffic_cntl without adding complexity.
>>>>>> 
>>>>>> On Tue, Apr 2, 2019 at 5:48 PM Walt Karas <wk...@verizonmedia.com>
>>>>>> 
>>>>>> wrote:
>>>>>> 
>>>>>> 
>>>>>> Add this new structure to apidefs.h:
>>>>>> 
>>>>>> typedef struct TSFastDbgCntl_
>>>>>> {
>>>>>> const char * const tag; // nul-terminated string
>>>>>> const volatile char * const on; // pointer to 1-byte flag
>>>>>> }
>>>>>> TSFastDbgCntl;
>>>>>> 
>>>>>> Add this new API function, that returns a pointer to an instance of
>>>>>> 
>>>>>> the
>>>>>> 
>>>>>> above structure:
>>>>>> 
>>>>>> TSFastDbgCntl * TSCreateFastDbgCntl(const char *tag);
>>>>>> 
>>>>>> Add this macro, which would be used in place of direct use of
>>>>>> 
>>>>>> TSDebug().
>>>>>> 
>>>>>> 
>>>>>> #define TSFastDbg(_FD_CNTL, ...) \
>>>>>> do { \
>>>>>> if ((_FD_CNTL)->on) \
>>>>>> TSDebug((_FD_CNTL)->tag, __VA_ARGS__); \
>>>>>> } while(0)
>>>>>> 
>>>>>> The first parameter to TSFastDbg() is a pointer that was returned by
>>>>>> TSCreateFastDbgCntl().  The remaining parameters are printf parameters
>>>>>> (format string and values matching format specifiers).  The core would
>>>>>> 
>>>>>> be
>>>>>> 
>>>>>> responsible for changing the value of the 'on' fields in the
>>>>>> 
>>>>>> TSFastDbgCntl
>>>>>> 
>>>>>> instances.  For direct calls to TSDebug(), even if the tag is
>>>>>> 
>>>>>> disabled,
>>>>>> 
>>>>>> all
>>>>>> 
>>>>>> the parameters have to be prepared and pushed onto the stack.  As was
>>>>>> discussed in IRC, you have to feel guilty about doing converting a
>>>>>> string_view to a string and outputting it with c_str().  Because the
>>>>>> 
>>>>>> code
>>>>>> 
>>>>>> to do this will be executed even if the debug tag is disabled.
>>>>>> Furthermore, for each call to TSDebug, it's necessary to do an
>>>>>> 
>>>>>> associative
>>>>>> 
>>>>>> lookup on the tag string to determine that output for the tag is
>>>>>> 
>>>>>> disabled.
>>>>>> 
>>>>>> 
>>>>>> We would want to put equivalent capabilities in Diags.h in the core
>>>>>> 
>>>>>> for
>>>>>> 
>>>>>> debug output.  The implementation is non-trivial, but I think it would
>>>>>> 
>>>>>> take
>>>>>> 
>>>>>> at most a week including the Au test additions.
>>>>>> 
>>>>>> I looked at making TSFastDbg() an inline function rather than a macro:
>>>>>> https://godbolt.org/z/IfVbBk
>>>>>> The opitimization works well if you compile the code with the inline
>>>>>> function as C, but the optimization is poor if you compile it as C++.
>>>>>> 
>>>>>> This
>>>>>> 
>>>>>> difference exists for both gcc and clang.  It's weird.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>> 
>> 


Re: TS API change proposal: Reduce performance impact of TSDebug calls when the tag is disabled

Posted by Walt Karas <wk...@verizonmedia.com.INVALID>.
That's correct in that the flag that TSDEBUG looks at does not seem to be
getting set properly in all cases by core TS.  Going to drop this since
it's not the small change that I thought.
https://github.com/ywkaras/trafficserver/blob/debug_perf/tests/tools/plugins/test_tsapi.cc#L186
is
available to cut and paste for crude execution time tests.

On Mon, Jun 17, 2019 at 2:46 PM Bryan Call <bc...@apache.org> wrote:

> With TSDEBUG there is no way to updated the global variable after you
> start up, even with traffic_ctl.  A plugin can set the variable directly,
> but that seems hacky.
>
> Debug should already be checking global variables to see if debug is
> enabled first before calling into PCRE.
>
> -Bryan
>
>
> > On Jun 17, 2019, at 12:35 PM, Walt Karas <wk...@verizonmedia.com.INVALID>
> wrote:
> >
> > It looked like a simple fast change.  Are you saying you can see that
> > TSDEBUG would not start printing debug messages if you enabled the global
> > flag with traffic_cntl ?
> >
> > An alternative would be to have the core functions that print the debug
> > messages check the global flag first before they call PCRE to check if
> the
> > specific tag is enabled.  That would improvement would have effect for
> both
> > core and plugin debug messages.  If wouldn't eliminate the parameter push
> > and function call overhead, but the testing indicates that is much less
> > significant.
> >
> > On Mon, Jun 17, 2019 at 2:01 PM Bryan Call <bc...@apache.org> wrote:
> >
> >> Walt,
> >>
> >> TSDEBUG uses diags_on_for_plugins and that looks to only be set on
> startup
> >> and then just calls TSDebug().  It doesn’t make sense to replace TSDebug
> >> with TSDEBUG.
> >>
> >> I don’t see why you are trying to optimize just the plugin debug.  When
> >> debug is enabled it has a performance impact on the entire system, not
> just
> >> for plugins.
> >>
> >> -Bryan
> >>
> >>
> >>> On Jun 14, 2019, at 4:51 PM, Leif Hedstrom <zw...@apache.org> wrote:
> >>>
> >>>
> >>>
> >>>> On Jun 14, 2019, at 5:18 PM, Walt Karas <wkaras@verizonmedia.com
> .INVALID>
> >> wrote:
> >>>>
> >>>> I got some crude performance numbers for TSDebug() and TSDEBUG()
> >>>> when proxy.config.diags.debug.enabled is 0.  The good news is that
> >>>> TSDebug() looks to take about 8 nanoseconds (the raw data is below).
> >> So,
> >>>> if we take 1000 requests per core per seconds as a ballpark estimate
> of
> >> TS
> >>>> performance, it would take a lot of them to have a significant impact.
> >> But
> >>>> TSDebug() takes about 30 times as long as TSDEBUG().  So it seems
> >>>> worthwhile to make the tiny effort to change TSDEBUG() so it won't
> >> swallow
> >>>> a following 'else' clause.  I also think we should rename it to
> >> TSFastDbg().
> >>>
> >>> The numbers sounds good, but why on earth keep the old Debug? Why would
> >> I chose to use a slow API over a fast API? That just makes no sense.
> >>>
> >>> if we do this, just change TSDebug().  If it’s incompatible, we
> document
> >> it with appropriate warnings.
> >>>
> >>> Does any of this affect / improve the core Debug() statements? Those
> are
> >> probably 10x - 100x more critical to get fast, because they are
> >> *everywhere*.
> >>>
> >>> — leif
> >>>
> >>>>
> >>>> We could rename TSDebug() to TSDebug_(), and TSDEBUG() to TSDebug().
> >> But I
> >>>> think the very small benefit is outweighed by the potential risk.  TS
> >> seems
> >>>> to be rife with race conditions, so a change that will alter the
> length
> >> of
> >>>> lots of code paths could create yet more instability.
> >>>>
> >>>> Here is the raw data (6 tests, 3 repetitions).  Done with
> >>>> https://github.com/apache/trafficserver/pull/5638 .  I don't
> >> understand why
> >>>> it's not more consistent.  If anyone knows a better way to time this,
> >>>> please let me know.
> >>>>
> >>>> Performance for TSDebug() with no parameters, 1000000000 repetitions,
> >>>> microseconds user=7670000 system=20000
> >>>> Performance for TSDEBUG() with no parameters, 1000000000 repetitions,
> >>>> microseconds user=260000 system=0
> >>>> Performance for TSDebug() with parameters, 1000000000 repetitions,
> >>>> microseconds user=8030000 system=50000
> >>>> Performance for TSDEBUG() with parameters, 1000000000 repetitions,
> >>>> microseconds user=280000 system=0
> >>>> Performance for TSDebug() with volatile parameters, 1000000000
> >> repetitions,
> >>>> microseconds user=8830000 system=30000
> >>>> Performance for TSDEBUG() with volatile parameters, 1000000000
> >> repetitions,
> >>>> microseconds user=270000 system=0
> >>>>
> >>>> Performance for TSDebug() with no parameters, 1000000000 repetitions,
> >>>> microseconds user=7520000 system=50000
> >>>> Performance for TSDEBUG() with no parameters, 1000000000 repetitions,
> >>>> microseconds user=260000 system=0
> >>>> Performance for TSDebug() with parameters, 1000000000 repetitions,
> >>>> microseconds user=8040000 system=30000
> >>>> Performance for TSDEBUG() with parameters, 1000000000 repetitions,
> >>>> microseconds user=260000 system=0
> >>>> Performance for TSDebug() with volatile parameters, 1000000000
> >> repetitions,
> >>>> microseconds user=8450000 system=50000
> >>>> Performance for TSDEBUG() with volatile parameters, 1000000000
> >> repetitions,
> >>>> microseconds user=260000 system=0
> >>>>
> >>>> Performance for TSDebug() with no parameters, 1000000000 repetitions,
> >>>> microseconds user=7470000 system=80000
> >>>> Performance for TSDEBUG() with no parameters, 1000000000 repetitions,
> >>>> microseconds user=260000 system=0
> >>>> Performance for TSDebug() with parameters, 1000000000 repetitions,
> >>>> microseconds user=8120000 system=50000
> >>>> Performance for TSDEBUG() with parameters, 1000000000 repetitions,
> >>>> microseconds user=270000 system=0
> >>>> Performance for TSDebug() with volatile parameters, 1000000000
> >> repetitions,
> >>>> microseconds user=8500000 system=60000
> >>>> Performance for TSDEBUG() with volatile parameters, 1000000000
> >> repetitions,
> >>>> microseconds user=270000 system=0
> >>>>
> >>>> On Wed, Jun 12, 2019 at 12:58 PM Walt Karas <wk...@verizonmedia.com>
> >> wrote:
> >>>>
> >>>>> Since I'm still relatively new to this application, there are limits
> to
> >>>>> what I can take on on my own initiative.  I don't see how anything
> I've
> >>>>> proposed would interfere with any higher-priority work.
> >>>>>
> >>>>> On Wed, Jun 12, 2019 at 12:41 PM Leif Hedstrom <zw...@apache.org>
> >> wrote:
> >>>>>
> >>>>>> Also, I’d like to reiterate my call to arms on v9.0.0:
> >>>>>>
> >>>>>> Please, everyone lets focus on the things that are critical to ATS
> >>>>>> v9.0.0 readiness!
> >>>>>>
> >>>>>>
> >>>>>> This includes:
> >>>>>>
> >>>>>>      1) Fixing all the critical bugs (and there’s a lot of them)
> >>>>>>
> >>>>>>      2) Testing “master"
> >>>>>>
> >>>>>>      3) Doing the incompatible changes that needs to be done before
> >> we
> >>>>>> can release
> >>>>>>
> >>>>>>      4) Removing features that we do not want to support for the
> next
> >>>>>> 3+ years
> >>>>>>
> >>>>>>      5) Make sure that v9.0.0 has the functionality that you
> >>>>>> *absolutely* need to be able
> >>>>>>      to use it
> >>>>>>
> >>>>>>
> >>>>>> I kinda feel that several of these recent proposal do not fit into
> >> this,
> >>>>>> and in fact, are a fair amount of distraction. I urge everyone to
> not
> >> go
> >>>>>> down this route until we’ve got v9.0.0 stable and functional, and
> >> instead
> >>>>>> focus on the v9.0.0 tasks at hand.  I don’t think this one for
> >> example fits
> >>>>>> into any of those 5 points above.
> >>>>>>
> >>>>>> My $.01.
> >>>>>>
> >>>>>> — Leif
> >>>>>>
> >>>>>> P.s
> >>>>>> https://github.com/apache/trafficserver/issues/5457
> >>>>>>
> >>>>>>
> >>>>>>> On Jun 12, 2019, at 11:30 AM, Walt Karas <wkaras@verizonmedia.com
> >> .INVALID>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>> I would guess the performance increase would be well under 1%.  But
> >>>>>> it's a
> >>>>>>> fairly simple change.  The biggest performance increase I think
> would
> >>>>>> come
> >>>>>>> from avoiding a PCRE match, to see if the specific debug tag was
> >>>>>> enabled.
> >>>>>>> For some TSDebug calls, there can be many arguments to push, and
> >>>>>> perhaps a
> >>>>>>> fair amount of calculation to generate the correct arguments.
> >>>>>>>
> >>>>>>> On Wed, Jun 12, 2019 at 12:21 PM Bryan Call <bc...@apache.org>
> >> wrote:
> >>>>>>>
> >>>>>>>> I am surprised you are seeing this as a performance issue.
> Looking
> >> at
> >>>>>> the
> >>>>>>>> code it looks like 1 function call and a couple pointer
> >> dereferences.
> >>>>>>>>
> >>>>>>>> If you wanted to speed up TSDebug and make the code harder to
> >> maintain
> >>>>>> you
> >>>>>>>> could create a structure that looks like like DiagsConfig in ts.h
> >> and
> >>>>>> then
> >>>>>>>> set a pointer used in the API to look at the gobal DiagsConfig.
> >> Then
> >>>>>> you
> >>>>>>>> could have a macro that would do basically what on() does.  This
> >> would
> >>>>>>>> bypass a function call and would have 1 more pointer dereference
> >> than
> >>>>>> your
> >>>>>>>> example.
> >>>>>>>>
> >>>>>>>> -Bryan
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> On Jun 12, 2019, at 10:00 AM, Walt Karas <
> wkaras@verizonmedia.com
> >>>>>> .INVALID>
> >>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>> What do you mean by "exposing diags"?
> >>>>>>>>>
> >>>>>>>>> On Wed, Jun 12, 2019 at 11:50 AM Bryan Call <bc...@apache.org>
> >> wrote:
> >>>>>>>>>
> >>>>>>>>>> I would rather see TSDebug be optimized and exposing diags
> >> instead of
> >>>>>>>>>> creating a new type of debug.
> >>>>>>>>>>
> >>>>>>>>>> -Bryan
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> On Jun 12, 2019, at 9:44 AM, Walt Karas <
> wkaras@verizonmedia.com
> >>>>>>>> .INVALID>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> I don't believe so.  The issue I'm trying to address is that
> >> there
> >>>>>> is
> >>>>>>>>>> still
> >>>>>>>>>>> significant execution time overhead for debug output calls when
> >>>>>>>>>>> proxy.config.diags.debug.enabled is 0.  This change would make
> >> the
> >>>>>>>>>>> execution time overhead negligible (if TSFastDbg is used or
> >> TSDebug
> >>>>>> is
> >>>>>>>>>>> changed to be like the proposed TSFastDbg).  Another way of
> >> putting
> >>>>>> it
> >>>>>>>>>> is,
> >>>>>>>>>>> I want to clean up the existing TSDEBUG alternative to TSDebug.
> >>>>>>>>>>>
> >>>>>>>>>>> On Wed, Jun 12, 2019 at 11:31 AM Bryan Call <bc...@apache.org>
> >>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> How would this be controlled on the command line when running
> >>>>>>>>>>>> trafic_server -T?
> >>>>>>>>>>>>
> >>>>>>>>>>>> It looks like it is a departure from how debug tags and
> >> controlled
> >>>>>>>> now.
> >>>>>>>>>>>>
> >>>>>>>>>>>> -Bryan
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>> On Jun 11, 2019, at 1:21 PM, Walt Karas <
> >> wkaras@verizonmedia.com
> >>>>>>>>>> .INVALID>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Any +1's for simply replacing TSDebug with the proposed
> >> TSFastDbg
> >>>>>> ?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Mon, Apr 8, 2019 at 6:20 PM Leif Hedstrom <
> zwoop@apache.org
> >>>
> >>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Apr 4, 2019, at 8:05 AM, Walt Karas <
> >> wkaras@verizonmedia.com
> >>>>>>>>>>>> .INVALID>
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Because senility. probably "on" should be a non-pointer of
> >> type
> >>>>>>>> const
> >>>>>>>>>>>>>> volatile char, and InkAPI in the core can just keep a
> pointer
> >> to
> >>>>>> the
> >>>>>>>>>>>> whole
> >>>>>>>>>>>>>> structure for updating.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> But I'm leaning more to fixing up the capability that
> already
> >>>>>> kinda
> >>>>>>>>>>>> there.
> >>>>>>>>>>>>>> Which is optimizimg only for the case when all debug output
> is
> >>>>>>>>>> disabled.
> >>>>>>>>>>>>>> Especially since maybe zwoop would be on my side for once
> :o)
> >> .
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> My bad, whatever Walt said, I take the opposite stance.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> — leif
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> P.s
> >>>>>>>>>>>>>> J/K!
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Thu, Apr 4, 2019 at 8:51 AM Alan Carroll
> >>>>>>>>>>>>>> <so...@verizonmedia.com.invalid> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> This could enable a significant speed up for debug tags. One
> >>>>>> point
> >>>>>>>> is
> >>>>>>>>>>>> that
> >>>>>>>>>>>>>> when the debug tag string is set, at that point the debug
> >> objects
> >>>>>>>>>> could
> >>>>>>>>>>>> be
> >>>>>>>>>>>>>> updated to the correct state according to the debug tag
> >> string,
> >>>>>>>> rather
> >>>>>>>>>>>> than
> >>>>>>>>>>>>>> checking it every time a debug message is logged.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I have to ask, why is there a pointer to the enable flag,
> >>>>>> instead of
> >>>>>>>>>>>> just
> >>>>>>>>>>>>>> the enable flag? Your logic doesn't even check that flag, it
> >>>>>> checks
> >>>>>>>>>>>> whether
> >>>>>>>>>>>>>> the pointer is null.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Wed, Apr 3, 2019 at 11:52 AM Leif Hedstrom <
> >> zwoop@apache.org>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Apr 3, 2019, at 10:47 AM, Walt Karas <
> >> wkaras@verizonmedia.com
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> .INVALID>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> It looks like the TS API already has less granular version
> of
> >>>>>> this:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> extern int diags_on_for_plugins;
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> #define TSDEBUG             \
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> if (diags_on_for_plugins) \
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> TSDebug
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> A problem with this is if you write:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> if (flag)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> TSDEBUG(yadayada);
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> else
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> do_something();
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> then the else will be associated with the if hidden in the
> >> macro.
> >>>>>>>> I'd
> >>>>>>>>>>>>>> prefer to change to something like:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> extern const volatile char TS_detail_dbg_output_enabled;
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> #define TSFastDbg(_TAG, ...) \
> >>>>>>>>>>>>>> do { \
> >>>>>>>>>>>>>> if (TS_detail_dbg_output_enabled) \
> >>>>>>>>>>>>>> TSDebug(_TAG, __VA_ARGS__); \
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> } while(0)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Yeh, that kinda sounds like a bug, doesn’t it?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> The negative is that we'd only see the benefit when all
> debug
> >>>>>> output
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> was
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> disabled.  But, in any case, that is the main benefit, to
> >> avoid
> >>>>>>>>>> impacts
> >>>>>>>>>>>>>> during normal operation.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Right, we should always optimize for the common cases,
> >> obviously.
> >>>>>>>> When
> >>>>>>>>>>>>>> there’s a decision to make, where you can either make it
> >>>>>> “equally”
> >>>>>>>>>>>>>> expensive always, or more expensive in a rare case, but
> >> cheaper
> >>>>>> in
> >>>>>>>> the
> >>>>>>>>>>>>>> common case, we should likely chose the latter.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Looking in src/tscore/Diags.cc, it looks like the debug
> enable
> >>>>>> flag
> >>>>>>>> is
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> only
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> set at startup time.  So we wouldn't be able to enable
> >> TSFastDbg
> >>>>>>>> using
> >>>>>>>>>>>>>> traffic_cntl without adding complexity.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Tue, Apr 2, 2019 at 5:48 PM Walt Karas <
> >>>>>> wkaras@verizonmedia.com>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Add this new structure to apidefs.h:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> typedef struct TSFastDbgCntl_
> >>>>>>>>>>>>>> {
> >>>>>>>>>>>>>> const char * const tag; // nul-terminated string
> >>>>>>>>>>>>>> const volatile char * const on; // pointer to 1-byte flag
> >>>>>>>>>>>>>> }
> >>>>>>>>>>>>>> TSFastDbgCntl;
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Add this new API function, that returns a pointer to an
> >> instance
> >>>>>> of
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> above structure:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> TSFastDbgCntl * TSCreateFastDbgCntl(const char *tag);
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Add this macro, which would be used in place of direct use
> of
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> TSDebug().
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> #define TSFastDbg(_FD_CNTL, ...) \
> >>>>>>>>>>>>>> do { \
> >>>>>>>>>>>>>> if ((_FD_CNTL)->on) \
> >>>>>>>>>>>>>> TSDebug((_FD_CNTL)->tag, __VA_ARGS__); \
> >>>>>>>>>>>>>> } while(0)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> The first parameter to TSFastDbg() is a pointer that was
> >>>>>> returned by
> >>>>>>>>>>>>>> TSCreateFastDbgCntl().  The remaining parameters are printf
> >>>>>>>> parameters
> >>>>>>>>>>>>>> (format string and values matching format specifiers).  The
> >> core
> >>>>>>>> would
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> responsible for changing the value of the 'on' fields in the
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> TSFastDbgCntl
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> instances.  For direct calls to TSDebug(), even if the tag
> is
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> disabled,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> all
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> the parameters have to be prepared and pushed onto the
> >> stack.  As
> >>>>>>>> was
> >>>>>>>>>>>>>> discussed in IRC, you have to feel guilty about doing
> >> converting
> >>>>>> a
> >>>>>>>>>>>>>> string_view to a string and outputting it with c_str().
> >> Because
> >>>>>> the
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> code
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> to do this will be executed even if the debug tag is
> disabled.
> >>>>>>>>>>>>>> Furthermore, for each call to TSDebug, it's necessary to do
> an
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> associative
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> lookup on the tag string to determine that output for the
> tag
> >> is
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> disabled.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> We would want to put equivalent capabilities in Diags.h in
> the
> >>>>>> core
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> for
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> debug output.  The implementation is non-trivial, but I
> think
> >> it
> >>>>>>>> would
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> take
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> at most a week including the Au test additions.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I looked at making TSFastDbg() an inline function rather
> than
> >> a
> >>>>>>>> macro:
> >>>>>>>>>>>>>> https://godbolt.org/z/IfVbBk
> >>>>>>>>>>>>>> The opitimization works well if you compile the code with
> the
> >>>>>> inline
> >>>>>>>>>>>>>> function as C, but the optimization is poor if you compile
> it
> >> as
> >>>>>>>> C++.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> This
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> difference exists for both gcc and clang.  It's weird.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>>
> >>>
> >>
> >>
>
>

Re: TS API change proposal: Reduce performance impact of TSDebug calls when the tag is disabled

Posted by Bryan Call <bc...@apache.org>.
With TSDEBUG there is no way to updated the global variable after you start up, even with traffic_ctl.  A plugin can set the variable directly, but that seems hacky.

Debug should already be checking global variables to see if debug is enabled first before calling into PCRE.

-Bryan


> On Jun 17, 2019, at 12:35 PM, Walt Karas <wk...@verizonmedia.com.INVALID> wrote:
> 
> It looked like a simple fast change.  Are you saying you can see that
> TSDEBUG would not start printing debug messages if you enabled the global
> flag with traffic_cntl ?
> 
> An alternative would be to have the core functions that print the debug
> messages check the global flag first before they call PCRE to check if the
> specific tag is enabled.  That would improvement would have effect for both
> core and plugin debug messages.  If wouldn't eliminate the parameter push
> and function call overhead, but the testing indicates that is much less
> significant.
> 
> On Mon, Jun 17, 2019 at 2:01 PM Bryan Call <bc...@apache.org> wrote:
> 
>> Walt,
>> 
>> TSDEBUG uses diags_on_for_plugins and that looks to only be set on startup
>> and then just calls TSDebug().  It doesn’t make sense to replace TSDebug
>> with TSDEBUG.
>> 
>> I don’t see why you are trying to optimize just the plugin debug.  When
>> debug is enabled it has a performance impact on the entire system, not just
>> for plugins.
>> 
>> -Bryan
>> 
>> 
>>> On Jun 14, 2019, at 4:51 PM, Leif Hedstrom <zw...@apache.org> wrote:
>>> 
>>> 
>>> 
>>>> On Jun 14, 2019, at 5:18 PM, Walt Karas <wk...@verizonmedia.com.INVALID>
>> wrote:
>>>> 
>>>> I got some crude performance numbers for TSDebug() and TSDEBUG()
>>>> when proxy.config.diags.debug.enabled is 0.  The good news is that
>>>> TSDebug() looks to take about 8 nanoseconds (the raw data is below).
>> So,
>>>> if we take 1000 requests per core per seconds as a ballpark estimate of
>> TS
>>>> performance, it would take a lot of them to have a significant impact.
>> But
>>>> TSDebug() takes about 30 times as long as TSDEBUG().  So it seems
>>>> worthwhile to make the tiny effort to change TSDEBUG() so it won't
>> swallow
>>>> a following 'else' clause.  I also think we should rename it to
>> TSFastDbg().
>>> 
>>> The numbers sounds good, but why on earth keep the old Debug? Why would
>> I chose to use a slow API over a fast API? That just makes no sense.
>>> 
>>> if we do this, just change TSDebug().  If it’s incompatible, we document
>> it with appropriate warnings.
>>> 
>>> Does any of this affect / improve the core Debug() statements? Those are
>> probably 10x - 100x more critical to get fast, because they are
>> *everywhere*.
>>> 
>>> — leif
>>> 
>>>> 
>>>> We could rename TSDebug() to TSDebug_(), and TSDEBUG() to TSDebug().
>> But I
>>>> think the very small benefit is outweighed by the potential risk.  TS
>> seems
>>>> to be rife with race conditions, so a change that will alter the length
>> of
>>>> lots of code paths could create yet more instability.
>>>> 
>>>> Here is the raw data (6 tests, 3 repetitions).  Done with
>>>> https://github.com/apache/trafficserver/pull/5638 .  I don't
>> understand why
>>>> it's not more consistent.  If anyone knows a better way to time this,
>>>> please let me know.
>>>> 
>>>> Performance for TSDebug() with no parameters, 1000000000 repetitions,
>>>> microseconds user=7670000 system=20000
>>>> Performance for TSDEBUG() with no parameters, 1000000000 repetitions,
>>>> microseconds user=260000 system=0
>>>> Performance for TSDebug() with parameters, 1000000000 repetitions,
>>>> microseconds user=8030000 system=50000
>>>> Performance for TSDEBUG() with parameters, 1000000000 repetitions,
>>>> microseconds user=280000 system=0
>>>> Performance for TSDebug() with volatile parameters, 1000000000
>> repetitions,
>>>> microseconds user=8830000 system=30000
>>>> Performance for TSDEBUG() with volatile parameters, 1000000000
>> repetitions,
>>>> microseconds user=270000 system=0
>>>> 
>>>> Performance for TSDebug() with no parameters, 1000000000 repetitions,
>>>> microseconds user=7520000 system=50000
>>>> Performance for TSDEBUG() with no parameters, 1000000000 repetitions,
>>>> microseconds user=260000 system=0
>>>> Performance for TSDebug() with parameters, 1000000000 repetitions,
>>>> microseconds user=8040000 system=30000
>>>> Performance for TSDEBUG() with parameters, 1000000000 repetitions,
>>>> microseconds user=260000 system=0
>>>> Performance for TSDebug() with volatile parameters, 1000000000
>> repetitions,
>>>> microseconds user=8450000 system=50000
>>>> Performance for TSDEBUG() with volatile parameters, 1000000000
>> repetitions,
>>>> microseconds user=260000 system=0
>>>> 
>>>> Performance for TSDebug() with no parameters, 1000000000 repetitions,
>>>> microseconds user=7470000 system=80000
>>>> Performance for TSDEBUG() with no parameters, 1000000000 repetitions,
>>>> microseconds user=260000 system=0
>>>> Performance for TSDebug() with parameters, 1000000000 repetitions,
>>>> microseconds user=8120000 system=50000
>>>> Performance for TSDEBUG() with parameters, 1000000000 repetitions,
>>>> microseconds user=270000 system=0
>>>> Performance for TSDebug() with volatile parameters, 1000000000
>> repetitions,
>>>> microseconds user=8500000 system=60000
>>>> Performance for TSDEBUG() with volatile parameters, 1000000000
>> repetitions,
>>>> microseconds user=270000 system=0
>>>> 
>>>> On Wed, Jun 12, 2019 at 12:58 PM Walt Karas <wk...@verizonmedia.com>
>> wrote:
>>>> 
>>>>> Since I'm still relatively new to this application, there are limits to
>>>>> what I can take on on my own initiative.  I don't see how anything I've
>>>>> proposed would interfere with any higher-priority work.
>>>>> 
>>>>> On Wed, Jun 12, 2019 at 12:41 PM Leif Hedstrom <zw...@apache.org>
>> wrote:
>>>>> 
>>>>>> Also, I’d like to reiterate my call to arms on v9.0.0:
>>>>>> 
>>>>>> Please, everyone lets focus on the things that are critical to ATS
>>>>>> v9.0.0 readiness!
>>>>>> 
>>>>>> 
>>>>>> This includes:
>>>>>> 
>>>>>>      1) Fixing all the critical bugs (and there’s a lot of them)
>>>>>> 
>>>>>>      2) Testing “master"
>>>>>> 
>>>>>>      3) Doing the incompatible changes that needs to be done before
>> we
>>>>>> can release
>>>>>> 
>>>>>>      4) Removing features that we do not want to support for the next
>>>>>> 3+ years
>>>>>> 
>>>>>>      5) Make sure that v9.0.0 has the functionality that you
>>>>>> *absolutely* need to be able
>>>>>>      to use it
>>>>>> 
>>>>>> 
>>>>>> I kinda feel that several of these recent proposal do not fit into
>> this,
>>>>>> and in fact, are a fair amount of distraction. I urge everyone to not
>> go
>>>>>> down this route until we’ve got v9.0.0 stable and functional, and
>> instead
>>>>>> focus on the v9.0.0 tasks at hand.  I don’t think this one for
>> example fits
>>>>>> into any of those 5 points above.
>>>>>> 
>>>>>> My $.01.
>>>>>> 
>>>>>> — Leif
>>>>>> 
>>>>>> P.s
>>>>>> https://github.com/apache/trafficserver/issues/5457
>>>>>> 
>>>>>> 
>>>>>>> On Jun 12, 2019, at 11:30 AM, Walt Karas <wkaras@verizonmedia.com
>> .INVALID>
>>>>>> wrote:
>>>>>>> 
>>>>>>> I would guess the performance increase would be well under 1%.  But
>>>>>> it's a
>>>>>>> fairly simple change.  The biggest performance increase I think would
>>>>>> come
>>>>>>> from avoiding a PCRE match, to see if the specific debug tag was
>>>>>> enabled.
>>>>>>> For some TSDebug calls, there can be many arguments to push, and
>>>>>> perhaps a
>>>>>>> fair amount of calculation to generate the correct arguments.
>>>>>>> 
>>>>>>> On Wed, Jun 12, 2019 at 12:21 PM Bryan Call <bc...@apache.org>
>> wrote:
>>>>>>> 
>>>>>>>> I am surprised you are seeing this as a performance issue.  Looking
>> at
>>>>>> the
>>>>>>>> code it looks like 1 function call and a couple pointer
>> dereferences.
>>>>>>>> 
>>>>>>>> If you wanted to speed up TSDebug and make the code harder to
>> maintain
>>>>>> you
>>>>>>>> could create a structure that looks like like DiagsConfig in ts.h
>> and
>>>>>> then
>>>>>>>> set a pointer used in the API to look at the gobal DiagsConfig.
>> Then
>>>>>> you
>>>>>>>> could have a macro that would do basically what on() does.  This
>> would
>>>>>>>> bypass a function call and would have 1 more pointer dereference
>> than
>>>>>> your
>>>>>>>> example.
>>>>>>>> 
>>>>>>>> -Bryan
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On Jun 12, 2019, at 10:00 AM, Walt Karas <wkaras@verizonmedia.com
>>>>>> .INVALID>
>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>> What do you mean by "exposing diags"?
>>>>>>>>> 
>>>>>>>>> On Wed, Jun 12, 2019 at 11:50 AM Bryan Call <bc...@apache.org>
>> wrote:
>>>>>>>>> 
>>>>>>>>>> I would rather see TSDebug be optimized and exposing diags
>> instead of
>>>>>>>>>> creating a new type of debug.
>>>>>>>>>> 
>>>>>>>>>> -Bryan
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> On Jun 12, 2019, at 9:44 AM, Walt Karas <wkaras@verizonmedia.com
>>>>>>>> .INVALID>
>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> I don't believe so.  The issue I'm trying to address is that
>> there
>>>>>> is
>>>>>>>>>> still
>>>>>>>>>>> significant execution time overhead for debug output calls when
>>>>>>>>>>> proxy.config.diags.debug.enabled is 0.  This change would make
>> the
>>>>>>>>>>> execution time overhead negligible (if TSFastDbg is used or
>> TSDebug
>>>>>> is
>>>>>>>>>>> changed to be like the proposed TSFastDbg).  Another way of
>> putting
>>>>>> it
>>>>>>>>>> is,
>>>>>>>>>>> I want to clean up the existing TSDEBUG alternative to TSDebug.
>>>>>>>>>>> 
>>>>>>>>>>> On Wed, Jun 12, 2019 at 11:31 AM Bryan Call <bc...@apache.org>
>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> How would this be controlled on the command line when running
>>>>>>>>>>>> trafic_server -T?
>>>>>>>>>>>> 
>>>>>>>>>>>> It looks like it is a departure from how debug tags and
>> controlled
>>>>>>>> now.
>>>>>>>>>>>> 
>>>>>>>>>>>> -Bryan
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>> On Jun 11, 2019, at 1:21 PM, Walt Karas <
>> wkaras@verizonmedia.com
>>>>>>>>>> .INVALID>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Any +1's for simply replacing TSDebug with the proposed
>> TSFastDbg
>>>>>> ?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Mon, Apr 8, 2019 at 6:20 PM Leif Hedstrom <zwoop@apache.org
>>> 
>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Apr 4, 2019, at 8:05 AM, Walt Karas <
>> wkaras@verizonmedia.com
>>>>>>>>>>>> .INVALID>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Because senility. probably "on" should be a non-pointer of
>> type
>>>>>>>> const
>>>>>>>>>>>>>> volatile char, and InkAPI in the core can just keep a pointer
>> to
>>>>>> the
>>>>>>>>>>>> whole
>>>>>>>>>>>>>> structure for updating.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> But I'm leaning more to fixing up the capability that already
>>>>>> kinda
>>>>>>>>>>>> there.
>>>>>>>>>>>>>> Which is optimizimg only for the case when all debug output is
>>>>>>>>>> disabled.
>>>>>>>>>>>>>> Especially since maybe zwoop would be on my side for once :o)
>> .
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> My bad, whatever Walt said, I take the opposite stance.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> — leif
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> P.s
>>>>>>>>>>>>>> J/K!
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Thu, Apr 4, 2019 at 8:51 AM Alan Carroll
>>>>>>>>>>>>>> <so...@verizonmedia.com.invalid> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> This could enable a significant speed up for debug tags. One
>>>>>> point
>>>>>>>> is
>>>>>>>>>>>> that
>>>>>>>>>>>>>> when the debug tag string is set, at that point the debug
>> objects
>>>>>>>>>> could
>>>>>>>>>>>> be
>>>>>>>>>>>>>> updated to the correct state according to the debug tag
>> string,
>>>>>>>> rather
>>>>>>>>>>>> than
>>>>>>>>>>>>>> checking it every time a debug message is logged.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I have to ask, why is there a pointer to the enable flag,
>>>>>> instead of
>>>>>>>>>>>> just
>>>>>>>>>>>>>> the enable flag? Your logic doesn't even check that flag, it
>>>>>> checks
>>>>>>>>>>>> whether
>>>>>>>>>>>>>> the pointer is null.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Wed, Apr 3, 2019 at 11:52 AM Leif Hedstrom <
>> zwoop@apache.org>
>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Apr 3, 2019, at 10:47 AM, Walt Karas <
>> wkaras@verizonmedia.com
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> .INVALID>
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> It looks like the TS API already has less granular version of
>>>>>> this:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> extern int diags_on_for_plugins;
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> #define TSDEBUG             \
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> if (diags_on_for_plugins) \
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> TSDebug
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> A problem with this is if you write:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> if (flag)
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> TSDEBUG(yadayada);
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> else
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> do_something();
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> then the else will be associated with the if hidden in the
>> macro.
>>>>>>>> I'd
>>>>>>>>>>>>>> prefer to change to something like:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> extern const volatile char TS_detail_dbg_output_enabled;
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> #define TSFastDbg(_TAG, ...) \
>>>>>>>>>>>>>> do { \
>>>>>>>>>>>>>> if (TS_detail_dbg_output_enabled) \
>>>>>>>>>>>>>> TSDebug(_TAG, __VA_ARGS__); \
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> } while(0)
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Yeh, that kinda sounds like a bug, doesn’t it?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> The negative is that we'd only see the benefit when all debug
>>>>>> output
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> was
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> disabled.  But, in any case, that is the main benefit, to
>> avoid
>>>>>>>>>> impacts
>>>>>>>>>>>>>> during normal operation.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Right, we should always optimize for the common cases,
>> obviously.
>>>>>>>> When
>>>>>>>>>>>>>> there’s a decision to make, where you can either make it
>>>>>> “equally”
>>>>>>>>>>>>>> expensive always, or more expensive in a rare case, but
>> cheaper
>>>>>> in
>>>>>>>> the
>>>>>>>>>>>>>> common case, we should likely chose the latter.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Looking in src/tscore/Diags.cc, it looks like the debug enable
>>>>>> flag
>>>>>>>> is
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> only
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> set at startup time.  So we wouldn't be able to enable
>> TSFastDbg
>>>>>>>> using
>>>>>>>>>>>>>> traffic_cntl without adding complexity.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Tue, Apr 2, 2019 at 5:48 PM Walt Karas <
>>>>>> wkaras@verizonmedia.com>
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Add this new structure to apidefs.h:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> typedef struct TSFastDbgCntl_
>>>>>>>>>>>>>> {
>>>>>>>>>>>>>> const char * const tag; // nul-terminated string
>>>>>>>>>>>>>> const volatile char * const on; // pointer to 1-byte flag
>>>>>>>>>>>>>> }
>>>>>>>>>>>>>> TSFastDbgCntl;
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Add this new API function, that returns a pointer to an
>> instance
>>>>>> of
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> above structure:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> TSFastDbgCntl * TSCreateFastDbgCntl(const char *tag);
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Add this macro, which would be used in place of direct use of
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> TSDebug().
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> #define TSFastDbg(_FD_CNTL, ...) \
>>>>>>>>>>>>>> do { \
>>>>>>>>>>>>>> if ((_FD_CNTL)->on) \
>>>>>>>>>>>>>> TSDebug((_FD_CNTL)->tag, __VA_ARGS__); \
>>>>>>>>>>>>>> } while(0)
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> The first parameter to TSFastDbg() is a pointer that was
>>>>>> returned by
>>>>>>>>>>>>>> TSCreateFastDbgCntl().  The remaining parameters are printf
>>>>>>>> parameters
>>>>>>>>>>>>>> (format string and values matching format specifiers).  The
>> core
>>>>>>>> would
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> be
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> responsible for changing the value of the 'on' fields in the
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> TSFastDbgCntl
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> instances.  For direct calls to TSDebug(), even if the tag is
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> disabled,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> all
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> the parameters have to be prepared and pushed onto the
>> stack.  As
>>>>>>>> was
>>>>>>>>>>>>>> discussed in IRC, you have to feel guilty about doing
>> converting
>>>>>> a
>>>>>>>>>>>>>> string_view to a string and outputting it with c_str().
>> Because
>>>>>> the
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> code
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> to do this will be executed even if the debug tag is disabled.
>>>>>>>>>>>>>> Furthermore, for each call to TSDebug, it's necessary to do an
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> associative
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> lookup on the tag string to determine that output for the tag
>> is
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> disabled.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> We would want to put equivalent capabilities in Diags.h in the
>>>>>> core
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> for
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> debug output.  The implementation is non-trivial, but I think
>> it
>>>>>>>> would
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> take
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> at most a week including the Au test additions.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I looked at making TSFastDbg() an inline function rather than
>> a
>>>>>>>> macro:
>>>>>>>>>>>>>> https://godbolt.org/z/IfVbBk
>>>>>>>>>>>>>> The opitimization works well if you compile the code with the
>>>>>> inline
>>>>>>>>>>>>>> function as C, but the optimization is poor if you compile it
>> as
>>>>>>>> C++.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> This
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> difference exists for both gcc and clang.  It's weird.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>>> 
>>> 
>> 
>> 


Re: TS API change proposal: Reduce performance impact of TSDebug calls when the tag is disabled

Posted by Walt Karas <wk...@verizonmedia.com.INVALID>.
It looked like a simple fast change.  Are you saying you can see that
TSDEBUG would not start printing debug messages if you enabled the global
flag with traffic_cntl ?

An alternative would be to have the core functions that print the debug
messages check the global flag first before they call PCRE to check if the
specific tag is enabled.  That would improvement would have effect for both
core and plugin debug messages.  If wouldn't eliminate the parameter push
and function call overhead, but the testing indicates that is much less
significant.

On Mon, Jun 17, 2019 at 2:01 PM Bryan Call <bc...@apache.org> wrote:

> Walt,
>
> TSDEBUG uses diags_on_for_plugins and that looks to only be set on startup
> and then just calls TSDebug().  It doesn’t make sense to replace TSDebug
> with TSDEBUG.
>
> I don’t see why you are trying to optimize just the plugin debug.  When
> debug is enabled it has a performance impact on the entire system, not just
> for plugins.
>
> -Bryan
>
>
> > On Jun 14, 2019, at 4:51 PM, Leif Hedstrom <zw...@apache.org> wrote:
> >
> >
> >
> >> On Jun 14, 2019, at 5:18 PM, Walt Karas <wk...@verizonmedia.com.INVALID>
> wrote:
> >>
> >> I got some crude performance numbers for TSDebug() and TSDEBUG()
> >> when proxy.config.diags.debug.enabled is 0.  The good news is that
> >> TSDebug() looks to take about 8 nanoseconds (the raw data is below).
> So,
> >> if we take 1000 requests per core per seconds as a ballpark estimate of
> TS
> >> performance, it would take a lot of them to have a significant impact.
> But
> >> TSDebug() takes about 30 times as long as TSDEBUG().  So it seems
> >> worthwhile to make the tiny effort to change TSDEBUG() so it won't
> swallow
> >> a following 'else' clause.  I also think we should rename it to
> TSFastDbg().
> >
> > The numbers sounds good, but why on earth keep the old Debug? Why would
> I chose to use a slow API over a fast API? That just makes no sense.
> >
> > if we do this, just change TSDebug().  If it’s incompatible, we document
> it with appropriate warnings.
> >
> > Does any of this affect / improve the core Debug() statements? Those are
> probably 10x - 100x more critical to get fast, because they are
> *everywhere*.
> >
> > — leif
> >
> >>
> >> We could rename TSDebug() to TSDebug_(), and TSDEBUG() to TSDebug().
> But I
> >> think the very small benefit is outweighed by the potential risk.  TS
> seems
> >> to be rife with race conditions, so a change that will alter the length
> of
> >> lots of code paths could create yet more instability.
> >>
> >> Here is the raw data (6 tests, 3 repetitions).  Done with
> >> https://github.com/apache/trafficserver/pull/5638 .  I don't
> understand why
> >> it's not more consistent.  If anyone knows a better way to time this,
> >> please let me know.
> >>
> >> Performance for TSDebug() with no parameters, 1000000000 repetitions,
> >> microseconds user=7670000 system=20000
> >> Performance for TSDEBUG() with no parameters, 1000000000 repetitions,
> >> microseconds user=260000 system=0
> >> Performance for TSDebug() with parameters, 1000000000 repetitions,
> >> microseconds user=8030000 system=50000
> >> Performance for TSDEBUG() with parameters, 1000000000 repetitions,
> >> microseconds user=280000 system=0
> >> Performance for TSDebug() with volatile parameters, 1000000000
> repetitions,
> >> microseconds user=8830000 system=30000
> >> Performance for TSDEBUG() with volatile parameters, 1000000000
> repetitions,
> >> microseconds user=270000 system=0
> >>
> >> Performance for TSDebug() with no parameters, 1000000000 repetitions,
> >> microseconds user=7520000 system=50000
> >> Performance for TSDEBUG() with no parameters, 1000000000 repetitions,
> >> microseconds user=260000 system=0
> >> Performance for TSDebug() with parameters, 1000000000 repetitions,
> >> microseconds user=8040000 system=30000
> >> Performance for TSDEBUG() with parameters, 1000000000 repetitions,
> >> microseconds user=260000 system=0
> >> Performance for TSDebug() with volatile parameters, 1000000000
> repetitions,
> >> microseconds user=8450000 system=50000
> >> Performance for TSDEBUG() with volatile parameters, 1000000000
> repetitions,
> >> microseconds user=260000 system=0
> >>
> >> Performance for TSDebug() with no parameters, 1000000000 repetitions,
> >> microseconds user=7470000 system=80000
> >> Performance for TSDEBUG() with no parameters, 1000000000 repetitions,
> >> microseconds user=260000 system=0
> >> Performance for TSDebug() with parameters, 1000000000 repetitions,
> >> microseconds user=8120000 system=50000
> >> Performance for TSDEBUG() with parameters, 1000000000 repetitions,
> >> microseconds user=270000 system=0
> >> Performance for TSDebug() with volatile parameters, 1000000000
> repetitions,
> >> microseconds user=8500000 system=60000
> >> Performance for TSDEBUG() with volatile parameters, 1000000000
> repetitions,
> >> microseconds user=270000 system=0
> >>
> >> On Wed, Jun 12, 2019 at 12:58 PM Walt Karas <wk...@verizonmedia.com>
> wrote:
> >>
> >>> Since I'm still relatively new to this application, there are limits to
> >>> what I can take on on my own initiative.  I don't see how anything I've
> >>> proposed would interfere with any higher-priority work.
> >>>
> >>> On Wed, Jun 12, 2019 at 12:41 PM Leif Hedstrom <zw...@apache.org>
> wrote:
> >>>
> >>>> Also, I’d like to reiterate my call to arms on v9.0.0:
> >>>>
> >>>>  Please, everyone lets focus on the things that are critical to ATS
> >>>> v9.0.0 readiness!
> >>>>
> >>>>
> >>>> This includes:
> >>>>
> >>>>       1) Fixing all the critical bugs (and there’s a lot of them)
> >>>>
> >>>>       2) Testing “master"
> >>>>
> >>>>       3) Doing the incompatible changes that needs to be done before
> we
> >>>> can release
> >>>>
> >>>>       4) Removing features that we do not want to support for the next
> >>>> 3+ years
> >>>>
> >>>>       5) Make sure that v9.0.0 has the functionality that you
> >>>> *absolutely* need to be able
> >>>>       to use it
> >>>>
> >>>>
> >>>> I kinda feel that several of these recent proposal do not fit into
> this,
> >>>> and in fact, are a fair amount of distraction. I urge everyone to not
> go
> >>>> down this route until we’ve got v9.0.0 stable and functional, and
> instead
> >>>> focus on the v9.0.0 tasks at hand.  I don’t think this one for
> example fits
> >>>> into any of those 5 points above.
> >>>>
> >>>> My $.01.
> >>>>
> >>>> — Leif
> >>>>
> >>>> P.s
> >>>>  https://github.com/apache/trafficserver/issues/5457
> >>>>
> >>>>
> >>>>> On Jun 12, 2019, at 11:30 AM, Walt Karas <wkaras@verizonmedia.com
> .INVALID>
> >>>> wrote:
> >>>>>
> >>>>> I would guess the performance increase would be well under 1%.  But
> >>>> it's a
> >>>>> fairly simple change.  The biggest performance increase I think would
> >>>> come
> >>>>> from avoiding a PCRE match, to see if the specific debug tag was
> >>>> enabled.
> >>>>> For some TSDebug calls, there can be many arguments to push, and
> >>>> perhaps a
> >>>>> fair amount of calculation to generate the correct arguments.
> >>>>>
> >>>>> On Wed, Jun 12, 2019 at 12:21 PM Bryan Call <bc...@apache.org>
> wrote:
> >>>>>
> >>>>>> I am surprised you are seeing this as a performance issue.  Looking
> at
> >>>> the
> >>>>>> code it looks like 1 function call and a couple pointer
> dereferences.
> >>>>>>
> >>>>>> If you wanted to speed up TSDebug and make the code harder to
> maintain
> >>>> you
> >>>>>> could create a structure that looks like like DiagsConfig in ts.h
> and
> >>>> then
> >>>>>> set a pointer used in the API to look at the gobal DiagsConfig.
> Then
> >>>> you
> >>>>>> could have a macro that would do basically what on() does.  This
> would
> >>>>>> bypass a function call and would have 1 more pointer dereference
> than
> >>>> your
> >>>>>> example.
> >>>>>>
> >>>>>> -Bryan
> >>>>>>
> >>>>>>
> >>>>>>> On Jun 12, 2019, at 10:00 AM, Walt Karas <wkaras@verizonmedia.com
> >>>> .INVALID>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>> What do you mean by "exposing diags"?
> >>>>>>>
> >>>>>>> On Wed, Jun 12, 2019 at 11:50 AM Bryan Call <bc...@apache.org>
> wrote:
> >>>>>>>
> >>>>>>>> I would rather see TSDebug be optimized and exposing diags
> instead of
> >>>>>>>> creating a new type of debug.
> >>>>>>>>
> >>>>>>>> -Bryan
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> On Jun 12, 2019, at 9:44 AM, Walt Karas <wkaras@verizonmedia.com
> >>>>>> .INVALID>
> >>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>> I don't believe so.  The issue I'm trying to address is that
> there
> >>>> is
> >>>>>>>> still
> >>>>>>>>> significant execution time overhead for debug output calls when
> >>>>>>>>> proxy.config.diags.debug.enabled is 0.  This change would make
> the
> >>>>>>>>> execution time overhead negligible (if TSFastDbg is used or
> TSDebug
> >>>> is
> >>>>>>>>> changed to be like the proposed TSFastDbg).  Another way of
> putting
> >>>> it
> >>>>>>>> is,
> >>>>>>>>> I want to clean up the existing TSDEBUG alternative to TSDebug.
> >>>>>>>>>
> >>>>>>>>> On Wed, Jun 12, 2019 at 11:31 AM Bryan Call <bc...@apache.org>
> >>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> How would this be controlled on the command line when running
> >>>>>>>>>> trafic_server -T?
> >>>>>>>>>>
> >>>>>>>>>> It looks like it is a departure from how debug tags and
> controlled
> >>>>>> now.
> >>>>>>>>>>
> >>>>>>>>>> -Bryan
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> On Jun 11, 2019, at 1:21 PM, Walt Karas <
> wkaras@verizonmedia.com
> >>>>>>>> .INVALID>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> Any +1's for simply replacing TSDebug with the proposed
> TSFastDbg
> >>>> ?
> >>>>>>>>>>>
> >>>>>>>>>>> On Mon, Apr 8, 2019 at 6:20 PM Leif Hedstrom <zwoop@apache.org
> >
> >>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Apr 4, 2019, at 8:05 AM, Walt Karas <
> wkaras@verizonmedia.com
> >>>>>>>>>> .INVALID>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> Because senility. probably "on" should be a non-pointer of
> type
> >>>>>> const
> >>>>>>>>>>>> volatile char, and InkAPI in the core can just keep a pointer
> to
> >>>> the
> >>>>>>>>>> whole
> >>>>>>>>>>>> structure for updating.
> >>>>>>>>>>>>
> >>>>>>>>>>>> But I'm leaning more to fixing up the capability that already
> >>>> kinda
> >>>>>>>>>> there.
> >>>>>>>>>>>> Which is optimizimg only for the case when all debug output is
> >>>>>>>> disabled.
> >>>>>>>>>>>> Especially since maybe zwoop would be on my side for once :o)
> .
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> My bad, whatever Walt said, I take the opposite stance.
> >>>>>>>>>>>>
> >>>>>>>>>>>> — leif
> >>>>>>>>>>>>
> >>>>>>>>>>>> P.s
> >>>>>>>>>>>> J/K!
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Thu, Apr 4, 2019 at 8:51 AM Alan Carroll
> >>>>>>>>>>>> <so...@verizonmedia.com.invalid> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> This could enable a significant speed up for debug tags. One
> >>>> point
> >>>>>> is
> >>>>>>>>>> that
> >>>>>>>>>>>> when the debug tag string is set, at that point the debug
> objects
> >>>>>>>> could
> >>>>>>>>>> be
> >>>>>>>>>>>> updated to the correct state according to the debug tag
> string,
> >>>>>> rather
> >>>>>>>>>> than
> >>>>>>>>>>>> checking it every time a debug message is logged.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I have to ask, why is there a pointer to the enable flag,
> >>>> instead of
> >>>>>>>>>> just
> >>>>>>>>>>>> the enable flag? Your logic doesn't even check that flag, it
> >>>> checks
> >>>>>>>>>> whether
> >>>>>>>>>>>> the pointer is null.
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Wed, Apr 3, 2019 at 11:52 AM Leif Hedstrom <
> zwoop@apache.org>
> >>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Apr 3, 2019, at 10:47 AM, Walt Karas <
> wkaras@verizonmedia.com
> >>>>>>>>>>>>
> >>>>>>>>>>>> .INVALID>
> >>>>>>>>>>>>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> It looks like the TS API already has less granular version of
> >>>> this:
> >>>>>>>>>>>>
> >>>>>>>>>>>> extern int diags_on_for_plugins;
> >>>>>>>>>>>>
> >>>>>>>>>>>> #define TSDEBUG             \
> >>>>>>>>>>>>
> >>>>>>>>>>>> if (diags_on_for_plugins) \
> >>>>>>>>>>>>
> >>>>>>>>>>>> TSDebug
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> A problem with this is if you write:
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> if (flag)
> >>>>>>>>>>>>
> >>>>>>>>>>>> TSDEBUG(yadayada);
> >>>>>>>>>>>>
> >>>>>>>>>>>> else
> >>>>>>>>>>>>
> >>>>>>>>>>>> do_something();
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> then the else will be associated with the if hidden in the
> macro.
> >>>>>> I'd
> >>>>>>>>>>>> prefer to change to something like:
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> extern const volatile char TS_detail_dbg_output_enabled;
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> #define TSFastDbg(_TAG, ...) \
> >>>>>>>>>>>> do { \
> >>>>>>>>>>>> if (TS_detail_dbg_output_enabled) \
> >>>>>>>>>>>> TSDebug(_TAG, __VA_ARGS__); \
> >>>>>>>>>>>>
> >>>>>>>>>>>> } while(0)
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Yeh, that kinda sounds like a bug, doesn’t it?
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> The negative is that we'd only see the benefit when all debug
> >>>> output
> >>>>>>>>>>>>
> >>>>>>>>>>>> was
> >>>>>>>>>>>>
> >>>>>>>>>>>> disabled.  But, in any case, that is the main benefit, to
> avoid
> >>>>>>>> impacts
> >>>>>>>>>>>> during normal operation.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Right, we should always optimize for the common cases,
> obviously.
> >>>>>> When
> >>>>>>>>>>>> there’s a decision to make, where you can either make it
> >>>> “equally”
> >>>>>>>>>>>> expensive always, or more expensive in a rare case, but
> cheaper
> >>>> in
> >>>>>> the
> >>>>>>>>>>>> common case, we should likely chose the latter.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Looking in src/tscore/Diags.cc, it looks like the debug enable
> >>>> flag
> >>>>>> is
> >>>>>>>>>>>>
> >>>>>>>>>>>> only
> >>>>>>>>>>>>
> >>>>>>>>>>>> set at startup time.  So we wouldn't be able to enable
> TSFastDbg
> >>>>>> using
> >>>>>>>>>>>> traffic_cntl without adding complexity.
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Tue, Apr 2, 2019 at 5:48 PM Walt Karas <
> >>>> wkaras@verizonmedia.com>
> >>>>>>>>>>>>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Add this new structure to apidefs.h:
> >>>>>>>>>>>>
> >>>>>>>>>>>> typedef struct TSFastDbgCntl_
> >>>>>>>>>>>> {
> >>>>>>>>>>>> const char * const tag; // nul-terminated string
> >>>>>>>>>>>> const volatile char * const on; // pointer to 1-byte flag
> >>>>>>>>>>>> }
> >>>>>>>>>>>> TSFastDbgCntl;
> >>>>>>>>>>>>
> >>>>>>>>>>>> Add this new API function, that returns a pointer to an
> instance
> >>>> of
> >>>>>>>>>>>>
> >>>>>>>>>>>> the
> >>>>>>>>>>>>
> >>>>>>>>>>>> above structure:
> >>>>>>>>>>>>
> >>>>>>>>>>>> TSFastDbgCntl * TSCreateFastDbgCntl(const char *tag);
> >>>>>>>>>>>>
> >>>>>>>>>>>> Add this macro, which would be used in place of direct use of
> >>>>>>>>>>>>
> >>>>>>>>>>>> TSDebug().
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> #define TSFastDbg(_FD_CNTL, ...) \
> >>>>>>>>>>>> do { \
> >>>>>>>>>>>> if ((_FD_CNTL)->on) \
> >>>>>>>>>>>> TSDebug((_FD_CNTL)->tag, __VA_ARGS__); \
> >>>>>>>>>>>> } while(0)
> >>>>>>>>>>>>
> >>>>>>>>>>>> The first parameter to TSFastDbg() is a pointer that was
> >>>> returned by
> >>>>>>>>>>>> TSCreateFastDbgCntl().  The remaining parameters are printf
> >>>>>> parameters
> >>>>>>>>>>>> (format string and values matching format specifiers).  The
> core
> >>>>>> would
> >>>>>>>>>>>>
> >>>>>>>>>>>> be
> >>>>>>>>>>>>
> >>>>>>>>>>>> responsible for changing the value of the 'on' fields in the
> >>>>>>>>>>>>
> >>>>>>>>>>>> TSFastDbgCntl
> >>>>>>>>>>>>
> >>>>>>>>>>>> instances.  For direct calls to TSDebug(), even if the tag is
> >>>>>>>>>>>>
> >>>>>>>>>>>> disabled,
> >>>>>>>>>>>>
> >>>>>>>>>>>> all
> >>>>>>>>>>>>
> >>>>>>>>>>>> the parameters have to be prepared and pushed onto the
> stack.  As
> >>>>>> was
> >>>>>>>>>>>> discussed in IRC, you have to feel guilty about doing
> converting
> >>>> a
> >>>>>>>>>>>> string_view to a string and outputting it with c_str().
> Because
> >>>> the
> >>>>>>>>>>>>
> >>>>>>>>>>>> code
> >>>>>>>>>>>>
> >>>>>>>>>>>> to do this will be executed even if the debug tag is disabled.
> >>>>>>>>>>>> Furthermore, for each call to TSDebug, it's necessary to do an
> >>>>>>>>>>>>
> >>>>>>>>>>>> associative
> >>>>>>>>>>>>
> >>>>>>>>>>>> lookup on the tag string to determine that output for the tag
> is
> >>>>>>>>>>>>
> >>>>>>>>>>>> disabled.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> We would want to put equivalent capabilities in Diags.h in the
> >>>> core
> >>>>>>>>>>>>
> >>>>>>>>>>>> for
> >>>>>>>>>>>>
> >>>>>>>>>>>> debug output.  The implementation is non-trivial, but I think
> it
> >>>>>> would
> >>>>>>>>>>>>
> >>>>>>>>>>>> take
> >>>>>>>>>>>>
> >>>>>>>>>>>> at most a week including the Au test additions.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I looked at making TSFastDbg() an inline function rather than
> a
> >>>>>> macro:
> >>>>>>>>>>>> https://godbolt.org/z/IfVbBk
> >>>>>>>>>>>> The opitimization works well if you compile the code with the
> >>>> inline
> >>>>>>>>>>>> function as C, but the optimization is poor if you compile it
> as
> >>>>>> C++.
> >>>>>>>>>>>>
> >>>>>>>>>>>> This
> >>>>>>>>>>>>
> >>>>>>>>>>>> difference exists for both gcc and clang.  It's weird.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>>
> >>>>
> >>>>
> >
>
>

Re: TS API change proposal: Reduce performance impact of TSDebug calls when the tag is disabled

Posted by Bryan Call <bc...@apache.org>.
Walt,

TSDEBUG uses diags_on_for_plugins and that looks to only be set on startup and then just calls TSDebug().  It doesn’t make sense to replace TSDebug with TSDEBUG.

I don’t see why you are trying to optimize just the plugin debug.  When debug is enabled it has a performance impact on the entire system, not just for plugins.

-Bryan


> On Jun 14, 2019, at 4:51 PM, Leif Hedstrom <zw...@apache.org> wrote:
> 
> 
> 
>> On Jun 14, 2019, at 5:18 PM, Walt Karas <wk...@verizonmedia.com.INVALID> wrote:
>> 
>> I got some crude performance numbers for TSDebug() and TSDEBUG()
>> when proxy.config.diags.debug.enabled is 0.  The good news is that
>> TSDebug() looks to take about 8 nanoseconds (the raw data is below).  So,
>> if we take 1000 requests per core per seconds as a ballpark estimate of TS
>> performance, it would take a lot of them to have a significant impact.  But
>> TSDebug() takes about 30 times as long as TSDEBUG().  So it seems
>> worthwhile to make the tiny effort to change TSDEBUG() so it won't swallow
>> a following 'else' clause.  I also think we should rename it to TSFastDbg().
> 
> The numbers sounds good, but why on earth keep the old Debug? Why would I chose to use a slow API over a fast API? That just makes no sense.
> 
> if we do this, just change TSDebug().  If it’s incompatible, we document it with appropriate warnings.
> 
> Does any of this affect / improve the core Debug() statements? Those are probably 10x - 100x more critical to get fast, because they are *everywhere*.
> 
> — leif
> 
>> 
>> We could rename TSDebug() to TSDebug_(), and TSDEBUG() to TSDebug().  But I
>> think the very small benefit is outweighed by the potential risk.  TS seems
>> to be rife with race conditions, so a change that will alter the length of
>> lots of code paths could create yet more instability.
>> 
>> Here is the raw data (6 tests, 3 repetitions).  Done with
>> https://github.com/apache/trafficserver/pull/5638 .  I don't understand why
>> it's not more consistent.  If anyone knows a better way to time this,
>> please let me know.
>> 
>> Performance for TSDebug() with no parameters, 1000000000 repetitions,
>> microseconds user=7670000 system=20000
>> Performance for TSDEBUG() with no parameters, 1000000000 repetitions,
>> microseconds user=260000 system=0
>> Performance for TSDebug() with parameters, 1000000000 repetitions,
>> microseconds user=8030000 system=50000
>> Performance for TSDEBUG() with parameters, 1000000000 repetitions,
>> microseconds user=280000 system=0
>> Performance for TSDebug() with volatile parameters, 1000000000 repetitions,
>> microseconds user=8830000 system=30000
>> Performance for TSDEBUG() with volatile parameters, 1000000000 repetitions,
>> microseconds user=270000 system=0
>> 
>> Performance for TSDebug() with no parameters, 1000000000 repetitions,
>> microseconds user=7520000 system=50000
>> Performance for TSDEBUG() with no parameters, 1000000000 repetitions,
>> microseconds user=260000 system=0
>> Performance for TSDebug() with parameters, 1000000000 repetitions,
>> microseconds user=8040000 system=30000
>> Performance for TSDEBUG() with parameters, 1000000000 repetitions,
>> microseconds user=260000 system=0
>> Performance for TSDebug() with volatile parameters, 1000000000 repetitions,
>> microseconds user=8450000 system=50000
>> Performance for TSDEBUG() with volatile parameters, 1000000000 repetitions,
>> microseconds user=260000 system=0
>> 
>> Performance for TSDebug() with no parameters, 1000000000 repetitions,
>> microseconds user=7470000 system=80000
>> Performance for TSDEBUG() with no parameters, 1000000000 repetitions,
>> microseconds user=260000 system=0
>> Performance for TSDebug() with parameters, 1000000000 repetitions,
>> microseconds user=8120000 system=50000
>> Performance for TSDEBUG() with parameters, 1000000000 repetitions,
>> microseconds user=270000 system=0
>> Performance for TSDebug() with volatile parameters, 1000000000 repetitions,
>> microseconds user=8500000 system=60000
>> Performance for TSDEBUG() with volatile parameters, 1000000000 repetitions,
>> microseconds user=270000 system=0
>> 
>> On Wed, Jun 12, 2019 at 12:58 PM Walt Karas <wk...@verizonmedia.com> wrote:
>> 
>>> Since I'm still relatively new to this application, there are limits to
>>> what I can take on on my own initiative.  I don't see how anything I've
>>> proposed would interfere with any higher-priority work.
>>> 
>>> On Wed, Jun 12, 2019 at 12:41 PM Leif Hedstrom <zw...@apache.org> wrote:
>>> 
>>>> Also, I’d like to reiterate my call to arms on v9.0.0:
>>>> 
>>>>  Please, everyone lets focus on the things that are critical to ATS
>>>> v9.0.0 readiness!
>>>> 
>>>> 
>>>> This includes:
>>>> 
>>>>       1) Fixing all the critical bugs (and there’s a lot of them)
>>>> 
>>>>       2) Testing “master"
>>>> 
>>>>       3) Doing the incompatible changes that needs to be done before we
>>>> can release
>>>> 
>>>>       4) Removing features that we do not want to support for the next
>>>> 3+ years
>>>> 
>>>>       5) Make sure that v9.0.0 has the functionality that you
>>>> *absolutely* need to be able
>>>>       to use it
>>>> 
>>>> 
>>>> I kinda feel that several of these recent proposal do not fit into this,
>>>> and in fact, are a fair amount of distraction. I urge everyone to not go
>>>> down this route until we’ve got v9.0.0 stable and functional, and instead
>>>> focus on the v9.0.0 tasks at hand.  I don’t think this one for example fits
>>>> into any of those 5 points above.
>>>> 
>>>> My $.01.
>>>> 
>>>> — Leif
>>>> 
>>>> P.s
>>>>  https://github.com/apache/trafficserver/issues/5457
>>>> 
>>>> 
>>>>> On Jun 12, 2019, at 11:30 AM, Walt Karas <wk...@verizonmedia.com.INVALID>
>>>> wrote:
>>>>> 
>>>>> I would guess the performance increase would be well under 1%.  But
>>>> it's a
>>>>> fairly simple change.  The biggest performance increase I think would
>>>> come
>>>>> from avoiding a PCRE match, to see if the specific debug tag was
>>>> enabled.
>>>>> For some TSDebug calls, there can be many arguments to push, and
>>>> perhaps a
>>>>> fair amount of calculation to generate the correct arguments.
>>>>> 
>>>>> On Wed, Jun 12, 2019 at 12:21 PM Bryan Call <bc...@apache.org> wrote:
>>>>> 
>>>>>> I am surprised you are seeing this as a performance issue.  Looking at
>>>> the
>>>>>> code it looks like 1 function call and a couple pointer dereferences.
>>>>>> 
>>>>>> If you wanted to speed up TSDebug and make the code harder to maintain
>>>> you
>>>>>> could create a structure that looks like like DiagsConfig in ts.h and
>>>> then
>>>>>> set a pointer used in the API to look at the gobal DiagsConfig.  Then
>>>> you
>>>>>> could have a macro that would do basically what on() does.  This would
>>>>>> bypass a function call and would have 1 more pointer dereference than
>>>> your
>>>>>> example.
>>>>>> 
>>>>>> -Bryan
>>>>>> 
>>>>>> 
>>>>>>> On Jun 12, 2019, at 10:00 AM, Walt Karas <wkaras@verizonmedia.com
>>>> .INVALID>
>>>>>> wrote:
>>>>>>> 
>>>>>>> What do you mean by "exposing diags"?
>>>>>>> 
>>>>>>> On Wed, Jun 12, 2019 at 11:50 AM Bryan Call <bc...@apache.org> wrote:
>>>>>>> 
>>>>>>>> I would rather see TSDebug be optimized and exposing diags instead of
>>>>>>>> creating a new type of debug.
>>>>>>>> 
>>>>>>>> -Bryan
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On Jun 12, 2019, at 9:44 AM, Walt Karas <wkaras@verizonmedia.com
>>>>>> .INVALID>
>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>> I don't believe so.  The issue I'm trying to address is that there
>>>> is
>>>>>>>> still
>>>>>>>>> significant execution time overhead for debug output calls when
>>>>>>>>> proxy.config.diags.debug.enabled is 0.  This change would make the
>>>>>>>>> execution time overhead negligible (if TSFastDbg is used or TSDebug
>>>> is
>>>>>>>>> changed to be like the proposed TSFastDbg).  Another way of putting
>>>> it
>>>>>>>> is,
>>>>>>>>> I want to clean up the existing TSDEBUG alternative to TSDebug.
>>>>>>>>> 
>>>>>>>>> On Wed, Jun 12, 2019 at 11:31 AM Bryan Call <bc...@apache.org>
>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> How would this be controlled on the command line when running
>>>>>>>>>> trafic_server -T?
>>>>>>>>>> 
>>>>>>>>>> It looks like it is a departure from how debug tags and controlled
>>>>>> now.
>>>>>>>>>> 
>>>>>>>>>> -Bryan
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> On Jun 11, 2019, at 1:21 PM, Walt Karas <wkaras@verizonmedia.com
>>>>>>>> .INVALID>
>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Any +1's for simply replacing TSDebug with the proposed TSFastDbg
>>>> ?
>>>>>>>>>>> 
>>>>>>>>>>> On Mon, Apr 8, 2019 at 6:20 PM Leif Hedstrom <zw...@apache.org>
>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On Apr 4, 2019, at 8:05 AM, Walt Karas <wkaras@verizonmedia.com
>>>>>>>>>> .INVALID>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> Because senility. probably "on" should be a non-pointer of type
>>>>>> const
>>>>>>>>>>>> volatile char, and InkAPI in the core can just keep a pointer to
>>>> the
>>>>>>>>>> whole
>>>>>>>>>>>> structure for updating.
>>>>>>>>>>>> 
>>>>>>>>>>>> But I'm leaning more to fixing up the capability that already
>>>> kinda
>>>>>>>>>> there.
>>>>>>>>>>>> Which is optimizimg only for the case when all debug output is
>>>>>>>> disabled.
>>>>>>>>>>>> Especially since maybe zwoop would be on my side for once :o) .
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> My bad, whatever Walt said, I take the opposite stance.
>>>>>>>>>>>> 
>>>>>>>>>>>> — leif
>>>>>>>>>>>> 
>>>>>>>>>>>> P.s
>>>>>>>>>>>> J/K!
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On Thu, Apr 4, 2019 at 8:51 AM Alan Carroll
>>>>>>>>>>>> <so...@verizonmedia.com.invalid> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> This could enable a significant speed up for debug tags. One
>>>> point
>>>>>> is
>>>>>>>>>> that
>>>>>>>>>>>> when the debug tag string is set, at that point the debug objects
>>>>>>>> could
>>>>>>>>>> be
>>>>>>>>>>>> updated to the correct state according to the debug tag string,
>>>>>> rather
>>>>>>>>>> than
>>>>>>>>>>>> checking it every time a debug message is logged.
>>>>>>>>>>>> 
>>>>>>>>>>>> I have to ask, why is there a pointer to the enable flag,
>>>> instead of
>>>>>>>>>> just
>>>>>>>>>>>> the enable flag? Your logic doesn't even check that flag, it
>>>> checks
>>>>>>>>>> whether
>>>>>>>>>>>> the pointer is null.
>>>>>>>>>>>> 
>>>>>>>>>>>> On Wed, Apr 3, 2019 at 11:52 AM Leif Hedstrom <zw...@apache.org>
>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On Apr 3, 2019, at 10:47 AM, Walt Karas <wkaras@verizonmedia.com
>>>>>>>>>>>> 
>>>>>>>>>>>> .INVALID>
>>>>>>>>>>>> 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> It looks like the TS API already has less granular version of
>>>> this:
>>>>>>>>>>>> 
>>>>>>>>>>>> extern int diags_on_for_plugins;
>>>>>>>>>>>> 
>>>>>>>>>>>> #define TSDEBUG             \
>>>>>>>>>>>> 
>>>>>>>>>>>> if (diags_on_for_plugins) \
>>>>>>>>>>>> 
>>>>>>>>>>>> TSDebug
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> A problem with this is if you write:
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> if (flag)
>>>>>>>>>>>> 
>>>>>>>>>>>> TSDEBUG(yadayada);
>>>>>>>>>>>> 
>>>>>>>>>>>> else
>>>>>>>>>>>> 
>>>>>>>>>>>> do_something();
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> then the else will be associated with the if hidden in the macro.
>>>>>> I'd
>>>>>>>>>>>> prefer to change to something like:
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> extern const volatile char TS_detail_dbg_output_enabled;
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> #define TSFastDbg(_TAG, ...) \
>>>>>>>>>>>> do { \
>>>>>>>>>>>> if (TS_detail_dbg_output_enabled) \
>>>>>>>>>>>> TSDebug(_TAG, __VA_ARGS__); \
>>>>>>>>>>>> 
>>>>>>>>>>>> } while(0)
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Yeh, that kinda sounds like a bug, doesn’t it?
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> The negative is that we'd only see the benefit when all debug
>>>> output
>>>>>>>>>>>> 
>>>>>>>>>>>> was
>>>>>>>>>>>> 
>>>>>>>>>>>> disabled.  But, in any case, that is the main benefit, to avoid
>>>>>>>> impacts
>>>>>>>>>>>> during normal operation.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Right, we should always optimize for the common cases, obviously.
>>>>>> When
>>>>>>>>>>>> there’s a decision to make, where you can either make it
>>>> “equally”
>>>>>>>>>>>> expensive always, or more expensive in a rare case, but cheaper
>>>> in
>>>>>> the
>>>>>>>>>>>> common case, we should likely chose the latter.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Looking in src/tscore/Diags.cc, it looks like the debug enable
>>>> flag
>>>>>> is
>>>>>>>>>>>> 
>>>>>>>>>>>> only
>>>>>>>>>>>> 
>>>>>>>>>>>> set at startup time.  So we wouldn't be able to enable TSFastDbg
>>>>>> using
>>>>>>>>>>>> traffic_cntl without adding complexity.
>>>>>>>>>>>> 
>>>>>>>>>>>> On Tue, Apr 2, 2019 at 5:48 PM Walt Karas <
>>>> wkaras@verizonmedia.com>
>>>>>>>>>>>> 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Add this new structure to apidefs.h:
>>>>>>>>>>>> 
>>>>>>>>>>>> typedef struct TSFastDbgCntl_
>>>>>>>>>>>> {
>>>>>>>>>>>> const char * const tag; // nul-terminated string
>>>>>>>>>>>> const volatile char * const on; // pointer to 1-byte flag
>>>>>>>>>>>> }
>>>>>>>>>>>> TSFastDbgCntl;
>>>>>>>>>>>> 
>>>>>>>>>>>> Add this new API function, that returns a pointer to an instance
>>>> of
>>>>>>>>>>>> 
>>>>>>>>>>>> the
>>>>>>>>>>>> 
>>>>>>>>>>>> above structure:
>>>>>>>>>>>> 
>>>>>>>>>>>> TSFastDbgCntl * TSCreateFastDbgCntl(const char *tag);
>>>>>>>>>>>> 
>>>>>>>>>>>> Add this macro, which would be used in place of direct use of
>>>>>>>>>>>> 
>>>>>>>>>>>> TSDebug().
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> #define TSFastDbg(_FD_CNTL, ...) \
>>>>>>>>>>>> do { \
>>>>>>>>>>>> if ((_FD_CNTL)->on) \
>>>>>>>>>>>> TSDebug((_FD_CNTL)->tag, __VA_ARGS__); \
>>>>>>>>>>>> } while(0)
>>>>>>>>>>>> 
>>>>>>>>>>>> The first parameter to TSFastDbg() is a pointer that was
>>>> returned by
>>>>>>>>>>>> TSCreateFastDbgCntl().  The remaining parameters are printf
>>>>>> parameters
>>>>>>>>>>>> (format string and values matching format specifiers).  The core
>>>>>> would
>>>>>>>>>>>> 
>>>>>>>>>>>> be
>>>>>>>>>>>> 
>>>>>>>>>>>> responsible for changing the value of the 'on' fields in the
>>>>>>>>>>>> 
>>>>>>>>>>>> TSFastDbgCntl
>>>>>>>>>>>> 
>>>>>>>>>>>> instances.  For direct calls to TSDebug(), even if the tag is
>>>>>>>>>>>> 
>>>>>>>>>>>> disabled,
>>>>>>>>>>>> 
>>>>>>>>>>>> all
>>>>>>>>>>>> 
>>>>>>>>>>>> the parameters have to be prepared and pushed onto the stack.  As
>>>>>> was
>>>>>>>>>>>> discussed in IRC, you have to feel guilty about doing converting
>>>> a
>>>>>>>>>>>> string_view to a string and outputting it with c_str().  Because
>>>> the
>>>>>>>>>>>> 
>>>>>>>>>>>> code
>>>>>>>>>>>> 
>>>>>>>>>>>> to do this will be executed even if the debug tag is disabled.
>>>>>>>>>>>> Furthermore, for each call to TSDebug, it's necessary to do an
>>>>>>>>>>>> 
>>>>>>>>>>>> associative
>>>>>>>>>>>> 
>>>>>>>>>>>> lookup on the tag string to determine that output for the tag is
>>>>>>>>>>>> 
>>>>>>>>>>>> disabled.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> We would want to put equivalent capabilities in Diags.h in the
>>>> core
>>>>>>>>>>>> 
>>>>>>>>>>>> for
>>>>>>>>>>>> 
>>>>>>>>>>>> debug output.  The implementation is non-trivial, but I think it
>>>>>> would
>>>>>>>>>>>> 
>>>>>>>>>>>> take
>>>>>>>>>>>> 
>>>>>>>>>>>> at most a week including the Au test additions.
>>>>>>>>>>>> 
>>>>>>>>>>>> I looked at making TSFastDbg() an inline function rather than a
>>>>>> macro:
>>>>>>>>>>>> https://godbolt.org/z/IfVbBk
>>>>>>>>>>>> The opitimization works well if you compile the code with the
>>>> inline
>>>>>>>>>>>> function as C, but the optimization is poor if you compile it as
>>>>>> C++.
>>>>>>>>>>>> 
>>>>>>>>>>>> This
>>>>>>>>>>>> 
>>>>>>>>>>>> difference exists for both gcc and clang.  It's weird.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>>> 
> 


Re: TS API change proposal: Reduce performance impact of TSDebug calls when the tag is disabled

Posted by Leif Hedstrom <zw...@apache.org>.

> On Jun 14, 2019, at 5:18 PM, Walt Karas <wk...@verizonmedia.com.INVALID> wrote:
> 
> I got some crude performance numbers for TSDebug() and TSDEBUG()
> when proxy.config.diags.debug.enabled is 0.  The good news is that
> TSDebug() looks to take about 8 nanoseconds (the raw data is below).  So,
> if we take 1000 requests per core per seconds as a ballpark estimate of TS
> performance, it would take a lot of them to have a significant impact.  But
> TSDebug() takes about 30 times as long as TSDEBUG().  So it seems
> worthwhile to make the tiny effort to change TSDEBUG() so it won't swallow
> a following 'else' clause.  I also think we should rename it to TSFastDbg().

The numbers sounds good, but why on earth keep the old Debug? Why would I chose to use a slow API over a fast API? That just makes no sense.

if we do this, just change TSDebug().  If it’s incompatible, we document it with appropriate warnings.

Does any of this affect / improve the core Debug() statements? Those are probably 10x - 100x more critical to get fast, because they are *everywhere*.

— leif

> 
> We could rename TSDebug() to TSDebug_(), and TSDEBUG() to TSDebug().  But I
> think the very small benefit is outweighed by the potential risk.  TS seems
> to be rife with race conditions, so a change that will alter the length of
> lots of code paths could create yet more instability.
> 
> Here is the raw data (6 tests, 3 repetitions).  Done with
> https://github.com/apache/trafficserver/pull/5638 .  I don't understand why
> it's not more consistent.  If anyone knows a better way to time this,
> please let me know.
> 
> Performance for TSDebug() with no parameters, 1000000000 repetitions,
> microseconds user=7670000 system=20000
> Performance for TSDEBUG() with no parameters, 1000000000 repetitions,
> microseconds user=260000 system=0
> Performance for TSDebug() with parameters, 1000000000 repetitions,
> microseconds user=8030000 system=50000
> Performance for TSDEBUG() with parameters, 1000000000 repetitions,
> microseconds user=280000 system=0
> Performance for TSDebug() with volatile parameters, 1000000000 repetitions,
> microseconds user=8830000 system=30000
> Performance for TSDEBUG() with volatile parameters, 1000000000 repetitions,
> microseconds user=270000 system=0
> 
> Performance for TSDebug() with no parameters, 1000000000 repetitions,
> microseconds user=7520000 system=50000
> Performance for TSDEBUG() with no parameters, 1000000000 repetitions,
> microseconds user=260000 system=0
> Performance for TSDebug() with parameters, 1000000000 repetitions,
> microseconds user=8040000 system=30000
> Performance for TSDEBUG() with parameters, 1000000000 repetitions,
> microseconds user=260000 system=0
> Performance for TSDebug() with volatile parameters, 1000000000 repetitions,
> microseconds user=8450000 system=50000
> Performance for TSDEBUG() with volatile parameters, 1000000000 repetitions,
> microseconds user=260000 system=0
> 
> Performance for TSDebug() with no parameters, 1000000000 repetitions,
> microseconds user=7470000 system=80000
> Performance for TSDEBUG() with no parameters, 1000000000 repetitions,
> microseconds user=260000 system=0
> Performance for TSDebug() with parameters, 1000000000 repetitions,
> microseconds user=8120000 system=50000
> Performance for TSDEBUG() with parameters, 1000000000 repetitions,
> microseconds user=270000 system=0
> Performance for TSDebug() with volatile parameters, 1000000000 repetitions,
> microseconds user=8500000 system=60000
> Performance for TSDEBUG() with volatile parameters, 1000000000 repetitions,
> microseconds user=270000 system=0
> 
> On Wed, Jun 12, 2019 at 12:58 PM Walt Karas <wk...@verizonmedia.com> wrote:
> 
>> Since I'm still relatively new to this application, there are limits to
>> what I can take on on my own initiative.  I don't see how anything I've
>> proposed would interfere with any higher-priority work.
>> 
>> On Wed, Jun 12, 2019 at 12:41 PM Leif Hedstrom <zw...@apache.org> wrote:
>> 
>>> Also, I’d like to reiterate my call to arms on v9.0.0:
>>> 
>>>   Please, everyone lets focus on the things that are critical to ATS
>>> v9.0.0 readiness!
>>> 
>>> 
>>> This includes:
>>> 
>>>        1) Fixing all the critical bugs (and there’s a lot of them)
>>> 
>>>        2) Testing “master"
>>> 
>>>        3) Doing the incompatible changes that needs to be done before we
>>> can release
>>> 
>>>        4) Removing features that we do not want to support for the next
>>> 3+ years
>>> 
>>>        5) Make sure that v9.0.0 has the functionality that you
>>> *absolutely* need to be able
>>>        to use it
>>> 
>>> 
>>> I kinda feel that several of these recent proposal do not fit into this,
>>> and in fact, are a fair amount of distraction. I urge everyone to not go
>>> down this route until we’ve got v9.0.0 stable and functional, and instead
>>> focus on the v9.0.0 tasks at hand.  I don’t think this one for example fits
>>> into any of those 5 points above.
>>> 
>>> My $.01.
>>> 
>>> — Leif
>>> 
>>> P.s
>>>   https://github.com/apache/trafficserver/issues/5457
>>> 
>>> 
>>>> On Jun 12, 2019, at 11:30 AM, Walt Karas <wk...@verizonmedia.com.INVALID>
>>> wrote:
>>>> 
>>>> I would guess the performance increase would be well under 1%.  But
>>> it's a
>>>> fairly simple change.  The biggest performance increase I think would
>>> come
>>>> from avoiding a PCRE match, to see if the specific debug tag was
>>> enabled.
>>>> For some TSDebug calls, there can be many arguments to push, and
>>> perhaps a
>>>> fair amount of calculation to generate the correct arguments.
>>>> 
>>>> On Wed, Jun 12, 2019 at 12:21 PM Bryan Call <bc...@apache.org> wrote:
>>>> 
>>>>> I am surprised you are seeing this as a performance issue.  Looking at
>>> the
>>>>> code it looks like 1 function call and a couple pointer dereferences.
>>>>> 
>>>>> If you wanted to speed up TSDebug and make the code harder to maintain
>>> you
>>>>> could create a structure that looks like like DiagsConfig in ts.h and
>>> then
>>>>> set a pointer used in the API to look at the gobal DiagsConfig.  Then
>>> you
>>>>> could have a macro that would do basically what on() does.  This would
>>>>> bypass a function call and would have 1 more pointer dereference than
>>> your
>>>>> example.
>>>>> 
>>>>> -Bryan
>>>>> 
>>>>> 
>>>>>> On Jun 12, 2019, at 10:00 AM, Walt Karas <wkaras@verizonmedia.com
>>> .INVALID>
>>>>> wrote:
>>>>>> 
>>>>>> What do you mean by "exposing diags"?
>>>>>> 
>>>>>> On Wed, Jun 12, 2019 at 11:50 AM Bryan Call <bc...@apache.org> wrote:
>>>>>> 
>>>>>>> I would rather see TSDebug be optimized and exposing diags instead of
>>>>>>> creating a new type of debug.
>>>>>>> 
>>>>>>> -Bryan
>>>>>>> 
>>>>>>> 
>>>>>>>> On Jun 12, 2019, at 9:44 AM, Walt Karas <wkaras@verizonmedia.com
>>>>> .INVALID>
>>>>>>> wrote:
>>>>>>>> 
>>>>>>>> I don't believe so.  The issue I'm trying to address is that there
>>> is
>>>>>>> still
>>>>>>>> significant execution time overhead for debug output calls when
>>>>>>>> proxy.config.diags.debug.enabled is 0.  This change would make the
>>>>>>>> execution time overhead negligible (if TSFastDbg is used or TSDebug
>>> is
>>>>>>>> changed to be like the proposed TSFastDbg).  Another way of putting
>>> it
>>>>>>> is,
>>>>>>>> I want to clean up the existing TSDEBUG alternative to TSDebug.
>>>>>>>> 
>>>>>>>> On Wed, Jun 12, 2019 at 11:31 AM Bryan Call <bc...@apache.org>
>>> wrote:
>>>>>>>> 
>>>>>>>>> How would this be controlled on the command line when running
>>>>>>>>> trafic_server -T?
>>>>>>>>> 
>>>>>>>>> It looks like it is a departure from how debug tags and controlled
>>>>> now.
>>>>>>>>> 
>>>>>>>>> -Bryan
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> On Jun 11, 2019, at 1:21 PM, Walt Karas <wkaras@verizonmedia.com
>>>>>>> .INVALID>
>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>> Any +1's for simply replacing TSDebug with the proposed TSFastDbg
>>> ?
>>>>>>>>>> 
>>>>>>>>>> On Mon, Apr 8, 2019 at 6:20 PM Leif Hedstrom <zw...@apache.org>
>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Apr 4, 2019, at 8:05 AM, Walt Karas <wkaras@verizonmedia.com
>>>>>>>>> .INVALID>
>>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Because senility. probably "on" should be a non-pointer of type
>>>>> const
>>>>>>>>>>> volatile char, and InkAPI in the core can just keep a pointer to
>>> the
>>>>>>>>> whole
>>>>>>>>>>> structure for updating.
>>>>>>>>>>> 
>>>>>>>>>>> But I'm leaning more to fixing up the capability that already
>>> kinda
>>>>>>>>> there.
>>>>>>>>>>> Which is optimizimg only for the case when all debug output is
>>>>>>> disabled.
>>>>>>>>>>> Especially since maybe zwoop would be on my side for once :o) .
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> My bad, whatever Walt said, I take the opposite stance.
>>>>>>>>>>> 
>>>>>>>>>>> — leif
>>>>>>>>>>> 
>>>>>>>>>>> P.s
>>>>>>>>>>> J/K!
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Thu, Apr 4, 2019 at 8:51 AM Alan Carroll
>>>>>>>>>>> <so...@verizonmedia.com.invalid> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> This could enable a significant speed up for debug tags. One
>>> point
>>>>> is
>>>>>>>>> that
>>>>>>>>>>> when the debug tag string is set, at that point the debug objects
>>>>>>> could
>>>>>>>>> be
>>>>>>>>>>> updated to the correct state according to the debug tag string,
>>>>> rather
>>>>>>>>> than
>>>>>>>>>>> checking it every time a debug message is logged.
>>>>>>>>>>> 
>>>>>>>>>>> I have to ask, why is there a pointer to the enable flag,
>>> instead of
>>>>>>>>> just
>>>>>>>>>>> the enable flag? Your logic doesn't even check that flag, it
>>> checks
>>>>>>>>> whether
>>>>>>>>>>> the pointer is null.
>>>>>>>>>>> 
>>>>>>>>>>> On Wed, Apr 3, 2019 at 11:52 AM Leif Hedstrom <zw...@apache.org>
>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Apr 3, 2019, at 10:47 AM, Walt Karas <wkaras@verizonmedia.com
>>>>>>>>>>> 
>>>>>>>>>>> .INVALID>
>>>>>>>>>>> 
>>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> It looks like the TS API already has less granular version of
>>> this:
>>>>>>>>>>> 
>>>>>>>>>>> extern int diags_on_for_plugins;
>>>>>>>>>>> 
>>>>>>>>>>> #define TSDEBUG             \
>>>>>>>>>>> 
>>>>>>>>>>> if (diags_on_for_plugins) \
>>>>>>>>>>> 
>>>>>>>>>>> TSDebug
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> A problem with this is if you write:
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> if (flag)
>>>>>>>>>>> 
>>>>>>>>>>> TSDEBUG(yadayada);
>>>>>>>>>>> 
>>>>>>>>>>> else
>>>>>>>>>>> 
>>>>>>>>>>> do_something();
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> then the else will be associated with the if hidden in the macro.
>>>>> I'd
>>>>>>>>>>> prefer to change to something like:
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> extern const volatile char TS_detail_dbg_output_enabled;
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> #define TSFastDbg(_TAG, ...) \
>>>>>>>>>>> do { \
>>>>>>>>>>> if (TS_detail_dbg_output_enabled) \
>>>>>>>>>>> TSDebug(_TAG, __VA_ARGS__); \
>>>>>>>>>>> 
>>>>>>>>>>> } while(0)
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Yeh, that kinda sounds like a bug, doesn’t it?
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> The negative is that we'd only see the benefit when all debug
>>> output
>>>>>>>>>>> 
>>>>>>>>>>> was
>>>>>>>>>>> 
>>>>>>>>>>> disabled.  But, in any case, that is the main benefit, to avoid
>>>>>>> impacts
>>>>>>>>>>> during normal operation.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Right, we should always optimize for the common cases, obviously.
>>>>> When
>>>>>>>>>>> there’s a decision to make, where you can either make it
>>> “equally”
>>>>>>>>>>> expensive always, or more expensive in a rare case, but cheaper
>>> in
>>>>> the
>>>>>>>>>>> common case, we should likely chose the latter.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Looking in src/tscore/Diags.cc, it looks like the debug enable
>>> flag
>>>>> is
>>>>>>>>>>> 
>>>>>>>>>>> only
>>>>>>>>>>> 
>>>>>>>>>>> set at startup time.  So we wouldn't be able to enable TSFastDbg
>>>>> using
>>>>>>>>>>> traffic_cntl without adding complexity.
>>>>>>>>>>> 
>>>>>>>>>>> On Tue, Apr 2, 2019 at 5:48 PM Walt Karas <
>>> wkaras@verizonmedia.com>
>>>>>>>>>>> 
>>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Add this new structure to apidefs.h:
>>>>>>>>>>> 
>>>>>>>>>>> typedef struct TSFastDbgCntl_
>>>>>>>>>>> {
>>>>>>>>>>> const char * const tag; // nul-terminated string
>>>>>>>>>>> const volatile char * const on; // pointer to 1-byte flag
>>>>>>>>>>> }
>>>>>>>>>>> TSFastDbgCntl;
>>>>>>>>>>> 
>>>>>>>>>>> Add this new API function, that returns a pointer to an instance
>>> of
>>>>>>>>>>> 
>>>>>>>>>>> the
>>>>>>>>>>> 
>>>>>>>>>>> above structure:
>>>>>>>>>>> 
>>>>>>>>>>> TSFastDbgCntl * TSCreateFastDbgCntl(const char *tag);
>>>>>>>>>>> 
>>>>>>>>>>> Add this macro, which would be used in place of direct use of
>>>>>>>>>>> 
>>>>>>>>>>> TSDebug().
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> #define TSFastDbg(_FD_CNTL, ...) \
>>>>>>>>>>> do { \
>>>>>>>>>>> if ((_FD_CNTL)->on) \
>>>>>>>>>>> TSDebug((_FD_CNTL)->tag, __VA_ARGS__); \
>>>>>>>>>>> } while(0)
>>>>>>>>>>> 
>>>>>>>>>>> The first parameter to TSFastDbg() is a pointer that was
>>> returned by
>>>>>>>>>>> TSCreateFastDbgCntl().  The remaining parameters are printf
>>>>> parameters
>>>>>>>>>>> (format string and values matching format specifiers).  The core
>>>>> would
>>>>>>>>>>> 
>>>>>>>>>>> be
>>>>>>>>>>> 
>>>>>>>>>>> responsible for changing the value of the 'on' fields in the
>>>>>>>>>>> 
>>>>>>>>>>> TSFastDbgCntl
>>>>>>>>>>> 
>>>>>>>>>>> instances.  For direct calls to TSDebug(), even if the tag is
>>>>>>>>>>> 
>>>>>>>>>>> disabled,
>>>>>>>>>>> 
>>>>>>>>>>> all
>>>>>>>>>>> 
>>>>>>>>>>> the parameters have to be prepared and pushed onto the stack.  As
>>>>> was
>>>>>>>>>>> discussed in IRC, you have to feel guilty about doing converting
>>> a
>>>>>>>>>>> string_view to a string and outputting it with c_str().  Because
>>> the
>>>>>>>>>>> 
>>>>>>>>>>> code
>>>>>>>>>>> 
>>>>>>>>>>> to do this will be executed even if the debug tag is disabled.
>>>>>>>>>>> Furthermore, for each call to TSDebug, it's necessary to do an
>>>>>>>>>>> 
>>>>>>>>>>> associative
>>>>>>>>>>> 
>>>>>>>>>>> lookup on the tag string to determine that output for the tag is
>>>>>>>>>>> 
>>>>>>>>>>> disabled.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> We would want to put equivalent capabilities in Diags.h in the
>>> core
>>>>>>>>>>> 
>>>>>>>>>>> for
>>>>>>>>>>> 
>>>>>>>>>>> debug output.  The implementation is non-trivial, but I think it
>>>>> would
>>>>>>>>>>> 
>>>>>>>>>>> take
>>>>>>>>>>> 
>>>>>>>>>>> at most a week including the Au test additions.
>>>>>>>>>>> 
>>>>>>>>>>> I looked at making TSFastDbg() an inline function rather than a
>>>>> macro:
>>>>>>>>>>> https://godbolt.org/z/IfVbBk
>>>>>>>>>>> The opitimization works well if you compile the code with the
>>> inline
>>>>>>>>>>> function as C, but the optimization is poor if you compile it as
>>>>> C++.
>>>>>>>>>>> 
>>>>>>>>>>> This
>>>>>>>>>>> 
>>>>>>>>>>> difference exists for both gcc and clang.  It's weird.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>> 
>>>>> 
>>> 
>>> 


Re: TS API change proposal: Reduce performance impact of TSDebug calls when the tag is disabled

Posted by Walt Karas <wk...@verizonmedia.com.INVALID>.
I got some crude performance numbers for TSDebug() and TSDEBUG()
when proxy.config.diags.debug.enabled is 0.  The good news is that
TSDebug() looks to take about 8 nanoseconds (the raw data is below).  So,
if we take 1000 requests per core per seconds as a ballpark estimate of TS
performance, it would take a lot of them to have a significant impact.  But
TSDebug() takes about 30 times as long as TSDEBUG().  So it seems
worthwhile to make the tiny effort to change TSDEBUG() so it won't swallow
a following 'else' clause.  I also think we should rename it to TSFastDbg().

We could rename TSDebug() to TSDebug_(), and TSDEBUG() to TSDebug().  But I
think the very small benefit is outweighed by the potential risk.  TS seems
to be rife with race conditions, so a change that will alter the length of
lots of code paths could create yet more instability.

Here is the raw data (6 tests, 3 repetitions).  Done with
https://github.com/apache/trafficserver/pull/5638 .  I don't understand why
it's not more consistent.  If anyone knows a better way to time this,
please let me know.

Performance for TSDebug() with no parameters, 1000000000 repetitions,
microseconds user=7670000 system=20000
Performance for TSDEBUG() with no parameters, 1000000000 repetitions,
microseconds user=260000 system=0
Performance for TSDebug() with parameters, 1000000000 repetitions,
microseconds user=8030000 system=50000
Performance for TSDEBUG() with parameters, 1000000000 repetitions,
microseconds user=280000 system=0
Performance for TSDebug() with volatile parameters, 1000000000 repetitions,
microseconds user=8830000 system=30000
Performance for TSDEBUG() with volatile parameters, 1000000000 repetitions,
microseconds user=270000 system=0

Performance for TSDebug() with no parameters, 1000000000 repetitions,
microseconds user=7520000 system=50000
Performance for TSDEBUG() with no parameters, 1000000000 repetitions,
microseconds user=260000 system=0
Performance for TSDebug() with parameters, 1000000000 repetitions,
microseconds user=8040000 system=30000
Performance for TSDEBUG() with parameters, 1000000000 repetitions,
microseconds user=260000 system=0
Performance for TSDebug() with volatile parameters, 1000000000 repetitions,
microseconds user=8450000 system=50000
Performance for TSDEBUG() with volatile parameters, 1000000000 repetitions,
microseconds user=260000 system=0

Performance for TSDebug() with no parameters, 1000000000 repetitions,
microseconds user=7470000 system=80000
Performance for TSDEBUG() with no parameters, 1000000000 repetitions,
microseconds user=260000 system=0
Performance for TSDebug() with parameters, 1000000000 repetitions,
microseconds user=8120000 system=50000
Performance for TSDEBUG() with parameters, 1000000000 repetitions,
microseconds user=270000 system=0
Performance for TSDebug() with volatile parameters, 1000000000 repetitions,
microseconds user=8500000 system=60000
Performance for TSDEBUG() with volatile parameters, 1000000000 repetitions,
microseconds user=270000 system=0

On Wed, Jun 12, 2019 at 12:58 PM Walt Karas <wk...@verizonmedia.com> wrote:

> Since I'm still relatively new to this application, there are limits to
> what I can take on on my own initiative.  I don't see how anything I've
> proposed would interfere with any higher-priority work.
>
> On Wed, Jun 12, 2019 at 12:41 PM Leif Hedstrom <zw...@apache.org> wrote:
>
>> Also, I’d like to reiterate my call to arms on v9.0.0:
>>
>>    Please, everyone lets focus on the things that are critical to ATS
>> v9.0.0 readiness!
>>
>>
>> This includes:
>>
>>         1) Fixing all the critical bugs (and there’s a lot of them)
>>
>>         2) Testing “master"
>>
>>         3) Doing the incompatible changes that needs to be done before we
>> can release
>>
>>         4) Removing features that we do not want to support for the next
>> 3+ years
>>
>>         5) Make sure that v9.0.0 has the functionality that you
>> *absolutely* need to be able
>>         to use it
>>
>>
>> I kinda feel that several of these recent proposal do not fit into this,
>> and in fact, are a fair amount of distraction. I urge everyone to not go
>> down this route until we’ve got v9.0.0 stable and functional, and instead
>> focus on the v9.0.0 tasks at hand.  I don’t think this one for example fits
>> into any of those 5 points above.
>>
>> My $.01.
>>
>> — Leif
>>
>> P.s
>>    https://github.com/apache/trafficserver/issues/5457
>>
>>
>> > On Jun 12, 2019, at 11:30 AM, Walt Karas <wk...@verizonmedia.com.INVALID>
>> wrote:
>> >
>> > I would guess the performance increase would be well under 1%.  But
>> it's a
>> > fairly simple change.  The biggest performance increase I think would
>> come
>> > from avoiding a PCRE match, to see if the specific debug tag was
>> enabled.
>> > For some TSDebug calls, there can be many arguments to push, and
>> perhaps a
>> > fair amount of calculation to generate the correct arguments.
>> >
>> > On Wed, Jun 12, 2019 at 12:21 PM Bryan Call <bc...@apache.org> wrote:
>> >
>> >> I am surprised you are seeing this as a performance issue.  Looking at
>> the
>> >> code it looks like 1 function call and a couple pointer dereferences.
>> >>
>> >> If you wanted to speed up TSDebug and make the code harder to maintain
>> you
>> >> could create a structure that looks like like DiagsConfig in ts.h and
>> then
>> >> set a pointer used in the API to look at the gobal DiagsConfig.  Then
>> you
>> >> could have a macro that would do basically what on() does.  This would
>> >> bypass a function call and would have 1 more pointer dereference than
>> your
>> >> example.
>> >>
>> >> -Bryan
>> >>
>> >>
>> >>> On Jun 12, 2019, at 10:00 AM, Walt Karas <wkaras@verizonmedia.com
>> .INVALID>
>> >> wrote:
>> >>>
>> >>> What do you mean by "exposing diags"?
>> >>>
>> >>> On Wed, Jun 12, 2019 at 11:50 AM Bryan Call <bc...@apache.org> wrote:
>> >>>
>> >>>> I would rather see TSDebug be optimized and exposing diags instead of
>> >>>> creating a new type of debug.
>> >>>>
>> >>>> -Bryan
>> >>>>
>> >>>>
>> >>>>> On Jun 12, 2019, at 9:44 AM, Walt Karas <wkaras@verizonmedia.com
>> >> .INVALID>
>> >>>> wrote:
>> >>>>>
>> >>>>> I don't believe so.  The issue I'm trying to address is that there
>> is
>> >>>> still
>> >>>>> significant execution time overhead for debug output calls when
>> >>>>> proxy.config.diags.debug.enabled is 0.  This change would make the
>> >>>>> execution time overhead negligible (if TSFastDbg is used or TSDebug
>> is
>> >>>>> changed to be like the proposed TSFastDbg).  Another way of putting
>> it
>> >>>> is,
>> >>>>> I want to clean up the existing TSDEBUG alternative to TSDebug.
>> >>>>>
>> >>>>> On Wed, Jun 12, 2019 at 11:31 AM Bryan Call <bc...@apache.org>
>> wrote:
>> >>>>>
>> >>>>>> How would this be controlled on the command line when running
>> >>>>>> trafic_server -T?
>> >>>>>>
>> >>>>>> It looks like it is a departure from how debug tags and controlled
>> >> now.
>> >>>>>>
>> >>>>>> -Bryan
>> >>>>>>
>> >>>>>>
>> >>>>>>> On Jun 11, 2019, at 1:21 PM, Walt Karas <wkaras@verizonmedia.com
>> >>>> .INVALID>
>> >>>>>> wrote:
>> >>>>>>>
>> >>>>>>> Any +1's for simply replacing TSDebug with the proposed TSFastDbg
>> ?
>> >>>>>>>
>> >>>>>>> On Mon, Apr 8, 2019 at 6:20 PM Leif Hedstrom <zw...@apache.org>
>> >> wrote:
>> >>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> On Apr 4, 2019, at 8:05 AM, Walt Karas <wkaras@verizonmedia.com
>> >>>>>> .INVALID>
>> >>>>>>>> wrote:
>> >>>>>>>>
>> >>>>>>>> Because senility. probably "on" should be a non-pointer of type
>> >> const
>> >>>>>>>> volatile char, and InkAPI in the core can just keep a pointer to
>> the
>> >>>>>> whole
>> >>>>>>>> structure for updating.
>> >>>>>>>>
>> >>>>>>>> But I'm leaning more to fixing up the capability that already
>> kinda
>> >>>>>> there.
>> >>>>>>>> Which is optimizimg only for the case when all debug output is
>> >>>> disabled.
>> >>>>>>>> Especially since maybe zwoop would be on my side for once :o) .
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> My bad, whatever Walt said, I take the opposite stance.
>> >>>>>>>>
>> >>>>>>>> — leif
>> >>>>>>>>
>> >>>>>>>> P.s
>> >>>>>>>> J/K!
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> On Thu, Apr 4, 2019 at 8:51 AM Alan Carroll
>> >>>>>>>> <so...@verizonmedia.com.invalid> wrote:
>> >>>>>>>>
>> >>>>>>>> This could enable a significant speed up for debug tags. One
>> point
>> >> is
>> >>>>>> that
>> >>>>>>>> when the debug tag string is set, at that point the debug objects
>> >>>> could
>> >>>>>> be
>> >>>>>>>> updated to the correct state according to the debug tag string,
>> >> rather
>> >>>>>> than
>> >>>>>>>> checking it every time a debug message is logged.
>> >>>>>>>>
>> >>>>>>>> I have to ask, why is there a pointer to the enable flag,
>> instead of
>> >>>>>> just
>> >>>>>>>> the enable flag? Your logic doesn't even check that flag, it
>> checks
>> >>>>>> whether
>> >>>>>>>> the pointer is null.
>> >>>>>>>>
>> >>>>>>>> On Wed, Apr 3, 2019 at 11:52 AM Leif Hedstrom <zw...@apache.org>
>> >>>> wrote:
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> On Apr 3, 2019, at 10:47 AM, Walt Karas <wkaras@verizonmedia.com
>> >>>>>>>>
>> >>>>>>>> .INVALID>
>> >>>>>>>>
>> >>>>>>>> wrote:
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> It looks like the TS API already has less granular version of
>> this:
>> >>>>>>>>
>> >>>>>>>> extern int diags_on_for_plugins;
>> >>>>>>>>
>> >>>>>>>> #define TSDEBUG             \
>> >>>>>>>>
>> >>>>>>>> if (diags_on_for_plugins) \
>> >>>>>>>>
>> >>>>>>>> TSDebug
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> A problem with this is if you write:
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> if (flag)
>> >>>>>>>>
>> >>>>>>>> TSDEBUG(yadayada);
>> >>>>>>>>
>> >>>>>>>> else
>> >>>>>>>>
>> >>>>>>>> do_something();
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> then the else will be associated with the if hidden in the macro.
>> >> I'd
>> >>>>>>>> prefer to change to something like:
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> extern const volatile char TS_detail_dbg_output_enabled;
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> #define TSFastDbg(_TAG, ...) \
>> >>>>>>>> do { \
>> >>>>>>>> if (TS_detail_dbg_output_enabled) \
>> >>>>>>>> TSDebug(_TAG, __VA_ARGS__); \
>> >>>>>>>>
>> >>>>>>>> } while(0)
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> Yeh, that kinda sounds like a bug, doesn’t it?
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> The negative is that we'd only see the benefit when all debug
>> output
>> >>>>>>>>
>> >>>>>>>> was
>> >>>>>>>>
>> >>>>>>>> disabled.  But, in any case, that is the main benefit, to avoid
>> >>>> impacts
>> >>>>>>>> during normal operation.
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> Right, we should always optimize for the common cases, obviously.
>> >> When
>> >>>>>>>> there’s a decision to make, where you can either make it
>> “equally”
>> >>>>>>>> expensive always, or more expensive in a rare case, but cheaper
>> in
>> >> the
>> >>>>>>>> common case, we should likely chose the latter.
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> Looking in src/tscore/Diags.cc, it looks like the debug enable
>> flag
>> >> is
>> >>>>>>>>
>> >>>>>>>> only
>> >>>>>>>>
>> >>>>>>>> set at startup time.  So we wouldn't be able to enable TSFastDbg
>> >> using
>> >>>>>>>> traffic_cntl without adding complexity.
>> >>>>>>>>
>> >>>>>>>> On Tue, Apr 2, 2019 at 5:48 PM Walt Karas <
>> wkaras@verizonmedia.com>
>> >>>>>>>>
>> >>>>>>>> wrote:
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> Add this new structure to apidefs.h:
>> >>>>>>>>
>> >>>>>>>> typedef struct TSFastDbgCntl_
>> >>>>>>>> {
>> >>>>>>>> const char * const tag; // nul-terminated string
>> >>>>>>>> const volatile char * const on; // pointer to 1-byte flag
>> >>>>>>>> }
>> >>>>>>>> TSFastDbgCntl;
>> >>>>>>>>
>> >>>>>>>> Add this new API function, that returns a pointer to an instance
>> of
>> >>>>>>>>
>> >>>>>>>> the
>> >>>>>>>>
>> >>>>>>>> above structure:
>> >>>>>>>>
>> >>>>>>>> TSFastDbgCntl * TSCreateFastDbgCntl(const char *tag);
>> >>>>>>>>
>> >>>>>>>> Add this macro, which would be used in place of direct use of
>> >>>>>>>>
>> >>>>>>>> TSDebug().
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> #define TSFastDbg(_FD_CNTL, ...) \
>> >>>>>>>> do { \
>> >>>>>>>> if ((_FD_CNTL)->on) \
>> >>>>>>>> TSDebug((_FD_CNTL)->tag, __VA_ARGS__); \
>> >>>>>>>> } while(0)
>> >>>>>>>>
>> >>>>>>>> The first parameter to TSFastDbg() is a pointer that was
>> returned by
>> >>>>>>>> TSCreateFastDbgCntl().  The remaining parameters are printf
>> >> parameters
>> >>>>>>>> (format string and values matching format specifiers).  The core
>> >> would
>> >>>>>>>>
>> >>>>>>>> be
>> >>>>>>>>
>> >>>>>>>> responsible for changing the value of the 'on' fields in the
>> >>>>>>>>
>> >>>>>>>> TSFastDbgCntl
>> >>>>>>>>
>> >>>>>>>> instances.  For direct calls to TSDebug(), even if the tag is
>> >>>>>>>>
>> >>>>>>>> disabled,
>> >>>>>>>>
>> >>>>>>>> all
>> >>>>>>>>
>> >>>>>>>> the parameters have to be prepared and pushed onto the stack.  As
>> >> was
>> >>>>>>>> discussed in IRC, you have to feel guilty about doing converting
>> a
>> >>>>>>>> string_view to a string and outputting it with c_str().  Because
>> the
>> >>>>>>>>
>> >>>>>>>> code
>> >>>>>>>>
>> >>>>>>>> to do this will be executed even if the debug tag is disabled.
>> >>>>>>>> Furthermore, for each call to TSDebug, it's necessary to do an
>> >>>>>>>>
>> >>>>>>>> associative
>> >>>>>>>>
>> >>>>>>>> lookup on the tag string to determine that output for the tag is
>> >>>>>>>>
>> >>>>>>>> disabled.
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> We would want to put equivalent capabilities in Diags.h in the
>> core
>> >>>>>>>>
>> >>>>>>>> for
>> >>>>>>>>
>> >>>>>>>> debug output.  The implementation is non-trivial, but I think it
>> >> would
>> >>>>>>>>
>> >>>>>>>> take
>> >>>>>>>>
>> >>>>>>>> at most a week including the Au test additions.
>> >>>>>>>>
>> >>>>>>>> I looked at making TSFastDbg() an inline function rather than a
>> >> macro:
>> >>>>>>>> https://godbolt.org/z/IfVbBk
>> >>>>>>>> The opitimization works well if you compile the code with the
>> inline
>> >>>>>>>> function as C, but the optimization is poor if you compile it as
>> >> C++.
>> >>>>>>>>
>> >>>>>>>> This
>> >>>>>>>>
>> >>>>>>>> difference exists for both gcc and clang.  It's weird.
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>
>> >>>>
>> >>
>> >>
>>
>>

Re: TS API change proposal: Reduce performance impact of TSDebug calls when the tag is disabled

Posted by Walt Karas <wk...@verizonmedia.com.INVALID>.
Since I'm still relatively new to this application, there are limits to
what I can take on on my own initiative.  I don't see how anything I've
proposed would interfere with any higher-priority work.

On Wed, Jun 12, 2019 at 12:41 PM Leif Hedstrom <zw...@apache.org> wrote:

> Also, I’d like to reiterate my call to arms on v9.0.0:
>
>    Please, everyone lets focus on the things that are critical to ATS
> v9.0.0 readiness!
>
>
> This includes:
>
>         1) Fixing all the critical bugs (and there’s a lot of them)
>
>         2) Testing “master"
>
>         3) Doing the incompatible changes that needs to be done before we
> can release
>
>         4) Removing features that we do not want to support for the next
> 3+ years
>
>         5) Make sure that v9.0.0 has the functionality that you
> *absolutely* need to be able
>         to use it
>
>
> I kinda feel that several of these recent proposal do not fit into this,
> and in fact, are a fair amount of distraction. I urge everyone to not go
> down this route until we’ve got v9.0.0 stable and functional, and instead
> focus on the v9.0.0 tasks at hand.  I don’t think this one for example fits
> into any of those 5 points above.
>
> My $.01.
>
> — Leif
>
> P.s
>    https://github.com/apache/trafficserver/issues/5457
>
>
> > On Jun 12, 2019, at 11:30 AM, Walt Karas <wk...@verizonmedia.com.INVALID>
> wrote:
> >
> > I would guess the performance increase would be well under 1%.  But it's
> a
> > fairly simple change.  The biggest performance increase I think would
> come
> > from avoiding a PCRE match, to see if the specific debug tag was enabled.
> > For some TSDebug calls, there can be many arguments to push, and perhaps
> a
> > fair amount of calculation to generate the correct arguments.
> >
> > On Wed, Jun 12, 2019 at 12:21 PM Bryan Call <bc...@apache.org> wrote:
> >
> >> I am surprised you are seeing this as a performance issue.  Looking at
> the
> >> code it looks like 1 function call and a couple pointer dereferences.
> >>
> >> If you wanted to speed up TSDebug and make the code harder to maintain
> you
> >> could create a structure that looks like like DiagsConfig in ts.h and
> then
> >> set a pointer used in the API to look at the gobal DiagsConfig.  Then
> you
> >> could have a macro that would do basically what on() does.  This would
> >> bypass a function call and would have 1 more pointer dereference than
> your
> >> example.
> >>
> >> -Bryan
> >>
> >>
> >>> On Jun 12, 2019, at 10:00 AM, Walt Karas <wkaras@verizonmedia.com
> .INVALID>
> >> wrote:
> >>>
> >>> What do you mean by "exposing diags"?
> >>>
> >>> On Wed, Jun 12, 2019 at 11:50 AM Bryan Call <bc...@apache.org> wrote:
> >>>
> >>>> I would rather see TSDebug be optimized and exposing diags instead of
> >>>> creating a new type of debug.
> >>>>
> >>>> -Bryan
> >>>>
> >>>>
> >>>>> On Jun 12, 2019, at 9:44 AM, Walt Karas <wkaras@verizonmedia.com
> >> .INVALID>
> >>>> wrote:
> >>>>>
> >>>>> I don't believe so.  The issue I'm trying to address is that there is
> >>>> still
> >>>>> significant execution time overhead for debug output calls when
> >>>>> proxy.config.diags.debug.enabled is 0.  This change would make the
> >>>>> execution time overhead negligible (if TSFastDbg is used or TSDebug
> is
> >>>>> changed to be like the proposed TSFastDbg).  Another way of putting
> it
> >>>> is,
> >>>>> I want to clean up the existing TSDEBUG alternative to TSDebug.
> >>>>>
> >>>>> On Wed, Jun 12, 2019 at 11:31 AM Bryan Call <bc...@apache.org>
> wrote:
> >>>>>
> >>>>>> How would this be controlled on the command line when running
> >>>>>> trafic_server -T?
> >>>>>>
> >>>>>> It looks like it is a departure from how debug tags and controlled
> >> now.
> >>>>>>
> >>>>>> -Bryan
> >>>>>>
> >>>>>>
> >>>>>>> On Jun 11, 2019, at 1:21 PM, Walt Karas <wkaras@verizonmedia.com
> >>>> .INVALID>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>> Any +1's for simply replacing TSDebug with the proposed TSFastDbg ?
> >>>>>>>
> >>>>>>> On Mon, Apr 8, 2019 at 6:20 PM Leif Hedstrom <zw...@apache.org>
> >> wrote:
> >>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Apr 4, 2019, at 8:05 AM, Walt Karas <wkaras@verizonmedia.com
> >>>>>> .INVALID>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>> Because senility. probably "on" should be a non-pointer of type
> >> const
> >>>>>>>> volatile char, and InkAPI in the core can just keep a pointer to
> the
> >>>>>> whole
> >>>>>>>> structure for updating.
> >>>>>>>>
> >>>>>>>> But I'm leaning more to fixing up the capability that already
> kinda
> >>>>>> there.
> >>>>>>>> Which is optimizimg only for the case when all debug output is
> >>>> disabled.
> >>>>>>>> Especially since maybe zwoop would be on my side for once :o) .
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> My bad, whatever Walt said, I take the opposite stance.
> >>>>>>>>
> >>>>>>>> — leif
> >>>>>>>>
> >>>>>>>> P.s
> >>>>>>>> J/K!
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Thu, Apr 4, 2019 at 8:51 AM Alan Carroll
> >>>>>>>> <so...@verizonmedia.com.invalid> wrote:
> >>>>>>>>
> >>>>>>>> This could enable a significant speed up for debug tags. One point
> >> is
> >>>>>> that
> >>>>>>>> when the debug tag string is set, at that point the debug objects
> >>>> could
> >>>>>> be
> >>>>>>>> updated to the correct state according to the debug tag string,
> >> rather
> >>>>>> than
> >>>>>>>> checking it every time a debug message is logged.
> >>>>>>>>
> >>>>>>>> I have to ask, why is there a pointer to the enable flag, instead
> of
> >>>>>> just
> >>>>>>>> the enable flag? Your logic doesn't even check that flag, it
> checks
> >>>>>> whether
> >>>>>>>> the pointer is null.
> >>>>>>>>
> >>>>>>>> On Wed, Apr 3, 2019 at 11:52 AM Leif Hedstrom <zw...@apache.org>
> >>>> wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Apr 3, 2019, at 10:47 AM, Walt Karas <wkaras@verizonmedia.com
> >>>>>>>>
> >>>>>>>> .INVALID>
> >>>>>>>>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> It looks like the TS API already has less granular version of
> this:
> >>>>>>>>
> >>>>>>>> extern int diags_on_for_plugins;
> >>>>>>>>
> >>>>>>>> #define TSDEBUG             \
> >>>>>>>>
> >>>>>>>> if (diags_on_for_plugins) \
> >>>>>>>>
> >>>>>>>> TSDebug
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> A problem with this is if you write:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> if (flag)
> >>>>>>>>
> >>>>>>>> TSDEBUG(yadayada);
> >>>>>>>>
> >>>>>>>> else
> >>>>>>>>
> >>>>>>>> do_something();
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> then the else will be associated with the if hidden in the macro.
> >> I'd
> >>>>>>>> prefer to change to something like:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> extern const volatile char TS_detail_dbg_output_enabled;
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> #define TSFastDbg(_TAG, ...) \
> >>>>>>>> do { \
> >>>>>>>> if (TS_detail_dbg_output_enabled) \
> >>>>>>>> TSDebug(_TAG, __VA_ARGS__); \
> >>>>>>>>
> >>>>>>>> } while(0)
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Yeh, that kinda sounds like a bug, doesn’t it?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> The negative is that we'd only see the benefit when all debug
> output
> >>>>>>>>
> >>>>>>>> was
> >>>>>>>>
> >>>>>>>> disabled.  But, in any case, that is the main benefit, to avoid
> >>>> impacts
> >>>>>>>> during normal operation.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Right, we should always optimize for the common cases, obviously.
> >> When
> >>>>>>>> there’s a decision to make, where you can either make it “equally”
> >>>>>>>> expensive always, or more expensive in a rare case, but cheaper in
> >> the
> >>>>>>>> common case, we should likely chose the latter.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Looking in src/tscore/Diags.cc, it looks like the debug enable
> flag
> >> is
> >>>>>>>>
> >>>>>>>> only
> >>>>>>>>
> >>>>>>>> set at startup time.  So we wouldn't be able to enable TSFastDbg
> >> using
> >>>>>>>> traffic_cntl without adding complexity.
> >>>>>>>>
> >>>>>>>> On Tue, Apr 2, 2019 at 5:48 PM Walt Karas <
> wkaras@verizonmedia.com>
> >>>>>>>>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Add this new structure to apidefs.h:
> >>>>>>>>
> >>>>>>>> typedef struct TSFastDbgCntl_
> >>>>>>>> {
> >>>>>>>> const char * const tag; // nul-terminated string
> >>>>>>>> const volatile char * const on; // pointer to 1-byte flag
> >>>>>>>> }
> >>>>>>>> TSFastDbgCntl;
> >>>>>>>>
> >>>>>>>> Add this new API function, that returns a pointer to an instance
> of
> >>>>>>>>
> >>>>>>>> the
> >>>>>>>>
> >>>>>>>> above structure:
> >>>>>>>>
> >>>>>>>> TSFastDbgCntl * TSCreateFastDbgCntl(const char *tag);
> >>>>>>>>
> >>>>>>>> Add this macro, which would be used in place of direct use of
> >>>>>>>>
> >>>>>>>> TSDebug().
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> #define TSFastDbg(_FD_CNTL, ...) \
> >>>>>>>> do { \
> >>>>>>>> if ((_FD_CNTL)->on) \
> >>>>>>>> TSDebug((_FD_CNTL)->tag, __VA_ARGS__); \
> >>>>>>>> } while(0)
> >>>>>>>>
> >>>>>>>> The first parameter to TSFastDbg() is a pointer that was returned
> by
> >>>>>>>> TSCreateFastDbgCntl().  The remaining parameters are printf
> >> parameters
> >>>>>>>> (format string and values matching format specifiers).  The core
> >> would
> >>>>>>>>
> >>>>>>>> be
> >>>>>>>>
> >>>>>>>> responsible for changing the value of the 'on' fields in the
> >>>>>>>>
> >>>>>>>> TSFastDbgCntl
> >>>>>>>>
> >>>>>>>> instances.  For direct calls to TSDebug(), even if the tag is
> >>>>>>>>
> >>>>>>>> disabled,
> >>>>>>>>
> >>>>>>>> all
> >>>>>>>>
> >>>>>>>> the parameters have to be prepared and pushed onto the stack.  As
> >> was
> >>>>>>>> discussed in IRC, you have to feel guilty about doing converting a
> >>>>>>>> string_view to a string and outputting it with c_str().  Because
> the
> >>>>>>>>
> >>>>>>>> code
> >>>>>>>>
> >>>>>>>> to do this will be executed even if the debug tag is disabled.
> >>>>>>>> Furthermore, for each call to TSDebug, it's necessary to do an
> >>>>>>>>
> >>>>>>>> associative
> >>>>>>>>
> >>>>>>>> lookup on the tag string to determine that output for the tag is
> >>>>>>>>
> >>>>>>>> disabled.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> We would want to put equivalent capabilities in Diags.h in the
> core
> >>>>>>>>
> >>>>>>>> for
> >>>>>>>>
> >>>>>>>> debug output.  The implementation is non-trivial, but I think it
> >> would
> >>>>>>>>
> >>>>>>>> take
> >>>>>>>>
> >>>>>>>> at most a week including the Au test additions.
> >>>>>>>>
> >>>>>>>> I looked at making TSFastDbg() an inline function rather than a
> >> macro:
> >>>>>>>> https://godbolt.org/z/IfVbBk
> >>>>>>>> The opitimization works well if you compile the code with the
> inline
> >>>>>>>> function as C, but the optimization is poor if you compile it as
> >> C++.
> >>>>>>>>
> >>>>>>>> This
> >>>>>>>>
> >>>>>>>> difference exists for both gcc and clang.  It's weird.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>>
> >>>>
> >>>>
> >>
> >>
>
>

Re: TS API change proposal: Reduce performance impact of TSDebug calls when the tag is disabled

Posted by Leif Hedstrom <zw...@apache.org>.
Also, I’d like to reiterate my call to arms on v9.0.0:

   Please, everyone lets focus on the things that are critical to ATS v9.0.0 readiness!


This includes:

	1) Fixing all the critical bugs (and there’s a lot of them)

	2) Testing “master"

	3) Doing the incompatible changes that needs to be done before we can release

	4) Removing features that we do not want to support for the next 3+ years

	5) Make sure that v9.0.0 has the functionality that you *absolutely* need to be able 
	to use it


I kinda feel that several of these recent proposal do not fit into this, and in fact, are a fair amount of distraction. I urge everyone to not go down this route until we’ve got v9.0.0 stable and functional, and instead focus on the v9.0.0 tasks at hand.  I don’t think this one for example fits into any of those 5 points above.

My $.01.

— Leif

P.s
   https://github.com/apache/trafficserver/issues/5457


> On Jun 12, 2019, at 11:30 AM, Walt Karas <wk...@verizonmedia.com.INVALID> wrote:
> 
> I would guess the performance increase would be well under 1%.  But it's a
> fairly simple change.  The biggest performance increase I think would come
> from avoiding a PCRE match, to see if the specific debug tag was enabled.
> For some TSDebug calls, there can be many arguments to push, and perhaps a
> fair amount of calculation to generate the correct arguments.
> 
> On Wed, Jun 12, 2019 at 12:21 PM Bryan Call <bc...@apache.org> wrote:
> 
>> I am surprised you are seeing this as a performance issue.  Looking at the
>> code it looks like 1 function call and a couple pointer dereferences.
>> 
>> If you wanted to speed up TSDebug and make the code harder to maintain you
>> could create a structure that looks like like DiagsConfig in ts.h and then
>> set a pointer used in the API to look at the gobal DiagsConfig.  Then you
>> could have a macro that would do basically what on() does.  This would
>> bypass a function call and would have 1 more pointer dereference than your
>> example.
>> 
>> -Bryan
>> 
>> 
>>> On Jun 12, 2019, at 10:00 AM, Walt Karas <wk...@verizonmedia.com.INVALID>
>> wrote:
>>> 
>>> What do you mean by "exposing diags"?
>>> 
>>> On Wed, Jun 12, 2019 at 11:50 AM Bryan Call <bc...@apache.org> wrote:
>>> 
>>>> I would rather see TSDebug be optimized and exposing diags instead of
>>>> creating a new type of debug.
>>>> 
>>>> -Bryan
>>>> 
>>>> 
>>>>> On Jun 12, 2019, at 9:44 AM, Walt Karas <wkaras@verizonmedia.com
>> .INVALID>
>>>> wrote:
>>>>> 
>>>>> I don't believe so.  The issue I'm trying to address is that there is
>>>> still
>>>>> significant execution time overhead for debug output calls when
>>>>> proxy.config.diags.debug.enabled is 0.  This change would make the
>>>>> execution time overhead negligible (if TSFastDbg is used or TSDebug is
>>>>> changed to be like the proposed TSFastDbg).  Another way of putting it
>>>> is,
>>>>> I want to clean up the existing TSDEBUG alternative to TSDebug.
>>>>> 
>>>>> On Wed, Jun 12, 2019 at 11:31 AM Bryan Call <bc...@apache.org> wrote:
>>>>> 
>>>>>> How would this be controlled on the command line when running
>>>>>> trafic_server -T?
>>>>>> 
>>>>>> It looks like it is a departure from how debug tags and controlled
>> now.
>>>>>> 
>>>>>> -Bryan
>>>>>> 
>>>>>> 
>>>>>>> On Jun 11, 2019, at 1:21 PM, Walt Karas <wkaras@verizonmedia.com
>>>> .INVALID>
>>>>>> wrote:
>>>>>>> 
>>>>>>> Any +1's for simply replacing TSDebug with the proposed TSFastDbg ?
>>>>>>> 
>>>>>>> On Mon, Apr 8, 2019 at 6:20 PM Leif Hedstrom <zw...@apache.org>
>> wrote:
>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Apr 4, 2019, at 8:05 AM, Walt Karas <wkaras@verizonmedia.com
>>>>>> .INVALID>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>> Because senility. probably "on" should be a non-pointer of type
>> const
>>>>>>>> volatile char, and InkAPI in the core can just keep a pointer to the
>>>>>> whole
>>>>>>>> structure for updating.
>>>>>>>> 
>>>>>>>> But I'm leaning more to fixing up the capability that already kinda
>>>>>> there.
>>>>>>>> Which is optimizimg only for the case when all debug output is
>>>> disabled.
>>>>>>>> Especially since maybe zwoop would be on my side for once :o) .
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> My bad, whatever Walt said, I take the opposite stance.
>>>>>>>> 
>>>>>>>> — leif
>>>>>>>> 
>>>>>>>> P.s
>>>>>>>> J/K!
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Thu, Apr 4, 2019 at 8:51 AM Alan Carroll
>>>>>>>> <so...@verizonmedia.com.invalid> wrote:
>>>>>>>> 
>>>>>>>> This could enable a significant speed up for debug tags. One point
>> is
>>>>>> that
>>>>>>>> when the debug tag string is set, at that point the debug objects
>>>> could
>>>>>> be
>>>>>>>> updated to the correct state according to the debug tag string,
>> rather
>>>>>> than
>>>>>>>> checking it every time a debug message is logged.
>>>>>>>> 
>>>>>>>> I have to ask, why is there a pointer to the enable flag, instead of
>>>>>> just
>>>>>>>> the enable flag? Your logic doesn't even check that flag, it checks
>>>>>> whether
>>>>>>>> the pointer is null.
>>>>>>>> 
>>>>>>>> On Wed, Apr 3, 2019 at 11:52 AM Leif Hedstrom <zw...@apache.org>
>>>> wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Apr 3, 2019, at 10:47 AM, Walt Karas <wkaras@verizonmedia.com
>>>>>>>> 
>>>>>>>> .INVALID>
>>>>>>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> It looks like the TS API already has less granular version of this:
>>>>>>>> 
>>>>>>>> extern int diags_on_for_plugins;
>>>>>>>> 
>>>>>>>> #define TSDEBUG             \
>>>>>>>> 
>>>>>>>> if (diags_on_for_plugins) \
>>>>>>>> 
>>>>>>>> TSDebug
>>>>>>>> 
>>>>>>>> 
>>>>>>>> A problem with this is if you write:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> if (flag)
>>>>>>>> 
>>>>>>>> TSDEBUG(yadayada);
>>>>>>>> 
>>>>>>>> else
>>>>>>>> 
>>>>>>>> do_something();
>>>>>>>> 
>>>>>>>> 
>>>>>>>> then the else will be associated with the if hidden in the macro.
>> I'd
>>>>>>>> prefer to change to something like:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> extern const volatile char TS_detail_dbg_output_enabled;
>>>>>>>> 
>>>>>>>> 
>>>>>>>> #define TSFastDbg(_TAG, ...) \
>>>>>>>> do { \
>>>>>>>> if (TS_detail_dbg_output_enabled) \
>>>>>>>> TSDebug(_TAG, __VA_ARGS__); \
>>>>>>>> 
>>>>>>>> } while(0)
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Yeh, that kinda sounds like a bug, doesn’t it?
>>>>>>>> 
>>>>>>>> 
>>>>>>>> The negative is that we'd only see the benefit when all debug output
>>>>>>>> 
>>>>>>>> was
>>>>>>>> 
>>>>>>>> disabled.  But, in any case, that is the main benefit, to avoid
>>>> impacts
>>>>>>>> during normal operation.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Right, we should always optimize for the common cases, obviously.
>> When
>>>>>>>> there’s a decision to make, where you can either make it “equally”
>>>>>>>> expensive always, or more expensive in a rare case, but cheaper in
>> the
>>>>>>>> common case, we should likely chose the latter.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Looking in src/tscore/Diags.cc, it looks like the debug enable flag
>> is
>>>>>>>> 
>>>>>>>> only
>>>>>>>> 
>>>>>>>> set at startup time.  So we wouldn't be able to enable TSFastDbg
>> using
>>>>>>>> traffic_cntl without adding complexity.
>>>>>>>> 
>>>>>>>> On Tue, Apr 2, 2019 at 5:48 PM Walt Karas <wk...@verizonmedia.com>
>>>>>>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Add this new structure to apidefs.h:
>>>>>>>> 
>>>>>>>> typedef struct TSFastDbgCntl_
>>>>>>>> {
>>>>>>>> const char * const tag; // nul-terminated string
>>>>>>>> const volatile char * const on; // pointer to 1-byte flag
>>>>>>>> }
>>>>>>>> TSFastDbgCntl;
>>>>>>>> 
>>>>>>>> Add this new API function, that returns a pointer to an instance of
>>>>>>>> 
>>>>>>>> the
>>>>>>>> 
>>>>>>>> above structure:
>>>>>>>> 
>>>>>>>> TSFastDbgCntl * TSCreateFastDbgCntl(const char *tag);
>>>>>>>> 
>>>>>>>> Add this macro, which would be used in place of direct use of
>>>>>>>> 
>>>>>>>> TSDebug().
>>>>>>>> 
>>>>>>>> 
>>>>>>>> #define TSFastDbg(_FD_CNTL, ...) \
>>>>>>>> do { \
>>>>>>>> if ((_FD_CNTL)->on) \
>>>>>>>> TSDebug((_FD_CNTL)->tag, __VA_ARGS__); \
>>>>>>>> } while(0)
>>>>>>>> 
>>>>>>>> The first parameter to TSFastDbg() is a pointer that was returned by
>>>>>>>> TSCreateFastDbgCntl().  The remaining parameters are printf
>> parameters
>>>>>>>> (format string and values matching format specifiers).  The core
>> would
>>>>>>>> 
>>>>>>>> be
>>>>>>>> 
>>>>>>>> responsible for changing the value of the 'on' fields in the
>>>>>>>> 
>>>>>>>> TSFastDbgCntl
>>>>>>>> 
>>>>>>>> instances.  For direct calls to TSDebug(), even if the tag is
>>>>>>>> 
>>>>>>>> disabled,
>>>>>>>> 
>>>>>>>> all
>>>>>>>> 
>>>>>>>> the parameters have to be prepared and pushed onto the stack.  As
>> was
>>>>>>>> discussed in IRC, you have to feel guilty about doing converting a
>>>>>>>> string_view to a string and outputting it with c_str().  Because the
>>>>>>>> 
>>>>>>>> code
>>>>>>>> 
>>>>>>>> to do this will be executed even if the debug tag is disabled.
>>>>>>>> Furthermore, for each call to TSDebug, it's necessary to do an
>>>>>>>> 
>>>>>>>> associative
>>>>>>>> 
>>>>>>>> lookup on the tag string to determine that output for the tag is
>>>>>>>> 
>>>>>>>> disabled.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> We would want to put equivalent capabilities in Diags.h in the core
>>>>>>>> 
>>>>>>>> for
>>>>>>>> 
>>>>>>>> debug output.  The implementation is non-trivial, but I think it
>> would
>>>>>>>> 
>>>>>>>> take
>>>>>>>> 
>>>>>>>> at most a week including the Au test additions.
>>>>>>>> 
>>>>>>>> I looked at making TSFastDbg() an inline function rather than a
>> macro:
>>>>>>>> https://godbolt.org/z/IfVbBk
>>>>>>>> The opitimization works well if you compile the code with the inline
>>>>>>>> function as C, but the optimization is poor if you compile it as
>> C++.
>>>>>>>> 
>>>>>>>> This
>>>>>>>> 
>>>>>>>> difference exists for both gcc and clang.  It's weird.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>> 
>> 


Re: TS API change proposal: Reduce performance impact of TSDebug calls when the tag is disabled

Posted by Walt Karas <wk...@verizonmedia.com.INVALID>.
I would guess the performance increase would be well under 1%.  But it's a
fairly simple change.  The biggest performance increase I think would come
from avoiding a PCRE match, to see if the specific debug tag was enabled.
For some TSDebug calls, there can be many arguments to push, and perhaps a
fair amount of calculation to generate the correct arguments.

On Wed, Jun 12, 2019 at 12:21 PM Bryan Call <bc...@apache.org> wrote:

> I am surprised you are seeing this as a performance issue.  Looking at the
> code it looks like 1 function call and a couple pointer dereferences.
>
> If you wanted to speed up TSDebug and make the code harder to maintain you
> could create a structure that looks like like DiagsConfig in ts.h and then
> set a pointer used in the API to look at the gobal DiagsConfig.  Then you
> could have a macro that would do basically what on() does.  This would
> bypass a function call and would have 1 more pointer dereference than your
> example.
>
> -Bryan
>
>
> > On Jun 12, 2019, at 10:00 AM, Walt Karas <wk...@verizonmedia.com.INVALID>
> wrote:
> >
> > What do you mean by "exposing diags"?
> >
> > On Wed, Jun 12, 2019 at 11:50 AM Bryan Call <bc...@apache.org> wrote:
> >
> >> I would rather see TSDebug be optimized and exposing diags instead of
> >> creating a new type of debug.
> >>
> >> -Bryan
> >>
> >>
> >>> On Jun 12, 2019, at 9:44 AM, Walt Karas <wkaras@verizonmedia.com
> .INVALID>
> >> wrote:
> >>>
> >>> I don't believe so.  The issue I'm trying to address is that there is
> >> still
> >>> significant execution time overhead for debug output calls when
> >>> proxy.config.diags.debug.enabled is 0.  This change would make the
> >>> execution time overhead negligible (if TSFastDbg is used or TSDebug is
> >>> changed to be like the proposed TSFastDbg).  Another way of putting it
> >> is,
> >>> I want to clean up the existing TSDEBUG alternative to TSDebug.
> >>>
> >>> On Wed, Jun 12, 2019 at 11:31 AM Bryan Call <bc...@apache.org> wrote:
> >>>
> >>>> How would this be controlled on the command line when running
> >>>> trafic_server -T?
> >>>>
> >>>> It looks like it is a departure from how debug tags and controlled
> now.
> >>>>
> >>>> -Bryan
> >>>>
> >>>>
> >>>>> On Jun 11, 2019, at 1:21 PM, Walt Karas <wkaras@verizonmedia.com
> >> .INVALID>
> >>>> wrote:
> >>>>>
> >>>>> Any +1's for simply replacing TSDebug with the proposed TSFastDbg ?
> >>>>>
> >>>>> On Mon, Apr 8, 2019 at 6:20 PM Leif Hedstrom <zw...@apache.org>
> wrote:
> >>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Apr 4, 2019, at 8:05 AM, Walt Karas <wkaras@verizonmedia.com
> >>>> .INVALID>
> >>>>>> wrote:
> >>>>>>
> >>>>>> Because senility. probably "on" should be a non-pointer of type
> const
> >>>>>> volatile char, and InkAPI in the core can just keep a pointer to the
> >>>> whole
> >>>>>> structure for updating.
> >>>>>>
> >>>>>> But I'm leaning more to fixing up the capability that already kinda
> >>>> there.
> >>>>>> Which is optimizimg only for the case when all debug output is
> >> disabled.
> >>>>>> Especially since maybe zwoop would be on my side for once :o) .
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> My bad, whatever Walt said, I take the opposite stance.
> >>>>>>
> >>>>>> — leif
> >>>>>>
> >>>>>> P.s
> >>>>>> J/K!
> >>>>>>
> >>>>>>
> >>>>>> On Thu, Apr 4, 2019 at 8:51 AM Alan Carroll
> >>>>>> <so...@verizonmedia.com.invalid> wrote:
> >>>>>>
> >>>>>> This could enable a significant speed up for debug tags. One point
> is
> >>>> that
> >>>>>> when the debug tag string is set, at that point the debug objects
> >> could
> >>>> be
> >>>>>> updated to the correct state according to the debug tag string,
> rather
> >>>> than
> >>>>>> checking it every time a debug message is logged.
> >>>>>>
> >>>>>> I have to ask, why is there a pointer to the enable flag, instead of
> >>>> just
> >>>>>> the enable flag? Your logic doesn't even check that flag, it checks
> >>>> whether
> >>>>>> the pointer is null.
> >>>>>>
> >>>>>> On Wed, Apr 3, 2019 at 11:52 AM Leif Hedstrom <zw...@apache.org>
> >> wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Apr 3, 2019, at 10:47 AM, Walt Karas <wkaras@verizonmedia.com
> >>>>>>
> >>>>>> .INVALID>
> >>>>>>
> >>>>>> wrote:
> >>>>>>
> >>>>>>
> >>>>>> It looks like the TS API already has less granular version of this:
> >>>>>>
> >>>>>> extern int diags_on_for_plugins;
> >>>>>>
> >>>>>> #define TSDEBUG             \
> >>>>>>
> >>>>>> if (diags_on_for_plugins) \
> >>>>>>
> >>>>>> TSDebug
> >>>>>>
> >>>>>>
> >>>>>> A problem with this is if you write:
> >>>>>>
> >>>>>>
> >>>>>> if (flag)
> >>>>>>
> >>>>>> TSDEBUG(yadayada);
> >>>>>>
> >>>>>> else
> >>>>>>
> >>>>>> do_something();
> >>>>>>
> >>>>>>
> >>>>>> then the else will be associated with the if hidden in the macro.
> I'd
> >>>>>> prefer to change to something like:
> >>>>>>
> >>>>>>
> >>>>>> extern const volatile char TS_detail_dbg_output_enabled;
> >>>>>>
> >>>>>>
> >>>>>> #define TSFastDbg(_TAG, ...) \
> >>>>>> do { \
> >>>>>> if (TS_detail_dbg_output_enabled) \
> >>>>>> TSDebug(_TAG, __VA_ARGS__); \
> >>>>>>
> >>>>>> } while(0)
> >>>>>>
> >>>>>>
> >>>>>> Yeh, that kinda sounds like a bug, doesn’t it?
> >>>>>>
> >>>>>>
> >>>>>> The negative is that we'd only see the benefit when all debug output
> >>>>>>
> >>>>>> was
> >>>>>>
> >>>>>> disabled.  But, in any case, that is the main benefit, to avoid
> >> impacts
> >>>>>> during normal operation.
> >>>>>>
> >>>>>>
> >>>>>> Right, we should always optimize for the common cases, obviously.
> When
> >>>>>> there’s a decision to make, where you can either make it “equally”
> >>>>>> expensive always, or more expensive in a rare case, but cheaper in
> the
> >>>>>> common case, we should likely chose the latter.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> Looking in src/tscore/Diags.cc, it looks like the debug enable flag
> is
> >>>>>>
> >>>>>> only
> >>>>>>
> >>>>>> set at startup time.  So we wouldn't be able to enable TSFastDbg
> using
> >>>>>> traffic_cntl without adding complexity.
> >>>>>>
> >>>>>> On Tue, Apr 2, 2019 at 5:48 PM Walt Karas <wk...@verizonmedia.com>
> >>>>>>
> >>>>>> wrote:
> >>>>>>
> >>>>>>
> >>>>>> Add this new structure to apidefs.h:
> >>>>>>
> >>>>>> typedef struct TSFastDbgCntl_
> >>>>>> {
> >>>>>> const char * const tag; // nul-terminated string
> >>>>>> const volatile char * const on; // pointer to 1-byte flag
> >>>>>> }
> >>>>>> TSFastDbgCntl;
> >>>>>>
> >>>>>> Add this new API function, that returns a pointer to an instance of
> >>>>>>
> >>>>>> the
> >>>>>>
> >>>>>> above structure:
> >>>>>>
> >>>>>> TSFastDbgCntl * TSCreateFastDbgCntl(const char *tag);
> >>>>>>
> >>>>>> Add this macro, which would be used in place of direct use of
> >>>>>>
> >>>>>> TSDebug().
> >>>>>>
> >>>>>>
> >>>>>> #define TSFastDbg(_FD_CNTL, ...) \
> >>>>>> do { \
> >>>>>> if ((_FD_CNTL)->on) \
> >>>>>> TSDebug((_FD_CNTL)->tag, __VA_ARGS__); \
> >>>>>> } while(0)
> >>>>>>
> >>>>>> The first parameter to TSFastDbg() is a pointer that was returned by
> >>>>>> TSCreateFastDbgCntl().  The remaining parameters are printf
> parameters
> >>>>>> (format string and values matching format specifiers).  The core
> would
> >>>>>>
> >>>>>> be
> >>>>>>
> >>>>>> responsible for changing the value of the 'on' fields in the
> >>>>>>
> >>>>>> TSFastDbgCntl
> >>>>>>
> >>>>>> instances.  For direct calls to TSDebug(), even if the tag is
> >>>>>>
> >>>>>> disabled,
> >>>>>>
> >>>>>> all
> >>>>>>
> >>>>>> the parameters have to be prepared and pushed onto the stack.  As
> was
> >>>>>> discussed in IRC, you have to feel guilty about doing converting a
> >>>>>> string_view to a string and outputting it with c_str().  Because the
> >>>>>>
> >>>>>> code
> >>>>>>
> >>>>>> to do this will be executed even if the debug tag is disabled.
> >>>>>> Furthermore, for each call to TSDebug, it's necessary to do an
> >>>>>>
> >>>>>> associative
> >>>>>>
> >>>>>> lookup on the tag string to determine that output for the tag is
> >>>>>>
> >>>>>> disabled.
> >>>>>>
> >>>>>>
> >>>>>> We would want to put equivalent capabilities in Diags.h in the core
> >>>>>>
> >>>>>> for
> >>>>>>
> >>>>>> debug output.  The implementation is non-trivial, but I think it
> would
> >>>>>>
> >>>>>> take
> >>>>>>
> >>>>>> at most a week including the Au test additions.
> >>>>>>
> >>>>>> I looked at making TSFastDbg() an inline function rather than a
> macro:
> >>>>>> https://godbolt.org/z/IfVbBk
> >>>>>> The opitimization works well if you compile the code with the inline
> >>>>>> function as C, but the optimization is poor if you compile it as
> C++.
> >>>>>>
> >>>>>> This
> >>>>>>
> >>>>>> difference exists for both gcc and clang.  It's weird.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>
> >>>>
> >>
> >>
>
>

Re: TS API change proposal: Reduce performance impact of TSDebug calls when the tag is disabled

Posted by Bryan Call <bc...@apache.org>.
I am surprised you are seeing this as a performance issue.  Looking at the code it looks like 1 function call and a couple pointer dereferences.

If you wanted to speed up TSDebug and make the code harder to maintain you could create a structure that looks like like DiagsConfig in ts.h and then set a pointer used in the API to look at the gobal DiagsConfig.  Then you could have a macro that would do basically what on() does.  This would bypass a function call and would have 1 more pointer dereference than your example.

-Bryan


> On Jun 12, 2019, at 10:00 AM, Walt Karas <wk...@verizonmedia.com.INVALID> wrote:
> 
> What do you mean by "exposing diags"?
> 
> On Wed, Jun 12, 2019 at 11:50 AM Bryan Call <bc...@apache.org> wrote:
> 
>> I would rather see TSDebug be optimized and exposing diags instead of
>> creating a new type of debug.
>> 
>> -Bryan
>> 
>> 
>>> On Jun 12, 2019, at 9:44 AM, Walt Karas <wk...@verizonmedia.com.INVALID>
>> wrote:
>>> 
>>> I don't believe so.  The issue I'm trying to address is that there is
>> still
>>> significant execution time overhead for debug output calls when
>>> proxy.config.diags.debug.enabled is 0.  This change would make the
>>> execution time overhead negligible (if TSFastDbg is used or TSDebug is
>>> changed to be like the proposed TSFastDbg).  Another way of putting it
>> is,
>>> I want to clean up the existing TSDEBUG alternative to TSDebug.
>>> 
>>> On Wed, Jun 12, 2019 at 11:31 AM Bryan Call <bc...@apache.org> wrote:
>>> 
>>>> How would this be controlled on the command line when running
>>>> trafic_server -T?
>>>> 
>>>> It looks like it is a departure from how debug tags and controlled now.
>>>> 
>>>> -Bryan
>>>> 
>>>> 
>>>>> On Jun 11, 2019, at 1:21 PM, Walt Karas <wkaras@verizonmedia.com
>> .INVALID>
>>>> wrote:
>>>>> 
>>>>> Any +1's for simply replacing TSDebug with the proposed TSFastDbg ?
>>>>> 
>>>>> On Mon, Apr 8, 2019 at 6:20 PM Leif Hedstrom <zw...@apache.org> wrote:
>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Apr 4, 2019, at 8:05 AM, Walt Karas <wkaras@verizonmedia.com
>>>> .INVALID>
>>>>>> wrote:
>>>>>> 
>>>>>> Because senility. probably "on" should be a non-pointer of type const
>>>>>> volatile char, and InkAPI in the core can just keep a pointer to the
>>>> whole
>>>>>> structure for updating.
>>>>>> 
>>>>>> But I'm leaning more to fixing up the capability that already kinda
>>>> there.
>>>>>> Which is optimizimg only for the case when all debug output is
>> disabled.
>>>>>> Especially since maybe zwoop would be on my side for once :o) .
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> My bad, whatever Walt said, I take the opposite stance.
>>>>>> 
>>>>>> — leif
>>>>>> 
>>>>>> P.s
>>>>>> J/K!
>>>>>> 
>>>>>> 
>>>>>> On Thu, Apr 4, 2019 at 8:51 AM Alan Carroll
>>>>>> <so...@verizonmedia.com.invalid> wrote:
>>>>>> 
>>>>>> This could enable a significant speed up for debug tags. One point is
>>>> that
>>>>>> when the debug tag string is set, at that point the debug objects
>> could
>>>> be
>>>>>> updated to the correct state according to the debug tag string, rather
>>>> than
>>>>>> checking it every time a debug message is logged.
>>>>>> 
>>>>>> I have to ask, why is there a pointer to the enable flag, instead of
>>>> just
>>>>>> the enable flag? Your logic doesn't even check that flag, it checks
>>>> whether
>>>>>> the pointer is null.
>>>>>> 
>>>>>> On Wed, Apr 3, 2019 at 11:52 AM Leif Hedstrom <zw...@apache.org>
>> wrote:
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Apr 3, 2019, at 10:47 AM, Walt Karas <wkaras@verizonmedia.com
>>>>>> 
>>>>>> .INVALID>
>>>>>> 
>>>>>> wrote:
>>>>>> 
>>>>>> 
>>>>>> It looks like the TS API already has less granular version of this:
>>>>>> 
>>>>>> extern int diags_on_for_plugins;
>>>>>> 
>>>>>> #define TSDEBUG             \
>>>>>> 
>>>>>> if (diags_on_for_plugins) \
>>>>>> 
>>>>>> TSDebug
>>>>>> 
>>>>>> 
>>>>>> A problem with this is if you write:
>>>>>> 
>>>>>> 
>>>>>> if (flag)
>>>>>> 
>>>>>> TSDEBUG(yadayada);
>>>>>> 
>>>>>> else
>>>>>> 
>>>>>> do_something();
>>>>>> 
>>>>>> 
>>>>>> then the else will be associated with the if hidden in the macro.  I'd
>>>>>> prefer to change to something like:
>>>>>> 
>>>>>> 
>>>>>> extern const volatile char TS_detail_dbg_output_enabled;
>>>>>> 
>>>>>> 
>>>>>> #define TSFastDbg(_TAG, ...) \
>>>>>> do { \
>>>>>> if (TS_detail_dbg_output_enabled) \
>>>>>> TSDebug(_TAG, __VA_ARGS__); \
>>>>>> 
>>>>>> } while(0)
>>>>>> 
>>>>>> 
>>>>>> Yeh, that kinda sounds like a bug, doesn’t it?
>>>>>> 
>>>>>> 
>>>>>> The negative is that we'd only see the benefit when all debug output
>>>>>> 
>>>>>> was
>>>>>> 
>>>>>> disabled.  But, in any case, that is the main benefit, to avoid
>> impacts
>>>>>> during normal operation.
>>>>>> 
>>>>>> 
>>>>>> Right, we should always optimize for the common cases, obviously. When
>>>>>> there’s a decision to make, where you can either make it “equally”
>>>>>> expensive always, or more expensive in a rare case, but cheaper in the
>>>>>> common case, we should likely chose the latter.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Looking in src/tscore/Diags.cc, it looks like the debug enable flag is
>>>>>> 
>>>>>> only
>>>>>> 
>>>>>> set at startup time.  So we wouldn't be able to enable TSFastDbg using
>>>>>> traffic_cntl without adding complexity.
>>>>>> 
>>>>>> On Tue, Apr 2, 2019 at 5:48 PM Walt Karas <wk...@verizonmedia.com>
>>>>>> 
>>>>>> wrote:
>>>>>> 
>>>>>> 
>>>>>> Add this new structure to apidefs.h:
>>>>>> 
>>>>>> typedef struct TSFastDbgCntl_
>>>>>> {
>>>>>> const char * const tag; // nul-terminated string
>>>>>> const volatile char * const on; // pointer to 1-byte flag
>>>>>> }
>>>>>> TSFastDbgCntl;
>>>>>> 
>>>>>> Add this new API function, that returns a pointer to an instance of
>>>>>> 
>>>>>> the
>>>>>> 
>>>>>> above structure:
>>>>>> 
>>>>>> TSFastDbgCntl * TSCreateFastDbgCntl(const char *tag);
>>>>>> 
>>>>>> Add this macro, which would be used in place of direct use of
>>>>>> 
>>>>>> TSDebug().
>>>>>> 
>>>>>> 
>>>>>> #define TSFastDbg(_FD_CNTL, ...) \
>>>>>> do { \
>>>>>> if ((_FD_CNTL)->on) \
>>>>>> TSDebug((_FD_CNTL)->tag, __VA_ARGS__); \
>>>>>> } while(0)
>>>>>> 
>>>>>> The first parameter to TSFastDbg() is a pointer that was returned by
>>>>>> TSCreateFastDbgCntl().  The remaining parameters are printf parameters
>>>>>> (format string and values matching format specifiers).  The core would
>>>>>> 
>>>>>> be
>>>>>> 
>>>>>> responsible for changing the value of the 'on' fields in the
>>>>>> 
>>>>>> TSFastDbgCntl
>>>>>> 
>>>>>> instances.  For direct calls to TSDebug(), even if the tag is
>>>>>> 
>>>>>> disabled,
>>>>>> 
>>>>>> all
>>>>>> 
>>>>>> the parameters have to be prepared and pushed onto the stack.  As was
>>>>>> discussed in IRC, you have to feel guilty about doing converting a
>>>>>> string_view to a string and outputting it with c_str().  Because the
>>>>>> 
>>>>>> code
>>>>>> 
>>>>>> to do this will be executed even if the debug tag is disabled.
>>>>>> Furthermore, for each call to TSDebug, it's necessary to do an
>>>>>> 
>>>>>> associative
>>>>>> 
>>>>>> lookup on the tag string to determine that output for the tag is
>>>>>> 
>>>>>> disabled.
>>>>>> 
>>>>>> 
>>>>>> We would want to put equivalent capabilities in Diags.h in the core
>>>>>> 
>>>>>> for
>>>>>> 
>>>>>> debug output.  The implementation is non-trivial, but I think it would
>>>>>> 
>>>>>> take
>>>>>> 
>>>>>> at most a week including the Au test additions.
>>>>>> 
>>>>>> I looked at making TSFastDbg() an inline function rather than a macro:
>>>>>> https://godbolt.org/z/IfVbBk
>>>>>> The opitimization works well if you compile the code with the inline
>>>>>> function as C, but the optimization is poor if you compile it as C++.
>>>>>> 
>>>>>> This
>>>>>> 
>>>>>> difference exists for both gcc and clang.  It's weird.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>> 
>> 


Re: TS API change proposal: Reduce performance impact of TSDebug calls when the tag is disabled

Posted by Walt Karas <wk...@verizonmedia.com.INVALID>.
What do you mean by "exposing diags"?

On Wed, Jun 12, 2019 at 11:50 AM Bryan Call <bc...@apache.org> wrote:

> I would rather see TSDebug be optimized and exposing diags instead of
> creating a new type of debug.
>
> -Bryan
>
>
> > On Jun 12, 2019, at 9:44 AM, Walt Karas <wk...@verizonmedia.com.INVALID>
> wrote:
> >
> > I don't believe so.  The issue I'm trying to address is that there is
> still
> > significant execution time overhead for debug output calls when
> > proxy.config.diags.debug.enabled is 0.  This change would make the
> > execution time overhead negligible (if TSFastDbg is used or TSDebug is
> > changed to be like the proposed TSFastDbg).  Another way of putting it
> is,
> > I want to clean up the existing TSDEBUG alternative to TSDebug.
> >
> > On Wed, Jun 12, 2019 at 11:31 AM Bryan Call <bc...@apache.org> wrote:
> >
> >> How would this be controlled on the command line when running
> >> trafic_server -T?
> >>
> >> It looks like it is a departure from how debug tags and controlled now.
> >>
> >> -Bryan
> >>
> >>
> >>> On Jun 11, 2019, at 1:21 PM, Walt Karas <wkaras@verizonmedia.com
> .INVALID>
> >> wrote:
> >>>
> >>> Any +1's for simply replacing TSDebug with the proposed TSFastDbg ?
> >>>
> >>> On Mon, Apr 8, 2019 at 6:20 PM Leif Hedstrom <zw...@apache.org> wrote:
> >>>
> >>>>
> >>>>
> >>>> On Apr 4, 2019, at 8:05 AM, Walt Karas <wkaras@verizonmedia.com
> >> .INVALID>
> >>>> wrote:
> >>>>
> >>>> Because senility. probably "on" should be a non-pointer of type const
> >>>> volatile char, and InkAPI in the core can just keep a pointer to the
> >> whole
> >>>> structure for updating.
> >>>>
> >>>> But I'm leaning more to fixing up the capability that already kinda
> >> there.
> >>>> Which is optimizimg only for the case when all debug output is
> disabled.
> >>>> Especially since maybe zwoop would be on my side for once :o) .
> >>>>
> >>>>
> >>>>
> >>>> My bad, whatever Walt said, I take the opposite stance.
> >>>>
> >>>> — leif
> >>>>
> >>>> P.s
> >>>> J/K!
> >>>>
> >>>>
> >>>> On Thu, Apr 4, 2019 at 8:51 AM Alan Carroll
> >>>> <so...@verizonmedia.com.invalid> wrote:
> >>>>
> >>>> This could enable a significant speed up for debug tags. One point is
> >> that
> >>>> when the debug tag string is set, at that point the debug objects
> could
> >> be
> >>>> updated to the correct state according to the debug tag string, rather
> >> than
> >>>> checking it every time a debug message is logged.
> >>>>
> >>>> I have to ask, why is there a pointer to the enable flag, instead of
> >> just
> >>>> the enable flag? Your logic doesn't even check that flag, it checks
> >> whether
> >>>> the pointer is null.
> >>>>
> >>>> On Wed, Apr 3, 2019 at 11:52 AM Leif Hedstrom <zw...@apache.org>
> wrote:
> >>>>
> >>>>
> >>>>
> >>>> On Apr 3, 2019, at 10:47 AM, Walt Karas <wkaras@verizonmedia.com
> >>>>
> >>>> .INVALID>
> >>>>
> >>>> wrote:
> >>>>
> >>>>
> >>>> It looks like the TS API already has less granular version of this:
> >>>>
> >>>> extern int diags_on_for_plugins;
> >>>>
> >>>> #define TSDEBUG             \
> >>>>
> >>>> if (diags_on_for_plugins) \
> >>>>
> >>>> TSDebug
> >>>>
> >>>>
> >>>> A problem with this is if you write:
> >>>>
> >>>>
> >>>> if (flag)
> >>>>
> >>>> TSDEBUG(yadayada);
> >>>>
> >>>> else
> >>>>
> >>>> do_something();
> >>>>
> >>>>
> >>>> then the else will be associated with the if hidden in the macro.  I'd
> >>>> prefer to change to something like:
> >>>>
> >>>>
> >>>> extern const volatile char TS_detail_dbg_output_enabled;
> >>>>
> >>>>
> >>>> #define TSFastDbg(_TAG, ...) \
> >>>> do { \
> >>>> if (TS_detail_dbg_output_enabled) \
> >>>> TSDebug(_TAG, __VA_ARGS__); \
> >>>>
> >>>> } while(0)
> >>>>
> >>>>
> >>>> Yeh, that kinda sounds like a bug, doesn’t it?
> >>>>
> >>>>
> >>>> The negative is that we'd only see the benefit when all debug output
> >>>>
> >>>> was
> >>>>
> >>>> disabled.  But, in any case, that is the main benefit, to avoid
> impacts
> >>>> during normal operation.
> >>>>
> >>>>
> >>>> Right, we should always optimize for the common cases, obviously. When
> >>>> there’s a decision to make, where you can either make it “equally”
> >>>> expensive always, or more expensive in a rare case, but cheaper in the
> >>>> common case, we should likely chose the latter.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> Looking in src/tscore/Diags.cc, it looks like the debug enable flag is
> >>>>
> >>>> only
> >>>>
> >>>> set at startup time.  So we wouldn't be able to enable TSFastDbg using
> >>>> traffic_cntl without adding complexity.
> >>>>
> >>>> On Tue, Apr 2, 2019 at 5:48 PM Walt Karas <wk...@verizonmedia.com>
> >>>>
> >>>> wrote:
> >>>>
> >>>>
> >>>> Add this new structure to apidefs.h:
> >>>>
> >>>> typedef struct TSFastDbgCntl_
> >>>> {
> >>>> const char * const tag; // nul-terminated string
> >>>> const volatile char * const on; // pointer to 1-byte flag
> >>>> }
> >>>> TSFastDbgCntl;
> >>>>
> >>>> Add this new API function, that returns a pointer to an instance of
> >>>>
> >>>> the
> >>>>
> >>>> above structure:
> >>>>
> >>>> TSFastDbgCntl * TSCreateFastDbgCntl(const char *tag);
> >>>>
> >>>> Add this macro, which would be used in place of direct use of
> >>>>
> >>>> TSDebug().
> >>>>
> >>>>
> >>>> #define TSFastDbg(_FD_CNTL, ...) \
> >>>> do { \
> >>>> if ((_FD_CNTL)->on) \
> >>>> TSDebug((_FD_CNTL)->tag, __VA_ARGS__); \
> >>>> } while(0)
> >>>>
> >>>> The first parameter to TSFastDbg() is a pointer that was returned by
> >>>> TSCreateFastDbgCntl().  The remaining parameters are printf parameters
> >>>> (format string and values matching format specifiers).  The core would
> >>>>
> >>>> be
> >>>>
> >>>> responsible for changing the value of the 'on' fields in the
> >>>>
> >>>> TSFastDbgCntl
> >>>>
> >>>> instances.  For direct calls to TSDebug(), even if the tag is
> >>>>
> >>>> disabled,
> >>>>
> >>>> all
> >>>>
> >>>> the parameters have to be prepared and pushed onto the stack.  As was
> >>>> discussed in IRC, you have to feel guilty about doing converting a
> >>>> string_view to a string and outputting it with c_str().  Because the
> >>>>
> >>>> code
> >>>>
> >>>> to do this will be executed even if the debug tag is disabled.
> >>>> Furthermore, for each call to TSDebug, it's necessary to do an
> >>>>
> >>>> associative
> >>>>
> >>>> lookup on the tag string to determine that output for the tag is
> >>>>
> >>>> disabled.
> >>>>
> >>>>
> >>>> We would want to put equivalent capabilities in Diags.h in the core
> >>>>
> >>>> for
> >>>>
> >>>> debug output.  The implementation is non-trivial, but I think it would
> >>>>
> >>>> take
> >>>>
> >>>> at most a week including the Au test additions.
> >>>>
> >>>> I looked at making TSFastDbg() an inline function rather than a macro:
> >>>> https://godbolt.org/z/IfVbBk
> >>>> The opitimization works well if you compile the code with the inline
> >>>> function as C, but the optimization is poor if you compile it as C++.
> >>>>
> >>>> This
> >>>>
> >>>> difference exists for both gcc and clang.  It's weird.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>
> >>
>
>

Re: TS API change proposal: Reduce performance impact of TSDebug calls when the tag is disabled

Posted by Bryan Call <bc...@apache.org>.
I would rather see TSDebug be optimized and exposing diags instead of creating a new type of debug.

-Bryan


> On Jun 12, 2019, at 9:44 AM, Walt Karas <wk...@verizonmedia.com.INVALID> wrote:
> 
> I don't believe so.  The issue I'm trying to address is that there is still
> significant execution time overhead for debug output calls when
> proxy.config.diags.debug.enabled is 0.  This change would make the
> execution time overhead negligible (if TSFastDbg is used or TSDebug is
> changed to be like the proposed TSFastDbg).  Another way of putting it is,
> I want to clean up the existing TSDEBUG alternative to TSDebug.
> 
> On Wed, Jun 12, 2019 at 11:31 AM Bryan Call <bc...@apache.org> wrote:
> 
>> How would this be controlled on the command line when running
>> trafic_server -T?
>> 
>> It looks like it is a departure from how debug tags and controlled now.
>> 
>> -Bryan
>> 
>> 
>>> On Jun 11, 2019, at 1:21 PM, Walt Karas <wk...@verizonmedia.com.INVALID>
>> wrote:
>>> 
>>> Any +1's for simply replacing TSDebug with the proposed TSFastDbg ?
>>> 
>>> On Mon, Apr 8, 2019 at 6:20 PM Leif Hedstrom <zw...@apache.org> wrote:
>>> 
>>>> 
>>>> 
>>>> On Apr 4, 2019, at 8:05 AM, Walt Karas <wkaras@verizonmedia.com
>> .INVALID>
>>>> wrote:
>>>> 
>>>> Because senility. probably "on" should be a non-pointer of type const
>>>> volatile char, and InkAPI in the core can just keep a pointer to the
>> whole
>>>> structure for updating.
>>>> 
>>>> But I'm leaning more to fixing up the capability that already kinda
>> there.
>>>> Which is optimizimg only for the case when all debug output is disabled.
>>>> Especially since maybe zwoop would be on my side for once :o) .
>>>> 
>>>> 
>>>> 
>>>> My bad, whatever Walt said, I take the opposite stance.
>>>> 
>>>> — leif
>>>> 
>>>> P.s
>>>> J/K!
>>>> 
>>>> 
>>>> On Thu, Apr 4, 2019 at 8:51 AM Alan Carroll
>>>> <so...@verizonmedia.com.invalid> wrote:
>>>> 
>>>> This could enable a significant speed up for debug tags. One point is
>> that
>>>> when the debug tag string is set, at that point the debug objects could
>> be
>>>> updated to the correct state according to the debug tag string, rather
>> than
>>>> checking it every time a debug message is logged.
>>>> 
>>>> I have to ask, why is there a pointer to the enable flag, instead of
>> just
>>>> the enable flag? Your logic doesn't even check that flag, it checks
>> whether
>>>> the pointer is null.
>>>> 
>>>> On Wed, Apr 3, 2019 at 11:52 AM Leif Hedstrom <zw...@apache.org> wrote:
>>>> 
>>>> 
>>>> 
>>>> On Apr 3, 2019, at 10:47 AM, Walt Karas <wkaras@verizonmedia.com
>>>> 
>>>> .INVALID>
>>>> 
>>>> wrote:
>>>> 
>>>> 
>>>> It looks like the TS API already has less granular version of this:
>>>> 
>>>> extern int diags_on_for_plugins;
>>>> 
>>>> #define TSDEBUG             \
>>>> 
>>>> if (diags_on_for_plugins) \
>>>> 
>>>> TSDebug
>>>> 
>>>> 
>>>> A problem with this is if you write:
>>>> 
>>>> 
>>>> if (flag)
>>>> 
>>>> TSDEBUG(yadayada);
>>>> 
>>>> else
>>>> 
>>>> do_something();
>>>> 
>>>> 
>>>> then the else will be associated with the if hidden in the macro.  I'd
>>>> prefer to change to something like:
>>>> 
>>>> 
>>>> extern const volatile char TS_detail_dbg_output_enabled;
>>>> 
>>>> 
>>>> #define TSFastDbg(_TAG, ...) \
>>>> do { \
>>>> if (TS_detail_dbg_output_enabled) \
>>>> TSDebug(_TAG, __VA_ARGS__); \
>>>> 
>>>> } while(0)
>>>> 
>>>> 
>>>> Yeh, that kinda sounds like a bug, doesn’t it?
>>>> 
>>>> 
>>>> The negative is that we'd only see the benefit when all debug output
>>>> 
>>>> was
>>>> 
>>>> disabled.  But, in any case, that is the main benefit, to avoid impacts
>>>> during normal operation.
>>>> 
>>>> 
>>>> Right, we should always optimize for the common cases, obviously. When
>>>> there’s a decision to make, where you can either make it “equally”
>>>> expensive always, or more expensive in a rare case, but cheaper in the
>>>> common case, we should likely chose the latter.
>>>> 
>>>> 
>>>> 
>>>> 
>>>> Looking in src/tscore/Diags.cc, it looks like the debug enable flag is
>>>> 
>>>> only
>>>> 
>>>> set at startup time.  So we wouldn't be able to enable TSFastDbg using
>>>> traffic_cntl without adding complexity.
>>>> 
>>>> On Tue, Apr 2, 2019 at 5:48 PM Walt Karas <wk...@verizonmedia.com>
>>>> 
>>>> wrote:
>>>> 
>>>> 
>>>> Add this new structure to apidefs.h:
>>>> 
>>>> typedef struct TSFastDbgCntl_
>>>> {
>>>> const char * const tag; // nul-terminated string
>>>> const volatile char * const on; // pointer to 1-byte flag
>>>> }
>>>> TSFastDbgCntl;
>>>> 
>>>> Add this new API function, that returns a pointer to an instance of
>>>> 
>>>> the
>>>> 
>>>> above structure:
>>>> 
>>>> TSFastDbgCntl * TSCreateFastDbgCntl(const char *tag);
>>>> 
>>>> Add this macro, which would be used in place of direct use of
>>>> 
>>>> TSDebug().
>>>> 
>>>> 
>>>> #define TSFastDbg(_FD_CNTL, ...) \
>>>> do { \
>>>> if ((_FD_CNTL)->on) \
>>>> TSDebug((_FD_CNTL)->tag, __VA_ARGS__); \
>>>> } while(0)
>>>> 
>>>> The first parameter to TSFastDbg() is a pointer that was returned by
>>>> TSCreateFastDbgCntl().  The remaining parameters are printf parameters
>>>> (format string and values matching format specifiers).  The core would
>>>> 
>>>> be
>>>> 
>>>> responsible for changing the value of the 'on' fields in the
>>>> 
>>>> TSFastDbgCntl
>>>> 
>>>> instances.  For direct calls to TSDebug(), even if the tag is
>>>> 
>>>> disabled,
>>>> 
>>>> all
>>>> 
>>>> the parameters have to be prepared and pushed onto the stack.  As was
>>>> discussed in IRC, you have to feel guilty about doing converting a
>>>> string_view to a string and outputting it with c_str().  Because the
>>>> 
>>>> code
>>>> 
>>>> to do this will be executed even if the debug tag is disabled.
>>>> Furthermore, for each call to TSDebug, it's necessary to do an
>>>> 
>>>> associative
>>>> 
>>>> lookup on the tag string to determine that output for the tag is
>>>> 
>>>> disabled.
>>>> 
>>>> 
>>>> We would want to put equivalent capabilities in Diags.h in the core
>>>> 
>>>> for
>>>> 
>>>> debug output.  The implementation is non-trivial, but I think it would
>>>> 
>>>> take
>>>> 
>>>> at most a week including the Au test additions.
>>>> 
>>>> I looked at making TSFastDbg() an inline function rather than a macro:
>>>> https://godbolt.org/z/IfVbBk
>>>> The opitimization works well if you compile the code with the inline
>>>> function as C, but the optimization is poor if you compile it as C++.
>>>> 
>>>> This
>>>> 
>>>> difference exists for both gcc and clang.  It's weird.
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>> 
>> 


Re: TS API change proposal: Reduce performance impact of TSDebug calls when the tag is disabled

Posted by Walt Karas <wk...@verizonmedia.com.INVALID>.
I don't believe so.  The issue I'm trying to address is that there is still
significant execution time overhead for debug output calls when
proxy.config.diags.debug.enabled is 0.  This change would make the
execution time overhead negligible (if TSFastDbg is used or TSDebug is
changed to be like the proposed TSFastDbg).  Another way of putting it is,
I want to clean up the existing TSDEBUG alternative to TSDebug.

On Wed, Jun 12, 2019 at 11:31 AM Bryan Call <bc...@apache.org> wrote:

> How would this be controlled on the command line when running
> trafic_server -T?
>
> It looks like it is a departure from how debug tags and controlled now.
>
> -Bryan
>
>
> > On Jun 11, 2019, at 1:21 PM, Walt Karas <wk...@verizonmedia.com.INVALID>
> wrote:
> >
> > Any +1's for simply replacing TSDebug with the proposed TSFastDbg ?
> >
> > On Mon, Apr 8, 2019 at 6:20 PM Leif Hedstrom <zw...@apache.org> wrote:
> >
> >>
> >>
> >> On Apr 4, 2019, at 8:05 AM, Walt Karas <wkaras@verizonmedia.com
> .INVALID>
> >> wrote:
> >>
> >> Because senility. probably "on" should be a non-pointer of type const
> >> volatile char, and InkAPI in the core can just keep a pointer to the
> whole
> >> structure for updating.
> >>
> >> But I'm leaning more to fixing up the capability that already kinda
> there.
> >> Which is optimizimg only for the case when all debug output is disabled.
> >> Especially since maybe zwoop would be on my side for once :o) .
> >>
> >>
> >>
> >> My bad, whatever Walt said, I take the opposite stance.
> >>
> >> — leif
> >>
> >> P.s
> >>  J/K!
> >>
> >>
> >> On Thu, Apr 4, 2019 at 8:51 AM Alan Carroll
> >> <so...@verizonmedia.com.invalid> wrote:
> >>
> >> This could enable a significant speed up for debug tags. One point is
> that
> >> when the debug tag string is set, at that point the debug objects could
> be
> >> updated to the correct state according to the debug tag string, rather
> than
> >> checking it every time a debug message is logged.
> >>
> >> I have to ask, why is there a pointer to the enable flag, instead of
> just
> >> the enable flag? Your logic doesn't even check that flag, it checks
> whether
> >> the pointer is null.
> >>
> >> On Wed, Apr 3, 2019 at 11:52 AM Leif Hedstrom <zw...@apache.org> wrote:
> >>
> >>
> >>
> >> On Apr 3, 2019, at 10:47 AM, Walt Karas <wkaras@verizonmedia.com
> >>
> >> .INVALID>
> >>
> >> wrote:
> >>
> >>
> >> It looks like the TS API already has less granular version of this:
> >>
> >> extern int diags_on_for_plugins;
> >>
> >> #define TSDEBUG             \
> >>
> >> if (diags_on_for_plugins) \
> >>
> >> TSDebug
> >>
> >>
> >> A problem with this is if you write:
> >>
> >>
> >> if (flag)
> >>
> >> TSDEBUG(yadayada);
> >>
> >> else
> >>
> >> do_something();
> >>
> >>
> >> then the else will be associated with the if hidden in the macro.  I'd
> >> prefer to change to something like:
> >>
> >>
> >> extern const volatile char TS_detail_dbg_output_enabled;
> >>
> >>
> >> #define TSFastDbg(_TAG, ...) \
> >> do { \
> >> if (TS_detail_dbg_output_enabled) \
> >> TSDebug(_TAG, __VA_ARGS__); \
> >>
> >> } while(0)
> >>
> >>
> >> Yeh, that kinda sounds like a bug, doesn’t it?
> >>
> >>
> >> The negative is that we'd only see the benefit when all debug output
> >>
> >> was
> >>
> >> disabled.  But, in any case, that is the main benefit, to avoid impacts
> >> during normal operation.
> >>
> >>
> >> Right, we should always optimize for the common cases, obviously. When
> >> there’s a decision to make, where you can either make it “equally”
> >> expensive always, or more expensive in a rare case, but cheaper in the
> >> common case, we should likely chose the latter.
> >>
> >>
> >>
> >>
> >> Looking in src/tscore/Diags.cc, it looks like the debug enable flag is
> >>
> >> only
> >>
> >> set at startup time.  So we wouldn't be able to enable TSFastDbg using
> >> traffic_cntl without adding complexity.
> >>
> >> On Tue, Apr 2, 2019 at 5:48 PM Walt Karas <wk...@verizonmedia.com>
> >>
> >> wrote:
> >>
> >>
> >> Add this new structure to apidefs.h:
> >>
> >> typedef struct TSFastDbgCntl_
> >> {
> >> const char * const tag; // nul-terminated string
> >> const volatile char * const on; // pointer to 1-byte flag
> >> }
> >> TSFastDbgCntl;
> >>
> >> Add this new API function, that returns a pointer to an instance of
> >>
> >> the
> >>
> >> above structure:
> >>
> >> TSFastDbgCntl * TSCreateFastDbgCntl(const char *tag);
> >>
> >> Add this macro, which would be used in place of direct use of
> >>
> >> TSDebug().
> >>
> >>
> >> #define TSFastDbg(_FD_CNTL, ...) \
> >> do { \
> >> if ((_FD_CNTL)->on) \
> >> TSDebug((_FD_CNTL)->tag, __VA_ARGS__); \
> >> } while(0)
> >>
> >> The first parameter to TSFastDbg() is a pointer that was returned by
> >> TSCreateFastDbgCntl().  The remaining parameters are printf parameters
> >> (format string and values matching format specifiers).  The core would
> >>
> >> be
> >>
> >> responsible for changing the value of the 'on' fields in the
> >>
> >> TSFastDbgCntl
> >>
> >> instances.  For direct calls to TSDebug(), even if the tag is
> >>
> >> disabled,
> >>
> >> all
> >>
> >> the parameters have to be prepared and pushed onto the stack.  As was
> >> discussed in IRC, you have to feel guilty about doing converting a
> >> string_view to a string and outputting it with c_str().  Because the
> >>
> >> code
> >>
> >> to do this will be executed even if the debug tag is disabled.
> >> Furthermore, for each call to TSDebug, it's necessary to do an
> >>
> >> associative
> >>
> >> lookup on the tag string to determine that output for the tag is
> >>
> >> disabled.
> >>
> >>
> >> We would want to put equivalent capabilities in Diags.h in the core
> >>
> >> for
> >>
> >> debug output.  The implementation is non-trivial, but I think it would
> >>
> >> take
> >>
> >> at most a week including the Au test additions.
> >>
> >> I looked at making TSFastDbg() an inline function rather than a macro:
> >> https://godbolt.org/z/IfVbBk
> >> The opitimization works well if you compile the code with the inline
> >> function as C, but the optimization is poor if you compile it as C++.
> >>
> >> This
> >>
> >> difference exists for both gcc and clang.  It's weird.
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
>
>

Re: TS API change proposal: Reduce performance impact of TSDebug calls when the tag is disabled

Posted by Bryan Call <bc...@apache.org>.
How would this be controlled on the command line when running trafic_server -T?

It looks like it is a departure from how debug tags and controlled now.

-Bryan


> On Jun 11, 2019, at 1:21 PM, Walt Karas <wk...@verizonmedia.com.INVALID> wrote:
> 
> Any +1's for simply replacing TSDebug with the proposed TSFastDbg ?
> 
> On Mon, Apr 8, 2019 at 6:20 PM Leif Hedstrom <zw...@apache.org> wrote:
> 
>> 
>> 
>> On Apr 4, 2019, at 8:05 AM, Walt Karas <wk...@verizonmedia.com.INVALID>
>> wrote:
>> 
>> Because senility. probably "on" should be a non-pointer of type const
>> volatile char, and InkAPI in the core can just keep a pointer to the whole
>> structure for updating.
>> 
>> But I'm leaning more to fixing up the capability that already kinda there.
>> Which is optimizimg only for the case when all debug output is disabled.
>> Especially since maybe zwoop would be on my side for once :o) .
>> 
>> 
>> 
>> My bad, whatever Walt said, I take the opposite stance.
>> 
>> — leif
>> 
>> P.s
>>  J/K!
>> 
>> 
>> On Thu, Apr 4, 2019 at 8:51 AM Alan Carroll
>> <so...@verizonmedia.com.invalid> wrote:
>> 
>> This could enable a significant speed up for debug tags. One point is that
>> when the debug tag string is set, at that point the debug objects could be
>> updated to the correct state according to the debug tag string, rather than
>> checking it every time a debug message is logged.
>> 
>> I have to ask, why is there a pointer to the enable flag, instead of just
>> the enable flag? Your logic doesn't even check that flag, it checks whether
>> the pointer is null.
>> 
>> On Wed, Apr 3, 2019 at 11:52 AM Leif Hedstrom <zw...@apache.org> wrote:
>> 
>> 
>> 
>> On Apr 3, 2019, at 10:47 AM, Walt Karas <wkaras@verizonmedia.com
>> 
>> .INVALID>
>> 
>> wrote:
>> 
>> 
>> It looks like the TS API already has less granular version of this:
>> 
>> extern int diags_on_for_plugins;
>> 
>> #define TSDEBUG             \
>> 
>> if (diags_on_for_plugins) \
>> 
>> TSDebug
>> 
>> 
>> A problem with this is if you write:
>> 
>> 
>> if (flag)
>> 
>> TSDEBUG(yadayada);
>> 
>> else
>> 
>> do_something();
>> 
>> 
>> then the else will be associated with the if hidden in the macro.  I'd
>> prefer to change to something like:
>> 
>> 
>> extern const volatile char TS_detail_dbg_output_enabled;
>> 
>> 
>> #define TSFastDbg(_TAG, ...) \
>> do { \
>> if (TS_detail_dbg_output_enabled) \
>> TSDebug(_TAG, __VA_ARGS__); \
>> 
>> } while(0)
>> 
>> 
>> Yeh, that kinda sounds like a bug, doesn’t it?
>> 
>> 
>> The negative is that we'd only see the benefit when all debug output
>> 
>> was
>> 
>> disabled.  But, in any case, that is the main benefit, to avoid impacts
>> during normal operation.
>> 
>> 
>> Right, we should always optimize for the common cases, obviously. When
>> there’s a decision to make, where you can either make it “equally”
>> expensive always, or more expensive in a rare case, but cheaper in the
>> common case, we should likely chose the latter.
>> 
>> 
>> 
>> 
>> Looking in src/tscore/Diags.cc, it looks like the debug enable flag is
>> 
>> only
>> 
>> set at startup time.  So we wouldn't be able to enable TSFastDbg using
>> traffic_cntl without adding complexity.
>> 
>> On Tue, Apr 2, 2019 at 5:48 PM Walt Karas <wk...@verizonmedia.com>
>> 
>> wrote:
>> 
>> 
>> Add this new structure to apidefs.h:
>> 
>> typedef struct TSFastDbgCntl_
>> {
>> const char * const tag; // nul-terminated string
>> const volatile char * const on; // pointer to 1-byte flag
>> }
>> TSFastDbgCntl;
>> 
>> Add this new API function, that returns a pointer to an instance of
>> 
>> the
>> 
>> above structure:
>> 
>> TSFastDbgCntl * TSCreateFastDbgCntl(const char *tag);
>> 
>> Add this macro, which would be used in place of direct use of
>> 
>> TSDebug().
>> 
>> 
>> #define TSFastDbg(_FD_CNTL, ...) \
>> do { \
>> if ((_FD_CNTL)->on) \
>> TSDebug((_FD_CNTL)->tag, __VA_ARGS__); \
>> } while(0)
>> 
>> The first parameter to TSFastDbg() is a pointer that was returned by
>> TSCreateFastDbgCntl().  The remaining parameters are printf parameters
>> (format string and values matching format specifiers).  The core would
>> 
>> be
>> 
>> responsible for changing the value of the 'on' fields in the
>> 
>> TSFastDbgCntl
>> 
>> instances.  For direct calls to TSDebug(), even if the tag is
>> 
>> disabled,
>> 
>> all
>> 
>> the parameters have to be prepared and pushed onto the stack.  As was
>> discussed in IRC, you have to feel guilty about doing converting a
>> string_view to a string and outputting it with c_str().  Because the
>> 
>> code
>> 
>> to do this will be executed even if the debug tag is disabled.
>> Furthermore, for each call to TSDebug, it's necessary to do an
>> 
>> associative
>> 
>> lookup on the tag string to determine that output for the tag is
>> 
>> disabled.
>> 
>> 
>> We would want to put equivalent capabilities in Diags.h in the core
>> 
>> for
>> 
>> debug output.  The implementation is non-trivial, but I think it would
>> 
>> take
>> 
>> at most a week including the Au test additions.
>> 
>> I looked at making TSFastDbg() an inline function rather than a macro:
>> https://godbolt.org/z/IfVbBk
>> The opitimization works well if you compile the code with the inline
>> function as C, but the optimization is poor if you compile it as C++.
>> 
>> This
>> 
>> difference exists for both gcc and clang.  It's weird.
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>>