You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@avalon.apache.org by Marcus Crafter <cr...@fztig938.bank.dresdner.net> on 2002/02/25 12:57:44 UTC

[RT] Profiler class thoughts

Hi All,

	Hope all is well.

	Over the weekend I spent some time playing around with the
	Profiler interfaces in scratchpad. I've come up with an initial
	version of the code, but during implementation it got me
	thinking about future ideas.

	I had a couple of thoughts, and I'd be interested in any comments, as
	they require changes to the current API.
	
	Bear with me, hopefully this is clear :)

	The current API has the Profiler class in control. It does the
	sampling, and pushes out the data to the reporting implementation.
	This model has some disadvantages:

	1. Data is pushed out to the reporting engine, which means we can't
	write any on-demand reporting classes (eg. UpdatableSwingProfileReport,
	ServletProfileReport, etc). :(

	2. If we enhance Profiler to support multiple reports, all reports
	will receive sample data at the same time interval. :(

	3. For large scale applications where sampling time is not negligable,
	samples and the timestamp value can be inaccurate when printed. :(

	4. The reporting object must be initialized with the profilable object
	names and profile point names before any sampling takes place so
	it knows how to label information.

	This means it requires some voodoo code to dynamically add
	Profilables to the system without restarting the Profiler. :(

	On an implementation level, it means the reporting engine must
	maintain a second (internal) copy/reference of each Profilable's
	names and their profile points names (the first copies are in the
	Profilable objects themselves), and maintain relationship hierarchies
	between any child profilable objects.

	(I found this to be a real mess to implement as ProfileReport.addGroup()
	does not distinguish between profile point names and profilable object
	names)

	We can remove all these limitations and reduce implementation
	complexity by changing the way sampling takes place.

	Instead of implementing a push model where the profiler samples
	data and pushes it out to the reporting implementation, we could
	consider a pull model, where the reporting implementation does
	the sampling directly.

	This fixes all of the above listed disadvantages, and also
	makes implementation much easier.
	
	At the interface level, we would have to change ProfileReport to be
	something like:

	interface ProfileReport 
	{
		void startReport();
		void endReport();
		void addProfilable( Profilable profilable );
		void sample();
	}

	Where 'startReport' would signal the reporting engine it can
	being sampling.
	
	Since the reporting class does the sampling, it can be made
	configurable with sampling times (and/or support on-demand
	sampling), etc.
	
	Since it maintains references directly to each profilable, supporting
	dynamic's is easy, and the report also doesn't need to maintain any
	group/subgroup names/references.

	Also, since sampling is done by the reporting class when the
	data is needed it's more likely to be accurate in large scale
	environments.

	To support multiple reports we'd need to change Profiler so it
	had:

	interface Profiler
	{
		void addReport( ProfileReport report );
	}

	Instead of just 'report(...)', and perhaps consider some
	'remove' methods.

	Hopefully that's understandable, please feel free to ask any
	questions.

	Thoughts, comments ?

	Cheers,

	Marcus

-- 
        .....
     ,,$$$$$$$$$,      Marcus Crafter
    ;$'      '$$$$:    Computer Systems Engineer
    $:         $$$$:   ManageSoft GmbH
     $       o_)$$$:   82-84 Mainzer Landstrasse
     ;$,    _/\ &&:'   60327 Frankfurt Germany
       '     /( &&&
           \_&&&&'
          &&&&.
    &&&&&&&:

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>

Re: [RT] Profiler class thoughts

Posted by Berin Loritsch <bl...@apache.org>.

:) At Marcus' prodding, I will add in my thoughts now.

The first thing I want to say is that I started the Profiler interfaces
a while back when I had more time to devote to Avalon.  They are based
on Matt Welsh's work with the Sandstorm server.  I cleaned up the API,
and added the ability to have multiple profile points within an object.
Furthermore, I intended for the system to be heirarchical in nature so
that the containers could manage what gets reported, and what doesn't.

Marcus brought up some good points in that Instrumenting your code is
more the concern of the developer or administrator.  When you are tuning
your system, you want to be able to be able to zero in on the
information you really want.

Therefore, the choice of what gets reported is ultimately the result of
the Report itself.  Also, the starting and stopping of profiling should
be done by the report object.  In some cases, you want to start right
away, and collect information for the life of the development run.  In
other cases, you only want certain time slices.

Furthermore, the Profiler (or better named Instrumentor) should be the
liaison that collects all the Profilables and profile points.  The
Report should be able to communicate with the Intrumentor (or Profiler)
to determine which of the profile points we want to examine.

If you keep the spirit of what I just laid out in tact, I would be happy
to let you guys loose on the "profiler" code.  Just keep us appraised
as to your progress.

Leif Mortenson wrote:
> 
> 
> Marcus Crafter wrote:
> 
>> Hi All,
>>
>>     Hope all is well.
>>
>>     Over the weekend I spent some time playing around with the
>>     Profiler interfaces in scratchpad. I've come up with an initial
>>     version of the code, but during implementation it got me
>>     thinking about future ideas.
>>
>>     I had a couple of thoughts, and I'd be interested in any comments, as
>>     they require changes to the current API.
>>     
>>     Bear with me, hopefully this is clear :)
>>
>>     The current API has the Profiler class in control. It does the
>>     sampling, and pushes out the data to the reporting implementation.
>>     This model has some disadvantages:
>>
>>     1. Data is pushed out to the reporting engine, which means we can't
>>     write any on-demand reporting classes (eg. 
>> UpdatableSwingProfileReport,
>>     ServletProfileReport, etc). :(
>>
>>
>>
>>     2. If we enhance Profiler to support multiple reports, all reports
>>     will receive sample data at the same time interval. :(
>>
>>     3. For large scale applications where sampling time is not 
>> negligable,
>>     samples and the timestamp value can be inaccurate when printed. :(
>>
>>     4. The reporting object must be initialized with the profilable 
>> object
>>     names and profile point names before any sampling takes place so
>>     it knows how to label information.
>>
>>     This means it requires some voodoo code to dynamically add
>>     Profilables to the system without restarting the Profiler. :(
>>
>>     On an implementation level, it means the reporting engine must
>>     maintain a second (internal) copy/reference of each Profilable's
>>     names and their profile points names (the first copies are in the
>>     Profilable objects themselves), and maintain relationship hierarchies
>>     between any child profilable objects.
>>
>>     (I found this to be a real mess to implement as 
>> ProfileReport.addGroup()
>>     does not distinguish between profile point names and profilable 
>> object
>>     names)
>>
>>     We can remove all these limitations and reduce implementation
>>     complexity by changing the way sampling takes place.
>>
>>     Instead of implementing a push model where the profiler samples
>>     data and pushes it out to the reporting implementation, we could
>>     consider a pull model, where the reporting implementation does
>>     the sampling directly.
>>
> Here are some of my ideas on how to do this.  Some of it is in line with 
> what you are making, and others
> are not.  :-)
> 
> Our old product includes some simple profiling that was always enabled 
> to keep track of things like memory
> usage, connection counts, request counts etc.  These profile points 
> could then be requested by any report at
> any time in the future.
> 
> Is what I would really love to see the Profiler do is the following:   
> When any component that implements
> Profilable is initialized, it will go through and register any profile 
> points that it has available.  Then the Profile
> Manager would look for these points in a configuration like the following:
> 
> -----------------
> <profiler>
>    <categories>
>        <category name="history-stats">
>            <data key="min" type="min">
>                <history type="full-data"/>
>            </data>
>            <data key="max" type="max">
>                <history type="full-data"/>
>            </data>
>            <data key="average" type="average">
>                <history type="full-data"/>
>            </data>
>        </category>
> 
>        <category name="history-count">
>            <data key="count" type="count">
>                <history type="full-data"/>
>            </data>
>        </category>
> 
>        <category name="debug">
>            <data key="count" type="count"/>
>        </category>
>    </categories>
> 
>    <history-types>
>        <history-type name="full-data">
>            <!-- 1 second interval for 10 minutes -->
>            <storage interval="1000" size="600"/>
>            <!-- 1 minute interval for 1 hour -->
>            <storage interval="60000" size="3600000"/>
>            <!-- 1 hour interval for 1 day -->
>            <storage interval="3600000" size="86400000"/>
>        </history-type>
>    </history-types>
> </profiler>
> 
> <components>
>    <my-server>
>        <profile-point name="memory" category="standard" 
> history-type="history-stats"/>
>        <profile-point name="connections" category="standard" 
> history-type="history-count"/>
>        <profile-point name="getObjCalls" category="debug"/>
>    </my-server>
> </components>
> -----------------
> 
> This configuration defines a my-server component which has 3 profile 
> ponts.  The first two:
> memory and connections are configured to be "pull" datapoints because 
> they specify
> history.  The third point, getObjCalls, does not.  Its profile 
> information would not be collected
> until the a report needs the data.
> 
> The profile points should be used like the following to avoid 
> unnecessary CPU usage.  Modeled
> after the logger code.)
> 
> ---
> if (m_getObjCallsProfilePoint.isListening()) {
>    m_getObjCallsProfilePoint.increment();
> }
> ---
> or
> ---
> m_connectionCount++;
> if (m_connectionsProfilePoint.isListening()) {
>    m_connectionsProfilePoint.setPoint(m_connectionCount);
> }
> ---
> 
> The isListening method for any profile point would be false whenever 
> nobody was listening.
> 
> The history data would be implemented as a report so it would cause the 
> isListening method
> to return true.  By doing it this way, the component would not need to 
> worry about whether
> or not history was being used.
> 
> A report could either be registered as a listener or the ProfileManager 
> could be queried for
> a specific data set using history if available.  If the profile point 
> does not maintain history, then
> the data set would just be empty.
> 
> Its getting late, so I may not be making too much sense.  I'll go back 
> and look over what you
> have checked in and get my thoughts strait in the morning :-)
> 
> Cheers,
> Leif
> 
>>
>>
>>     This fixes all of the above listed disadvantages, and also
>>     makes implementation much easier.
>>     
>>     At the interface level, we would have to change ProfileReport to be
>>     something like:
>>
>>     interface ProfileReport     {
>>         void startReport();
>>         void endReport();
>>         void addProfilable( Profilable profilable );
>>         void sample();
>>     }
>>
>>     Where 'startReport' would signal the reporting engine it can
>>     being sampling.
>>     
>>     Since the reporting class does the sampling, it can be made
>>     configurable with sampling times (and/or support on-demand
>>     sampling), etc.
>>     
>>     Since it maintains references directly to each profilable, supporting
>>     dynamic's is easy, and the report also doesn't need to maintain any
>>     group/subgroup names/references.
>>
>>     Also, since sampling is done by the reporting class when the
>>     data is needed it's more likely to be accurate in large scale
>>     environments.
>>
>>     To support multiple reports we'd need to change Profiler so it
>>     had:
>>
>>     interface Profiler
>>     {
>>         void addReport( ProfileReport report );
>>     }
>>
>>     Instead of just 'report(...)', and perhaps consider some
>>     'remove' methods.
>>
>>     Hopefully that's understandable, please feel free to ask any
>>     questions.
>>
>>     Thoughts, comments ?
>>
>>     Cheers,
>>
>>     Marcus
>>
> 
> 
> 
> -- 
> To unsubscribe, e-mail:   
> <ma...@jakarta.apache.org>
> For additional commands, e-mail: 
> <ma...@jakarta.apache.org>
> 
> 



-- 

"They that give up essential liberty to obtain a little temporary safety
  deserve neither liberty nor safety."
                 - Benjamin Franklin


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>

Re: [RT] Profiler class thoughts

Posted by Leif Mortenson <le...@silveregg.co.jp>.


Marcus Crafter wrote:

>Hi All,
>
>	Hope all is well.
>
>	Over the weekend I spent some time playing around with the
>	Profiler interfaces in scratchpad. I've come up with an initial
>	version of the code, but during implementation it got me
>	thinking about future ideas.
>
>	I had a couple of thoughts, and I'd be interested in any comments, as
>	they require changes to the current API.
>	
>	Bear with me, hopefully this is clear :)
>
>	The current API has the Profiler class in control. It does the
>	sampling, and pushes out the data to the reporting implementation.
>	This model has some disadvantages:
>
>	1. Data is pushed out to the reporting engine, which means we can't
>	write any on-demand reporting classes (eg. UpdatableSwingProfileReport,
>	ServletProfileReport, etc). :(
>
>
>
>	2. If we enhance Profiler to support multiple reports, all reports
>	will receive sample data at the same time interval. :(
>
>	3. For large scale applications where sampling time is not negligable,
>	samples and the timestamp value can be inaccurate when printed. :(
>
>	4. The reporting object must be initialized with the profilable object
>	names and profile point names before any sampling takes place so
>	it knows how to label information.
>
>	This means it requires some voodoo code to dynamically add
>	Profilables to the system without restarting the Profiler. :(
>
>	On an implementation level, it means the reporting engine must
>	maintain a second (internal) copy/reference of each Profilable's
>	names and their profile points names (the first copies are in the
>	Profilable objects themselves), and maintain relationship hierarchies
>	between any child profilable objects.
>
>	(I found this to be a real mess to implement as ProfileReport.addGroup()
>	does not distinguish between profile point names and profilable object
>	names)
>
>	We can remove all these limitations and reduce implementation
>	complexity by changing the way sampling takes place.
>
>	Instead of implementing a push model where the profiler samples
>	data and pushes it out to the reporting implementation, we could
>	consider a pull model, where the reporting implementation does
>	the sampling directly.
>
Here are some of my ideas on how to do this.  Some of it is in line with 
what you are making, and others
are not.  :-)

Our old product includes some simple profiling that was always enabled 
to keep track of things like memory
usage, connection counts, request counts etc.  These profile points 
could then be requested by any report at
any time in the future.

Is what I would really love to see the Profiler do is the following:   
When any component that implements
Profilable is initialized, it will go through and register any profile 
points that it has available.  Then the Profile
Manager would look for these points in a configuration like the following:

-----------------
<profiler>
    <categories>
        <category name="history-stats">
            <data key="min" type="min">
                <history type="full-data"/>
            </data>
            <data key="max" type="max">
                <history type="full-data"/>
            </data>
            <data key="average" type="average">
                <history type="full-data"/>
            </data>
        </category>

        <category name="history-count">
            <data key="count" type="count">
                <history type="full-data"/>
            </data>
        </category>

        <category name="debug">
            <data key="count" type="count"/>
        </category>
    </categories>

    <history-types>
        <history-type name="full-data">
            <!-- 1 second interval for 10 minutes -->
            <storage interval="1000" size="600"/>
            <!-- 1 minute interval for 1 hour -->
            <storage interval="60000" size="3600000"/>
            <!-- 1 hour interval for 1 day -->
            <storage interval="3600000" size="86400000"/>
        </history-type>
    </history-types>
</profiler>

<components>
    <my-server>
        <profile-point name="memory" category="standard" 
history-type="history-stats"/>
        <profile-point name="connections" category="standard" 
history-type="history-count"/>
        <profile-point name="getObjCalls" category="debug"/>
    </my-server>
</components>
-----------------

This configuration defines a my-server component which has 3 profile 
ponts.  The first two:
memory and connections are configured to be "pull" datapoints because 
they specify
history.  The third point, getObjCalls, does not.  Its profile 
information would not be collected
until the a report needs the data.

The profile points should be used like the following to avoid 
unnecessary CPU usage.  Modeled
after the logger code.)

---
if (m_getObjCallsProfilePoint.isListening()) {
    m_getObjCallsProfilePoint.increment();
}
---
or
---
m_connectionCount++;
if (m_connectionsProfilePoint.isListening()) {
    m_connectionsProfilePoint.setPoint(m_connectionCount);
}
---

The isListening method for any profile point would be false whenever 
nobody was listening.

The history data would be implemented as a report so it would cause the 
isListening method
to return true.  By doing it this way, the component would not need to 
worry about whether
or not history was being used.

A report could either be registered as a listener or the ProfileManager 
could be queried for
a specific data set using history if available.  If the profile point 
does not maintain history, then
the data set would just be empty.

Its getting late, so I may not be making too much sense.  I'll go back 
and look over what you
have checked in and get my thoughts strait in the morning :-)

Cheers,
Leif

>
>
>	This fixes all of the above listed disadvantages, and also
>	makes implementation much easier.
>	
>	At the interface level, we would have to change ProfileReport to be
>	something like:
>
>	interface ProfileReport 
>	{
>		void startReport();
>		void endReport();
>		void addProfilable( Profilable profilable );
>		void sample();
>	}
>
>	Where 'startReport' would signal the reporting engine it can
>	being sampling.
>	
>	Since the reporting class does the sampling, it can be made
>	configurable with sampling times (and/or support on-demand
>	sampling), etc.
>	
>	Since it maintains references directly to each profilable, supporting
>	dynamic's is easy, and the report also doesn't need to maintain any
>	group/subgroup names/references.
>
>	Also, since sampling is done by the reporting class when the
>	data is needed it's more likely to be accurate in large scale
>	environments.
>
>	To support multiple reports we'd need to change Profiler so it
>	had:
>
>	interface Profiler
>	{
>		void addReport( ProfileReport report );
>	}
>
>	Instead of just 'report(...)', and perhaps consider some
>	'remove' methods.
>
>	Hopefully that's understandable, please feel free to ask any
>	questions.
>
>	Thoughts, comments ?
>
>	Cheers,
>
>	Marcus
>



--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>