You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@tomcat.apache.org by Christopher Schultz <ch...@christopherschultz.net> on 2012/08/17 00:44:25 UTC

Re-factoring TLD parsing

All,

The first item in the TOMCAT-NEXT.txt is this:

 1. Refactor the TLD parsing. TLDs are currently parsed twice. Once by
    Catalina looking for listeners and once by Jasper.

I had a conversation in Vancouver with David Blevins about the scourge
of JAR-scanning in general (in that case, we were discussing
annotation-processing) and I suggested that a generic JAR scanner could
be built that would simply scan JARs and emit events like "found
annotation" or "found JAR" or "found file" or whatever.

I don't know enough about how Catalina and Jasper each handle these
things, but would such a scanning component be helpful to them? If both
components (Catalina and Jasper, other components) could register event
handlers with the scanner, the number of times each JAR would be scanned
would be limited to 1.

Perhaps this strategy could be extended to TLD processing as well: the
scanner could be configured to send TLD-specific notifications (or maybe
just have a TLD-specific listener registered with the scanner that
generates events like "read TLD component" or what have you).

I guess the question is whether each component (Catalina, Jasper,
whatever add-ons might be interested in similar information) can cleanly
register their interest in receiving notification of these kinds of
events. Is there a convenient place where components hosted by Catalina
could naturally register for these kinds of events (or even request that
a JAR scan occur)?

Is this kind of thing overkill, or does it sound like it would be a
truly useful utility?

Thanks,
-chris

Re: Re-factoring TLD parsing

Posted by Mark Thomas <ma...@apache.org>.


Christopher Schultz <ch...@christopherschultz.net> wrote:

>All,
>
>The first item in the TOMCAT-NEXT.txt is this:
>
> 1. Refactor the TLD parsing. TLDs are currently parsed twice. Once by
>    Catalina looking for listeners and once by Jasper.
>
>I had a conversation in Vancouver with David Blevins about the scourge
>of JAR-scanning in general (in that case, we were discussing
>annotation-processing) and I suggested that a generic JAR scanner could
>be built that would simply scan JARs and emit events like "found
>annotation" or "found JAR" or "found file" or whatever.

There is work in Commons on a class scanning component.

>I don't know enough about how Catalina and Jasper each handle these
>things, but would such a scanning component be helpful to them?

Possibly.

Your e-mail landed in my inbox just as I was starting to think about BZ 53714. In light of the recent clarifications from the Servlet EG and that BZ, I think it is worth reviewing:
- how we identify JARs (or JAR like directories)
- what we might need to scan for in each JAR (web fragments, SCIs, annotations, TLDs, ???)
- the options available to the user for skipping the scanning of JARs known not to contain things of interest (Spec and Tomcat specific)
- any ordering constraints
- options for caching results (particularly of JARs in the common or shared class loaders)
- performance considerations (e.g. it might be faster to scan once for everything even if we suspect we might be able to skip some aspects of the scan)

> If both
>components (Catalina and Jasper, other components) could register event
>handlers with the scanner, the number of times each JAR would be
>scanned
>would be limited to 1.

Yes, with the caveat that it depends on the type of scan. Checking for an SCI or web-fragment is a lot faster than scanning all classes for annotations. TLDs are a little slower than checking for an SCI but not much.

>Perhaps this strategy could be extended to TLD processing as well: the
>scanner could be configured to send TLD-specific notifications (or
>maybe
>just have a TLD-specific listener registered with the scanner that
>generates events like "read TLD component" or what have you).
>
>I guess the question is whether each component (Catalina, Jasper,
>whatever add-ons might be interested in similar information) can
>cleanly
>register their interest in receiving notification of these kinds of
>events. Is there a convenient place where components hosted by Catalina
>could naturally register for these kinds of events (or even request
>that
>a JAR scan occur)?

The benefits of refactoring the TLD scanning are mainly that we currently do it twice and we don't do it fully in terms of tracking dependencies. There is clear scope for improvement here.

>Is this kind of thing overkill, or does it sound like it would be a
>truly useful utility?

Potentially but we need to get a better grip on all the requirements. Given the refactoring aspects, this will probably remain a 8.0.x solution although BZ 53714 needs solving for 7.0.x as well. For backwards compatibility reasons I think it makes sense to figure out the 8.0.x solution and then back-port it.

An added complication is the resources refactoring I am working on. That will have an impact on JarScanner although I'm not sure exactly what at this point.

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org

Re: Re-factoring TLD parsing

Posted by sebb <se...@gmail.com>.

On 17 August 2012 05:00, Christopher Schultz
<ch...@christopherschultz.net> wrote:
> Sebb,
>
> On 8/16/12 7:11 PM, sebb wrote:
>> On 16 August 2012 23:44, Christopher Schultz
>> <ch...@christopherschultz.net> wrote:
>>>
>>> I had a conversation in Vancouver with David Blevins about the scourge
>>> of JAR-scanning in general (in that case, we were discussing
>>> annotation-processing) and I suggested that a generic JAR scanner could
>>> be built that would simply scan JARs and emit events like "found
>>> annotation" or "found JAR" or "found file" or whatever.
>>
>> What about timing?
>> If the components start up independently, there could be issues with
>> knowing when all interested parties have declared themselves and when
>> it is safe to start the scan. Equally, one component may require the
>> information before another registers.
>
> Obviously, timing can be an issue. If the scan is already done and the
> events haven't been stored somewhere, the scan needs to be re-done. At
> that point, it's no worse than the situation as it stands, today. The
> components that can benefit from this idea can, while others will not
> suffer any worse than they already do.

Provided that the scan only does what is needed by that component.
If the scanner performs multiple types of scan each time, some of that
work will be wasted.

So the components would need to be able to register for specific scan types.
The scanner would also need to keep track of which scan types are
outstanding requests.

Obviously if the scanner can cache the info, this would not apply.

> -chris
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org

Re: Re-factoring TLD parsing

Posted by Christopher Schultz <ch...@christopherschultz.net>.

Sebb,

On 8/16/12 7:11 PM, sebb wrote:
> On 16 August 2012 23:44, Christopher Schultz
> <ch...@christopherschultz.net> wrote:
>>
>> I had a conversation in Vancouver with David Blevins about the scourge
>> of JAR-scanning in general (in that case, we were discussing
>> annotation-processing) and I suggested that a generic JAR scanner could
>> be built that would simply scan JARs and emit events like "found
>> annotation" or "found JAR" or "found file" or whatever.
> 
> What about timing?
> If the components start up independently, there could be issues with
> knowing when all interested parties have declared themselves and when
> it is safe to start the scan. Equally, one component may require the
> information before another registers.

Obviously, timing can be an issue. If the scan is already done and the
events haven't been stored somewhere, the scan needs to be re-done. At
that point, it's no worse than the situation as it stands, today. The
components that can benefit from this idea can, while others will not
suffer any worse than they already do.

-chris

Re: Re-factoring TLD parsing

Posted by sebb <se...@gmail.com>.

On 16 August 2012 23:44, Christopher Schultz
<ch...@christopherschultz.net> wrote:
> All,
>
> The first item in the TOMCAT-NEXT.txt is this:
>
>  1. Refactor the TLD parsing. TLDs are currently parsed twice. Once by
>     Catalina looking for listeners and once by Jasper.
>
> I had a conversation in Vancouver with David Blevins about the scourge
> of JAR-scanning in general (in that case, we were discussing
> annotation-processing) and I suggested that a generic JAR scanner could
> be built that would simply scan JARs and emit events like "found
> annotation" or "found JAR" or "found file" or whatever.
>
> I don't know enough about how Catalina and Jasper each handle these
> things, but would such a scanning component be helpful to them? If both
> components (Catalina and Jasper, other components) could register event
> handlers with the scanner, the number of times each JAR would be scanned
> would be limited to 1.
>
> Perhaps this strategy could be extended to TLD processing as well: the
> scanner could be configured to send TLD-specific notifications (or maybe
> just have a TLD-specific listener registered with the scanner that
> generates events like "read TLD component" or what have you).
>
> I guess the question is whether each component (Catalina, Jasper,
> whatever add-ons might be interested in similar information) can cleanly
> register their interest in receiving notification of these kinds of
> events. Is there a convenient place where components hosted by Catalina
> could naturally register for these kinds of events (or even request that
> a JAR scan occur)?

What about timing?
If the components start up independently, there could be issues with
knowing when all interested parties have declared themselves and when
it is safe to start the scan. Equally, one component may require the
information before another registers.

> Is this kind of thing overkill, or does it sound like it would be a
> truly useful utility?
>
> Thanks,
> -chris
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org