You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by Stuart Roebuck <st...@adolos.co.uk> on 2001/07/04 13:04:28 UTC

Re: Poor Tomcat 3.2.2 performance on Win32 with class reloading enabled

I've been hitting against getCanonicalPath myself recently in the context 
of Cocoon 2 and Tomcat 4 under Mac OS X.

I've taken a look through some of the code of Tomcat and it appears that 
there are opportunities to cut down on calls to getCanonicalPath, but I'm 
conscious that I really am not at all familiar with the code and the 
possible underlying issues.

Here are some examples of bits of code in the current Tomcat source which 
appear open to optimization, perhaps you could comment on whether such 
optimizations would work:

StandardContext.java:

>         if (directory.exists() && directory.canRead() &&
>             directory.isDirectory()) {
>             String filenames[] = directory.list();
>             for (int i = 0; i < filenames.length; i++) {
>                 if (!filenames[i].endsWith(".jar"))
>                     continue;
>                 File file = new File(directory, filenames[i]);
>                 try {
>                     URL url = new URL("file", null, file.getCanonicalPath(
> ));
>                     newLoader.addRepository(url.toString());
>                 } catch (IOException e) {
>                     throw new IllegalArgumentException(e.toString());
>                 }
>             }
>         }

Couldn't we get the Canonical path of the directory and then assume that 
the absolute path of the filenames returned would be Canonical too, 
thereby taking getCanonicalPath out of the loop?

> 	// Create this directory if necessary
> 	File dir = new File(workDir);
> 	if (!dir.isAbsolute()) {
>             File catalinaHome = new 
> File(System.getProperty("catalina.home"));
>             String catalinaHomePath = null;
>             try {
>                 catalinaHomePath = catalinaHome.getCanonicalPath();
>                 dir = new File(catalinaHomePath, workDir);
>             } catch (IOException e) {
>             }
>         }

Isn't getAbsolutePath sufficient here?

Similar issues exist in Bootstrap.java.

HostConfig:

>                 // Deploy the application in this directory
>                 if (debug >= 1)
>                     log(sm.getString("hostConfig.deployDir", files[i]));
>                 try {
>                     URL url = new URL("file", null, dir.getCanonicalPath(
> ));
>                     ((Deployer) host).install(contextPath, url);
>                 } catch (Throwable t) {
>                     log(sm.getString("hostConfig.deployDir.error", files[
> i]),
>                         t);
>                 }

dir is derived using the list method on the appBase variable (not shown on 
listing) which is canonical.  Do we need to use getCanonicalPath here or 
can we just use getAbsolutePath?

Finally, how about having some form of cached getCanonicalPath method 
which returns results from cache for requests for the same file path?  I 
tried replacing the actual getCanonicalPath method in the main Java 
classes with a simple unlimited hashTable and the performance improvements 
running Tomcat and Cocoon were very visible.  Clearly the hash table would 
need to be time and size limited, but are there any other fundamental 
issues with this idea?

Stuart.


On Friday, June 15, 2001, at 10:52  pm, cmanolache@yahoo.com wrote:

>
> Thanks for the report - if you can think of a patch we can use it for
> 3.2.3. The reloading system has been almost completely rewritten for 3.3,
> with far fewer accesses to the file system.
>
> Have you tried with a different VM ? ( I don't think we can avoid
> getCanonicalPath, is a very useful call and is very important for security
> too )
>
> Costin
>
>
>
>
>
> On Fri, 15 Jun 2001, Brett M. Bergquist wrote:
>
>> I just upgraded or application from Tomcat 3.1 to Tomcat 3.2.2 and
> noticed a dramatic slowdown and increased CPU utilization.  This is on
> WinN  4.0 SP5.  To figure out what was happening, I fired up OptimizeIT
> and  ook a look.  I found that 68% time was being spent in
> "org.apache.tomcat.core.ServletWrapper.handleReload" and of the time
> within this method, 51% of the time was being spent in
> "java.io.File.getCanonicalPath" and of the time spend within this method,
> all of it was spent within "java.io.Win32FileSystem.canonicalize".
>>
>> My question is is this normal and what in the hell is taking so long in
> "java.io.Win32FileSystem.canonacalize"?  Once I disabled class reloading,
> by performance and CPU utilization went back to what was present with
> 3.1.
>>
>> Thanks for any input.
>>
>> Brett M. Bergquist
>> Canoga Perkins Corp.
>>
>


-------------------------------------------------------------------------
Stuart Roebuck, BSc, MBA        Tel.: 0131 228 4853 / Fax.: 0870 054 8322
Managing Director
ADOLOS                                           <http://www.adolos.com/>

Re: Poor Tomcat 3.2.2 performance on Win32 with class reloading enabled

Posted by "Craig R. McClanahan" <cr...@apache.org>.
On Mon, 9 Jul 2001, Stuart Roebuck wrote:

> Remy & Craig,
> 
> Thanks for your responses...
> 
> On Saturday, July 7, 2001, at 07:28  am, Remy Maucherat wrote:
> 
> >> The cases you've described above all happen only once (at startup
> >> time).  Assuming that we could safely optimize these cases, would it
> >> really make a significant difference?
> >
> > It wouldn't.
> >
> > While you were away, I did some profiling of the startup (since people
> > complained), and it turns out that the time is mostly spent in :
> > - Crimson, because it's used in validating mode
> > - introspecting and doing various things in the XML mapper (no surprise)
> 
> The examples I gave were just some examples of getCanonicalPath that I 
> located in the source - I acknowledge that I had been unable to identify 
> which ones were taking up significant resources, though all the ones I 
> listed were at least in loops.
> 
> I have had problems profiling Tomcat and Cocoon, but in those threads I 
> have been able to profile under Mac OS X, getCanonicalPath (i.e. native 
> UnixFileSystem.canonicalize) calls whilst running Tomcat 4 and Cocoon II 
> constitute a *very* significant proportion of runtime.  I know that others 
> have noted that this is also true under Windows, so this does not appear 
> to be specific to Mac OS X, though the 'cost' of getCanonicalPath under 
> Mac OS X may be proportionally higher than on other operating systems due 
> to certain underlying filesystem issues.
> 

One commonality, IIRC - both Windows and MacOS are case insensitive.

The servlet spec (at least in 2.3) mandates case sensitive comparisons for
web app resources.  The only way we've discovered to do that is to rely on
getCanonicalPath().

> >> To influence the performance of Cocoon, we'd want to look at the 
> >> Resources
> >> implementation.  It's been worked on, but I would certainly not say we've
> >> really optimized it yet.  And reducing the number of calls to
> >> getCanonicalPath() sounds like a good strategy -- as long as it can be
> >> done safely.
> >
> > getCanonicalPath is called only under Windows (for case sensitivity
> > checking).
> > More profiling showed that the resources were fast enough.
> >
> > Remy
> 
> Remy - could you explain what you mean here.  getCanonicalPath calls 
> within Tomcat are not, as far as I can see, conditional upon runtime 
> Operating System.  Both Windows filing systems and HFS / HFS+ (Under Mac 
> OS / Mac OS X) are case insensitive, the latter being BSD Unix derivative.
>    I would have thought that calls to getCanonicalPath would be required, 
> not just for case sensitivity reasons, but also to deal with the 
> assortment of equivalent path descriptors, e.g. the use of "..", ".", 
> "~user" and the effects of filesystem softlinks, hardlinks, aliases, 
> shortcuts, or whathaveyou.  If getCanonicalPath is being used as a case 
> insensitive string comparitor, then I am sure there are less costly 
> alternatives.
> 
> When you say that profiling indicated that resources were fast enough, do 
> you mean that getCanonicalPath is not a significant bottleneck on any 
> platform?  I'm not doubting this with respect to Tomcat 4, as the main 
> issues I have run across may be largely Cocoon II specific, unfortunately 
> currently profiling  problems mean that I'm going round the houses a 
> little to pin this one down.  However, as I indicated in my earlier email,
>   I have tested substituting th java.io.File.getCanonicalPath() method with 
> a caching version and found a very visible speedup which confirms that 
> getCanonicalPath is an issue somewhere in this Tomcat 4 / Cocoon II 
> combination.
> 
> Stuart.
> 
> 
> -------------------------------------------------------------------------
> Stuart Roebuck                                  stuart.roebuck@adolos.com
> Lead Developer                               Java, XML, MacOS X, XP, etc.
> ADOLOS                                           <http://www.adolos.com/>
> 
Craig



Re: Poor Tomcat 3.2.2 performance on Win32 with class reloading enabled

Posted by Stuart Roebuck <st...@adolos.co.uk>.
Remy & Craig,

Thanks for your responses...

On Saturday, July 7, 2001, at 07:28  am, Remy Maucherat wrote:

>> The cases you've described above all happen only once (at startup
>> time).  Assuming that we could safely optimize these cases, would it
>> really make a significant difference?
>
> It wouldn't.
>
> While you were away, I did some profiling of the startup (since people
> complained), and it turns out that the time is mostly spent in :
> - Crimson, because it's used in validating mode
> - introspecting and doing various things in the XML mapper (no surprise)

The examples I gave were just some examples of getCanonicalPath that I 
located in the source - I acknowledge that I had been unable to identify 
which ones were taking up significant resources, though all the ones I 
listed were at least in loops.

I have had problems profiling Tomcat and Cocoon, but in those threads I 
have been able to profile under Mac OS X, getCanonicalPath (i.e. native 
UnixFileSystem.canonicalize) calls whilst running Tomcat 4 and Cocoon II 
constitute a *very* significant proportion of runtime.  I know that others 
have noted that this is also true under Windows, so this does not appear 
to be specific to Mac OS X, though the 'cost' of getCanonicalPath under 
Mac OS X may be proportionally higher than on other operating systems due 
to certain underlying filesystem issues.

>> To influence the performance of Cocoon, we'd want to look at the 
>> Resources
>> implementation.  It's been worked on, but I would certainly not say we've
>> really optimized it yet.  And reducing the number of calls to
>> getCanonicalPath() sounds like a good strategy -- as long as it can be
>> done safely.
>
> getCanonicalPath is called only under Windows (for case sensitivity
> checking).
> More profiling showed that the resources were fast enough.
>
> Remy

Remy - could you explain what you mean here.  getCanonicalPath calls 
within Tomcat are not, as far as I can see, conditional upon runtime 
Operating System.  Both Windows filing systems and HFS / HFS+ (Under Mac 
OS / Mac OS X) are case insensitive, the latter being BSD Unix derivative.
   I would have thought that calls to getCanonicalPath would be required, 
not just for case sensitivity reasons, but also to deal with the 
assortment of equivalent path descriptors, e.g. the use of "..", ".", 
"~user" and the effects of filesystem softlinks, hardlinks, aliases, 
shortcuts, or whathaveyou.  If getCanonicalPath is being used as a case 
insensitive string comparitor, then I am sure there are less costly 
alternatives.

When you say that profiling indicated that resources were fast enough, do 
you mean that getCanonicalPath is not a significant bottleneck on any 
platform?  I'm not doubting this with respect to Tomcat 4, as the main 
issues I have run across may be largely Cocoon II specific, unfortunately 
currently profiling  problems mean that I'm going round the houses a 
little to pin this one down.  However, as I indicated in my earlier email,
  I have tested substituting th java.io.File.getCanonicalPath() method with 
a caching version and found a very visible speedup which confirms that 
getCanonicalPath is an issue somewhere in this Tomcat 4 / Cocoon II 
combination.

Stuart.


-------------------------------------------------------------------------
Stuart Roebuck                                  stuart.roebuck@adolos.com
Lead Developer                               Java, XML, MacOS X, XP, etc.
ADOLOS                                           <http://www.adolos.com/>

Re: Poor Tomcat 3.2.2 performance on Win32 with class reloading enabled

Posted by Remy Maucherat <re...@apache.org>.
> The cases you've described above all happen only once (at startup
> time).  Assuming that we could safely optimize these cases, would it
> really make a significant difference?

It wouldn't.

While you were away, I did some profiling of the startup (since people
complained), and it turns out that the time is mostly spent in :
- Crimson, because it's used in validating mode
- introspecting and doing various things in the XML mapper (no surprise)

> To influence the performance of Cocoon, we'd want to look at the Resources
> implementation.  It's been worked on, but I would certainly not say we've
> really optimized it yet.  And reducing the number of calls to
> getCanonicalPath() sounds like a good strategy -- as long as it can be
> done safely.

getCanonicalPath is called only under Windows (for case sensitivity
checking).
More profiling showed that the resources were fast enough.

Remy


Re: Poor Tomcat 3.2.2 performance on Win32 with class reloading enabled

Posted by "Craig R. McClanahan" <cr...@apache.org>.

On Wed, 4 Jul 2001, Stuart Roebuck wrote:

> I've been hitting against getCanonicalPath myself recently in the context 
> of Cocoon 2 and Tomcat 4 under Mac OS X.
> 
> I've taken a look through some of the code of Tomcat and it appears that 
> there are opportunities to cut down on calls to getCanonicalPath, but I'm 
> conscious that I really am not at all familiar with the code and the 
> possible underlying issues.
> 

One thing to be *very* cautious about here is security.  You have to make
absolutely sure, for example, that resource references are case sensitive
(even on case insensitive platforms), and that malicious users can't use
things like "../../../../../other/place" to access things outside of the
web app.

> Here are some examples of bits of code in the current Tomcat source which 
> appear open to optimization, perhaps you could comment on whether such 
> optimizations would work:
> 
> StandardContext.java:
> 
> >         if (directory.exists() && directory.canRead() &&
> >             directory.isDirectory()) {
> >             String filenames[] = directory.list();
> >             for (int i = 0; i < filenames.length; i++) {
> >                 if (!filenames[i].endsWith(".jar"))
> >                     continue;
> >                 File file = new File(directory, filenames[i]);
> >                 try {
> >                     URL url = new URL("file", null, file.getCanonicalPath(
> > ));
> >                     newLoader.addRepository(url.toString());
> >                 } catch (IOException e) {
> >                     throw new IllegalArgumentException(e.toString());
> >                 }
> >             }
> >         }
> 
> Couldn't we get the Canonical path of the directory and then assume that 
> the absolute path of the filenames returned would be Canonical too, 
> thereby taking getCanonicalPath out of the loop?
> 

Depends on where the list of filenames comes from.  In this case, it might
be feasible (since they come from the operating system and not the web
app), but a thorough security review would be important first.

> > 	// Create this directory if necessary
> > 	File dir = new File(workDir);
> > 	if (!dir.isAbsolute()) {
> >             File catalinaHome = new 
> > File(System.getProperty("catalina.home"));
> >             String catalinaHomePath = null;
> >             try {
> >                 catalinaHomePath = catalinaHome.getCanonicalPath();
> >                 dir = new File(catalinaHomePath, workDir);
> >             } catch (IOException e) {
> >             }
> >         }
> 
> Isn't getAbsolutePath sufficient here?
> 

IIRC, getAbsolutePath caused problems here on Windows but worked
correctly on Solaris and Linux.  Such is life when you've got to be
portable across multiple JVMs.

> Similar issues exist in Bootstrap.java.
> 
> HostConfig:
> 
> >                 // Deploy the application in this directory
> >                 if (debug >= 1)
> >                     log(sm.getString("hostConfig.deployDir", files[i]));
> >                 try {
> >                     URL url = new URL("file", null, dir.getCanonicalPath(
> > ));
> >                     ((Deployer) host).install(contextPath, url);
> >                 } catch (Throwable t) {
> >                     log(sm.getString("hostConfig.deployDir.error", files[
> > i]),
> >                         t);
> >                 }
> 
> dir is derived using the list method on the appBase variable (not shown on 
> listing) which is canonical.  Do we need to use getCanonicalPath here or 
> can we just use getAbsolutePath?
> 
> Finally, how about having some form of cached getCanonicalPath method 
> which returns results from cache for requests for the same file path?  I 
> tried replacing the actual getCanonicalPath method in the main Java 
> classes with a simple unlimited hashTable and the performance improvements 
> running Tomcat and Cocoon were very visible.  Clearly the hash table would 
> need to be time and size limited, but are there any other fundamental 
> issues with this idea?
> 

The cases you've described above all happen only once (at startup
time).  Assuming that we could safely optimize these cases, would it
really make a significant difference?

To influence the performance of Cocoon, we'd want to look at the Resources
implementation.  It's been worked on, but I would certainly not say we've
really optimized it yet.  And reducing the number of calls to
getCanonicalPath() sounds like a good strategy -- as long as it can be
done safely.

> Stuart.
> 

Craig