You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tomcat.apache.org by neal <ne...@yahoo.com> on 2002/12/11 23:48:23 UTC

Tomcat - a search engine liability?!?!

I've been working on Search engine optimization this week and I've come to a
couple of conclusions and developed a couple of questions.

First a conclusion:
1. Google does NOT index any servlet, framework class, or cgi file.  If it
doesn't end in "jsp" forget about it.  My own framework ending in .mdlx and
servlets without extensions appear to be S.O.L as well.  My guess is struts
and similar other frameworks are in the same boat.  I determined this by
downloading the google toolbar and watching the page rank for various pages.
Pages with a grayed out rank are not indexed. ASP, and JSP were the only
dynamic extensions I consistantly saw that were being indexed.

And now a question (and possible other detriment):
2.  Tomcat standalone automatically redirects (http 302) to index.html or
whatever default file is otherwise specified in the welcome pages node of
the server.xml file.  Potentially BIG problem! Search engines HATE 302s!!!!
And, if a link is provided to http://www.xyz.com whereas, requesting that
page from tomcat goes to http://www.xyz.com/index.html -  the link is likley
not included by most search engines, and this is huge for for anyone who
realizes that inbound links is significant to search engine placement.  But
how does this get turned off?  The only advice I've gotten is to write a
servlet that parse these requests and appropriately forwards the request to
the right file.  But again, I refer to the problem listed above - servlets
to NOT get indexed on the engine that handles 80% of search traffic.

For anyone out there listening, any change this auto-redirect 'feature'
might have an "off" feature in future revisions?  Or, does anyone know of
away around that redirect issue as of current?

Thanks.
Neal Cabage


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


RE: Tomcat - a search engine liability?!?!

Posted by "Noel J. Bergman" <no...@devtech.com>.
> ASP, and JSP were the only dynamic extensions I consistantly
> saw that were being indexed.

> Google does NOT index any servlet, framework class, or cgi file.

I haven't reviewed your facts for accuracy, so take this with a grain of
salt.  But *IF* the world according to Google is as you say it is, and I
needed to use some funky extension, I would consider using mod_rewrite to
rewrite request URLs, and a filter to rewrite URLs in the response data
stream.

This is similar to an issue recently raised in a thread "static url
routing".  In your case, the browser might see
http://host/mypath/mdlx/page.html and you would want tomcat to see
/mypath/page.mdlx.

Actually, I would always rewrite request URLs, but only rewrite the response
data stream for a search engine like Google.  Waste of cycles otherwise, and
I'd want to eliminate the rewrite when search engines are more RESTful.

> Tomcat standalone automatically redirects (http 302) to [welcome file]

I assume that you mean '/' -- 302 --> '/index.jsp', as in your example of
"http://www.xyz.com [goes to] http://www.xyz.com/index.html"?  IIRC, you can
eliminate the round trip to the browser by rewriting the URL, e.g.,

RewriteRule ^(.*)/$  $1/index.jsp -- or whatever you want to use.

	--- Noel


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>