You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by Klaus Sonnenleiter <kl...@sonnenleiter.com> on 2001/04/13 16:40:14 UTC

%3F Problem

This may be a bit obscure, but I'm trying to get Tomcat to respond to a 
request that arrives with an encoded URL in the form [URL]%3F[Parameters]. 
It looks like "http://myhost:myport/mycontext/servlet/myservlet%3Fx=y" If I 
do the equivalent with an Apache http server (for verifying that I'm trying 
the right thing), I get the error message indicating that Apache was 
looking for the correct URL followed by a question mark and the name-value 
pairs of the parameter list. In Tomcat, however, the %3F does not get 
replaced and the error message indicates that Tomcat is looking for a 
servlet class called "myservlet%3Fx=y" which does not exist on my system.

It looks like somebody must have attempted to fix this since the b3 version 
does a correct replacement if the %3F shows up right behind the end of the 
context (/mycontext/servlet%3F). I would volunteer to attempt a fix, if 
someone could point me to the right files - I looked at StandardWrapper and 
StandardClassLoader, and I can get the class loaded by cutting the name 
before the %, but then I lose the parameter list.

Any hints?

TIA


Re: %3F Problem

Posted by "Craig R. McClanahan" <cr...@apache.org>.

On Fri, 13 Apr 2001, Klaus Sonnenleiter wrote:

> Craig,
> 
> since I'm not really familiar with what the standard says, I can't comment 
> on that. But I can only tell you what I observed in other HTTP servers and 
> it appears that most convert a %3F into a question mark some time before 
> sending the request to the classloader or to the filter that looks for the 
> file. My current problem is actually limited to a specific area and I think 
> taking a calculated risk and deviating slightly from the standard (if 
> that's what it is), would not be the worst of all options.
> 

Remy has pointed out the RFCs for interpreting URIs.

> However, I would be interested in finding out what exactly the desired 
> behavior for the final version of Tomcat 4.0 is. 

The controlling authority for Tomcat 4.0 behavior is the Servlet 2.3 and
JSP 1.2 specifications (both in "Proposed Final Draft" status) at:

  http://java.sun.com/products/servlet/download.html
  http://java.sun.com/products/jsp/download.html

Both specs contain references to other specs they rely on as
well.  Basically, the spec defines what a web application developer can
assume is portable across servlet containers.

There are lots of features in Tomcat 4.0 that go beyond the spec, which
are basically defined by the code contributions that are made.  But the
specifics of the behavior you are talking about here seems to be pretty
clearly defined in the relevant specifications.

> Speaking of which, is 
> there a target release date yet?
> 

It doesn't make much sense to have Tomcat 4.0 go final until the specs do
(there have in fact been some minor changes -- and some clarifications are
still being discussed in the JSR-053 expert group that produced these
specs even as I type).  Therefore, we'll continue to release "betas", with
increasing functionality and performance along with bug fixes, until then.

> Klaus Sonnenleiter
> 

Craig


Re: %3F Problem

Posted by Klaus Sonnenleiter <kl...@sonnenleiter.com>.
Craig,

since I'm not really familiar with what the standard says, I can't comment 
on that. But I can only tell you what I observed in other HTTP servers and 
it appears that most convert a %3F into a question mark some time before 
sending the request to the classloader or to the filter that looks for the 
file. My current problem is actually limited to a specific area and I think 
taking a calculated risk and deviating slightly from the standard (if 
that's what it is), would not be the worst of all options.

However, I would be interested in finding out what exactly the desired 
behavior for the final version of Tomcat 4.0 is. Speaking of which, is 
there a target release date yet?

Klaus Sonnenleiter

At 10:47 AM 4/13/01 -0700, you wrote:


>On Fri, 13 Apr 2001, Klaus Sonnenleiter wrote:
>
> > Craig,
> >
> > I looked at HttpRequestImpl. Would it be safe to manipulate the URI in a
> > call to setRequestURI before it sets the instance variable requestURI? It
> > seems like this gets called the moment a request is made - this way, the
> > encoded characters could be transformed to their unencoded equivalents
> > before the parameter list is parsed and the classloader gets called.
> >
> > Klaus
> >
>
>The key thing to remember is a spec requirement that
>request.getRequestURI() must return the original request URI *without*
>decoding.  The values returned by request.getServletPath() and
>request.getPathInfo(), on the other hand, are decoded first.  Therefore,
>if you manipulate the request URI value in setRequestURI(), we'd need to
>make sure that we save an unmanipulated version somewhere as well.
>
>The deeper issue, though, is the portability of what you are
>proposing (across servlet containers) would be.  As I understand it, you
>would like the %3f character to be interpreted as a "?" character so that
>the stuff after it is understood as part of the query string.  That seems
>(to me) a questionable practice -- the reason you would use a %3f encoding
>in the first place is so that you could treat a question mark as a regular
>data character, instead of being a significant delimiter.  If you decode
>first and then find that the "?" is significant, how would you ever
>include a question mark as part of the data value for a query string
>parameter (for example)?
>
>NOTE:  There also needs to be a little more work in this area with respect
>to path parameters (;xxx stuff, which is how the session id is
>transmitted).  This is being discussed in the expert group, and will
>probably require some minor changes in this area of Tomcat 4.
>
>Craig
>
>
>
>
> > At 09:34 AM 4/13/01 -0700, you wrote:
> >
> >
> > >On Fri, 13 Apr 2001, Klaus Sonnenleiter wrote:
> > >
> > > > Oops, I guess I should have mentioned that I'm using the 4.0 
> version. Do
> > > > you happen to know where the RequestImpl or equivalent class is in
> > > > catalina? (I checked org.apache.catalina.core.* without success).
> > > >
> > >
> > >The base class is org.apache.catalina.connector.HttpRequestBase.  The 1.1
> > >connector subclasses this as
> > >org.apache.catalina.connector.http.HttpRequestImpl.
> > >
> > >Craig
> >
> >


Re: %3F Problem

Posted by Klaus Sonnenleiter <kl...@sonnenleiter.com>.
>
>So, is there any way to intercept the first call to the URI parser, 
>determine whether this is one of my previously encoded URIs and replace 
>the escaped character if it is?
Never mind, I just answered that for myself (must have been half asleep 
when I asked <g>).


Re: %3F Problem

Posted by Klaus Sonnenleiter <kl...@sonnenleiter.com>.
Remy, Craig,

Yes, you're right. I read the specs and apparently the TC way of doing 
things is precisely the way it's written in the standard. However, that 
still doesn't fix my problem except if I want to carry along my hacked 
version forever.

Here's what I'm trying to achieve: I currently have Tomcat proxy requests 
to underlying applications. When proxying applets however, I'm running into 
trouble since I need to pass parameters to the proxy from the URI which in 
this case is embedded in an <APPLET> tag and gets cut at the question mark 
by the browser unless it's escaped. A properly behaving Tomcat will not be 
able to find the right servlet.

So, is there any way to intercept the first call to the URI parser, 
determine whether this is one of my previously encoded URIs and replace the 
escaped character if it is?

Klaus

At 10:55 AM 4/13/01 -0700, you wrote:
> > On Fri, 13 Apr 2001, Klaus Sonnenleiter wrote:
> >
> > > Craig,
> > >
> > > I looked at HttpRequestImpl. Would it be safe to manipulate the URI in a
> > > call to setRequestURI before it sets the instance variable requestURI?
>It
> > > seems like this gets called the moment a request is made - this way, the
> > > encoded characters could be transformed to their unencoded equivalents
> > > before the parameter list is parsed and the classloader gets called.
> > >
> > > Klaus
> > >
> >
> > The key thing to remember is a spec requirement that
> > request.getRequestURI() must return the original request URI *without*
> > decoding.  The values returned by request.getServletPath() and
> > request.getPathInfo(), on the other hand, are decoded first.  Therefore,
> > if you manipulate the request URI value in setRequestURI(), we'd need to
> > make sure that we save an unmanipulated version somewhere as well.
> >
> > The deeper issue, though, is the portability of what you are
> > proposing (across servlet containers) would be.  As I understand it, you
> > would like the %3f character to be interpreted as a "?" character so that
> > the stuff after it is understood as part of the query string.  That seems
> > (to me) a questionable practice -- the reason you would use a %3f encoding
> > in the first place is so that you could treat a question mark as a regular
> > data character, instead of being a significant delimiter.  If you decode
> > first and then find that the "?" is significant, how would you ever
> > include a question mark as part of the data value for a query string
> > parameter (for example)?
> >
> > NOTE:  There also needs to be a little more work in this area with respect
> > to path parameters (;xxx stuff, which is how the session id is
> > transmitted).  This is being discussed in the expert group, and will
> > probably require some minor changes in this area of Tomcat 4.
>
>'?' shouldn't be encoded in the first place as it's a reserved character
>(just like you should never encode '/' in the path). If it's encoded, I
>don't think it should be interpreted as the delimiter for the query section
>of the URL.
>So IMO the current TC behavior is the right one.
>
>The RFC for URIs is http://www.ietf.org/rfc/rfc2396.txt
>
>Remy


Re: %3F Problem

Posted by Remy Maucherat <re...@apache.org>.
> On Fri, 13 Apr 2001, Klaus Sonnenleiter wrote:
>
> > Craig,
> >
> > I looked at HttpRequestImpl. Would it be safe to manipulate the URI in a
> > call to setRequestURI before it sets the instance variable requestURI?
It
> > seems like this gets called the moment a request is made - this way, the
> > encoded characters could be transformed to their unencoded equivalents
> > before the parameter list is parsed and the classloader gets called.
> >
> > Klaus
> >
>
> The key thing to remember is a spec requirement that
> request.getRequestURI() must return the original request URI *without*
> decoding.  The values returned by request.getServletPath() and
> request.getPathInfo(), on the other hand, are decoded first.  Therefore,
> if you manipulate the request URI value in setRequestURI(), we'd need to
> make sure that we save an unmanipulated version somewhere as well.
>
> The deeper issue, though, is the portability of what you are
> proposing (across servlet containers) would be.  As I understand it, you
> would like the %3f character to be interpreted as a "?" character so that
> the stuff after it is understood as part of the query string.  That seems
> (to me) a questionable practice -- the reason you would use a %3f encoding
> in the first place is so that you could treat a question mark as a regular
> data character, instead of being a significant delimiter.  If you decode
> first and then find that the "?" is significant, how would you ever
> include a question mark as part of the data value for a query string
> parameter (for example)?
>
> NOTE:  There also needs to be a little more work in this area with respect
> to path parameters (;xxx stuff, which is how the session id is
> transmitted).  This is being discussed in the expert group, and will
> probably require some minor changes in this area of Tomcat 4.

'?' shouldn't be encoded in the first place as it's a reserved character
(just like you should never encode '/' in the path). If it's encoded, I
don't think it should be interpreted as the delimiter for the query section
of the URL.
So IMO the current TC behavior is the right one.

The RFC for URIs is http://www.ietf.org/rfc/rfc2396.txt

Remy


Re: %3F Problem

Posted by "Craig R. McClanahan" <cr...@apache.org>.

On Fri, 13 Apr 2001, Klaus Sonnenleiter wrote:

> Craig,
> 
> I looked at HttpRequestImpl. Would it be safe to manipulate the URI in a 
> call to setRequestURI before it sets the instance variable requestURI? It 
> seems like this gets called the moment a request is made - this way, the 
> encoded characters could be transformed to their unencoded equivalents 
> before the parameter list is parsed and the classloader gets called.
> 
> Klaus
> 

The key thing to remember is a spec requirement that
request.getRequestURI() must return the original request URI *without*
decoding.  The values returned by request.getServletPath() and
request.getPathInfo(), on the other hand, are decoded first.  Therefore,
if you manipulate the request URI value in setRequestURI(), we'd need to
make sure that we save an unmanipulated version somewhere as well.

The deeper issue, though, is the portability of what you are
proposing (across servlet containers) would be.  As I understand it, you
would like the %3f character to be interpreted as a "?" character so that
the stuff after it is understood as part of the query string.  That seems
(to me) a questionable practice -- the reason you would use a %3f encoding
in the first place is so that you could treat a question mark as a regular
data character, instead of being a significant delimiter.  If you decode
first and then find that the "?" is significant, how would you ever
include a question mark as part of the data value for a query string
parameter (for example)?

NOTE:  There also needs to be a little more work in this area with respect
to path parameters (;xxx stuff, which is how the session id is
transmitted).  This is being discussed in the expert group, and will
probably require some minor changes in this area of Tomcat 4.

Craig




> At 09:34 AM 4/13/01 -0700, you wrote:
> 
> 
> >On Fri, 13 Apr 2001, Klaus Sonnenleiter wrote:
> >
> > > Oops, I guess I should have mentioned that I'm using the 4.0 version. Do
> > > you happen to know where the RequestImpl or equivalent class is in
> > > catalina? (I checked org.apache.catalina.core.* without success).
> > >
> >
> >The base class is org.apache.catalina.connector.HttpRequestBase.  The 1.1
> >connector subclasses this as
> >org.apache.catalina.connector.http.HttpRequestImpl.
> >
> >Craig
> 
> 


Re: %3F Problem

Posted by Klaus Sonnenleiter <kl...@sonnenleiter.com>.
Craig,

I looked at HttpRequestImpl. Would it be safe to manipulate the URI in a 
call to setRequestURI before it sets the instance variable requestURI? It 
seems like this gets called the moment a request is made - this way, the 
encoded characters could be transformed to their unencoded equivalents 
before the parameter list is parsed and the classloader gets called.

Klaus

At 09:34 AM 4/13/01 -0700, you wrote:


>On Fri, 13 Apr 2001, Klaus Sonnenleiter wrote:
>
> > Oops, I guess I should have mentioned that I'm using the 4.0 version. Do
> > you happen to know where the RequestImpl or equivalent class is in
> > catalina? (I checked org.apache.catalina.core.* without success).
> >
>
>The base class is org.apache.catalina.connector.HttpRequestBase.  The 1.1
>connector subclasses this as
>org.apache.catalina.connector.http.HttpRequestImpl.
>
>Craig


Re: %3F Problem

Posted by "Craig R. McClanahan" <cr...@apache.org>.

On Fri, 13 Apr 2001, Klaus Sonnenleiter wrote:

> Oops, I guess I should have mentioned that I'm using the 4.0 version. Do 
> you happen to know where the RequestImpl or equivalent class is in 
> catalina? (I checked org.apache.catalina.core.* without success).
> 

The base class is org.apache.catalina.connector.HttpRequestBase.  The 1.1
connector subclasses this as
org.apache.catalina.connector.http.HttpRequestImpl.

Craig


Re: %3F Problem

Posted by Klaus Sonnenleiter <kl...@sonnenleiter.com>.
Oops, I guess I should have mentioned that I'm using the 4.0 version. Do 
you happen to know where the RequestImpl or equivalent class is in 
catalina? (I checked org.apache.catalina.core.* without success).

At 05:09 PM 4/13/01 +0200, you wrote:
>I find out some problem with this. Encoding som other characters then ISO
>Latin 1 from %xx URL fromat.
>
>With URL encoding deals:
>in tomcat:
>  org.apache.tomcat.core.RequestImpl.handleParameters();
>
>... this use some methods from the same class but the same thinh does:
>
>in servlet spec:
>  javax.servlet.http.HttpUtils.parseQueryString()
>
>I thing is better to use one implementation of parsing URL.
>(I posted the code in "Support for different Charsets" tread)
>
>Hi
>
>Jan Fnukal
>
>e-mail:   jfnukal@efcon.cz
>tel: +420-5-4142 5628
>
>EfCon a.s.
>Jaselska 25
>611 57 Brno
>Czech Republic
>www.efcon.cz
>
>
>----- Original Message -----
>From: Klaus Sonnenleiter <kl...@sonnenleiter.com>
>To: <to...@jakarta.apache.org>
>Sent: Friday, April 13, 2001 4:40 PM
>Subject: %3F Problem
>
>
> > This may be a bit obscure, but I'm trying to get Tomcat to respond to a
> > request that arrives with an encoded URL in the form [URL]%3F[Parameters].
> > It looks like "http://myhost:myport/mycontext/servlet/myservlet%3Fx=y" If
>I
> > do the equivalent with an Apache http server (for verifying that I'm
>trying
> > the right thing), I get the error message indicating that Apache was
> > looking for the correct URL followed by a question mark and the name-value
> > pairs of the parameter list. In Tomcat, however, the %3F does not get
> > replaced and the error message indicates that Tomcat is looking for a
> > servlet class called "myservlet%3Fx=y" which does not exist on my system.
> >
> > It looks like somebody must have attempted to fix this since the b3
>version
> > does a correct replacement if the %3F shows up right behind the end of the
> > context (/mycontext/servlet%3F). I would volunteer to attempt a fix, if
> > someone could point me to the right files - I looked at StandardWrapper
>and
> > StandardClassLoader, and I can get the class loaded by cutting the name
> > before the %, but then I lose the parameter list.
> >
> > Any hints?
> >
> > TIA
> >


Re: %3F Problem

Posted by Jan Fnukal <jf...@efcon.cz>.
I find out some problem with this. Encoding som other characters then ISO
Latin 1 from %xx URL fromat.

With URL encoding deals:
in tomcat:
 org.apache.tomcat.core.RequestImpl.handleParameters();

... this use some methods from the same class but the same thinh does:

in servlet spec:
 javax.servlet.http.HttpUtils.parseQueryString()

I thing is better to use one implementation of parsing URL.
(I posted the code in "Support for different Charsets" tread)

Hi

Jan Fnukal

e-mail:   jfnukal@efcon.cz
tel: +420-5-4142 5628

EfCon a.s.
Jaselska 25
611 57 Brno
Czech Republic
www.efcon.cz


----- Original Message -----
From: Klaus Sonnenleiter <kl...@sonnenleiter.com>
To: <to...@jakarta.apache.org>
Sent: Friday, April 13, 2001 4:40 PM
Subject: %3F Problem


> This may be a bit obscure, but I'm trying to get Tomcat to respond to a
> request that arrives with an encoded URL in the form [URL]%3F[Parameters].
> It looks like "http://myhost:myport/mycontext/servlet/myservlet%3Fx=y" If
I
> do the equivalent with an Apache http server (for verifying that I'm
trying
> the right thing), I get the error message indicating that Apache was
> looking for the correct URL followed by a question mark and the name-value
> pairs of the parameter list. In Tomcat, however, the %3F does not get
> replaced and the error message indicates that Tomcat is looking for a
> servlet class called "myservlet%3Fx=y" which does not exist on my system.
>
> It looks like somebody must have attempted to fix this since the b3
version
> does a correct replacement if the %3F shows up right behind the end of the
> context (/mycontext/servlet%3F). I would volunteer to attempt a fix, if
> someone could point me to the right files - I looked at StandardWrapper
and
> StandardClassLoader, and I can get the class loaded by cutting the name
> before the %, but then I lose the parameter list.
>
> Any hints?
>
> TIA
>