You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Matthew Albright <ma...@yahoo.com> on 2002/01/18 01:45:51 UTC

HttpClient: Basic Authentication & acting more like a browser

I've run into a interesting little shortcoming with HttpClient, and
would like to know if someone is working on a solution.

When a browser goes to a site that requires Basic Authentication, it
will make a request (GET /), get a 401 response with a realm
attached, and prompt the user for a username and password for that
realm.  After getting the username/password, it fills in the
appropriate header with that information, and makes the request
again, gets the 200 response, and presumably all the data for the
page.

Now, on subsequent visits to the same site, it appears (from server
logs) that the browser fills in the Authentication header for each
request, in a kind of pre-emptive strike.  That way, the browser and
server don't have to do the request-401-auth-200 dance every single
time.

It appears that HttpClient doesn't do this process exactly like a
browser.  Every single Method that is execute()'d will send out the
initial request without Authentication, get the 401, and then send
the request again with authentication information attached.

This is not normally a problem, or even noticable.

My problem is that I'm doing an HTTP POST with a good amount of data.
 This data is being sent down the wire twice for every
Method.execute() that I'm calling.  I'm on DSL, so it's not too bad,
but for large payloads or slow connections, it's IMHO unacceptable.

I'm considering modifying HttpState and HttpMethodBase to act more
like a browser... make HttpState store not only the realm, but also
the url of the site that requires authentication, and then make
HttpMethodBase fill in that information if it matches for every
request after the first successful one.

Is anyone else working on this, or have any input on a specific
approach to this problem?  Any things I should look out for?  Does
anyone have any idea what I'm talking about?  ;)

And, once I've finished this, and gotten it working, what should I do
with it?  Just send a patch to this list?

Thanks for reading this far.

matt


__________________________________________________
Do You Yahoo!?
Send FREE video emails in Yahoo! Mail!
http://promo.yahoo.com/videomail/

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: HttpClient: Basic Authentication & acting more like a browser

Posted by "Paul C. Bryan" <em...@pbryan.net>.
Matthew Albright wrote:

> How about a middle ground?  I was thinking maybe storing the URL of
> the place that required authentication, and then preemptively sending
> the authorization for every request that is "underneath" that place
> in the hierarchy...


In your example of initially authenticating with 
http://www.foo.bar/secure/index.html, how will you handle the following 
case?

> YES: http://www.foo.bar/secure/upload.cgi
> YES: http://www.foo.bar/secure/dir/index.html
> NO:  http://www.foo.bar/index.html
> NO:  https://www.foo.bar/secure/index.html


???: http://www.foo.bar/secure/../insecure/index.html

Also, would a number of URL fragments be cached if 401s are returned 
with the same realm?

If a realm is returned with a 401 with a "simpler" path (e.g. 
http://www.foo.bar/), will you simply simplify the path in the cache or 
just add it to the cache? (I know, this is just an implementation issue 
and probably doesn't belong in this design stage).

> One other thing to think about is nested authorization... I don't
> know what apache does if you have a .htaccess file in /secure, and
> then another one with a different realm in /secure/dir ... does it
> require both username/password combos, or just the innermost one? 
> HttpClient could handle either case, I suspect, with careful coding.


I believe (someone should confirm) that the browser only stores one 
realm associated with a host and port. If another realm is cited in a 
401 response with that host and port, that new realm will overwrite the 
old realm.

Paul


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: HttpClient: Basic Authentication & acting more like a browser

Posted by Matthew Albright <ma...@yahoo.com>.
--- "Paul C. Bryan" <em...@pbryan.net> wrote:
[...]
> Storing the URL, as you suggest is overly restrictive, and would 
> potentially require caching every separate URL that has required 
> authentication for a particular realm.
> 
> Most browsers appear to send credentials for every subsequent
> request to the same host and port. This can be a bit risky
> in my opinion, as the user's credentials could be supplied
> to unrelated services on the same host.
[...]

How about a middle ground?  I was thinking maybe storing the URL of
the place that required authentication, and then preemptively sending
the authorization for every request that is "underneath" that place
in the hierarchy...

Example:

We get a 401 for http://www.foo.bar/secure/index.html.  Store, along
with the realm, "http://www.foo.bar/secure/"... and then any
subsequent requests that startsWith that URL gets preemptively
authorized.  

YES: http://www.foo.bar/secure/upload.cgi
YES: http://www.foo.bar/secure/dir/index.html
NO:  http://www.foo.bar/index.html
NO:  https://www.foo.bar/secure/index.html

One other thing to think about is nested authorization... I don't
know what apache does if you have a .htaccess file in /secure, and
then another one with a different realm in /secure/dir ... does it
require both username/password combos, or just the innermost one? 
HttpClient could handle either case, I suspect, with careful coding.

matt


__________________________________________________
Do You Yahoo!?
Send FREE video emails in Yahoo! Mail!
http://promo.yahoo.com/videomail/

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


RE: HttpClient: Basic Authentication & acting more like a browser

Posted by Donnie Hale <do...@haleonline.net>.
A couple of thoughts here. FWIW, I spent 2-1/2 years early in the life of
the web writing custom security realms, etc., for use across the web.

First, I thought the HTTP spec dictated when and for what criterion to use
what has been dubbed the "preemptive strike" behavior. Don't have it handy
to double-check. Typically, the browsers make any first request to a site
(server+port) ***in that run of the browser process*** without any
credentials. After getting the 401 challenge, supplying the entered
credentials, and then getting a subsequent 200, the browser caches them for
the duration of that run. Of course the recent browsers are providing an
option to store those names and passwords and use them during subsequent
browser invocations w/o prompting.

Second, and this may be common knowledge by now, but digest authentication
isn't necessarily more secure than basic. It certainly prevents you from
learning the password, which could be useful if the user happens to visit
other sites and uses that password. However, the "digested" password is all
the credentials you need to get to the site in question. So sniffing the
network can get you that value and you can impersonate that individual with
those credentials.

It's really unfortunate that a secure, challenge-response authentication
mechanism hasn't become standard for HTTP. Five-six years ago, all the hype
was about digital certificates and how they were the answers to everyone's
prayer. Our security guru was adamant then that they'd never catch on for
client authentication, and so far he's been proven right. Waiting for that
nirvana has, I'm afraid, prevented the smart folks at W3C or elsewhere from
moving us to a ubiquitous standard.

FWIW...

Donnie


> -----Original Message-----
> From: Paul C. Bryan [mailto:email@pbryan.net]
> Sent: Thursday, January 17, 2002 9:29 PM
> To: Jakarta Commons Developers List
> Subject: Re: HttpClient: Basic Authentication & acting more like a
> browser
>
>
> Hi Matthew:
>
> I've noticed this behavior too. The question, in my opinion, is how to
> decide when to invoke the "preemptive strike" behavior. You don't know
> until a 401 has been returned exactly what realm is required for
> authentication for a particular request.
>
> Storing the URL, as you suggest is overly restrictive, and would
> potentially require caching every separate URL that has required
> authentication for a particular realm.
>
> Most browsers appear to send credentials for every subsequent request to
> the same host and port. This can be a bit risky in my opinion, as the
> user's credentials could be supplied to unrelated services on the
> same host.
>
> Some of this risk is mitigated in browser implementations by the fact
> that credentials can be sent using digest. Unfortunately, only basic
> authentication is provided in HttpClient so far, causing credentials to
> always be sent virtually in plaintext (actually, base64).
>
> If we decided to implement digest authentication in HttpClient, I'd be
> supportive of taking the approach of storing a collection of hosts and
> ports with the realm and credentials and send the credentials
> unconditionally to the same hosts and ports in subsequent requests.
>
> If the host doesn't support digest, we'd continue to send with basic. At
> least we provided the opportunity to use a more secure authentication
> method with that server, so if the credentials are sent to an
> undesireable service, too bad.
>
> My initial idea of Behavior:
>
> Attempt to find realm by host and port in HttpState (in new host/port
> map). Not found. Make request without credentials. Receive 401 response
> with realm. Lookup credentials by realm returned with 401. Found. Store
> realm for host and port in host/port map. Future attempts to find realm
> by host and port will result in the stored realm.
>
> That's my thinking thus far. Thoughts? Anyone?
>
> Yours truly,
>
> Paul C. Bryan <em...@pbryan.net>
> http://pbryan.net/
>
>
> --
> To unsubscribe, e-mail:
<ma...@jakarta.apache.org>
For additional commands, e-mail:
<ma...@jakarta.apache.org>



--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: HttpClient: Basic Authentication & acting more like a browser

Posted by "Paul C. Bryan" <em...@pbryan.net>.
Hi Matthew:

I've noticed this behavior too. The question, in my opinion, is how to 
decide when to invoke the "preemptive strike" behavior. You don't know 
until a 401 has been returned exactly what realm is required for 
authentication for a particular request.

Storing the URL, as you suggest is overly restrictive, and would 
potentially require caching every separate URL that has required 
authentication for a particular realm.

Most browsers appear to send credentials for every subsequent request to 
the same host and port. This can be a bit risky in my opinion, as the 
user's credentials could be supplied to unrelated services on the same host.

Some of this risk is mitigated in browser implementations by the fact 
that credentials can be sent using digest. Unfortunately, only basic 
authentication is provided in HttpClient so far, causing credentials to 
always be sent virtually in plaintext (actually, base64).

If we decided to implement digest authentication in HttpClient, I'd be 
supportive of taking the approach of storing a collection of hosts and 
ports with the realm and credentials and send the credentials 
unconditionally to the same hosts and ports in subsequent requests.

If the host doesn't support digest, we'd continue to send with basic. At 
least we provided the opportunity to use a more secure authentication 
method with that server, so if the credentials are sent to an 
undesireable service, too bad.

My initial idea of Behavior:

Attempt to find realm by host and port in HttpState (in new host/port 
map). Not found. Make request without credentials. Receive 401 response 
with realm. Lookup credentials by realm returned with 401. Found. Store 
realm for host and port in host/port map. Future attempts to find realm 
by host and port will result in the stored realm.

That's my thinking thus far. Thoughts? Anyone?

Yours truly,

Paul C. Bryan <em...@pbryan.net>
http://pbryan.net/


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: HttpClient: Basic Authentication & acting more like a browser

Posted by David Bullock <db...@lisasoft.com>.
I didn't say there was anything wrong with caching credentials as well.
This is indeed useful.

But there is still a situation where you have to do two large submits
on the first time the credentials are challenged, one of which is wasted.

In some situations, this could be unacceptable.

If could be actually eliminated ( for servers that support it, as you
say ) altogether, using the HTTP 100 response.

cheers,
David


On Mon, 28 Jan 2002, Ian Kallen <ia...@covalent.net> wrote:

>
> I don't think he was asking about the spec's facilities for request
> requirement discovery, I think he was suggesting that the httpclient's
> state mechanism should behave more like a conventional browsers'.  There's
> no reason not to do it; even if you could count on all of your servers to
> implement the response code 100 heuristics, there's nothing wrong with
> the standard browser behavior.
>
> On Tue, 29 Jan 2002, David Bullock wrote:
> > I was reading in the HTTP 1.1 spec just today, and found that there is a
> > way for the client to see if the server is willing to respond to the
> > HTTP request headers before sending the message body.
> >
> > If the server is, then it gives the response right away.  If not, then
> > it asks the client to continue ( HTTP response code 100 ) with the
> > message body.  This allows for things like security credential exchanges,
> > redirects, etc, to be performed without incurring unnecessary network
> > overhead.
> >
> > This is probably exactly what you're after to limit transmitting a large
> > body twice.
> >
> > Details are at:
> >
> >   ftp://ftp.isi.edu/in-notes/rfc2616.txt
> >
> > Section 8.2.3
> >
> >
> > cheers,
> > David.
> >
> >
> > On Thu, 17 Jan 2002, Matthew Albright wrote:
> >
> > > I've run into a interesting little shortcoming with HttpClient, and
> > > would like to know if someone is working on a solution.
> > >
> > > When a browser goes to a site that requires Basic Authentication, it
> > > will make a request (GET /), get a 401 response with a realm
> > > attached, and prompt the user for a username and password for that
> > > realm.  After getting the username/password, it fills in the
> > > appropriate header with that information, and makes the request
> > > again, gets the 200 response, and presumably all the data for the
> > > page.
> > >
> > > Now, on subsequent visits to the same site, it appears (from server
> > > logs) that the browser fills in the Authentication header for each
> > > request, in a kind of pre-emptive strike.  That way, the browser and
> > > server don't have to do the request-401-auth-200 dance every single
> > > time.
> > >
> > > It appears that HttpClient doesn't do this process exactly like a
> > > browser.  Every single Method that is execute()'d will send out the
> > > initial request without Authentication, get the 401, and then send
> > > the request again with authentication information attached.
> > >
> > > This is not normally a problem, or even noticable.
> > >
> > > My problem is that I'm doing an HTTP POST with a good amount of data.
> > >  This data is being sent down the wire twice for every
> > > Method.execute() that I'm calling.  I'm on DSL, so it's not too bad,
> > > but for large payloads or slow connections, it's IMHO unacceptable.
> > >
> > > I'm considering modifying HttpState and HttpMethodBase to act more
> > > like a browser... make HttpState store not only the realm, but also
> > > the url of the site that requires authentication, and then make
> > > HttpMethodBase fill in that information if it matches for every
> > > request after the first successful one.
> > >
> > > Is anyone else working on this, or have any input on a specific
> > > approach to this problem?  Any things I should look out for?  Does
> > > anyone have any idea what I'm talking about?  ;)
> > >
> > > And, once I've finished this, and gotten it working, what should I do
> > > with it?  Just send a patch to this list?
> > >
> > > Thanks for reading this far.
> > >
> > > matt
> > >
> > >
> > > __________________________________________________
> > > Do You Yahoo!?
> > > Send FREE video emails in Yahoo! Mail!
> > > http://promo.yahoo.com/videomail/
> > >
> > > --
> > > To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
> > > For additional commands, e-mail: <ma...@jakarta.apache.org>
> > >
> > >
> >
> >
>
> cheers,
> -Ian
>
>

-- 
----
David Bullock -  dbullock@lisasoft.com   +61 4 0290 1228
http://www.lisasoft.com/ (Architect) http://www.auug.org.au/ (President, SA)
http://www.ajug.org.au/sajug/sa.html (Activist) Sun Cert'd Programmer for Java 2

"The key ingredients of success are a crystal-clear goal, a realistic attack plan to achieve that goal, and consistent, daily action to reach that goal."
Steve Maguire, "Debugging the Development Process".


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: HttpClient: Basic Authentication & acting more like a browser

Posted by "Ian Kallen <iank@covalent.net>" <ia...@covalent.net>.
I don't think he was asking about the spec's facilities for request
requirement discovery, I think he was suggesting that the httpclient's
state mechanism should behave more like a conventional browsers'.  There's
no reason not to do it; even if you could count on all of your servers to
implement the response code 100 heuristics, there's nothing wrong with
the standard browser behavior.

On Tue, 29 Jan 2002, David Bullock wrote:
> I was reading in the HTTP 1.1 spec just today, and found that there is a
> way for the client to see if the server is willing to respond to the
> HTTP request headers before sending the message body.
> 
> If the server is, then it gives the response right away.  If not, then
> it asks the client to continue ( HTTP response code 100 ) with the
> message body.  This allows for things like security credential exchanges,
> redirects, etc, to be performed without incurring unnecessary network
> overhead.
> 
> This is probably exactly what you're after to limit transmitting a large
> body twice.
> 
> Details are at:
> 
>   ftp://ftp.isi.edu/in-notes/rfc2616.txt
> 
> Section 8.2.3
> 
> 
> cheers,
> David.
> 
> 
> On Thu, 17 Jan 2002, Matthew Albright wrote:
> 
> > I've run into a interesting little shortcoming with HttpClient, and
> > would like to know if someone is working on a solution.
> >
> > When a browser goes to a site that requires Basic Authentication, it
> > will make a request (GET /), get a 401 response with a realm
> > attached, and prompt the user for a username and password for that
> > realm.  After getting the username/password, it fills in the
> > appropriate header with that information, and makes the request
> > again, gets the 200 response, and presumably all the data for the
> > page.
> >
> > Now, on subsequent visits to the same site, it appears (from server
> > logs) that the browser fills in the Authentication header for each
> > request, in a kind of pre-emptive strike.  That way, the browser and
> > server don't have to do the request-401-auth-200 dance every single
> > time.
> >
> > It appears that HttpClient doesn't do this process exactly like a
> > browser.  Every single Method that is execute()'d will send out the
> > initial request without Authentication, get the 401, and then send
> > the request again with authentication information attached.
> >
> > This is not normally a problem, or even noticable.
> >
> > My problem is that I'm doing an HTTP POST with a good amount of data.
> >  This data is being sent down the wire twice for every
> > Method.execute() that I'm calling.  I'm on DSL, so it's not too bad,
> > but for large payloads or slow connections, it's IMHO unacceptable.
> >
> > I'm considering modifying HttpState and HttpMethodBase to act more
> > like a browser... make HttpState store not only the realm, but also
> > the url of the site that requires authentication, and then make
> > HttpMethodBase fill in that information if it matches for every
> > request after the first successful one.
> >
> > Is anyone else working on this, or have any input on a specific
> > approach to this problem?  Any things I should look out for?  Does
> > anyone have any idea what I'm talking about?  ;)
> >
> > And, once I've finished this, and gotten it working, what should I do
> > with it?  Just send a patch to this list?
> >
> > Thanks for reading this far.
> >
> > matt
> >
> >
> > __________________________________________________
> > Do You Yahoo!?
> > Send FREE video emails in Yahoo! Mail!
> > http://promo.yahoo.com/videomail/
> >
> > --
> > To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
> > For additional commands, e-mail: <ma...@jakarta.apache.org>
> >
> >
> 
> 

cheers,
-Ian

-- 
Ian Kallen <ia...@covalent.net> | AIM/yahoo: iankallen


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: HttpClient: Basic Authentication & acting more like a browser

Posted by David Bullock <db...@lisasoft.com>.
I was reading in the HTTP 1.1 spec just today, and found that there is a
way for the client to see if the server is willing to respond to the
HTTP request headers before sending the message body.

If the server is, then it gives the response right away.  If not, then
it asks the client to continue ( HTTP response code 100 ) with the
message body.  This allows for things like security credential exchanges,
redirects, etc, to be performed without incurring unnecessary network
overhead.

This is probably exactly what you're after to limit transmitting a large
body twice.

Details are at:

  ftp://ftp.isi.edu/in-notes/rfc2616.txt

Section 8.2.3


cheers,
David.


On Thu, 17 Jan 2002, Matthew Albright wrote:

> I've run into a interesting little shortcoming with HttpClient, and
> would like to know if someone is working on a solution.
>
> When a browser goes to a site that requires Basic Authentication, it
> will make a request (GET /), get a 401 response with a realm
> attached, and prompt the user for a username and password for that
> realm.  After getting the username/password, it fills in the
> appropriate header with that information, and makes the request
> again, gets the 200 response, and presumably all the data for the
> page.
>
> Now, on subsequent visits to the same site, it appears (from server
> logs) that the browser fills in the Authentication header for each
> request, in a kind of pre-emptive strike.  That way, the browser and
> server don't have to do the request-401-auth-200 dance every single
> time.
>
> It appears that HttpClient doesn't do this process exactly like a
> browser.  Every single Method that is execute()'d will send out the
> initial request without Authentication, get the 401, and then send
> the request again with authentication information attached.
>
> This is not normally a problem, or even noticable.
>
> My problem is that I'm doing an HTTP POST with a good amount of data.
>  This data is being sent down the wire twice for every
> Method.execute() that I'm calling.  I'm on DSL, so it's not too bad,
> but for large payloads or slow connections, it's IMHO unacceptable.
>
> I'm considering modifying HttpState and HttpMethodBase to act more
> like a browser... make HttpState store not only the realm, but also
> the url of the site that requires authentication, and then make
> HttpMethodBase fill in that information if it matches for every
> request after the first successful one.
>
> Is anyone else working on this, or have any input on a specific
> approach to this problem?  Any things I should look out for?  Does
> anyone have any idea what I'm talking about?  ;)
>
> And, once I've finished this, and gotten it working, what should I do
> with it?  Just send a patch to this list?
>
> Thanks for reading this far.
>
> matt
>
>
> __________________________________________________
> Do You Yahoo!?
> Send FREE video emails in Yahoo! Mail!
> http://promo.yahoo.com/videomail/
>
> --
> To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
> For additional commands, e-mail: <ma...@jakarta.apache.org>
>
>

-- 
----
David Bullock -  dbullock@lisasoft.com   +61 4 0290 1228
http://www.lisasoft.com/ (Architect) http://www.auug.org.au/ (President, SA)
http://www.ajug.org.au/sajug/sa.html (Activist) Sun Cert'd Programmer for Java 2

"The key ingredients of success are a crystal-clear goal, a realistic attack plan to achieve that goal, and consistent, daily action to reach that goal."
Steve Maguire, "Debugging the Development Process".



--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>