You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apr.apache.org by Justin Erenkrantz <je...@ebuilt.com> on 2001/05/23 07:09:41 UTC

More migrations from httpd to apr-util and md5 in apr

Well, now that the uri stuff is now in apr-util, the biggest one left
(IMHO) is util_date.c.  My plan is to come up with a similar series of
patches for apr_date.h.  Should a new directory be used (date)?  I also
have some additional formats that don't quite fit the HTTP RFC, but that
I saw in the creation of mod_mbox (various degrees of RFC 822
compliance) - so I might add another function (apr_parseRFCdate?) which 
attempts to parse a wider range than what the HTTP RFC specifies - this 
might be useful for some other applications, I think.

Also, the other potential candidate I identified earlier was util_md5.c.
But, it seems that util_md5.c relies on apr/passwd/apr_md5.c.  I tried to 
track down usage of apr_md5 within apr itself, and the only usage I could 
find was in apr/misc/unix/getuuid.c which will only uses apr_md5 when 
APR_HAS_RANDOM is not defined (typically /dev/random doesn't exist - i.e.
Solaris doesn't have it, or truerand isn't available).  This seems odd.  
It just md5s the current date/hostname/pid to produce a "random" string 
of the requested length.  I *guess* it's random enough.

Now, I'm not sure what the dividing line is between apr and apr-util, 
but my gut tells me that md5 hashing doesn't belong in apr itself.  I 
think a better (IMHO) solution (if I interpret the use of md5 in getuuid.c 
correctly) is to introduce some PRNG into apr and move md5 into apr-util.  
This also has the side effect of guaranteeing that there is always some 
PRNG available.  As I see it, this PRNG would be used as a fallback 
when /dev/random and friends aren't available.

I bet there are some free PRNGs (BSD-license of course) flying around 
that could be used, or we could try to come up with one of our own.  I'd 
rather not reinvent the wheel (PRNGs aren't trivial), but hey, I'm game.  
Maybe a better solution to the PRNG is try to leverage rand() somehow,
and coersce it to fit the apr_generate_random_bytes prototype.  Make the
rand() % CHAR_MAX for each byte.  That is simplisitic and wouldn't be
that efficient.  We'd also have to determine whether any PRNG is good
enough for inclusion with APR for when it is used as a fallback.  I
don't think including a sub-par PRNG is a good idea (any idea what the
definition for a sub-par PRNG is?  <G>).  You'd also have to deal with 
seeding the rand() function (in apr_initialize?).

Thoughts?  -- justin


Re: More migrations from httpd to apr-util and md5 in apr

Posted by Greg Stein <gs...@lyra.org>.
On Wed, May 23, 2001 at 04:54:23PM -0700, Justin Erenkrantz wrote:
> Thanks.  Okay, I'll take a pass at date migration over the long weekend
> (misc is fine) unless someone says not to between now and Friday.  I'll 
> take a look at the mod_dav_fs dates - I'll see how compatible it is with 
> what I had for mod_mbox.  I may try and merge them, or if it is too 
> different, I'll commit what I had for mod_mbox and you can then look at 
> the mod_dav_fs stuff and decide what to do.

Oh, it is simply two ways to format a date/time into a buffer. I haven't
even looked lately to see if a similar function is now available in the
Apache code. It seems like I remember somebody adding some new date
formatting functions a while back.

> I might get around to the md5 stuff,

Cool. I've always planned to do it, too, but just haven't got there yet...

> but that may require also yanking out
> uuid into apr-util as well.

No can do. The UUID stuff is in APR because we use a Win32 specific function
on that platform. Thus, it belongs in APR. It is even possible that some
platforms may want to improve what kind of info they grab for their random
seed (i.e. on Unix, we use pid, time, hostname; BeOS or OS/2 or whatever may
have more data they want to use). Heck, somebody may want to use the
Ethernet MAC address, like the UUID specification states.

> As you said, that's not a bad idea - 
> hopefully, it isn't used anywhere within APR itself.  I'll see if I have
> enough time to do this right this weekend.  Baby steps - I want to break
> as little as possible at each step.

The smallest step would be to use a hash function over the UUID seed data,
rather than MD5.

> If I eventually see a PRNG that I like (suitable license, of course), 
> I'll include it in APR.  Only use it when /dev/random et al aren't 
> defined - does that sound reasonable?

You bet!

> I'll post appropriate patches to new-httpd to migrate their calls, but
> it'll need someone with commit access on that end to check those in.

Not a problem there. That is relatively straight-forward, once the
functionality "appears" within APR(UTIL).

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: More migrations from httpd to apr-util and md5 in apr

Posted by Justin Erenkrantz <je...@ebuilt.com>.
Thanks.  Okay, I'll take a pass at date migration over the long weekend
(misc is fine) unless someone says not to between now and Friday.  I'll 
take a look at the mod_dav_fs dates - I'll see how compatible it is with 
what I had for mod_mbox.  I may try and merge them, or if it is too 
different, I'll commit what I had for mod_mbox and you can then look at 
the mod_dav_fs stuff and decide what to do.

I might get around to the md5 stuff, but that may require also yanking out
uuid into apr-util as well.  As you said, that's not a bad idea - 
hopefully, it isn't used anywhere within APR itself.  I'll see if I have
enough time to do this right this weekend.  Baby steps - I want to break
as little as possible at each step.

If I eventually see a PRNG that I like (suitable license, of course), 
I'll include it in APR.  Only use it when /dev/random et al aren't 
defined - does that sound reasonable?

I'll post appropriate patches to new-httpd to migrate their calls, but
it'll need someone with commit access on that end to check those in.
-- justin


Re: More migrations from httpd to apr-util and md5 in apr

Posted by Greg Stein <gs...@lyra.org>.
On Tue, May 22, 2001 at 10:09:41PM -0700, Justin Erenkrantz wrote:
> Well, now that the uri stuff is now in apr-util, the biggest one left
> (IMHO) is util_date.c.  My plan is to come up with a similar series of
> patches for apr_date.h.  Should a new directory be used (date)?

We are going to have a misc/ directory which will contain version checking
stuff. I'd be okay with it in there (+1), but am also happy with a new
directory (-0). The other directories have "bulk" or potential for it, so
they make some sense. But "date"? I can't see much going in there.

> I also
> have some additional formats that don't quite fit the HTTP RFC, but that
> I saw in the creation of mod_mbox (various degrees of RFC 822
> compliance) - so I might add another function (apr_parseRFCdate?) which 
> attempts to parse a wider range than what the HTTP RFC specifies - this 
> might be useful for some other applications, I think.

mod_dav_fs also has some date formatting functions in dav/fs/repos.c that
I'd like to see migrated.

> Also, the other potential candidate I identified earlier was util_md5.c.
> But, it seems that util_md5.c relies on apr/passwd/apr_md5.c.  I tried to 
> track down usage of apr_md5 within apr itself, and the only usage I could 
> find was in apr/misc/unix/getuuid.c which will only uses apr_md5 when 
> APR_HAS_RANDOM is not defined (typically /dev/random doesn't exist - i.e.
> Solaris doesn't have it, or truerand isn't available).  This seems odd.  
> It just md5s the current date/hostname/pid to produce a "random" string 
> of the requested length.  I *guess* it's random enough.

MD5 is actually very good at creating random data. A one bit change in the
source produces unknown changes in the output. And it isn't reversible. The
technique works well.

However, that was simply a "crap. don't have nice random stuff. let's do
<this>" approach. The UUID stuff doesn't *truly* need a cryptographic random
number, but "good" numbers are needed. (I forget the exact details; would
need to read the UUID spec again)

> Now, I'm not sure what the dividing line is between apr and apr-util, 
> but my gut tells me that md5 hashing doesn't belong in apr itself.  I

Your gut is correct :-)  The plan has been to move the MD5 code from APR to
APRUTIL/crypto/. We also have the SHA1 cryptographic hashing function in
that directory right now.

(note that apr_getpass needs to move also, once we export an apr_crypt()
 function)

> think a better (IMHO) solution (if I interpret the use of md5 in getuuid.c 
> correctly) is to introduce some PRNG into apr and move md5 into apr-util.  
> This also has the side effect of guaranteeing that there is always some 
> PRNG available.  As I see it, this PRNG would be used as a fallback 
> when /dev/random and friends aren't available.

Well, you could skip the PRNG and just use "decent" random stuff in
getuuid.c. Or go whole hog. Your call. I'm perfectly fine with a portable
PRNG in APR.

> I bet there are some free PRNGs (BSD-license of course) flying around 
> that could be used, or we could try to come up with one of our own.  I'd 
> rather not reinvent the wheel (PRNGs aren't trivial), but hey, I'm game.  
> Maybe a better solution to the PRNG is try to leverage rand() somehow,
> and coersce it to fit the apr_generate_random_bytes prototype.  Make the
> rand() % CHAR_MAX for each byte.  That is simplisitic and wouldn't be
> that efficient.  We'd also have to determine whether any PRNG is good
> enough for inclusion with APR for when it is used as a fallback.  I
> don't think including a sub-par PRNG is a good idea (any idea what the
> definition for a sub-par PRNG is?  <G>).  You'd also have to deal with 
> seeding the rand() function (in apr_initialize?).
> 
> Thoughts?  -- justin

Your call on the PRNG stuff.

The MD5 can easily move, and you can substitute some other random
constructions. See the notes in:

  http://www.webdav.org/specs/draft-leach-uuids-guids-01.txt

section 4 for why I used the MD5 stuff. Note that it just asks for "suitably
random" information for the node ID, and then suggests that MD5 or SHA1 can
be useful.

For example, just copying the hashing function from apr_hash.c should be
quite fine.

(and all that is irrelevant if you add a PRNG)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/