You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Paul Bayley <ba...@mac.com> on 2001/08/20 07:08:00 UTC

file attribute questions

Hey,
	Last time I posted here it was suggested that I look into extending the definition of apr_finfo_t to include some filesystem-specific file attributes. Basically localize Apache 2.x for Darwin. I have a couple questions before I try this.

* Should I modify srclib/apr/include/apr_file_info.h?

* What is the difference between mtime and ctime? Also, would anybody have any use for creation date (as opposed to modification date)? I don't think anything outside the mac world uses creation dates. 

Once I know which attributes I need to fill out, I can get them using one call of getattrlist(). Those attributes a file system can't fulfill will be null. 

Question: How does Apache write files? What if somebody uploads a file via webdav or something, and I wish to set the correct attributes. Is that even possible? (I don't know what information web browsers will send when uploading a file. I fear the worst in that they will only send the file name)

Re: file attribute questions

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
[Reply sent to dev@apr.apache.org - where this phase of this discussion belongs.
 if you aren't subscribed, it's dev-subscribe@apr.apache.org]

From: "Paul Bayley" <ba...@mac.com>
Sent: Tuesday, August 21, 2001 12:15 AM


> >> * Should I modify srclib/apr/include/apr_file_info.h?
> >
> >That will break binary compatibility.  Fine today, horrible tommorow.
> >
> >I think we need an extension schema that won't keep breaking.  Care to propose one :-?
> >
> >If you wanted to add stuff that -many- systems support, I don't know that anybody would
> >object (physical size in storage [which maps to the '.size' if unknown], etc.)  Let's finish
> >defining those today, and agree to relagate all future growth into the extension schema.
> 
> UNIX hasn't changed in the last 30 years, I doubt it will in the next 5000.

To cite one specific FS change in Linux; discresionary access lists.  Let's not forget
that Darwin is a Unix, with several extensions.  Then there is IBM's OS families, and
OS2, Windows, NetThat's two.

I've clipped the rest of your Unix rant, it doesn't serve much purpose.  If you want to
contribute something, please start with the Darwin apr_dir_open/dir_read - it will speed
up Apache significantly if you can fill in the blanks as the directory is loaded - we
don't need a call to stat on each file if apr_dir_read can return things such as the
filetype (dir_read _must_ report symlinks for security to be effective), the time and
size values, etc.  I'm rather certain Darwin has an extended opendir/readdir that
returns these things.

Proceed to the apr_stat/apr_lstat calls, and fill in the .name entity, which I believe
Darwin can report in a single call with the stat info.  That will let us perform real
canoncial filename testing.

ITMT, I will proceed to set up more win32-specific info and build the accessor.  I'll
try resolving the ctime v.s. mtime debate on a bigger scale, and give you a template
to build Darwin-specific APR_MORE_INFO with.  I will kick this off with a post and
patch for discussion to dev@apr.apache.org, so we can keep moving.

Then I believe we will be ready to look at an apache core extension to mod_mime for
per-platform extra mime and charset info.  I've already got a hint that someone within
IBM would like to see this supported for their extended file systems ;)  That whole
discussion moves back to httpd@apache.org.

But the 'I don't like this about this OS' rants really don't go far unless you care to 
join the bsd or linux kernel lists, where they won't be circular filed, along with any
productive suggestions that were mixed in.  

Bill



Re: file attribute questions

Posted by Paul Bayley <ba...@mac.com>.
> >
>> * Should I modify srclib/apr/include/apr_file_info.h?
>
>That will break binary compatibility.  Fine today, horrible tommorow.
>
>I think we need an extension schema that won't keep breaking.  Care to propose one :-?
>
>If you wanted to add stuff that -many- systems support, I don't know that anybody would
>object (physical size in storage [which maps to the '.size' if unknown], etc.)  Let's finish
>defining those today, and agree to relagate all future growth into the extension schema.

UNIX hasn't changed in the last 30 years, I doubt it will in the next 5000.

In other words I don't see anybody else using these.

If I were to propose a serious solution I would start by defining an arbitrary metadata/file attribute standard for vfs then plead that standard UNIX file management tools like cp used the API. But I don't see that happening.

I don't know enough about Apache to propose specific changes in the API, but I would like to propose that certain assumptions be dropped like all mime-related declarations are only related to filename extensions. I don't really think that suggestion will be taken seriously so I'll just wait until I can implement what I want to, whenever that is.

>If we are adding the content/language stuff, I suggest that goes in the extensible
>mechansim.  Simply add an APR_FINFO_EXTRA bit if there might be something more to ask.

I guess. It's a pity most commonly used filesystems aren't more flexible so that adoption of things like mime type file attributes were easier to implement. Of course there would probably also have to be an extension to the ANSI standard library too. The decisions made 30 years ago still haunt me.

> > * What is the difference between mtime and ctime? Also, would anybody have any use for creation date (as opposed to modification
>date)? I don't think anything outside the mac world uses creation dates.
>
>modified versus created (I think that answers your question.)  And yes, Win32 NTFS
>and some other systems have ctime.  So does unix.

I thought ctime was when a file was last 'changed'. From apr_file_info.h:


    /** The time the file was last accessed */
    apr_time_t atime;
    /** The time the file was last modified */
    apr_time_t mtime;
    /** The time the file was last changed */
    apr_time_t ctime;

'changed' doesn't sound equivalent to 'created'. I'm pretty sure UNIX does not support creation date (since it didn't 30 years ago). Maybe somebody can double-check that. Basically the creation date shouldn't change, not even if the file was copied (thus the distinction).

> > Once I know which attributes I need to fill out, I can get them using one call of getattrlist(). Those attributes a file system
>can't fulfill will be null.
>
>Essentially, if you can grab them on your first apr_file_info_get or apr_stat call,
>then do so, but stash them in the more info pointer.  It can be an abstract (void*)
>member of the structure.  Allocate it on the pool.

If/when this 'more' info pointer exists, I may need some source code to get me going. I'm not yet familiar with Apache memory management.

> > Question: How does Apache write files? What if somebody uploads a file via webdav or something, and I wish to set the correct
>attributes. Is that even possible? (I don't know what information web browsers will send when uploading a file. I fear the worst in
>that they will only send the file name)
>
>mod_dav_fs extended the 'vanilla' DAV protocol to the file system.  If they upload
>by something other than webdav, I'm afraid it's that module/cgi author's job to
>deal with it.
>
>If we get mod_dav_fs right, it will be a fine example for the rest of those authors :)
>

So does webdav provide hooks to include more file attribute information? The problem is Apple is probably going to transition to webdav and I don't see how that's going to preserve important data like creation date when the standard http protocol doesn't.

Re: file attribute questions

Posted by Martin Kraemer <Ma...@Fujitsu-Siemens.com>.
On Tue, Aug 21, 2001 at 09:07:49AM -0500, William A. Rowe, Jr. wrote:
> Just out of curiosity, What apps
> would ever rely on ctime?

Hmm. You're right. ISTR that mail user agents were a candidate, but
I _think_ they only monitor mtime vs. atime to see if someone else
accessed the mailbox.

As it's extremely difficult to manipulate the ctime field, and very
difficult to set it to arbitrary values (which you could do with
the other two, using utime(2); BTW: A side effect of utime()
is that the ctime field is updated to the current time),
it can be used to monitor things like link creation,
permission modes' change etc. 

One well known program doing this is the tripwire program. 

Maybe a backup program could also be interested to see what to scan for
in an incremental backup?

   Martin
-- 
<Ma...@Fujitsu-Siemens.com>         |     Fujitsu Siemens
Fon: +49-89-636-46021, FAX: +49-89-636-47655 | 81730  Munich,  Germany

Re: file attribute questions

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
From: "Martin Kraemer" <Ma...@Fujitsu-Siemens.com>
Sent: Tuesday, August 21, 2001 1:36 AM


> On Mon, Aug 20, 2001 at 12:34:37AM -0500, William A. Rowe, Jr. wrote:
> > > * What is the difference between mtime and ctime? Also, would anybody have any use for creation date (as opposed to
modification
> > date)? I don't think anything outside the mac world uses creation dates.
> >
> > modified versus created (I think that answers your question.)  And yes, Win32 NTFS
> > and some other systems have ctime.  So does unix.
>
> Ehmmm.... As a nonWin32 enthusiast may I please interrupt here...

[enthusiast ;-?]

> It *may* be true that the windows API mis-interpreted the commonly
> accepted unix semantics for what the ctime field stands. In ALL unix
> systems, as well as in Linux, it stands for "time of last change to
> the inode".

Thanks for refocusing my understanding.

> <sarcast>
> I certainly do believe that microsoft had a hard time
> to fit that into the existing DOS and Win9x world, but then hey,
> the correct definition is clear.
> </sarcast>

msvcrt's 'struct stat' aside, MS has always spelled out that this is a file
creation time.  We fell in the trap of propagating their bs from their clib.
This will be fixed this week

Thank you for the detailed picture of unix ctime, I'll probably be trying some
other experiments myself to build on this.  Just out of curiosity, What apps
would ever rely on ctime?

Bill



Re: file attribute questions

Posted by Martin Kraemer <Ma...@Fujitsu-Siemens.com>.
On Tue, Aug 21, 2001 at 08:36:33AM +0200, Kraemer, Martin wrote:
> Luckily there are VERY few programs which rely on the correct implementation
> of the semantics of the ctime field.

I failed to give an example for a program which relies on the unix
semantics for ctime.

Let's first recall that the *system* sets the value of the ctime field
whenever *the system* makes a change to the inode. There is no function
to manipulate the st_ctime value and set it to arbitrary values
(unless you consider changing the hardware clock to arbitrary values
an "interface").

Based on that fact, the value of the ctime field cannot be controlled
by a non-super-user, and can be used to monitor changes to a file,
for example:
  - change in number of hard links to the file
  - change in size, or inode allocations,
  - but also, changing of the mtime or atime stamps (e.g. to "hide"
    the malevolent modification of a /usr/sbin/sshd trojan)

And it is this functionality which is used for example by the
well known tripwire program to monitor the integrity of important
system files. A ctime change on a system file CAN point to trouble.

   Martin
-- 
<Ma...@Fujitsu-Siemens.com>    |       Fujitsu Siemens
       <ma...@apache.org>              |   81730  Munich,  Germany

Re: file attribute questions

Posted by Martin Kraemer <Ma...@Fujitsu-Siemens.com>.
On Mon, Aug 20, 2001 at 12:34:37AM -0500, William A. Rowe, Jr. wrote:
> > * What is the difference between mtime and ctime? Also, would anybody have any use for creation date (as opposed to modification
> date)? I don't think anything outside the mac world uses creation dates.
> 
> modified versus created (I think that answers your question.)  And yes, Win32 NTFS
> and some other systems have ctime.  So does unix.

Ehmmm.... As a nonWin32 enthusiast may I please interrupt here...

It *may* be true that the windows API mis-interpreted the commonly
accepted unix semantics for what the ctime field stands. In ALL unix
systems, as well as in Linux, it stands for "time of last change to
the inode".
<sarcast>
I certainly do believe that microsoft had a hard time
to fit that into the existing DOS and Win9x world, but then hey,
the correct definition is clear.
</sarcast>

To illustrate the intricasies of "a change to the inode", let me show you
what actually happens to the three time stamps mtime, ctime, atime
when you do a set of operations on the file.

Here's a script which does that (on FreeBSD, there's a "ls -T" flag which
shows the time stamp to the second.)

--snip--
#!/bin/sh
set -x
rm -f the.file another.name
date | tee the.file
ls -Tl the.file
ls -Tlc the.file
ls -Tlu the.file
sleep 3
touch the.file
ls -Tl the.file
ls -Tlc the.file
ls -Tlu the.file
sleep 3
ln the.file another.name
ls -Tl the.file
ls -Tlc the.file
ls -Tlu the.file
sleep 3
cat another.name
ls -Tl the.file
ls -Tlc the.file
ls -Tlu the.file
sleep 3
date >>another.name
ls -Tl the.file
ls -Tlc the.file
ls -Tlu the.file
--snip--

And here's the output:
---start---
+ rm -f the.file another.name
+ date
+ tee the.file
Di  21 Aug 2001 08:26:34 CEST
+ ls -Tl the.file
-rw-r-----  1 martin  kraemer  30 21 Aug 08:26:34 2001 the.file
+ ls -Tlc the.file
-rw-r-----  1 martin  kraemer  30 21 Aug 08:26:34 2001 the.file
+ ls -Tlu the.file
-rw-r-----  1 martin  kraemer  30 21 Aug 08:26:34 2001 the.file
+ sleep 3
+ touch the.file
+ ls -Tl the.file
-rw-r-----  1 martin  kraemer  30 21 Aug 08:26:37 2001 the.file
+ ls -Tlc the.file
-rw-r-----  1 martin  kraemer  30 21 Aug 08:26:37 2001 the.file
+ ls -Tlu the.file
-rw-r-----  1 martin  kraemer  30 21 Aug 08:26:37 2001 the.file
+ sleep 3
+ ln the.file another.name
+ ls -Tl the.file
-rw-r-----  2 martin  kraemer  30 21 Aug 08:26:37 2001 the.file
+ ls -Tlc the.file
-rw-r-----  2 martin  kraemer  30 21 Aug 08:26:40 2001 the.file
+ ls -Tlu the.file
-rw-r-----  2 martin  kraemer  30 21 Aug 08:26:37 2001 the.file
+ sleep 3
+ cat another.name
Di  21 Aug 2001 08:26:34 CEST
+ ls -Tl the.file
-rw-r-----  2 martin  kraemer  30 21 Aug 08:26:37 2001 the.file
+ ls -Tlc the.file
-rw-r-----  2 martin  kraemer  30 21 Aug 08:26:40 2001 the.file
+ ls -Tlu the.file
-rw-r-----  2 martin  kraemer  30 21 Aug 08:26:43 2001 the.file
+ sleep 3
+ date
+ ls -Tl the.file
-rw-r-----  2 martin  kraemer  60 21 Aug 08:26:46 2001 the.file
+ ls -Tlc the.file
-rw-r-----  2 martin  kraemer  60 21 Aug 08:26:46 2001 the.file
+ ls -Tlu the.file
-rw-r-----  2 martin  kraemer  60 21 Aug 08:26:43 2001 the.file
---end---

Explanation:
* date | tee  -> the file is created and filled with its creation time.
  (...three seconds later):
* touch: the mtime field is updated to the current time (note that
  this also updates atime & ctime).
  (...three seconds later):
* a hard link is created from the file (so to speak, an alias name is
  created). Because the "link count" in the inode increases from 1 to 2,
  this means that a "change to the inode" occurred, and it changes the
  ctime (and the ctime ONLY). 
  At this point, people who implement or interpret "ctime" as "creation
  time" will have a hard time explaining why the.file was created NOW,
  but last modified THREE SECONDS AGO....
  Note that the two files the.file and another.name share the same inode,
  it exists only once. All file operations on another.name which modify the
  inode will be visible for the.file as well.
  (...another three seconds later):
* cat accesses the file under its alias name another.name (this access
  is identical, in all respects, to an access of the original the.file;
  the file's contents is its creation time stamp), so the atime field
  is updated. We now have three different times on the file.
  (...another three seconds later):
* date appends to "another.name" (again, this access is identical, in
  all respects, to an append to the original the.file). Note that the
  mtime changed (of course, because we made a modification NOW), but
  also the ctime changed (because the byte count of the file is an
  information stored in the inode, so the inode changed), but the
  access time, atime, did NOT change.

Luckily there are VERY few programs which rely on the correct implementation
of the semantics of the ctime field.

Hope this helps to clarify things,

   Martin
-- 
<Ma...@Fujitsu-Siemens.com>    |       Fujitsu Siemens
       <ma...@apache.org>              |   81730  Munich,  Germany

Re: file attribute questions

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
From: "Paul Bayley" <ba...@mac.com>
Sent: Monday, August 20, 2001 12:08 AM


> Hey,
> Last time I posted here it was suggested that I look into extending the definition of apr_finfo_t to include some
filesystem-specific file attributes. Basically localize Apache 2.x for Darwin. I have a couple questions before I try this.
>
> * Should I modify srclib/apr/include/apr_file_info.h?

That will break binary compatibility.  Fine today, horrible tommorow.

I think we need an extension schema that won't keep breaking.  Care to propose one :-?

If you wanted to add stuff that -many- systems support, I don't know that anybody would
object (physical size in storage [which maps to the '.size' if unknown], etc.)  Let's finish
defining those today, and agree to relagate all future growth into the extension schema.

If we are adding the content/language stuff, I suggest that goes in the extensible
mechansim.  Simply add an APR_FINFO_EXTRA bit if there might be something more to ask.

> * What is the difference between mtime and ctime? Also, would anybody have any use for creation date (as opposed to modification
date)? I don't think anything outside the mac world uses creation dates.

modified versus created (I think that answers your question.)  And yes, Win32 NTFS
and some other systems have ctime.  So does unix.

> Once I know which attributes I need to fill out, I can get them using one call of getattrlist(). Those attributes a file system
can't fulfill will be null.

Essentially, if you can grab them on your first apr_file_info_get or apr_stat call,
then do so, but stash them in the more info pointer.  It can be an abstract (void*)
member of the structure.  Allocate it on the pool.

Win32 can't get more info without extra system calls, which we wouldn't do up front.
We would write the same accessor for the extension stuff, only it must perform
the system calls when the user calls that accessor.

I guess you will just dig into that allocation and return a pointer to whatever
interests the user.

> Question: How does Apache write files? What if somebody uploads a file via webdav or something, and I wish to set the correct
attributes. Is that even possible? (I don't know what information web browsers will send when uploading a file. I fear the worst in
that they will only send the file name)

mod_dav_fs extended the 'vanilla' DAV protocol to the file system.  If they upload
by something other than webdav, I'm afraid it's that module/cgi author's job to
deal with it.

If we get mod_dav_fs right, it will be a fine example for the rest of those authors :)

Bill