You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Jeff Trawick <tr...@bellsouth.net> on 2000/06/14 23:25:05 UTC

APR file i/o questions

Should the read-type functions return APR_EOF at end-of-file on all
platforms?

  (Plenty of folks seem to agree: Ryan, Jeff, Brian, James)

Should ap_ungetc() work on all files, buffered or not?

  (Jeff thinks so.)

Should ap_fgets() on Unix have to check for '\r' *and* '\n' (i.e.,
think about '\r' at all)?

  (Jeff thinks not.  fgets() has managed without it forever)

same question for OS/2 and Win32..

  (Jeff thinks it should deal with '\r' and '\n' specially.)

*If* ap_fgets() on a platform checks for '\r' *and* '\n', what should
it do?

  (Jeff thinks "\r\n" should collapse to "\n".  The current Unix
  ap_fgets() in CVS is weird.  If the file has "Mary\r\n", subsequent
  calls to ap_fgets() will return "Mary\r" and "\n".  Not cool.)

What about locking within the buffered I/O support?

  (Jeff thinks that by default there should be no locking, but that a
  flag on ap_open() (APR_MT) can turn it on.  This flag would be
  ignored (not failed) if !APR_HAS_THREADS.  Even in a multithreaded
  app it seems uncommon for more than one thread to access the same
  file at once, so this is a reasonable default.

  Something like the following defined internal to readwrite.c should
  suffice for locking when needed:

   #ifdef APR_HAS_THREADS
   #define lock_buffer(f) \
       if ((f)->mt) ap_lock((f)->thlock);
   #define unlock_buffer(f) \
       if ((f)->mt) ap_unlock((f)->thlock);
   #else
   #define lock_buffer(f)
   #define unlock_buffer(f)
   #endif

What about Jeff's patch to Unix readwrite.c he posted earlier?

  (Jeff is 70% sure he should change it to call ap_read() from
  ap_fgets() but he will wait until there is a chance for some answers
  to these questions before playing more in the code.  Hopefully the
  next time he is coding there he can handle the resolution of more of
  the questions above.)

-- 
Jeff Trawick | trawick@ibm.net | PGP public key at web site:
     http://www.geocities.com/SiliconValley/Park/9289/
          Born in Roswell... married an alien...

Re: APR file i/o questions

Posted by dean gaudet <dg...@arctic.org>.

On Tue, 20 Jun 2000, Bill Stoddard wrote:

> > On Fri, 16 Jun 2000, Manoj Kasichainula wrote:
> >
> > > > Should ap_ungetc() work on all files, buffered or not?
> > >
> > > I'm not a big fan of the whole "ungetc" concept, though we use it a
> > > decent amount in our code now. +1 on removing all ungetc's if it's not
> > > too hard; no opinion on the actual question :)
> >
> > ungetc exists because most of the stuff folks scan for fall into the
> > category of LALR(1) languages.
> >
> > you wouldn't ever use ungetc with non-buffered i/o... 'cause ungetc is all
> > about scanning text, not bulk data movement.
> >
> 
> We should maintain ungetc on bufferd i/o. How should the call behave?  Should we be able
> to call ap_etc() multiple times (implementation would move the buf pointer back once per
> call) or should ap_ungetc() only be called once (implementation would save the character
> in thefile->ungetchar, as in the Unix implementation today).

ungetc() is defined to work for only one character of lookahead.  from
single unix:

	One byte of push-back is guaranteed. If ungetc() is called too
	many times on the same stream without an intervening read or
	file-positioning operation on that stream, the operation may fail.

-dean


Re: APR file i/o questions

Posted by Bill Stoddard <st...@raleigh.ibm.com>.
> On Fri, 16 Jun 2000, Manoj Kasichainula wrote:
>
> > > Should ap_ungetc() work on all files, buffered or not?
> >
> > I'm not a big fan of the whole "ungetc" concept, though we use it a
> > decent amount in our code now. +1 on removing all ungetc's if it's not
> > too hard; no opinion on the actual question :)
>
> ungetc exists because most of the stuff folks scan for fall into the
> category of LALR(1) languages.
>
> you wouldn't ever use ungetc with non-buffered i/o... 'cause ungetc is all
> about scanning text, not bulk data movement.
>

We should maintain ungetc on bufferd i/o. How should the call behave?  Should we be able
to call ap_etc() multiple times (implementation would move the buf pointer back once per
call) or should ap_ungetc() only be called once (implementation would save the character
in thefile->ungetchar, as in the Unix implementation today).

Bill


Re: APR file i/o questions

Posted by dean gaudet <dg...@arctic.org>.

On Fri, 16 Jun 2000, Manoj Kasichainula wrote:

> > Should ap_ungetc() work on all files, buffered or not?
> 
> I'm not a big fan of the whole "ungetc" concept, though we use it a
> decent amount in our code now. +1 on removing all ungetc's if it's not
> too hard; no opinion on the actual question :)

ungetc exists because most of the stuff folks scan for fall into the
category of LALR(1) languages.

you wouldn't ever use ungetc with non-buffered i/o... 'cause ungetc is all
about scanning text, not bulk data movement.

-dean


Re: APR file i/o questions

Posted by Manoj Kasichainula <ma...@io.com>.
On Wed, Jun 14, 2000 at 05:25:05PM -0400, Jeff Trawick wrote:
> Should the read-type functions return APR_EOF at end-of-file on all
> platforms?

+1

> Should ap_ungetc() work on all files, buffered or not?

I'm not a big fan of the whole "ungetc" concept, though we use it a
decent amount in our code now. +1 on removing all ungetc's if it's not
too hard; no opinion on the actual question :)

> Should ap_fgets() on Unix have to check for '\r' *and* '\n' (i.e.,
> think about '\r' at all)?

Naah.

> same question for OS/2 and Win32..

This is what ASCII vs. binary mode in ANSI C FILE *'s is for. Blecch.
I don't want to even *think* about that possibility.

> What about locking within the buffered I/O support?
> 
>   (Jeff thinks that by default there should be no locking, but that a
>   flag on ap_open() (APR_MT) can turn it on.

I'm pondering the possibility of an IOL that does nothing but locking
and liking it so far.


Re: APR file i/o questions

Posted by Brian Havard <br...@kheldar.apana.org.au>.
On 16 Jun 2000 10:36:43 -0000, Jeff Trawick \(httpd\) wrote:

>> From: "Brian Havard" <br...@kheldar.apana.org.au>
>> 
>> On Wed, 14 Jun 2000 17:25:05 -0400, Jeff Trawick wrote:
>> 
>> >Should ap_ungetc() work on all files, buffered or not?
>> >
>> >  (Jeff thinks so.)
>> 
>> Yes, though I'm not sure it's needed in its current form. What we actually
>> want is the ability to back up 1 char, IE seek -1. We don't need to be able
>> to specify what the ungot character is.
>
>That may be what the app, but it is faster if the app specifies the
>ungot character (no need to make a syscall to seek).
>
>By the way... does anybody want ap_seek() to work on buffered files?
>I guess for read we can discard the buffer contents first; for write
>we can flush first; I don't think it *has* to be supported though.

These two are related. Yes, ap_seek() should work on buffered files (I
haven't tested the unix version but it looks ok) and it should be very
cheap (no syscall) if the new position is within the buffered data. That's
why doing a -1 seek instead of an ungetc is not an expensive option.


>>  ap_status_t ap_getc(char *ch, ap_file_t *thefile)
>>  {
>
>ap_read() has to take care of *all* these details (ungetchar, eof_hit,
>setting APR_EOF if eof); ap_getc() just has to be a teeny tiny
>wrapper.

Hmmm, yeah, it can be cleaned up further.



>>      for (i = beg_idx; i < len; i++) {
>> -        rv = read(thefile->filedes, &str[i], 1); 
>> +        rv = 1;
>> +        status = ap_read(thefile, str+i, &rv);
>>          if (rv == 0) {
>>              thefile->eof_hit = TRUE;
>
>ap_read() has to do this... why do it again?

Well, actually only the buffered section does thefile->eof_hit = TRUE;
I was just making the basics work, not cleaning it all up....



>>  	    if (used_unget) {
>> @@ -388,7 +393,7 @@
>>              return APR_EOF;
>>          }
>>          else if (rv != 1) {
>> -            return errno;
>> +            return status;
>
>ap_read() does this...
>
>>          }
>>          if (str[i] == '\n' || str[i] == '\r') {
>>              break;
>
>In other words, if we're gonna call ap_read() from ap_getc() and
>ap_fgets(), then collapse to the minimum and don't do any work twice.
>
>I'll post another version after a while.

Okay, go for it.....

-- 
 ______________________________________________________________________________
 |  Brian Havard                 |  "He is not the messiah!                   |
 |  brianh@kheldar.apana.org.au  |  He's a very naughty boy!" - Life of Brian |
 ------------------------------------------------------------------------------


Re: APR file i/o questions

Posted by "Jeff Trawick (httpd)" <tr...@ibm.net>.
> From: "Brian Havard" <br...@kheldar.apana.org.au>
> Date: Fri, 16 Jun 2000 19:05:08 +1000 (EST)
> 
> On Wed, 14 Jun 2000 17:25:05 -0400, Jeff Trawick wrote:
> 
> >Should ap_ungetc() work on all files, buffered or not?
> >
> >  (Jeff thinks so.)
> 
> Yes, though I'm not sure it's needed in its current form. What we actually
> want is the ability to back up 1 char, IE seek -1. We don't need to be able
> to specify what the ungot character is.

That may be what the app, but it is faster if the app specifies the
ungot character (no need to make a syscall to seek).

By the way... does anybody want ap_seek() to work on buffered files?
I guess for read we can discard the buffer contents first; for write
we can flush first; I don't think it *has* to be supported though.

> >What about Jeff's patch to Unix readwrite.c he posted earlier?
> >
> >  (Jeff is 70% sure he should change it to call ap_read() from
> >  ap_fgets() but he will wait until there is a chance for some answers
> >  to these questions before playing more in the code.  Hopefully the
> >  next time he is coding there he can handle the resolution of more of
> >  the questions above.)
> 
> I've got it working with just this (yes, I have a linux box too):
> 
> 
> Index: file_io/unix/open.c
> ===================================================================
> RCS file: /home/cvs/apache-2.0/src/lib/apr/file_io/unix/open.c,v
> retrieving revision 1.54
> diff -u -w -r1.54 open.c
> --- file_io/unix/open.c	2000/06/15 17:39:16	1.54
> +++ file_io/unix/open.c	2000/06/16 08:40:59
> @@ -112,6 +112,10 @@
>  
>      (*new)->buffered = (flag & APR_BUFFERED) > 0;
>  
> +    if ((*new)->buffered) {
> +        (*new)->buffer = ap_palloc(cont, APR_FILE_BUFSIZE);
> +    }
> +
>      if (flag & APR_CREATE) {
>          oflags |= O_CREAT; 
>  	if (flag & APR_EXCL) {
> Index: file_io/unix/readwrite.c
> ===================================================================
> RCS file: /home/cvs/apache-2.0/src/lib/apr/file_io/unix/readwrite.c,v
> retrieving revision 1.54
> diff -u -w -r1.54 readwrite.c
> --- file_io/unix/readwrite.c	2000/06/14 02:58:34	1.54
> +++ file_io/unix/readwrite.c	2000/06/16 08:41:01
> @@ -301,6 +301,7 @@
>  
>  ap_status_t ap_getc(char *ch, ap_file_t *thefile)
>  {

ap_read() has to take care of *all* these details (ungetchar, eof_hit,
setting APR_EOF if eof); ap_getc() just has to be a teeny tiny
wrapper.

> +    ap_status_t status;
>      ssize_t rv;
>      
>      if (thefile->ungetchar != -1) {
> @@ -308,13 +309,15 @@
>          thefile->ungetchar = -1;
>          return APR_SUCCESS;
>      }
> -    rv = read(thefile->filedes, ch, 1); 
> +
> +    rv = 1;
> +    status = ap_read(thefile, ch, &rv);
>      if (rv == 0) {
>          thefile->eof_hit = TRUE;
>          return APR_EOF;
>      }
>      else if (rv != 1) {
> -        return errno;
> +        return status;
>      }
>      return APR_SUCCESS; 
>  }
> @@ -358,6 +361,7 @@
>  {
>      ssize_t rv;
>      int i, used_unget = FALSE, beg_idx;
> +    ap_status_t status;
>  
>      if (len <= 1) {  /* as per fgets() */
>          return APR_SUCCESS;
> @@ -378,7 +382,8 @@
>      }
>      
>      for (i = beg_idx; i < len; i++) {
> -        rv = read(thefile->filedes, &str[i], 1); 
> +        rv = 1;
> +        status = ap_read(thefile, str+i, &rv);
>          if (rv == 0) {
>              thefile->eof_hit = TRUE;

ap_read() has to do this... why do it again?

>  	    if (used_unget) {
> @@ -388,7 +393,7 @@
>              return APR_EOF;
>          }
>          else if (rv != 1) {
> -            return errno;
> +            return status;

ap_read() does this...

>          }
>          if (str[i] == '\n' || str[i] == '\r') {
>              break;

In other words, if we're gonna call ap_read() from ap_getc() and
ap_fgets(), then collapse to the minimum and don't do any work twice.

I'll post another version after a while.


Re: [PATCH] Re: APR file i/o questions

Posted by Brian Havard <br...@kheldar.apana.org.au>.
On 16 Jun 2000 13:30:13 -0000, Jeff Trawick \(httpd\) wrote:

>Differences between this one and what Brian posted earlier today:
>
>1) this fixes a problem in where ap_open() where a lock is created for
>   non-buffered files
>2) this fixes problems setting rv correctly in the ap_read() buffered 
>   path
>3) since ap_read() works as expected, it is possible to make ap_getc()
>   and ap_fgets() even smaller
>4) ap_fgets() no longer cares about '\r'
>
>ap_fgets() and ap_getc() are copied in their entirety at the end as
>it is hard to piece together the little remaining code from the diff.

Ok, that looks a lot nicer. Might not be 100% the way I would have done it
but that's not necessarily a bad thing.....

-- 
 ______________________________________________________________________________
 |  Brian Havard                 |  "He is not the messiah!                   |
 |  brianh@kheldar.apana.org.au  |  He's a very naughty boy!" - Life of Brian |
 ------------------------------------------------------------------------------


[PATCH] Re: APR file i/o questions

Posted by "Jeff Trawick (httpd)" <tr...@ibm.net>.
Differences between this one and what Brian posted earlier today:

1) this fixes a problem in where ap_open() where a lock is created for
   non-buffered files
2) this fixes problems setting rv correctly in the ap_read() buffered 
   path
3) since ap_read() works as expected, it is possible to make ap_getc()
   and ap_fgets() even smaller
4) ap_fgets() no longer cares about '\r'

ap_fgets() and ap_getc() are copied in their entirety at the end as
it is hard to piece together the little remaining code from the diff.

This doesn't attempt to change the lock design.

This does add more paths where you don't get both bytes read and
non-zero status, status being saved for the next call.  There were
such paths already and this is a familiar paradigm, so I don't have
any guilt with this.

Unix didn't have the ap_fgets() optimization that Greg S. wanted to
keep in the Win32 implementation and I didn't attempt to implement it
for Unix.

Does anybody want to speak up with specific objections or improvements
before I commit?

Thanks,

Jeff

Index: file_io/unix/open.c
===================================================================
RCS file: /cvs/apache/apache-2.0/src/lib/apr/file_io/unix/open.c,v
retrieving revision 1.54
diff -u -r1.54 open.c
--- open.c	2000/06/15 17:39:16	1.54
+++ open.c	2000/06/16 13:12:37
@@ -82,6 +82,9 @@
 ap_status_t ap_open(ap_file_t **new, const char *fname, ap_int32_t flag,  ap_fileperms_t perm, ap_pool_t *cont)
 {
     int oflags = 0;
+#if APR_HAS_THREADS
+    ap_status_t rv;
+#endif
 
     if ((*new) == NULL) {
         (*new) = (ap_file_t *)ap_pcalloc(cont, sizeof(ap_file_t));
@@ -90,10 +93,6 @@
     (*new)->cntxt = cont;
     (*new)->oflags = oflags;
     (*new)->filedes = -1;
-    (*new)->buffer = NULL;
-#if APR_HAS_THREADS
-    ap_create_lock(&((*new)->thlock), APR_MUTEX, APR_INTRAPROCESS, NULL, cont);
-#endif
 
     if ((flag & APR_READ) && (flag & APR_WRITE)) {
         oflags = O_RDWR;
@@ -111,6 +110,20 @@
     (*new)->fname = ap_pstrdup(cont, fname);
 
     (*new)->buffered = (flag & APR_BUFFERED) > 0;
+
+    if ((*new)->buffered) {
+        (*new)->buffer = ap_palloc(cont, APR_FILE_BUFSIZE);
+#if APR_HAS_THREADS
+        rv = ap_create_lock(&((*new)->thlock), APR_MUTEX, APR_INTRAPROCESS, 
+                            NULL, cont);
+        if (rv) {
+            return rv;
+        }
+#endif
+    }
+    else {
+        (*new)->buffer = NULL;
+    }
 
     if (flag & APR_CREATE) {
         oflags |= O_CREAT; 
Index: file_io/unix/readwrite.c
===================================================================
RCS file: /cvs/apache/apache-2.0/src/lib/apr/file_io/unix/readwrite.c,v
retrieving revision 1.54
diff -u -r1.54 readwrite.c
--- readwrite.c	2000/06/14 02:58:34	1.54
+++ readwrite.c	2000/06/16 13:12:37
@@ -98,6 +98,9 @@
 }
 #endif
 
+/* problems: 
+ * 1) ungetchar not used for buffered files
+ */
 ap_status_t ap_read(ap_file_t *thefile, void *buf, ap_ssize_t *nbytes)
 {
     ap_ssize_t rv;
@@ -130,8 +133,13 @@
                 thefile->dataRead = read(thefile->filedes, thefile->buffer, APR_FILE_BUFSIZE);
                 if (thefile->dataRead == 0) {
                     thefile->eof_hit = TRUE;
+                    rv = APR_EOF;
                     break;
                 }
+                else if (thefile->dataRead == -1) {
+                    rv = errno;
+                    break;
+                }
                 thefile->filePtr += thefile->dataRead;
                 thefile->bufpos = 0;
             }
@@ -143,7 +151,10 @@
             size -= blocksize;
         }
 
-        *nbytes = rv == 0 ? pos - (char *)buf : 0;
+        *nbytes = pos - (char *)buf;
+        if (*nbytes) {
+            rv = 0;
+        }
 #if APR_HAS_THREADS
         ap_unlock(thefile->thlock);
 #endif
@@ -301,22 +312,9 @@
 
 ap_status_t ap_getc(char *ch, ap_file_t *thefile)
 {
-    ssize_t rv;
-    
-    if (thefile->ungetchar != -1) {
-        *ch = (char) thefile->ungetchar;
-        thefile->ungetchar = -1;
-        return APR_SUCCESS;
-    }
-    rv = read(thefile->filedes, ch, 1); 
-    if (rv == 0) {
-        thefile->eof_hit = TRUE;
-        return APR_EOF;
-    }
-    else if (rv != 1) {
-        return errno;
-    }
-    return APR_SUCCESS; 
+    ap_ssize_t nbytes = 1;
+
+    return ap_read(thefile, ch, &nbytes);
 }
 
 ap_status_t ap_puts(char *str, ap_file_t *thefile)
@@ -356,59 +354,34 @@
 
 ap_status_t ap_fgets(char *str, int len, ap_file_t *thefile)
 {
-    ssize_t rv;
-    int i, used_unget = FALSE, beg_idx;
-
-    if (len <= 1) {  /* as per fgets() */
+    ap_status_t rv = APR_SUCCESS; /* get rid of gcc warning */
+    ap_ssize_t nbytes;
+    char *final = str + len - 1;
+
+    if (len <= 1) {  
+        /* sort of like fgets(), which returns NULL and stores no bytes 
+         */
         return APR_SUCCESS;
     }
 
-    if (thefile->ungetchar != -1) {
-        str[0] = thefile->ungetchar;
-	used_unget = TRUE;
-	beg_idx = 1;
-	if (str[0] == '\n' || str[0] == '\r') {
-	    thefile->ungetchar = -1;
-	    str[1] = '\0';
-	    return APR_SUCCESS;
-	}
-    } 
-    else {
-        beg_idx = 0;
-    }
-    
-    for (i = beg_idx; i < len; i++) {
-        rv = read(thefile->filedes, &str[i], 1); 
-        if (rv == 0) {
-            thefile->eof_hit = TRUE;
-	    if (used_unget) {
-                thefile->filedes = -1;
-            }
-	    str[i] = '\0';
-            return APR_EOF;
-        }
-        else if (rv != 1) {
-            return errno;
+    while (str < final) { /* leave room for trailing '\0' */
+        nbytes = 1;
+        rv = ap_read(thefile, str, &nbytes);
+        if (rv != APR_SUCCESS) {
+            break;
         }
-        if (str[i] == '\n' || str[i] == '\r') {
+        if (*str == '\n') {
+            ++str;
             break;
         }
+        ++str;
     }
-    if (i < len-1) {
-        str[i+1] = '\0';
-    }
-    return APR_SUCCESS; 
-}
-
-#if 0 /* not currently used */
-static int printf_flush(ap_vformatter_buff_t *vbuff)
-{
-    /* I would love to print this stuff out to the file, but I will
-     * get that working later.  :)  For now, just return.
+    /* We must store a terminating '\0' if we've stored any chars. We can
+     * get away with storing it if we hit an error first. 
      */
-    return -1;
+    *str = '\0'; 
+    return rv;
 }
-#endif
 
 APR_EXPORT(int) ap_fprintf(ap_file_t *fptr, const char *format, ...)
 {
Index: test/testfile.c
===================================================================
RCS file: /cvs/apache/apache-2.0/src/lib/apr/test/testfile.c,v
retrieving revision 1.13
diff -u -r1.13 testfile.c
--- testfile.c	2000/05/24 22:32:44	1.13
+++ testfile.c	2000/06/16 13:12:39
@@ -52,6 +52,7 @@
  * <http://www.apache.org/>.
  */
 
+#include <assert.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include "apr_file_io.h"
@@ -65,6 +66,7 @@
 
 int test_filedel(ap_pool_t *);
 int testdirs(ap_pool_t *);
+static void test_read(ap_pool_t *);
 
 int main()
 {
@@ -239,6 +241,7 @@
 
     testdirs(context); 
     test_filedel(context);
+    test_read(context);
 
     return 1;
 }
@@ -382,4 +385,201 @@
     
     return 1; 
 }    
+
+#define TESTREAD_BLKSIZE 1024
+#define APR_BUFFERSIZE   4096 /* This should match APR's buffer size. */
+
+static void create_testread(ap_pool_t *p, const char *fname)
+{
+    ap_file_t *f = NULL;
+    ap_status_t rv;
+    char buf[TESTREAD_BLKSIZE];
+    ap_ssize_t nbytes;
+
+    /* Create a test file with known content.
+     */
+    rv = ap_open(&f, fname, APR_CREATE | APR_WRITE | APR_TRUNCATE, APR_UREAD | APR_UWRITE, p);
+    if (rv) {
+        fprintf(stderr, "ap_open()->%d/%s\n",
+                rv, ap_strerror(rv, buf, sizeof buf));
+        exit(1);
+    }
+    nbytes = 4;
+    rv = ap_write(f, "abc\n", &nbytes);
+    assert(!rv && nbytes == 4);
+    memset(buf, 'a', sizeof buf);
+    nbytes = sizeof buf;
+    rv = ap_write(f, buf, &nbytes);
+    assert(!rv && nbytes == sizeof buf);
+    nbytes = 2;
+    rv = ap_write(f, "\n\n", &nbytes);
+    assert(!rv && nbytes == 2);
+    rv = ap_close(f);
+    assert(!rv);
+}
+
+static char read_one(ap_file_t *f, int expected)
+{
+  char bytes[3];
+  ap_status_t rv;
+  static int counter = 0;
+  ap_ssize_t nbytes;
+
+  counter += 1;
+
+  bytes[0] = bytes[2] = 0x01;
+  if (counter % 2) {
+      rv = ap_getc(bytes + 1, f);
+  }
+  else {
+      nbytes = 1;
+      rv = ap_read(f, bytes + 1, &nbytes);
+      assert(nbytes == 1);
+  }
+  assert(!rv);
+  assert(bytes[0] == 0x01 && bytes[2] == 0x01);
+  if (expected != -1) {
+      assert(bytes[1] == expected);
+  }
+  return bytes[1];
+}
+
+static void test_read_guts(ap_pool_t *p, const char *fname, ap_int32_t extra_flags)
+{
+    ap_file_t *f = NULL;
+    ap_status_t rv;
+    ap_ssize_t nbytes;
+    char buf[1024];
+    int i;
+
+    rv = ap_open(&f, fname, APR_READ | extra_flags, 0, p);
+    assert(!rv);
+    read_one(f, 'a');
+    read_one(f, 'b');
+    if (extra_flags & APR_BUFFERED) {
+        fprintf(stdout, 
+                "\n\t\tskipping ap_ungetc() for APR_BUFFERED... "
+                "doesn't work yet...\n\t\t\t\t ");
+    }
+    else {
+        rv = ap_ungetc('z', f);
+        assert(!rv);
+        rv = ap_ungetc('a', f);
+        assert(!rv); /* we just overwrote the previously-un-got char */
+        read_one(f, 'a');
+    }
+    read_one(f, 'c');
+    read_one(f, '\n');
+    for (i = 0; i < TESTREAD_BLKSIZE; i++) {
+        read_one(f, 'a');
+    }
+    read_one(f, '\n');
+    read_one(f, '\n');
+    rv = ap_getc(buf, f);
+    assert(rv == APR_EOF);
+    rv = ap_close(f);
+    assert(!rv);
+
+    f = NULL;
+    rv = ap_open(&f, fname, APR_READ | extra_flags, 0, p);
+    assert(!rv);
+    rv = ap_fgets(buf, 10, f);
+    assert(!rv);
+    assert(!strcmp(buf, "abc\n"));
+    /* read first 800 of TESTREAD_BLKSIZE 'a's 
+     */
+    rv = ap_fgets(buf, 801, f);
+    assert(!rv);
+    assert(strlen(buf) == 800);
+    for (i = 0; i < 800; i++) {
+        assert(buf[i] == 'a');
+    }
+    /* read rest of the 'a's and the first newline
+     */
+    rv = ap_fgets(buf, sizeof buf, f);
+    assert(!rv);
+    assert(strlen(buf) == TESTREAD_BLKSIZE - 800 + 1);
+    for (i = 0; i < TESTREAD_BLKSIZE - 800; i++) {
+        assert(buf[i] == 'a');
+    }
+    assert(buf[TESTREAD_BLKSIZE - 800] == '\n');
+    /* read the last newline
+     */
+    rv = ap_fgets(buf, sizeof buf, f);
+    assert(!rv);
+    assert(!strcmp(buf, "\n"));
+    /* get APR_EOF
+     */
+    rv = ap_fgets(buf, sizeof buf, f);
+    assert(rv == APR_EOF);
+    /* get APR_EOF with ap_getc
+     */
+    rv = ap_getc(buf, f);
+    assert(rv == APR_EOF);
+    /* get APR_EOF with ap_read
+     */
+    nbytes = sizeof buf;
+    rv = ap_read(f, buf, &nbytes);
+    assert(rv == APR_EOF);
+    rv = ap_close(f);
+    assert(!rv);
+}
+
+static void test_bigread(ap_pool_t *p, const char *fname, ap_int32_t extra_flags)
+{
+    ap_file_t *f = NULL;
+    ap_status_t rv;
+    char buf[APR_BUFFERSIZE * 2];
+    ap_ssize_t nbytes;
+
+    /* Create a test file with known content.
+     */
+    rv = ap_open(&f, fname, APR_CREATE | APR_WRITE | APR_TRUNCATE, APR_UREAD | APR_UWRITE, p);
+    if (rv) {
+        fprintf(stderr, "ap_open()->%d/%s\n",
+                rv, ap_strerror(rv, buf, sizeof buf));
+        exit(1);
+    }
+    nbytes = APR_BUFFERSIZE;
+    memset(buf, 0xFE, nbytes);
+    rv = ap_write(f, buf, &nbytes);
+    assert(!rv && nbytes == APR_BUFFERSIZE);
+    rv = ap_close(f);
+    assert(!rv);
+
+    f = NULL;
+    rv = ap_open(&f, fname, APR_READ | extra_flags, 0, p);
+    assert(!rv);
+    nbytes = sizeof buf;
+    rv = ap_read(f, buf, &nbytes);
+    assert(!rv);
+    assert(nbytes == APR_BUFFERSIZE);
+    rv = ap_close(f);
+    assert(!rv);
+}
+
+static void test_read(ap_pool_t *p)
+{
+    const char *fname = "testread.dat";
+    ap_status_t rv;
+
+    fprintf(stdout, "Testing file read functions.\n");
+
+    create_testread(p, fname);
+    fprintf(stdout, "\tBuffered file tests......");
+    test_read_guts(p, fname, APR_BUFFERED);
+    fprintf(stdout, "OK\n");
+    fprintf(stdout, "\tUnbuffered file tests....");
+    test_read_guts(p, fname, 0);
+    fprintf(stdout, "OK\n");
+    fprintf(stdout, "\tMore buffered file tests......");
+    test_bigread(p, fname, APR_BUFFERED);
+    fprintf(stdout, "OK\n");
+    fprintf(stdout, "\tMore unbuffered file tests......");
+    test_bigread(p, fname, 0);
+    fprintf(stdout, "OK\n");
+    rv = ap_remove_file(fname, p);
+    assert(!rv);
+    fprintf(stdout, "\tAll read tests...........OK\n");
+}
 
ap_getc()

ap_status_t ap_getc(char *ch, ap_file_t *thefile)
{
    ap_ssize_t nbytes = 1;

    return ap_read(thefile, ch, &nbytes);
}

ap_fgets()

ap_status_t ap_fgets(char *str, int len, ap_file_t *thefile)
{
    ap_status_t rv = APR_SUCCESS; /* get rid of gcc warning */
    ap_ssize_t nbytes;
    char *final = str + len - 1;

    if (len <= 1) {
        /* sort of like fgets(), which returns NULL and stores no
        bytes
         */
        return APR_SUCCESS;
    }

    while (str < final) { /* leave room for trailing '\0' */
        nbytes = 1;
        rv = ap_read(thefile, str, &nbytes);
        if (rv != APR_SUCCESS) {
            break;
        }
        if (*str == '\n') {
            ++str;
            break;
        }
        ++str;
    }
    /* We must store a terminating '\0' if we've stored any chars. We
        can
     * get away with storing it if we hit an error first.
     */
    *str = '\0';
    return rv;
}

Re: APR file i/o questions

Posted by Brian Havard <br...@kheldar.apana.org.au>.
On Wed, 14 Jun 2000 17:25:05 -0400, Jeff Trawick wrote:

>Should the read-type functions return APR_EOF at end-of-file on all
>platforms?
>
>  (Plenty of folks seem to agree: Ryan, Jeff, Brian, James)

That's how I see it yes. The way it is, I ask for some data to be read, it
reads nothing yet still returns APR_SUCCESS. To me that's nonsense.

I'm not decided about the case where the whole buffer isn't filled though
I'm leaning towards leaving that the way it is (returning success).



>Should ap_ungetc() work on all files, buffered or not?
>
>  (Jeff thinks so.)

Yes, though I'm not sure it's needed in its current form. What we actually
want is the ability to back up 1 char, IE seek -1. We don't need to be able
to specify what the ungot character is.



>Should ap_fgets() on Unix have to check for '\r' *and* '\n' (i.e.,
>think about '\r' at all)?
>
>  (Jeff thinks not.  fgets() has managed without it forever)

Probably not. I did that for the OS/2 & Win32 code as they'll usually get
CRLF files. Another point though is should the trailing \n be left in the
returned string? I know thats what the standard fgets does but I've always
found that annoying. Anyone else think that?



>What about locking within the buffered I/O support?
>
>  (Jeff thinks that by default there should be no locking, but that a
>  flag on ap_open() (APR_MT) can turn it on.  This flag would be
>  ignored (not failed) if !APR_HAS_THREADS.  Even in a multithreaded
>  app it seems uncommon for more than one thread to access the same
>  file at once, so this is a reasonable default.

Yup, I agree fully.



>What about Jeff's patch to Unix readwrite.c he posted earlier?
>
>  (Jeff is 70% sure he should change it to call ap_read() from
>  ap_fgets() but he will wait until there is a chance for some answers
>  to these questions before playing more in the code.  Hopefully the
>  next time he is coding there he can handle the resolution of more of
>  the questions above.)

I've got it working with just this (yes, I have a linux box too):


Index: file_io/unix/open.c
===================================================================
RCS file: /home/cvs/apache-2.0/src/lib/apr/file_io/unix/open.c,v
retrieving revision 1.54
diff -u -w -r1.54 open.c
--- file_io/unix/open.c	2000/06/15 17:39:16	1.54
+++ file_io/unix/open.c	2000/06/16 08:40:59
@@ -112,6 +112,10 @@
 
     (*new)->buffered = (flag & APR_BUFFERED) > 0;
 
+    if ((*new)->buffered) {
+        (*new)->buffer = ap_palloc(cont, APR_FILE_BUFSIZE);
+    }
+
     if (flag & APR_CREATE) {
         oflags |= O_CREAT; 
 	if (flag & APR_EXCL) {
Index: file_io/unix/readwrite.c
===================================================================
RCS file: /home/cvs/apache-2.0/src/lib/apr/file_io/unix/readwrite.c,v
retrieving revision 1.54
diff -u -w -r1.54 readwrite.c
--- file_io/unix/readwrite.c	2000/06/14 02:58:34	1.54
+++ file_io/unix/readwrite.c	2000/06/16 08:41:01
@@ -301,6 +301,7 @@
 
 ap_status_t ap_getc(char *ch, ap_file_t *thefile)
 {
+    ap_status_t status;
     ssize_t rv;
     
     if (thefile->ungetchar != -1) {
@@ -308,13 +309,15 @@
         thefile->ungetchar = -1;
         return APR_SUCCESS;
     }
-    rv = read(thefile->filedes, ch, 1); 
+
+    rv = 1;
+    status = ap_read(thefile, ch, &rv);
     if (rv == 0) {
         thefile->eof_hit = TRUE;
         return APR_EOF;
     }
     else if (rv != 1) {
-        return errno;
+        return status;
     }
     return APR_SUCCESS; 
 }
@@ -358,6 +361,7 @@
 {
     ssize_t rv;
     int i, used_unget = FALSE, beg_idx;
+    ap_status_t status;
 
     if (len <= 1) {  /* as per fgets() */
         return APR_SUCCESS;
@@ -378,7 +382,8 @@
     }
     
     for (i = beg_idx; i < len; i++) {
-        rv = read(thefile->filedes, &str[i], 1); 
+        rv = 1;
+        status = ap_read(thefile, str+i, &rv);
         if (rv == 0) {
             thefile->eof_hit = TRUE;
 	    if (used_unget) {
@@ -388,7 +393,7 @@
             return APR_EOF;
         }
         else if (rv != 1) {
-            return errno;
+            return status;
         }
         if (str[i] == '\n' || str[i] == '\r') {
             break;

-- 
 ______________________________________________________________________________
 |  Brian Havard                 |  "He is not the messiah!                   |
 |  brianh@kheldar.apana.org.au  |  He's a very naughty boy!" - Life of Brian |
 ------------------------------------------------------------------------------