You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apr.apache.org by Robert Norris <ro...@cataclysm.cx> on 2004/03/04 23:51:50 UTC

PATCH: Determine how many bytes are waiting to be read from a socket

Attached is a patch that adds a function apr_socket_pending(). This
allows the caller to find out how many bytes are waiting to be read from
a socket without actually reading those bytes.

I use this to determine if the peer has closed the connection. In that
case, the socket goes readable, and this function reports 0 bytes
waiting.

I've added configure checks for sys/ioctl.h, to get ioctl() and
FIONREAD. These should exist on just about all Unix-ish systems. Just in
case it doesn't, I've added an alternative implementation that uses
recv() and MSG_PEEK. I assume recv() is available, because functions in
sendrecv.c assume this too.

Rob.

-- 
Robert Norris                                       GPG: 1024D/FC18E6C2
Email+Jabber: rob@cataclysm.cx                Web: http://cataclysm.cx/

Re: PATCH: Determine how many bytes are waiting to be read from a socket

Posted by Robert Norris <ro...@cataclysm.cx>.
On Thu, Mar 04, 2004 at 11:48:52PM -0600, Scott Lamb wrote:
> >Interesting. I've never actually seen select() and friends do this, but
> >of course there's lots of broken OSes out there.
> 
> Well, they don't even consider it to be broken. I've found the message  
> in question:
> 
> <http://groups.google.com/groups?q=select+hint+group: 
> linux.kernel&hl=en&lr=lang_en&ie=UTF-8&oe=UTF 
> -8&group=linux.kernel&c2coff=1&safe=off&selm=20030609031009%2414ba%40gat 
> ed-at.bofh.it&rnum=1>

There's some logic in that explanation. I can accept it :)

> >I want a basic I/O event layer that tells the application
> >above it when something interesting happens on a socket (ie something  
> >to
> >read, ready to write, closed). If the event layer has to read data to
> >find this out, then this abstraction doesn't work.
> 
> Is distinguishing close really that important? What select() tells you  
> is that you can probably do a read() or write() without getting  
> -1/EWOULDBLOCK. Both of those are true when the socket is closed. I  
> guess I just don't understand the importance of making finer  
> distinctions than that before you actually try to do something.

My requirement basically came out of the way my application is currently
implemented. I get some weird I/O races because I can't properly detect
a socket closing (because I get socket readiness events from a seperate
piece of code to where I actually do the I/O). I can't actually remember
the reasoning right now, and I don't have my notes on hand, so I can't
be more specific.

However, I've thought about my design after considering the above
explanation, and I think I've got a better way that gives me both the
abstraction I want and detects socket closure by watching for read()
failing. So I'm happy to retract this patch.

Thanks all for educating me :)

Rob.

-- 
Robert Norris                                       GPG: 1024D/FC18E6C2
Email+Jabber: rob@cataclysm.cx                Web: http://cataclysm.cx/

Re: PATCH: Determine how many bytes are waiting to be read from a socket

Posted by Scott Lamb <sl...@slamb.org>.
On Mar 4, 2004, at 9:17 PM, Robert Norris wrote:

> On Thu, Mar 04, 2004 at 07:39:48PM -0600, Scott Lamb wrote:
>> On Mar 4, 2004, at 6:44 PM, Robert Norris wrote:
>>> The connection is only closed if the socket is readable and there's  
>>> no
>>> bytes waiting. The results of select()/poll() are undefined unless  
>>> they
>>> return successfully.
>>
>> I'm saying that I believe select(), poll(), etc. can return
>> successfully with an indication that a socket is available for read()
>> or write() when in fact it is not. In addition to the page I  
>> mentioned,
>> there was a discussion about this on linux-kernel a while ago, I
>> believe. They said (from memory) that (a) you should always use
>> non-blocking IO with select and (b) you should not be surprised if a
>> read() or write() returns -1/EWOULDBLOCK, even immediately after a
>> select() that indicates readability/writability for that descriptor.  
>> In
>> this situation, I believe a call to return the number of waiting bytes
>> would return 0; there'd be no way to distinguish the EOF from the
>> spurious return until you actually do the read().
>
> Interesting. I've never actually seen select() and friends do this, but
> of course there's lots of broken OSes out there.

Well, they don't even consider it to be broken. I've found the message  
in question:

<http://groups.google.com/groups?q=select+hint+group: 
linux.kernel&hl=en&lr=lang_en&ie=UTF-8&oe=UTF 
-8&group=linux.kernel&c2coff=1&safe=off&selm=20030609031009%2414ba%40gat 
ed-at.bofh.it&rnum=1>

>
> I'm not familiar with a read on a closed socket returning anything  
> other
> than 0. It doesn't make sense to me that it would return -1/EWOULDBLOCK
> - if its closed, then a read should never block - it should just return
> 0.

I've never seen that, either.

>
> If select() reports that a socket is readable when its not, then I
> suppose FIONREAD _might_ return 0 (I honestly don't know),

I would expect it would be similar to what happens if no select() has  
been made and nothing is available. Doesn't it return 0 then?

>  but I would
> still expect recv(MSG_PEEK) to return -1/EWOULDBLOCK (sinces it still
> does a read). So surely that would be a safe way to determine if  
> there's
> anything in the buffer?

Ahh. I'd forgotten about recv(..., MSG_PEEK). That might work.

> I want a basic I/O event layer that tells the application
> above it when something interesting happens on a socket (ie something  
> to
> read, ready to write, closed). If the event layer has to read data to
> find this out, then this abstraction doesn't work.

Is distinguishing close really that important? What select() tells you  
is that you can probably do a read() or write() without getting  
-1/EWOULDBLOCK. Both of those are true when the socket is closed. I  
guess I just don't understand the importance of making finer  
distinctions than that before you actually try to do something.

Scott


Re: PATCH: Determine how many bytes are waiting to be read from a socket

Posted by Robert Norris <ro...@cataclysm.cx>.
On Thu, Mar 04, 2004 at 07:39:48PM -0600, Scott Lamb wrote:
> On Mar 4, 2004, at 6:44 PM, Robert Norris wrote:
> >The connection is only closed if the socket is readable and there's no
> >bytes waiting. The results of select()/poll() are undefined unless they
> >return successfully.
> 
> I'm saying that I believe select(), poll(), etc. can return 
> successfully with an indication that a socket is available for read() 
> or write() when in fact it is not. In addition to the page I mentioned, 
> there was a discussion about this on linux-kernel a while ago, I 
> believe. They said (from memory) that (a) you should always use 
> non-blocking IO with select and (b) you should not be surprised if a 
> read() or write() returns -1/EWOULDBLOCK, even immediately after a 
> select() that indicates readability/writability for that descriptor. In 
> this situation, I believe a call to return the number of waiting bytes 
> would return 0; there'd be no way to distinguish the EOF from the 
> spurious return until you actually do the read().

Interesting. I've never actually seen select() and friends do this, but
of course there's lots of broken OSes out there.

I'm not familiar with a read on a closed socket returning anything other
than 0. It doesn't make sense to me that it would return -1/EWOULDBLOCK
- if its closed, then a read should never block - it should just return
0.

If select() reports that a socket is readable when its not, then I
suppose FIONREAD _might_ return 0 (I honestly don't know), but I would
still expect recv(MSG_PEEK) to return -1/EWOULDBLOCK (sinces it still
does a read). So surely that would be a safe way to determine if there's
anything in the buffer?

(I hope there is some relatively portable way - if not then there's
probably a lot of I/O multiplexing servers out there that are could be
doing the wrong thing).

> >Just doing the read is no good - I'm writing an event notifier and I
> >only want to actually read the data if there is data waiting (otherwise
> >my abstraction breaks).
> 
> Then I believe your abstraction is broken.

Not really. I want a basic I/O event layer that tells the application
above it when something interesting happens on a socket (ie something to
read, ready to write, closed). If the event layer has to read data to
find this out, then this abstraction doesn't work.

But thats specific to my app anyway. The real thing we have to work out
is if there is a reliable way to determine how much data is waiting to
be read (regardless of whether we're blocking or not, or if select() is
being used). It seems at the very least, recv(MSG_PEEK) should do this.

Rob.

-- 
Robert Norris                                       GPG: 1024D/FC18E6C2
Email+Jabber: rob@cataclysm.cx                Web: http://cataclysm.cx/

Re: PATCH: Determine how many bytes are waiting to be read from a socket

Posted by Scott Lamb <sl...@slamb.org>.
On Mar 4, 2004, at 6:44 PM, Robert Norris wrote:

> On Thu, Mar 04, 2004 at 06:04:05PM -0600, Scott Lamb wrote:
>> On Mar 4, 2004, at 4:51 PM, Robert Norris wrote:
>>> I use this to determine if the peer has closed the connection. In 
>>> that
>>> case, the socket goes readable, and this function reports 0 bytes
>>> waiting.
>>
>> I don't think this is correct. select() and similar functions can
>> return spuriously, at least on some platforms[*]. I would imagine this
>> would return 0 in such a case, and the socket would not have closed.
>> Why not just do the read and see if it returns 0?
>
> The connection is only closed if the socket is readable and there's no
> bytes waiting. The results of select()/poll() are undefined unless they
> return successfully.

I'm saying that I believe select(), poll(), etc. can return 
successfully with an indication that a socket is available for read() 
or write() when in fact it is not. In addition to the page I mentioned, 
there was a discussion about this on linux-kernel a while ago, I 
believe. They said (from memory) that (a) you should always use 
non-blocking IO with select and (b) you should not be surprised if a 
read() or write() returns -1/EWOULDBLOCK, even immediately after a 
select() that indicates readability/writability for that descriptor. In 
this situation, I believe a call to return the number of waiting bytes 
would return 0; there'd be no way to distinguish the EOF from the 
spurious return until you actually do the read().

> Just doing the read is no good - I'm writing an event notifier and I
> only want to actually read the data if there is data waiting (otherwise
> my abstraction breaks).

Then I believe your abstraction is broken.

Scott


Re: PATCH: Determine how many bytes are waiting to be read from a socket

Posted by Robert Norris <ro...@cataclysm.cx>.
On Thu, Mar 04, 2004 at 06:04:05PM -0600, Scott Lamb wrote:
> On Mar 4, 2004, at 4:51 PM, Robert Norris wrote:
> >I use this to determine if the peer has closed the connection. In that
> >case, the socket goes readable, and this function reports 0 bytes
> >waiting.
> 
> I don't think this is correct. select() and similar functions can 
> return spuriously, at least on some platforms[*]. I would imagine this 
> would return 0 in such a case, and the socket would not have closed. 
> Why not just do the read and see if it returns 0?

The connection is only closed if the socket is readable and there's no
bytes waiting. The results of select()/poll() are undefined unless they
return successfully.

Just doing the read is no good - I'm writing an event notifier and I
only want to actually read the data if there is data waiting (otherwise
my abstraction breaks).

Rob.

-- 
Robert Norris                                       GPG: 1024D/FC18E6C2
Email+Jabber: rob@cataclysm.cx                Web: http://cataclysm.cx/

Re: PATCH: Determine how many bytes are waiting to be read from a socket

Posted by Scott Lamb <sl...@slamb.org>.
On Mar 4, 2004, at 4:51 PM, Robert Norris wrote:

> Attached is a patch that adds a function apr_socket_pending(). This
> allows the caller to find out how many bytes are waiting to be read 
> from
> a socket without actually reading those bytes.
>
> I use this to determine if the peer has closed the connection. In that
> case, the socket goes readable, and this function reports 0 bytes
> waiting.

I don't think this is correct. select() and similar functions can 
return spuriously, at least on some platforms[*]. I would imagine this 
would return 0 in such a case, and the socket would not have closed. 
Why not just do the read and see if it returns 0?

[*] - http://cr.yp.to/docs/unixport.html

Scott