You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@trafficserver.apache.org by "Owens, Steve" <St...@disney.com> on 2013/01/25 23:14:21 UTC

Piercing the veil

 In my attempts to understand the mechanics of TSIOBufferCopy I found the following function definition in InkIOCoreAPI.cc


int64_t

TSIOBufferCopy(TSIOBuffer bufp, TSIOBufferReader readerp, int64_t length, int64_t offset)

{

  sdk_assert(sdk_sanity_check_iocore_structure(bufp) == TS_SUCCESS);

  sdk_assert(sdk_sanity_check_iocore_structure(readerp) == TS_SUCCESS);

  sdk_assert((length >= 0) && (offset >= 0));


  MIOBuffer *b = (MIOBuffer *)bufp;

  IOBufferReader *r = (IOBufferReader *)readerp;


  return b->write(r, length, offset);

}


Note that it is undocumented but apparently the b->write(r, length, offset) does the work.


I believe I found the declaration of that function in I_IOBuffer.h  One thing that sort of stuck out to me in the set of code comments below, was the fact that it says


"Should it be necessary to make a large number of small transfers, it's preferable to use a interface that copies the data rather than sharing blocks to prevent

 a build of blocks on the buffer"


What confuses me is isn't TSIOBufferCopy supposed to be an interface that copies data rather than sharing blocks?



  /**

    Add by data from IOBufferReader r to the this buffer by reference. If

    len is INT64_MAX, all available data on the reader is added. If len is

    less than INT64_MAX, the smaller of len or the amount of data on the

    buffer is added. If offset is greater than zero, than the offset

    bytes of data at the front of the reader are skipped. Bytes skipped

    by offset reduce the number of bytes available on the reader used

    in the amount of data to add computation. write() does not respect

    watermarks or buffer size limits. Users of write must implement

    their own flow control. Returns the number of bytes added. Each

    write() call creates a new IOBufferBlock, even if it is for one

    byte. As such, it's necessary to exercise caution in any code that

    repeatedly transfers data from one buffer to another, especially if

    the data is being read over the network as it may be coming in very

    small chunks. Because deallocation of outstanding buffer blocks is

    recursive, it's possible to overrun the stack if too many blocks

    have been added to the buffer chain. It's imperative that users

    both implement their own flow control to prevent too many bytes

    from becoming outstanding on a buffer that the write() call is

    being used and that care be taken to ensure the transfers are of a

    minimum size. Should it be necessary to make a large number of small

    transfers, it's preferable to use a interface that copies the data

    rather than sharing blocks to prevent a build of blocks on the buffer.


  */

  inkcoreapi int64_t write(IOBufferReader * r, int64_t len = INT64_MAX, int64_t offset = 0);



 Also, the comments say that the return value of write is the actual number of bytes written.  Is it safe to assume that the return value will equal len should offset be 0?   If so, when you are simply trying to copy data from upstream to downstream and you subsequently make a calls such as:


/* Copy the data from the read buffer to the output buffer. */

bytesWritten = TSIOBufferCopy(TSVIOBufferGet(data->output_vio),

TSVIOReaderGet(input_vio), towrite, 0);


/* Tell the read buffer that we have read 'towrite' bytes of data

* and are no longer interested in it.

*/

TSIOBufferReaderConsume(TSVIOReaderGet(input_vio), bytesWritten);  // Should second param be bytesWritten or should it be towrite?


/* Increment the input VIO nDone value to reflect how much

* data we've copied from the upstream and written to the

* downstream.

*/

TSVIONDoneSet(input_vio, TSVIONDoneGet(input_vio) + bytesWritten); // Should Second param be TSVIONDoneGet(input_vio) + bytesWritten or TSVIONDoneGet(input_vio) + towrite?




Re: Piercing the veil

Posted by James Peach <jp...@apache.org>.
On 26/01/2013, at 8:55 AM, "Owens, Steve" <St...@disney.com> wrote:

> James,
> 
> Thank you for clearing that up.  It might be helpful to update the example
> plugins as well as the man pages because for example in the null transform
> plugin we have
> 
> if (towrite > 0) {
>    /* The amount of data left to read needs to be truncated by
>     * the amount of data actually in the read buffer.
>     */
>    avail = TSIOBufferReaderAvail(TSVIOReaderGet(input_vio));
>    TSDebug("null-transform", "\tavail is %" PRId64 "", avail);
>    if (towrite > avail) {
>      towrite = avail;
>    }
> 
>    if (towrite > 0) {
>      /* Copy the data from the read buffer to the output buffer. */
>      TSIOBufferCopy(TSVIOBufferGet(data->output_vio),
> TSVIOReaderGet(input_vio), towrite, 0);
> 
>      /* Tell the read buffer that we have read the data and are no
>       * longer interested in it.
>       */
>      TSIOBufferReaderConsume(TSVIOReaderGet(input_vio), towrite);
> 
>      /* Modify the input VIO to reflect how much data we've
>       * completed.
>       */
>      TSVIONDoneSet(input_vio, TSVIONDoneGet(input_vio) + towrite);
>    }
>  }
> 
> Which does not handle the case where the downstream buffer may be full.

Note that I'm not claiming that this can happen; just that the API contract allows it and that it's best to stick to the letter of the law :)

>  I am glad I asked.
> 
> Cheers,
> 
> Steve Owens
> 
> 
> 
> On 1/25/13 9:29 PM, "James Peach" <jp...@apache.org> wrote:
> 
>> On 25/01/2013, at 2:14 PM, "Owens, Steve" <St...@disney.com> wrote:
>> 
>>> In my attempts to understand the mechanics of TSIOBufferCopy I found
>>> the following function definition in InkIOCoreAPI.cc
>> 
>> TSIOBufferCopy copies data from a TSIOBufferReader into a TSIOBuffer.
>> IMHO it doesn't really matter whether it copies bytes or whether it
>> shares references to internal buffers. It's just efficiently sucking data
>> out of the reader.
>> 
>>> 
>>> int64_t
>>> TSIOBufferCopy(TSIOBuffer bufp, TSIOBufferReader readerp, int64_t
>>> length, int64_t offset)
>>> {
>>>  sdk_assert(sdk_sanity_check_iocore_structure(bufp) == TS_SUCCESS);
>>>  sdk_assert(sdk_sanity_check_iocore_structure(readerp) == TS_SUCCESS);
>>>  sdk_assert((length >= 0) && (offset >= 0));
>>> 
>>>  MIOBuffer *b = (MIOBuffer *)bufp;
>>>  IOBufferReader *r = (IOBufferReader *)readerp;
>>> 
>>>  return b->write(r, length, offset);
>>> }
>>> 
>>> Note that it is undocumented but apparently the b->write(r, length,
>>> offset) does the work.
>>> 
>>> I believe I found the declaration of that function in I_IOBuffer.h  One
>>> thing that sort of stuck out to me in the set of code comments below,
>>> was the fact that it says
>>> 
>>> "Should it be necessary to make a large number of small transfers, it's
>>> preferable to use a interface that copies the data rather than sharing
>>> blocks to prevent
>>> a build of blocks on the buffer"
>>> 
>>> What confuses me is isn't TSIOBufferCopy supposed to be an interface
>>> that copies data rather than sharing blocks?
>> 
>> It's documented (in the v2 docs) as as sharing data blocks, but IMHO that
>> is an implementation detail that doesn't really matter for a plugin. The
>> plugin only cares whether the data is available in the right places.
>> TSIOBufferWrite() would always copy the data from an arbitrary buffer
>> (ie. no sharing).
>> 
>>> 
>>> 
>>> 
>>>  /**
>>>    Add by data from IOBufferReader r to the this buffer by reference.
>>> If
>>>    len is INT64_MAX, all available data on the reader is added. If len
>>> is
>>>    less than INT64_MAX, the smaller of len or the amount of data on the
>>>    buffer is added. If offset is greater than zero, than the offset
>>>    bytes of data at the front of the reader are skipped. Bytes skipped
>>>    by offset reduce the number of bytes available on the reader used
>>>    in the amount of data to add computation. write() does not respect
>>>    watermarks or buffer size limits. Users of write must implement
>>>    their own flow control. Returns the number of bytes added. Each
>>>    write() call creates a new IOBufferBlock, even if it is for one
>>>    byte. As such, it's necessary to exercise caution in any code that
>>>    repeatedly transfers data from one buffer to another, especially if
>>>    the data is being read over the network as it may be coming in very
>>>    small chunks. Because deallocation of outstanding buffer blocks is
>>>    recursive, it's possible to overrun the stack if too many blocks
>>>    have been added to the buffer chain. It's imperative that users
>>>    both implement their own flow control to prevent too many bytes
>>>    from becoming outstanding on a buffer that the write() call is
>>>    being used and that care be taken to ensure the transfers are of a
>>>    minimum size. Should it be necessary to make a large number of small
>>>    transfers, it's preferable to use a interface that copies the data
>>>    rather than sharing blocks to prevent a build of blocks on the
>>> buffer.
>>> 
>>>  */
>>>  inkcoreapi int64_t write(IOBufferReader * r, int64_t len = INT64_MAX,
>>> int64_t offset = 0);
>>> 
>>> 
>>> 
>>> Also, the comments say that the return value of write is the actual
>>> number of bytes written.  Is it safe to assume that the return value
>>> will equal len should offset be 0?
>> 
>> I would not make that assumption. It's probably true, but to be robust
>> you should handle the case where the receiving buffer is full and will
>> only accept a partial write.
>> 
>>> If so, when you are simply trying to copy data from upstream to
>>> downstream and you subsequently make a calls such as:
>>> 
>>> /* Copy the data from the read buffer to the output buffer. */
>>> bytesWritten = TSIOBufferCopy(TSVIOBufferGet(data->output_vio),
>>> 					TSVIOReaderGet(input_vio), towrite, 0);
>>> 
>>> /* Tell the read buffer that we have read 'towrite' bytes of data
>>> * and are no longer interested in it.
>>> */
>>> TSIOBufferReaderConsume(TSVIOReaderGet(input_vio), bytesWritten);  //
>>> Should second param be bytesWritten or should it be towrite?
>> 
>> Definitely bytesWritten. It's more correct and no harder to implement :)
>> 
>>> /* Increment the input VIO nDone value to reflect how much
>>> * data we've copied from the upstream and written to the
>>> * downstream.
>>> */
>>> TSVIONDoneSet(input_vio, TSVIONDoneGet(input_vio) + bytesWritten); //
>>> Should Second param be TSVIONDoneGet(input_vio) + bytesWritten or
>>> TSVIONDoneGet(input_vio) + towrite?
>> 
>> Again, I'd recommend using bytesWritten.
>> 
>> We have been slowly adding man pages in the docs/sdk directory in the
>> source tree. I will add something for the IOBuffer API. I have found the
>> v2 documentation <http://people.apache.org/~zwoop/ats/15_sdk_pg.pdf> very
>> helpful. It's still largely accurate.
>> 
>> J
> 


Re: Piercing the veil

Posted by "Owens, Steve" <St...@disney.com>.
James,

Thank you for clearing that up.  It might be helpful to update the example
plugins as well as the man pages because for example in the null transform
plugin we have

if (towrite > 0) {
    /* The amount of data left to read needs to be truncated by
     * the amount of data actually in the read buffer.
     */
    avail = TSIOBufferReaderAvail(TSVIOReaderGet(input_vio));
    TSDebug("null-transform", "\tavail is %" PRId64 "", avail);
    if (towrite > avail) {
      towrite = avail;
    }

    if (towrite > 0) {
      /* Copy the data from the read buffer to the output buffer. */
      TSIOBufferCopy(TSVIOBufferGet(data->output_vio),
TSVIOReaderGet(input_vio), towrite, 0);

      /* Tell the read buffer that we have read the data and are no
       * longer interested in it.
       */
      TSIOBufferReaderConsume(TSVIOReaderGet(input_vio), towrite);

      /* Modify the input VIO to reflect how much data we've
       * completed.
       */
      TSVIONDoneSet(input_vio, TSVIONDoneGet(input_vio) + towrite);
    }
  }

Which does not handle the case where the downstream buffer may be full.  I
am glad I asked.

Cheers,

Steve Owens



On 1/25/13 9:29 PM, "James Peach" <jp...@apache.org> wrote:

>On 25/01/2013, at 2:14 PM, "Owens, Steve" <St...@disney.com> wrote:
>
>>  In my attempts to understand the mechanics of TSIOBufferCopy I found
>>the following function definition in InkIOCoreAPI.cc
>
>TSIOBufferCopy copies data from a TSIOBufferReader into a TSIOBuffer.
>IMHO it doesn't really matter whether it copies bytes or whether it
>shares references to internal buffers. It's just efficiently sucking data
>out of the reader.
>
>> 
>> int64_t
>> TSIOBufferCopy(TSIOBuffer bufp, TSIOBufferReader readerp, int64_t
>>length, int64_t offset)
>> {
>>   sdk_assert(sdk_sanity_check_iocore_structure(bufp) == TS_SUCCESS);
>>   sdk_assert(sdk_sanity_check_iocore_structure(readerp) == TS_SUCCESS);
>>   sdk_assert((length >= 0) && (offset >= 0));
>> 
>>   MIOBuffer *b = (MIOBuffer *)bufp;
>>   IOBufferReader *r = (IOBufferReader *)readerp;
>> 
>>   return b->write(r, length, offset);
>> }
>> 
>> Note that it is undocumented but apparently the b->write(r, length,
>>offset) does the work.
>> 
>> I believe I found the declaration of that function in I_IOBuffer.h  One
>>thing that sort of stuck out to me in the set of code comments below,
>>was the fact that it says
>> 
>> "Should it be necessary to make a large number of small transfers, it's
>>preferable to use a interface that copies the data rather than sharing
>>blocks to prevent
>>  a build of blocks on the buffer"
>> 
>> What confuses me is isn't TSIOBufferCopy supposed to be an interface
>>that copies data rather than sharing blocks?
>
>It's documented (in the v2 docs) as as sharing data blocks, but IMHO that
>is an implementation detail that doesn't really matter for a plugin. The
>plugin only cares whether the data is available in the right places.
>TSIOBufferWrite() would always copy the data from an arbitrary buffer
>(ie. no sharing).
>
>>  
>> 
>> 
>>   /**
>>     Add by data from IOBufferReader r to the this buffer by reference.
>>If
>>     len is INT64_MAX, all available data on the reader is added. If len
>>is
>>     less than INT64_MAX, the smaller of len or the amount of data on the
>>     buffer is added. If offset is greater than zero, than the offset
>>     bytes of data at the front of the reader are skipped. Bytes skipped
>>     by offset reduce the number of bytes available on the reader used
>>     in the amount of data to add computation. write() does not respect
>>     watermarks or buffer size limits. Users of write must implement
>>     their own flow control. Returns the number of bytes added. Each
>>     write() call creates a new IOBufferBlock, even if it is for one
>>     byte. As such, it's necessary to exercise caution in any code that
>>     repeatedly transfers data from one buffer to another, especially if
>>     the data is being read over the network as it may be coming in very
>>     small chunks. Because deallocation of outstanding buffer blocks is
>>     recursive, it's possible to overrun the stack if too many blocks
>>     have been added to the buffer chain. It's imperative that users
>>     both implement their own flow control to prevent too many bytes
>>     from becoming outstanding on a buffer that the write() call is
>>     being used and that care be taken to ensure the transfers are of a
>>     minimum size. Should it be necessary to make a large number of small
>>     transfers, it's preferable to use a interface that copies the data
>>     rather than sharing blocks to prevent a build of blocks on the
>>buffer.
>> 
>>   */
>>   inkcoreapi int64_t write(IOBufferReader * r, int64_t len = INT64_MAX,
>>int64_t offset = 0);
>> 
>> 
>> 
>>  Also, the comments say that the return value of write is the actual
>>number of bytes written.  Is it safe to assume that the return value
>>will equal len should offset be 0?
>
>I would not make that assumption. It's probably true, but to be robust
>you should handle the case where the receiving buffer is full and will
>only accept a partial write.
>
>> If so, when you are simply trying to copy data from upstream to
>>downstream and you subsequently make a calls such as:
>> 
>> /* Copy the data from the read buffer to the output buffer. */
>> bytesWritten = TSIOBufferCopy(TSVIOBufferGet(data->output_vio),
>> 					TSVIOReaderGet(input_vio), towrite, 0);
>> 
>> /* Tell the read buffer that we have read 'towrite' bytes of data
>> * and are no longer interested in it.
>> */
>> TSIOBufferReaderConsume(TSVIOReaderGet(input_vio), bytesWritten);  //
>>Should second param be bytesWritten or should it be towrite?
>
>Definitely bytesWritten. It's more correct and no harder to implement :)
>
>> /* Increment the input VIO nDone value to reflect how much
>> * data we've copied from the upstream and written to the
>> * downstream.
>> */
>> TSVIONDoneSet(input_vio, TSVIONDoneGet(input_vio) + bytesWritten); //
>>Should Second param be TSVIONDoneGet(input_vio) + bytesWritten or
>>TSVIONDoneGet(input_vio) + towrite?
>
>Again, I'd recommend using bytesWritten.
>
>We have been slowly adding man pages in the docs/sdk directory in the
>source tree. I will add something for the IOBuffer API. I have found the
>v2 documentation <http://people.apache.org/~zwoop/ats/15_sdk_pg.pdf> very
>helpful. It's still largely accurate.
>
>J


Re: Piercing the veil

Posted by James Peach <jp...@apache.org>.
On 25/01/2013, at 2:14 PM, "Owens, Steve" <St...@disney.com> wrote:

>  In my attempts to understand the mechanics of TSIOBufferCopy I found the following function definition in InkIOCoreAPI.cc

TSIOBufferCopy copies data from a TSIOBufferReader into a TSIOBuffer. IMHO it doesn't really matter whether it copies bytes or whether it shares references to internal buffers. It's just efficiently sucking data out of the reader.

> 
> int64_t
> TSIOBufferCopy(TSIOBuffer bufp, TSIOBufferReader readerp, int64_t length, int64_t offset)
> {
>   sdk_assert(sdk_sanity_check_iocore_structure(bufp) == TS_SUCCESS);
>   sdk_assert(sdk_sanity_check_iocore_structure(readerp) == TS_SUCCESS);
>   sdk_assert((length >= 0) && (offset >= 0));
> 
>   MIOBuffer *b = (MIOBuffer *)bufp;
>   IOBufferReader *r = (IOBufferReader *)readerp;
> 
>   return b->write(r, length, offset);
> }
> 
> Note that it is undocumented but apparently the b->write(r, length, offset) does the work.
> 
> I believe I found the declaration of that function in I_IOBuffer.h  One thing that sort of stuck out to me in the set of code comments below, was the fact that it says 
> 
> "Should it be necessary to make a large number of small transfers, it's preferable to use a interface that copies the data rather than sharing blocks to prevent
>  a build of blocks on the buffer"
> 
> What confuses me is isn't TSIOBufferCopy supposed to be an interface that copies data rather than sharing blocks?

It's documented (in the v2 docs) as as sharing data blocks, but IMHO that is an implementation detail that doesn't really matter for a plugin. The plugin only cares whether the data is available in the right places. TSIOBufferWrite() would always copy the data from an arbitrary buffer (ie. no sharing).

>  
> 
> 
>   /**
>     Add by data from IOBufferReader r to the this buffer by reference. If
>     len is INT64_MAX, all available data on the reader is added. If len is
>     less than INT64_MAX, the smaller of len or the amount of data on the
>     buffer is added. If offset is greater than zero, than the offset
>     bytes of data at the front of the reader are skipped. Bytes skipped
>     by offset reduce the number of bytes available on the reader used
>     in the amount of data to add computation. write() does not respect
>     watermarks or buffer size limits. Users of write must implement
>     their own flow control. Returns the number of bytes added. Each
>     write() call creates a new IOBufferBlock, even if it is for one
>     byte. As such, it's necessary to exercise caution in any code that
>     repeatedly transfers data from one buffer to another, especially if
>     the data is being read over the network as it may be coming in very
>     small chunks. Because deallocation of outstanding buffer blocks is
>     recursive, it's possible to overrun the stack if too many blocks
>     have been added to the buffer chain. It's imperative that users
>     both implement their own flow control to prevent too many bytes
>     from becoming outstanding on a buffer that the write() call is
>     being used and that care be taken to ensure the transfers are of a
>     minimum size. Should it be necessary to make a large number of small
>     transfers, it's preferable to use a interface that copies the data
>     rather than sharing blocks to prevent a build of blocks on the buffer.
> 
>   */
>   inkcoreapi int64_t write(IOBufferReader * r, int64_t len = INT64_MAX, int64_t offset = 0);
> 
> 
> 
>  Also, the comments say that the return value of write is the actual number of bytes written.  Is it safe to assume that the return value will equal len should offset be 0?

I would not make that assumption. It's probably true, but to be robust you should handle the case where the receiving buffer is full and will only accept a partial write.

> If so, when you are simply trying to copy data from upstream to downstream and you subsequently make a calls such as:
> 
> /* Copy the data from the read buffer to the output buffer. */
> bytesWritten = TSIOBufferCopy(TSVIOBufferGet(data->output_vio),
> 					TSVIOReaderGet(input_vio), towrite, 0);
> 
> /* Tell the read buffer that we have read 'towrite' bytes of data
> * and are no longer interested in it.
> */
> TSIOBufferReaderConsume(TSVIOReaderGet(input_vio), bytesWritten);  // Should second param be bytesWritten or should it be towrite?

Definitely bytesWritten. It's more correct and no harder to implement :)

> /* Increment the input VIO nDone value to reflect how much
> * data we've copied from the upstream and written to the
> * downstream.
> */
> TSVIONDoneSet(input_vio, TSVIONDoneGet(input_vio) + bytesWritten); // Should Second param be TSVIONDoneGet(input_vio) + bytesWritten or TSVIONDoneGet(input_vio) + towrite?

Again, I'd recommend using bytesWritten.

We have been slowly adding man pages in the docs/sdk directory in the source tree. I will add something for the IOBuffer API. I have found the v2 documentation <http://people.apache.org/~zwoop/ats/15_sdk_pg.pdf> very helpful. It's still largely accurate.

J