You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@trafficserver.apache.org by Leif Hedstrom <zw...@apache.org> on 2010/12/03 01:47:11 UTC
SDK and string "copies"
Hi,
here's an idea, that could optimize memory allocation pressure
significantly (I think) when writing plugins. Here's the deal; A few
plugin APIs will make copies of the strings from the internal marshal
buffers, before returning them, with the only purpose of making them
NULL terminated. The odd thing, to me at least, is that all these APIs
already take an argument (an int*), which gets set with the length of
the string.
I think it'd make much more sense to consistently make all APIs that
deals with strings like this not be NULL terminated. This would require
no API changes, just that the "semantics" would change. And I think the
requirement to 'release' strings could then go away entirely (since they
are just const char* pointers into the marshal buffers).
Below are a few APIs that would change their "behavior", in that they no
longer would gurantee NULL terminated strings (which is something most
APIs don't anyways).
Thoughts on this?
-- leif
const char * TSUrlSchemeGet(TSMBuffer bufp, TSMLoc obj, int *length);
const char *TSUrlUserGet(TSMBuffer bufp, TSMLoc obj, int *length);
const char *TSUrlPasswordGet(TSMBuffer bufp, TSMLoc obj, int *length);
const char *TSUrlHostGet(TSMBuffer bufp, TSMLoc obj, int *length);
const char *TSUrlPathGet(TSMBuffer bufp, TSMLoc obj, int *length);
const char *TSUrlHttpParamsGet(TSMBuffer bufp, TSMLoc obj, int *length);
const char *TSUrlHttpQueryGet(TSMBuffer bufp, TSMLoc obj, int *length);
const char *TSUrlHttpFragmentGet(TSMBuffer bufp, TSMLoc obj, int *length);
const char *TSMimeHdrFieldNameGet(TSMBuffer bufp, TSMLoc hdr, TSMLoc
field, int *length);
const char *TSHttpHdrMethodGet(TSMBuffer bufp, TSMLoc obj, int *length);
const char *TSHttpHdrReasonGet(TSMBuffer bufp, TSMLoc obj, int *length);
TSReturnCode TSMimeHdrFieldValueStringGet(TSMBuffer bufp, TSMLoc hdr,
TSMLoc field, int idx, const char **value_ptr, int *value_len_ptr);
Re: SDK and string "copies"
Posted by Bryan Call <bc...@yahoo-inc.com>.
+1 - simplifies ts code and increases performance
-Bryan
On 12/02/2010 04:47 PM, Leif Hedstrom wrote:
> Hi,
>
> here's an idea, that could optimize memory allocation pressure
> significantly (I think) when writing plugins. Here's the deal; A few
> plugin APIs will make copies of the strings from the internal marshal
> buffers, before returning them, with the only purpose of making them
> NULL terminated. The odd thing, to me at least, is that all these APIs
> already take an argument (an int*), which gets set with the length of
> the string.
>
> I think it'd make much more sense to consistently make all APIs that
> deals with strings like this not be NULL terminated. This would require
> no API changes, just that the "semantics" would change. And I think the
> requirement to 'release' strings could then go away entirely (since they
> are just const char* pointers into the marshal buffers).
>
> Below are a few APIs that would change their "behavior", in that they no
> longer would gurantee NULL terminated strings (which is something most
> APIs don't anyways).
>
> Thoughts on this?
>
> -- leif
>
> const char * TSUrlSchemeGet(TSMBuffer bufp, TSMLoc obj, int *length);
> const char *TSUrlUserGet(TSMBuffer bufp, TSMLoc obj, int *length);
> const char *TSUrlPasswordGet(TSMBuffer bufp, TSMLoc obj, int *length);
> const char *TSUrlHostGet(TSMBuffer bufp, TSMLoc obj, int *length);
> const char *TSUrlPathGet(TSMBuffer bufp, TSMLoc obj, int *length);
> const char *TSUrlHttpParamsGet(TSMBuffer bufp, TSMLoc obj, int *length);
> const char *TSUrlHttpQueryGet(TSMBuffer bufp, TSMLoc obj, int *length);
> const char *TSUrlHttpFragmentGet(TSMBuffer bufp, TSMLoc obj, int *length);
>
> const char *TSMimeHdrFieldNameGet(TSMBuffer bufp, TSMLoc hdr, TSMLoc
> field, int *length);
> const char *TSHttpHdrMethodGet(TSMBuffer bufp, TSMLoc obj, int *length);
> const char *TSHttpHdrReasonGet(TSMBuffer bufp, TSMLoc obj, int *length);
>
> TSReturnCode TSMimeHdrFieldValueStringGet(TSMBuffer bufp, TSMLoc hdr,
> TSMLoc field, int idx, const char **value_ptr, int *value_len_ptr);
>
>
Re: SDK and string "copies"
Posted by Leif Hedstrom <zw...@apache.org>.
On 12/02/2010 08:05 PM, Jason wrote:
> ditto +1, just have to make a point of it in the docs
Absolutely, I will file separate bug(s) for that (doing this change will
definitely eliminate the requirement for ever having to call
TSHandleStringRelease(), which actually simplifies the usage of the APIs
a fair amount.
Fwiw, most of the doxygen docs for this already said the strings aren't
guaranteed to be NULL terminated. And most of the APIs I listed have a
note like this in the SDK doc:
Note: the returned string is not guaranteed to be null-terminated.
That note is obviously a lie, since up until now, they were guaranteed
to be NULL terminated, in a very expensive way (for this handful of
APIs). There's even a big section already in the docs, that says:
No null-terminated strings
In Traffic Server 5.2 (sic) and newer, you cannot assume that the
string data contained in
marshal buffers (data such as URLs and MIME fields) is stored in
null-terminated string
copies. This means that your plugins should always use the length
parameter when
retrieving or manipulating these strings.
I believe this whole nonsense of doing copies of strings on the heap,
and NULL terminate them, was some attempt to make the APIs compatible
with much older versions of Inktomi Traffic Server (which we don't need
to deal with obviously).
Cheers,
-- leif
Re: SDK and string "copies"
Posted by Jason <ja...@gmail.com>.
ditto +1, just have to make a point of it in the docs
On Dec 2, 2010, at 9:23 PM, Eric Balsa wrote:
> I'm +1 for making stuff non-null and returning a pointer straight into
> the MBuf like you propose.
>
> The only problem I see is that plugin writers will get direct access
> to the MBuf and one could cast and change data directly (but these are
> getters!). Before, you were operating on a copy of that data. However,
> I think the benefits (plugins without mallocs) outweigh the
> babysitting of developers who should know better than to cast away
> const.
>
> --Eric
>
> On Thu, Dec 2, 2010 at 4:47 PM, Leif Hedstrom <zw...@apache.org> wrote:
>> Hi,
>>
>> here's an idea, that could optimize memory allocation pressure significantly
>> (I think) when writing plugins. Here's the deal; A few plugin APIs will make
>> copies of the strings from the internal marshal buffers, before returning
>> them, with the only purpose of making them NULL terminated. The odd thing,
>> to me at least, is that all these APIs already take an argument (an int*),
>> which gets set with the length of the string.
>>
>> I think it'd make much more sense to consistently make all APIs that deals
>> with strings like this not be NULL terminated. This would require no API
>> changes, just that the "semantics" would change. And I think the requirement
>> to 'release' strings could then go away entirely (since they are just const
>> char* pointers into the marshal buffers).
>>
>> Below are a few APIs that would change their "behavior", in that they no
>> longer would gurantee NULL terminated strings (which is something most APIs
>> don't anyways).
>>
>> Thoughts on this?
>>
>> -- leif
>>
>> const char * TSUrlSchemeGet(TSMBuffer bufp, TSMLoc obj, int *length);
>> const char *TSUrlUserGet(TSMBuffer bufp, TSMLoc obj, int *length);
>> const char *TSUrlPasswordGet(TSMBuffer bufp, TSMLoc obj, int *length);
>> const char *TSUrlHostGet(TSMBuffer bufp, TSMLoc obj, int *length);
>> const char *TSUrlPathGet(TSMBuffer bufp, TSMLoc obj, int *length);
>> const char *TSUrlHttpParamsGet(TSMBuffer bufp, TSMLoc obj, int *length);
>> const char *TSUrlHttpQueryGet(TSMBuffer bufp, TSMLoc obj, int *length);
>> const char *TSUrlHttpFragmentGet(TSMBuffer bufp, TSMLoc obj, int *length);
>>
>> const char *TSMimeHdrFieldNameGet(TSMBuffer bufp, TSMLoc hdr, TSMLoc field,
>> int *length);
>> const char *TSHttpHdrMethodGet(TSMBuffer bufp, TSMLoc obj, int *length);
>> const char *TSHttpHdrReasonGet(TSMBuffer bufp, TSMLoc obj, int *length);
>>
>> TSReturnCode TSMimeHdrFieldValueStringGet(TSMBuffer bufp, TSMLoc hdr, TSMLoc
>> field, int idx, const char **value_ptr, int *value_len_ptr);
>>
>>
>>
Re: SDK and string "copies"
Posted by Eric Balsa <er...@apache.org>.
I'm +1 for making stuff non-null and returning a pointer straight into
the MBuf like you propose.
The only problem I see is that plugin writers will get direct access
to the MBuf and one could cast and change data directly (but these are
getters!). Before, you were operating on a copy of that data. However,
I think the benefits (plugins without mallocs) outweigh the
babysitting of developers who should know better than to cast away
const.
--Eric
On Thu, Dec 2, 2010 at 4:47 PM, Leif Hedstrom <zw...@apache.org> wrote:
> Hi,
>
> here's an idea, that could optimize memory allocation pressure significantly
> (I think) when writing plugins. Here's the deal; A few plugin APIs will make
> copies of the strings from the internal marshal buffers, before returning
> them, with the only purpose of making them NULL terminated. The odd thing,
> to me at least, is that all these APIs already take an argument (an int*),
> which gets set with the length of the string.
>
> I think it'd make much more sense to consistently make all APIs that deals
> with strings like this not be NULL terminated. This would require no API
> changes, just that the "semantics" would change. And I think the requirement
> to 'release' strings could then go away entirely (since they are just const
> char* pointers into the marshal buffers).
>
> Below are a few APIs that would change their "behavior", in that they no
> longer would gurantee NULL terminated strings (which is something most APIs
> don't anyways).
>
> Thoughts on this?
>
> -- leif
>
> const char * TSUrlSchemeGet(TSMBuffer bufp, TSMLoc obj, int *length);
> const char *TSUrlUserGet(TSMBuffer bufp, TSMLoc obj, int *length);
> const char *TSUrlPasswordGet(TSMBuffer bufp, TSMLoc obj, int *length);
> const char *TSUrlHostGet(TSMBuffer bufp, TSMLoc obj, int *length);
> const char *TSUrlPathGet(TSMBuffer bufp, TSMLoc obj, int *length);
> const char *TSUrlHttpParamsGet(TSMBuffer bufp, TSMLoc obj, int *length);
> const char *TSUrlHttpQueryGet(TSMBuffer bufp, TSMLoc obj, int *length);
> const char *TSUrlHttpFragmentGet(TSMBuffer bufp, TSMLoc obj, int *length);
>
> const char *TSMimeHdrFieldNameGet(TSMBuffer bufp, TSMLoc hdr, TSMLoc field,
> int *length);
> const char *TSHttpHdrMethodGet(TSMBuffer bufp, TSMLoc obj, int *length);
> const char *TSHttpHdrReasonGet(TSMBuffer bufp, TSMLoc obj, int *length);
>
> TSReturnCode TSMimeHdrFieldValueStringGet(TSMBuffer bufp, TSMLoc hdr, TSMLoc
> field, int idx, const char **value_ptr, int *value_len_ptr);
>
>
>
Re: SDK and string "copies"
Posted by Manish Pandey <ma...@gmail.com>.
Seems like an great idea for simplifying memory management. The overhead of
dealing with non-NULL termination should be minor.
Isn't there an equivalent of TSHttpBodyGet(TSMBuffer bufp, TSMLoc obj, int
*length);
Manish
On Thu, Dec 2, 2010 at 4:47 PM, Leif Hedstrom <zw...@apache.org> wrote:
> Hi,
>
> here's an idea, that could optimize memory allocation pressure
> significantly (I think) when writing plugins. Here's the deal; A few plugin
> APIs will make copies of the strings from the internal marshal buffers,
> before returning them, with the only purpose of making them NULL terminated.
> The odd thing, to me at least, is that all these APIs already take an
> argument (an int*), which gets set with the length of the string.
>
> I think it'd make much more sense to consistently make all APIs that deals
> with strings like this not be NULL terminated. This would require no API
> changes, just that the "semantics" would change. And I think the requirement
> to 'release' strings could then go away entirely (since they are just const
> char* pointers into the marshal buffers).
>
> Below are a few APIs that would change their "behavior", in that they no
> longer would gurantee NULL terminated strings (which is something most APIs
> don't anyways).
>
> Thoughts on this?
>
> -- leif
>
> const char * TSUrlSchemeGet(TSMBuffer bufp, TSMLoc obj, int *length);
> const char *TSUrlUserGet(TSMBuffer bufp, TSMLoc obj, int *length);
> const char *TSUrlPasswordGet(TSMBuffer bufp, TSMLoc obj, int *length);
> const char *TSUrlHostGet(TSMBuffer bufp, TSMLoc obj, int *length);
> const char *TSUrlPathGet(TSMBuffer bufp, TSMLoc obj, int *length);
> const char *TSUrlHttpParamsGet(TSMBuffer bufp, TSMLoc obj, int *length);
> const char *TSUrlHttpQueryGet(TSMBuffer bufp, TSMLoc obj, int *length);
> const char *TSUrlHttpFragmentGet(TSMBuffer bufp, TSMLoc obj, int *length);
>
> const char *TSMimeHdrFieldNameGet(TSMBuffer bufp, TSMLoc hdr, TSMLoc field,
> int *length);
> const char *TSHttpHdrMethodGet(TSMBuffer bufp, TSMLoc obj, int *length);
> const char *TSHttpHdrReasonGet(TSMBuffer bufp, TSMLoc obj, int *length);
>
> TSReturnCode TSMimeHdrFieldValueStringGet(TSMBuffer bufp, TSMLoc hdr,
> TSMLoc field, int idx, const char **value_ptr, int *value_len_ptr);
>
>
>
Re: SDK and string "copies"
Posted by "Alan M. Carroll" <am...@network-geographics.com>.
+1
Thursday, December 2, 2010, 6:47:11 PM, you wrote:
> I think it'd make much more sense to consistently make all APIs that
> deals with strings like this not be NULL terminated. This would require
> no API changes, just that the "semantics" would change. And I think the
> requirement to 'release' strings could then go away entirely (since they
> are just const char* pointers into the marshal buffers).
> Below are a few APIs that would change their "behavior", in that they no
> longer would gurantee NULL terminated strings (which is something most
> APIs don't anyways).