You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@trafficserver.apache.org by "Alan M. Carroll" <am...@network-geographics.com> on 2014/08/15 19:23:08 UTC

TSMimeHdrFieldNext

This came up yesterday on the IRC. The problem is that every call to TSMimeHdrFieldNext allocates a MIME field handle which gets very slow if you use the function heavily. One suggested approach was to switch the allocator from a global to a per thread.

I think it might be better to add TSMimeHdrFieldHandleNext() which updates the MIME field handle in place. Does this seem like a reasonable API addition?


Re: TSMimeHdrFieldNext

Posted by Sudheer Vinukonda <su...@yahoo-inc.com.INVALID>.
It may be sometimes useful/required to check (or even modify) every header
in the request. 

For example, one of the use cases we have is to allow only white listed
headers in the response and strip off the others.

Another use case is related to SPDY client connections, where some older
origins don¹t support lower case headers and require each header to be
converted to camel case.

Thanks,

Sudheer

On 8/15/14, 1:55 PM, "James Peach" <jp...@apache.org> wrote:

>On Aug 15, 2014, at 10:23 AM, Alan M. Carroll
><am...@network-geographics.com> wrote:
>
>> This came up yesterday on the IRC. The problem is that every call to
>>TSMimeHdrFieldNext allocates a MIME field handle which gets very slow if
>>you use the function heavily. One suggested approach was to switch the
>>allocator from a global to a per thread.
>> 
>> I think it might be better to add TSMimeHdrFieldHandleNext() which
>>updates the MIME field handle in place. Does this seem like a reasonable
>>API addition?
>
>What's a real use case for iterating over all the headers?
>
>J


Re: 回复: TSMimeHdrFieldNext

Posted by Leif Hedstrom <zw...@apache.org>.
On Aug 16, 2014, at 8:17 PM, portl4t.cn@gmail.com wrote:

> I think we have to create FetchSM with all the headers from the client request, should not we?
> Is there any other way to get all the headers from the client request?

Yes, there is a way to “dump” all headers on a VIO, would that not work with the FetchSM? In e.g. background_fetch, we do

    if ((data->vc = TSHttpConnect((sockaddr*)&data->client_ip)) != NULL) {
      TSHttpHdrPrint(data->mbuf, data->hdr_loc, data->req_io_buf);
      // We never send a body with the request. ToDo: Do we ever need to support that ?
      TSIOBufferWrite(data->req_io_buf, "\r\n", 2);



I.e. TSHttpHdrPrint( …);


— Leif

> 
> 
> --  
> portl4t.cn@gmail.com
> 
> 
> 在 2014年8月17日 星期日,上午9:32,James Peach 写道:
> 
>> On Aug 16, 2014, at 5:55 PM, portl4t.cn@gmail.com (mailto:portl4t.cn@gmail.com) wrote:
>> 
>>> I use this api heavily, I think we have to iterate over all headers if we want to create FetchSM.
>> 
>> Why? To serialize the mime headers?
>> 
>>> I use this api in some plugins, such as Combo, ESI, Slice and so on.
>>> Every time if we want to intercept the HttpSM and create several FetchSM to construct the real response, this api will be used.
>>> 
>>> 
>>> --  
>>> portl4t.cn@gmail.com (mailto:portl4t.cn@gmail.com)
>>> 
>>> 
>>> 在 2014年8月16日 星期六,上午4:55,James Peach 写道:
>>> 
>>>> On Aug 15, 2014, at 10:23 AM, Alan M. Carroll <amc@network-geographics.com (mailto:amc@network-geographics.com)> wrote:
>>>> 
>>>>> This came up yesterday on the IRC. The problem is that every call to TSMimeHdrFieldNext allocates a MIME field handle which gets very slow if you use the function heavily. One suggested approach was to switch the allocator from a global to a per thread.
>>>>> 
>>>>> I think it might be better to add TSMimeHdrFieldHandleNext() which updates the MIME field handle in place. Does this seem like a reasonable API addition?
>>>> 
>>>> What's a real use case for iterating over all the headers?
>>>> 
>>>> J  
> 


回复: TSMimeHdrFieldNext

Posted by po...@gmail.com.
I think we have to create FetchSM with all the headers from the client request, should not we?
Is there any other way to get all the headers from the client request?


--  
portl4t.cn@gmail.com


在 2014年8月17日 星期日,上午9:32,James Peach 写道:

> On Aug 16, 2014, at 5:55 PM, portl4t.cn@gmail.com (mailto:portl4t.cn@gmail.com) wrote:
>  
> > I use this api heavily, I think we have to iterate over all headers if we want to create FetchSM.
>  
> Why? To serialize the mime headers?
>  
> > I use this api in some plugins, such as Combo, ESI, Slice and so on.
> > Every time if we want to intercept the HttpSM and create several FetchSM to construct the real response, this api will be used.
> >  
> >  
> > --  
> > portl4t.cn@gmail.com (mailto:portl4t.cn@gmail.com)
> >  
> >  
> > 在 2014年8月16日 星期六,上午4:55,James Peach 写道:
> >  
> > > On Aug 15, 2014, at 10:23 AM, Alan M. Carroll <amc@network-geographics.com (mailto:amc@network-geographics.com)> wrote:
> > >  
> > > > This came up yesterday on the IRC. The problem is that every call to TSMimeHdrFieldNext allocates a MIME field handle which gets very slow if you use the function heavily. One suggested approach was to switch the allocator from a global to a per thread.
> > > >  
> > > > I think it might be better to add TSMimeHdrFieldHandleNext() which updates the MIME field handle in place. Does this seem like a reasonable API addition?
> > >  
> > > What's a real use case for iterating over all the headers?
> > >  
> > > J  


Re: 回复: TSMimeHdrFieldNext

Posted by James Peach <jp...@apache.org>.
On Aug 16, 2014, at 5:55 PM, portl4t.cn@gmail.com wrote:

> I use this api heavily, I think we have to iterate over all headers if we want to create FetchSM.

Why? To serialize the mime headers?

> I use this api in some plugins, such as Combo, ESI, Slice and so on.
> Every time if we want to intercept the HttpSM and create several FetchSM to construct the real response, this api will be used.
> 
> 
> --  
> portl4t.cn@gmail.com
> 
> 
> 在 2014年8月16日 星期六,上午4:55,James Peach 写道:
> 
>> On Aug 15, 2014, at 10:23 AM, Alan M. Carroll <amc@network-geographics.com (mailto:amc@network-geographics.com)> wrote:
>> 
>>> This came up yesterday on the IRC. The problem is that every call to TSMimeHdrFieldNext allocates a MIME field handle which gets very slow if you use the function heavily. One suggested approach was to switch the allocator from a global to a per thread.
>>> 
>>> I think it might be better to add TSMimeHdrFieldHandleNext() which updates the MIME field handle in place. Does this seem like a reasonable API addition?
>> 
>> What's a real use case for iterating over all the headers?
>> 
>> J  
> 


回复: TSMimeHdrFieldNext

Posted by po...@gmail.com.
I use this api heavily, I think we have to iterate over all headers if we want to create FetchSM.
I use this api in some plugins, such as Combo, ESI, Slice and so on.
Every time if we want to intercept the HttpSM and create several FetchSM to construct the real response, this api will be used.


--  
portl4t.cn@gmail.com


在 2014年8月16日 星期六,上午4:55,James Peach 写道:

> On Aug 15, 2014, at 10:23 AM, Alan M. Carroll <amc@network-geographics.com (mailto:amc@network-geographics.com)> wrote:
>  
> > This came up yesterday on the IRC. The problem is that every call to TSMimeHdrFieldNext allocates a MIME field handle which gets very slow if you use the function heavily. One suggested approach was to switch the allocator from a global to a per thread.
> >  
> > I think it might be better to add TSMimeHdrFieldHandleNext() which updates the MIME field handle in place. Does this seem like a reasonable API addition?
>  
> What's a real use case for iterating over all the headers?
>  
> J  


Re: TSMimeHdrFieldNext

Posted by Leif Hedstrom <zw...@apache.org>.
On Aug 16, 2014, at 7:35 PM, James Peach <jp...@apache.org> wrote:

> On Aug 16, 2014, at 8:29 AM, Brian Geffon <br...@gmail.com> wrote:
> 
>> I'm definitely +1 for functor/iterator semantics, ideally it could be done
>> in such a way as to preserve backwards compatability.
> 
> My initial reaction is against a callback-based API because that's not consistent with the existing API patterns. An iterator is probably more appropriate.


How so? We showed one case where we already do so (looping over all librecords). Maybe we should just fix that instead such that it follows whatever semantics amc had in mind? I’m not too concerned about changing the functors return values here, and we could also come up with a new naming scheme for all such APIs (and deprecate the old one).

If we go the route of making a new version of TSMimeHdrFieldNext() we should consider TSMimeHdrFieldGet() instead. It’s just better in all cases I can think of for looping through this. TSMimeHdrFieldNext() was designed to allow you to look at the header “following” one that you’ve already found, and has some additional expenses (double loop over the headers to start with).

Also, when we do this, we should consider either fixing our current plugins usage of TSMimeHdrFieldNext() (or at least file bugs). There’s some pretty inefficient stuff in a couple of places. At a minimum, they should use TSMimeHdrFieldGet(), more likely use normal getters and TSMimeHdrFieldNextDup().

Ciao,

— Leif


Re: TSMimeHdrFieldNext

Posted by James Peach <jp...@apache.org>.
On Aug 16, 2014, at 8:29 AM, Brian Geffon <br...@gmail.com> wrote:

> I'm definitely +1 for functor/iterator semantics, ideally it could be done
> in such a way as to preserve backwards compatability.

My initial reaction is against a callback-based API because that's not consistent with the existing API patterns. An iterator is probably more appropriate.

> On Aug 16, 2014 10:35 PM, "Leif Hedstrom" <zw...@apache.org> wrote:
> 
>> I have some mixed feeling on this. Such as, is us TSMimeHdrFieldGet a
>> better candidate to improve for looping over all fields ? Or even a new API
>> with functor semantics?
>> 
>> TSMimeHdrFieldGet at least avoids one loop to avoid finding the slotnum
>> 
>>> On Aug 16, 2014, at 5:29 AM, Brian Geffon <br...@gmail.com> wrote:
>>> 
>>> If no one has already started on a fix I can take this.
>>> 
>>> Brian
>>>> On Aug 16, 2014 6:13 AM, "Nick Kew" <ni...@apache.org> wrote:
>>>> 
>>>> 
>>>>> On 15 Aug 2014, at 21:55, James Peach wrote:
>>>>> 
>>>>> What's a real use case for iterating over all the headers?
>>>> 
>>>> Ironbee.  Or any WAF, or WAF framework.
>>>> 
>>>> Or performance analysis tools.
>>>> 
>>>> --
>>>> Nick Kew
>>>> 
>> 


Re: TSMimeHdrFieldNext

Posted by "Alan M. Carroll" <am...@network-geographics.com>.
I think there are two reasonable approaches -

1) Iterator style. In this case we either create a new iterator type or re-use the FieldHandle as an iterator. In this case you get a handle to the first field, and then re-use it for each field. This saves the expense of allocating a new handle for every field.

2) Functor style. In this case we do the equivalent of std::find_if(). It would be similar to TSRecordDump, the differences being it is over the HTTP header fields, and the functor can return a "stop calling me" value to terminate the iteration.

In terms of implementation, the latter should be noticeably easier. While I think option (1) is more flexible, I favor option (2) for ease of implementation because it's mainly clunkier than (1), not really much less powerful. Having looked at the underlying data structures, making option (1) efficient would involve some non-trivial changes to structures I am not sure anybody understands. I think option (2) could be done in a day, on the other hand.


Re: TSMimeHdrFieldNext

Posted by Brian Geffon <br...@gmail.com>.
I'm definitely +1 for functor/iterator semantics, ideally it could be done
in such a way as to preserve backwards compatability.
On Aug 16, 2014 10:35 PM, "Leif Hedstrom" <zw...@apache.org> wrote:

> I have some mixed feeling on this. Such as, is us TSMimeHdrFieldGet a
> better candidate to improve for looping over all fields ? Or even a new API
> with functor semantics?
>
> TSMimeHdrFieldGet at least avoids one loop to avoid finding the slotnum
>
> > On Aug 16, 2014, at 5:29 AM, Brian Geffon <br...@gmail.com> wrote:
> >
> > If no one has already started on a fix I can take this.
> >
> > Brian
> >> On Aug 16, 2014 6:13 AM, "Nick Kew" <ni...@apache.org> wrote:
> >>
> >>
> >>> On 15 Aug 2014, at 21:55, James Peach wrote:
> >>>
> >>> What's a real use case for iterating over all the headers?
> >>
> >> Ironbee.  Or any WAF, or WAF framework.
> >>
> >> Or performance analysis tools.
> >>
> >> --
> >> Nick Kew
> >>
>

Re: TSMimeHdrFieldNext

Posted by Leif Hedstrom <zw...@apache.org>.
I have some mixed feeling on this. Such as, is us TSMimeHdrFieldGet a better candidate to improve for looping over all fields ? Or even a new API with functor semantics?

TSMimeHdrFieldGet at least avoids one loop to avoid finding the slotnum

> On Aug 16, 2014, at 5:29 AM, Brian Geffon <br...@gmail.com> wrote:
> 
> If no one has already started on a fix I can take this.
> 
> Brian
>> On Aug 16, 2014 6:13 AM, "Nick Kew" <ni...@apache.org> wrote:
>> 
>> 
>>> On 15 Aug 2014, at 21:55, James Peach wrote:
>>> 
>>> What's a real use case for iterating over all the headers?
>> 
>> Ironbee.  Or any WAF, or WAF framework.
>> 
>> Or performance analysis tools.
>> 
>> --
>> Nick Kew
>> 

Re: TSMimeHdrFieldNext

Posted by Brian Geffon <br...@gmail.com>.
If no one has already started on a fix I can take this.

Brian
On Aug 16, 2014 6:13 AM, "Nick Kew" <ni...@apache.org> wrote:

>
> On 15 Aug 2014, at 21:55, James Peach wrote:
>
> > What's a real use case for iterating over all the headers?
>
> Ironbee.  Or any WAF, or WAF framework.
>
> Or performance analysis tools.
>
> --
> Nick Kew
>

Re: TSMimeHdrFieldNext

Posted by James Peach <jp...@apache.org>.
On Aug 15, 2014, at 3:13 PM, Nick Kew <ni...@apache.org> wrote:

> 
> On 15 Aug 2014, at 21:55, James Peach wrote:
> 
>> What's a real use case for iterating over all the headers?
> 
> Ironbee.  Or any WAF, or WAF framework.

ah, waf and header whitelisting makes sense

> 
> Or performance analysis tools.
> 
> -- 
> Nick Kew


Re: TSMimeHdrFieldNext

Posted by Nick Kew <ni...@apache.org>.
On 15 Aug 2014, at 21:55, James Peach wrote:

> What's a real use case for iterating over all the headers?

Ironbee.  Or any WAF, or WAF framework.

Or performance analysis tools.

-- 
Nick Kew

Re: TSMimeHdrFieldNext

Posted by James Peach <jp...@apache.org>.
On Aug 15, 2014, at 10:23 AM, Alan M. Carroll <am...@network-geographics.com> wrote:

> This came up yesterday on the IRC. The problem is that every call to TSMimeHdrFieldNext allocates a MIME field handle which gets very slow if you use the function heavily. One suggested approach was to switch the allocator from a global to a per thread.
> 
> I think it might be better to add TSMimeHdrFieldHandleNext() which updates the MIME field handle in place. Does this seem like a reasonable API addition?

What's a real use case for iterating over all the headers?

J

Re: TSMimeHdrFieldNext

Posted by Leif Hedstrom <zw...@apache.org>.
On Aug 15, 2014, at 12:33 PM, Alan M. Carroll <am...@network-geographics.com> wrote:

> Leif,
> 
>> Is the goal here to iterate over all headers? If so, maybe some sort of iterator functionality would be more appropriate, similar to how we added the iterator (with a callback) for lib records (i.e. TSRecordDump() )? Can that help simplifying / improve such operations?
> 
> If you added TSMimeHdrFieldHandleNext(), then the handle would serve as an iterator. 
> 

But if the goal is to always iterate over all headers, I’m guessing (but not sure?) that it could be done more efficiently with an API that assumes so ? But alas, I don’t know what the use case here is, I’ve yet to see one where iterating over all headers is required (you usually want TSMimeHdrFieldNextDup()).

Ciao,

— Leif


Re: TSMimeHdrFieldNext

Posted by "Alan M. Carroll" <am...@network-geographics.com>.
Leif,

> But if the goal is to always iterate over all headers, I’m guessing (but not sure?) that it could be done more efficiently with an API that assumes so ? But alas, I don’t know what the use case here is, I’ve yet to see one where iterating over all headers is required (you usually want TSMimeHdrFieldNextDup()).

Unless you are going to expose the type to the API (which is generally avoided) you'll need to allocate the iterator. So I don't see any way to do better than re-purposing the handle as an iterator. Fewer types, just as efficient.


Re: TSMimeHdrFieldNext

Posted by "Alan M. Carroll" <am...@network-geographics.com>.
Leif,

> Is the goal here to iterate over all headers? If so, maybe some sort of iterator functionality would be more appropriate, similar to how we added the iterator (with a callback) for lib records (i.e. TSRecordDump() )? Can that help simplifying / improve such operations?

If you added TSMimeHdrFieldHandleNext(), then the handle would serve as an iterator. 


Re: TSMimeHdrFieldNext

Posted by Leif Hedstrom <zw...@apache.org>.
On Aug 15, 2014, at 11:23 AM, Alan M. Carroll <am...@network-geographics.com> wrote:

> This came up yesterday on the IRC. The problem is that every call to TSMimeHdrFieldNext allocates a MIME field handle which gets very slow if you use the function heavily. One suggested approach was to switch the allocator from a global to a per thread.
> 
> I think it might be better to add TSMimeHdrFieldHandleNext() which updates the MIME field handle in place. Does this seem like a reasonable API addition?
> 


Hmmm, I don’t think I’ve ever used this API. Reading the comment, it even asks “Why would anyone use this?”… :) Looking at our code, we use it very sparingly, and perhaps even wrongly in a few cases (the typically used API is TSMimeHdrFieldNextDup() ).

Is the goal here to iterate over all headers? If so, maybe some sort of iterator functionality would be more appropriate, similar to how we added the iterator (with a callback) for lib records (i.e. TSRecordDump() )? Can that help simplifying / improve such operations?

Cheers,

— Leif