You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@trafficserver.apache.org by Walt Karas <wk...@verizonmedia.com.INVALID> on 2019/06/11 01:26:42 UTC

PROPOSED new TS API function TSHttpTxnEffectiveNormalizedUrlStringGet()

In https://github.com/apache/trafficserver/pull/5217 .
Same as TSHttpTxnEffectiveUrlStringGet(), but make sure scheme and host are
normalized to all lower case letters.

Re: PROPOSED new TS API function TSHttpTxnEffectiveNormalizedUrlStringGet()

Posted by Leif Hedstrom <zw...@apache.org>.

> On Jun 12, 2019, at 10:47 AM, Bryan Call <bc...@apache.org> wrote:
> 
> The RFC isn’t clear on what can be normalized for generating cache keys.  In practice I have done other methods for normalizing URLs such as sorting query parameters.  I have also lowercased paths knowing that the origin would be doing the same, which I would recommend in this case.
> 


For cache keys, we should have reasonable defaults (and it’s difficult to change those defaults, ever, so we’re pretty much SOL on modifying those). For all other custom needs, I firmly believe it belongs in the cachekey plugin. It can control whatever normalization (case sensitive, query param ordering etc. etc. that you need).

Cheers,

— Leif

> -Bryan
> 
> 
>> On Jun 11, 2019, at 11:24 AM, Walt Karas <wk...@verizonmedia.com.INVALID> wrote:
>> 
>> Sorry, premature send, my Mac sucks, fix it zwoop.
>> 
>> I looked and the IETF specs and discussed it with Dave Thompson.  It seems
>> that the only parts of the URL/URI that the Standards require to be case
>> insensitive are the scheme and the host.  Our plugin compares the URL with
>> a simple string compare.  I think, for the purposes of that plugin, all
>> other parts of the URL are case sensitive.  I'd rather not have to change
>> the plugin to have to deal with each component of the effective URL
>> individually.  Of course, this is probably just a fail safe, I would bet
>> $10 that the widely used browsers normalize the scheme and host to
>> lowercase anyway.
>> 
>> On Tue, Jun 11, 2019 at 1:18 PM Walt Karas <wk...@verizonmedia.com> wrote:
>> 
>>> I looked and the IETF specs and discussed it with Dave Thompson.  It seems
>>> that the only parts of the URL/URI that the Standards requ
>>> 
>>> On Tue, Jun 11, 2019 at 12:59 PM Bryan Call <bc...@apache.org> wrote:
>>> 
>>>> What are you matching against?  Are you trying to match against the URL
>>>> of a previous request?  Why only normalize the scheme and host and not the
>>>> path, query parameters, or matrix parameters?
>>>> 
>>>> I think the problem is you are not giving details and people are guessing
>>>> at what you are trying to accomplish.
>>>> 
>>>> -Bryan
>>>> 
>>>> 
>>>>> On Jun 11, 2019, at 10:14 AM, Walt Karas <wk...@verizonmedia.com.INVALID>
>>>> wrote:
>>>>> 
>>>>> We (Verizon) want to deploy a plugin that matches on URL premap.  With
>>>> the
>>>>> host and scheme normalized, we can do the matching using a simple string
>>>>> compare.  I had put up a PR to simply change the behavior of
>>>>> TSHttpTxnEffectiveUrlStringGet() but it was pocket vetoed by lack of
>>>>> reviews.
>>>>> 
>>>>> On Tue, Jun 11, 2019 at 12:03 PM Sudheer Vinukonda
>>>>> <su...@yahoo.com.invalid> wrote:
>>>>> 
>>>>>> Hmm..But, how do you define "correct" normalization? Wouldn't that be
>>>> use
>>>>>> case specific? Which is exactly why it feels like this shouldn't be
>>>> done in
>>>>>> the core?
>>>>>> If the use case is a common one that benefits everyone, then there
>>>> might
>>>>>> still be value in supporting it. That's why, curious to understand the
>>>> use
>>>>>> case.
>>>>>>  On Tuesday, June 11, 2019, 8:49:24 AM PDT, Alan Carroll
>>>>>> <so...@verizonmedia.com.INVALID> wrote:
>>>>>> 
>>>>>> The issue is, what is the correct normalization to perform? If that's
>>>>>> non-trivial, there's an argument for embedding that in the API  rather
>>>> than
>>>>>> requiring every plugin to hand roll it. It would be the same reason
>>>>>> `realpath` exists.
>>>>>> 
>>>> 
>>>> 
> 


Re: PROPOSED new TS API function TSHttpTxnEffectiveNormalizedUrlStringGet()

Posted by Bryan Call <bc...@apache.org>.
I meant wouldn’t in the last sentence. :)

-Bryan


> On Jun 12, 2019, at 9:47 AM, Bryan Call <bc...@apache.org> wrote:
> 
> The RFC isn’t clear on what can be normalized for generating cache keys.  In practice I have done other methods for normalizing URLs such as sorting query parameters.  I have also lowercased paths knowing that the origin would be doing the same, which I would recommend in this case.
> 
> -Bryan
> 
> 
>> On Jun 11, 2019, at 11:24 AM, Walt Karas <wk...@verizonmedia.com.INVALID> wrote:
>> 
>> Sorry, premature send, my Mac sucks, fix it zwoop.
>> 
>> I looked and the IETF specs and discussed it with Dave Thompson.  It seems
>> that the only parts of the URL/URI that the Standards require to be case
>> insensitive are the scheme and the host.  Our plugin compares the URL with
>> a simple string compare.  I think, for the purposes of that plugin, all
>> other parts of the URL are case sensitive.  I'd rather not have to change
>> the plugin to have to deal with each component of the effective URL
>> individually.  Of course, this is probably just a fail safe, I would bet
>> $10 that the widely used browsers normalize the scheme and host to
>> lowercase anyway.
>> 
>> On Tue, Jun 11, 2019 at 1:18 PM Walt Karas <wk...@verizonmedia.com> wrote:
>> 
>>> I looked and the IETF specs and discussed it with Dave Thompson.  It seems
>>> that the only parts of the URL/URI that the Standards requ
>>> 
>>> On Tue, Jun 11, 2019 at 12:59 PM Bryan Call <bc...@apache.org> wrote:
>>> 
>>>> What are you matching against?  Are you trying to match against the URL
>>>> of a previous request?  Why only normalize the scheme and host and not the
>>>> path, query parameters, or matrix parameters?
>>>> 
>>>> I think the problem is you are not giving details and people are guessing
>>>> at what you are trying to accomplish.
>>>> 
>>>> -Bryan
>>>> 
>>>> 
>>>>> On Jun 11, 2019, at 10:14 AM, Walt Karas <wk...@verizonmedia.com.INVALID>
>>>> wrote:
>>>>> 
>>>>> We (Verizon) want to deploy a plugin that matches on URL premap.  With
>>>> the
>>>>> host and scheme normalized, we can do the matching using a simple string
>>>>> compare.  I had put up a PR to simply change the behavior of
>>>>> TSHttpTxnEffectiveUrlStringGet() but it was pocket vetoed by lack of
>>>>> reviews.
>>>>> 
>>>>> On Tue, Jun 11, 2019 at 12:03 PM Sudheer Vinukonda
>>>>> <su...@yahoo.com.invalid> wrote:
>>>>> 
>>>>>> Hmm..But, how do you define "correct" normalization? Wouldn't that be
>>>> use
>>>>>> case specific? Which is exactly why it feels like this shouldn't be
>>>> done in
>>>>>> the core?
>>>>>> If the use case is a common one that benefits everyone, then there
>>>> might
>>>>>> still be value in supporting it. That's why, curious to understand the
>>>> use
>>>>>> case.
>>>>>>  On Tuesday, June 11, 2019, 8:49:24 AM PDT, Alan Carroll
>>>>>> <so...@verizonmedia.com.INVALID> wrote:
>>>>>> 
>>>>>> The issue is, what is the correct normalization to perform? If that's
>>>>>> non-trivial, there's an argument for embedding that in the API  rather
>>>> than
>>>>>> requiring every plugin to hand roll it. It would be the same reason
>>>>>> `realpath` exists.
>>>>>> 
>>>> 
>>>> 
> 


Re: PROPOSED new TS API function TSHttpTxnEffectiveNormalizedUrlStringGet()

Posted by Bryan Call <bc...@apache.org>.
The RFC isn’t clear on what can be normalized for generating cache keys.  In practice I have done other methods for normalizing URLs such as sorting query parameters.  I have also lowercased paths knowing that the origin would be doing the same, which I would recommend in this case.

-Bryan


> On Jun 11, 2019, at 11:24 AM, Walt Karas <wk...@verizonmedia.com.INVALID> wrote:
> 
> Sorry, premature send, my Mac sucks, fix it zwoop.
> 
> I looked and the IETF specs and discussed it with Dave Thompson.  It seems
> that the only parts of the URL/URI that the Standards require to be case
> insensitive are the scheme and the host.  Our plugin compares the URL with
> a simple string compare.  I think, for the purposes of that plugin, all
> other parts of the URL are case sensitive.  I'd rather not have to change
> the plugin to have to deal with each component of the effective URL
> individually.  Of course, this is probably just a fail safe, I would bet
> $10 that the widely used browsers normalize the scheme and host to
> lowercase anyway.
> 
> On Tue, Jun 11, 2019 at 1:18 PM Walt Karas <wk...@verizonmedia.com> wrote:
> 
>> I looked and the IETF specs and discussed it with Dave Thompson.  It seems
>> that the only parts of the URL/URI that the Standards requ
>> 
>> On Tue, Jun 11, 2019 at 12:59 PM Bryan Call <bc...@apache.org> wrote:
>> 
>>> What are you matching against?  Are you trying to match against the URL
>>> of a previous request?  Why only normalize the scheme and host and not the
>>> path, query parameters, or matrix parameters?
>>> 
>>> I think the problem is you are not giving details and people are guessing
>>> at what you are trying to accomplish.
>>> 
>>> -Bryan
>>> 
>>> 
>>>> On Jun 11, 2019, at 10:14 AM, Walt Karas <wk...@verizonmedia.com.INVALID>
>>> wrote:
>>>> 
>>>> We (Verizon) want to deploy a plugin that matches on URL premap.  With
>>> the
>>>> host and scheme normalized, we can do the matching using a simple string
>>>> compare.  I had put up a PR to simply change the behavior of
>>>> TSHttpTxnEffectiveUrlStringGet() but it was pocket vetoed by lack of
>>>> reviews.
>>>> 
>>>> On Tue, Jun 11, 2019 at 12:03 PM Sudheer Vinukonda
>>>> <su...@yahoo.com.invalid> wrote:
>>>> 
>>>>> Hmm..But, how do you define "correct" normalization? Wouldn't that be
>>> use
>>>>> case specific? Which is exactly why it feels like this shouldn't be
>>> done in
>>>>> the core?
>>>>> If the use case is a common one that benefits everyone, then there
>>> might
>>>>> still be value in supporting it. That's why, curious to understand the
>>> use
>>>>> case.
>>>>>   On Tuesday, June 11, 2019, 8:49:24 AM PDT, Alan Carroll
>>>>> <so...@verizonmedia.com.INVALID> wrote:
>>>>> 
>>>>> The issue is, what is the correct normalization to perform? If that's
>>>>> non-trivial, there's an argument for embedding that in the API  rather
>>> than
>>>>> requiring every plugin to hand roll it. It would be the same reason
>>>>> `realpath` exists.
>>>>> 
>>> 
>>> 


Re: PROPOSED new TS API function TSHttpTxnEffectiveNormalizedUrlStringGet()

Posted by Bryan Call <bc...@apache.org>.
When I dived into the code it looks like this is separate code from generating the cache key.

Cache::generate_key > URL::hash_get() > url_CryptoHash_get()

Vs 

TSHttpTxnEffectiveNormalizedUrlStringGet() > HTTPHdr::url_string_get()

-Bryan


> On Jun 12, 2019, at 7:09 AM, Sudheer Vinukonda <su...@yahoo.com.INVALID> wrote:
> 
> That sounds reasonable. However, it does cause a compatibility issue for someone upgrading in that, I think this API might change the cache key and could potentially break cache. 
> 
> Given that, wouldn’t we prefer to break the API in a more obvious fashion (e.g. modify the signature to require a “normalized” param), to let existing plugins that use this API explicitly modify as they need?
> 
> 
>> On Jun 12, 2019, at 6:06 AM, Alan Carroll <so...@verizonmedia.com.invalid> wrote:
>> 
>> Didn't that PR do the normalization at a lower level? I like zwoop's idea,
>> that TSEffectiveUrlGet always return a normalized URL. It's kind of the
>> point of the call, to get the "real" URL from the request. I don't even
>> think that breaks compatibility since that's the correct value to return.
>> 
>> On Tue, Jun 11, 2019 at 3:33 PM Walt Karas <wk...@verizonmedia.com.invalid>
>> wrote:
>> 
>>> That was the original version of the PR, to just change the behavior of the
>>> existing effective URL get function.  It just twisted in the wind for two
>>> months.
>>> 
>>> I think an aversion to long names in a C API is not realistic.  When ya got
>>> no scoping or overloading, it's either long names or very unintuitive ones.
>>> 
>>> 
>>>> On Tue, Jun 11, 2019 at 3:03 PM Leif Hedstrom <zw...@apache.org> wrote:
>>>> 
>>>> 
>>>> 
>>>>> On Jun 11, 2019, at 12:24 PM, Walt Karas <wkaras@verizonmedia.com
>>> .INVALID>
>>>> wrote:
>>>>> 
>>>>> Sorry, premature send, my Mac sucks, fix it zwoop.
>>>>> 
>>>>> I looked and the IETF specs and discussed it with Dave Thompson.  It
>>>> seems
>>>>> that the only parts of the URL/URI that the Standards require to be
>>> case
>>>>> insensitive are the scheme and the host.  Our plugin compares the URL
>>>> with
>>>>> a simple string compare.  I think, for the purposes of that plugin, all
>>>>> other parts of the URL are case sensitive.  I'd rather not have to
>>> change
>>>>> the plugin to have to deal with each component of the effective URL
>>>>> individually.  Of course, this is probably just a fail safe, I would
>>> bet
>>>>> $10 that the widely used browsers normalize the scheme and host to
>>>>> lowercase anyway.
>>>> 
>>>> 
>>>> Seems we could normalize those two fields in the existing API then. If
>>>> that is an expected behavior (which I think both Walt and amc are
>>>> implying), then why have an option to open up confusion. This would be
>>>> inline with amc’s argument as well, that leaving normalization as an
>>> option
>>>> opens up a can of worm. So just always do the same thing, and everyone is
>>>> happy (i don’t think we need two APIs for this).
>>>> 
>>>> Alternatively, we can change the existing API to take a normalization
>>>> option, and break compatibility. I much prefer either of these options
>>> than
>>>> adding this really convoluted API contraption.
>>>> 
>>>> — Leif
>>>> 
>>>>> 
>>>>> On Tue, Jun 11, 2019 at 1:18 PM Walt Karas <wk...@verizonmedia.com>
>>>> wrote:
>>>>> 
>>>>>> I looked and the IETF specs and discussed it with Dave Thompson.  It
>>>> seems
>>>>>> that the only parts of the URL/URI that the Standards requ
>>>>>> 
>>>>>>> On Tue, Jun 11, 2019 at 12:59 PM Bryan Call <bc...@apache.org> wrote:
>>>>>>> 
>>>>>>> What are you matching against?  Are you trying to match against the
>>> URL
>>>>>>> of a previous request?  Why only normalize the scheme and host and
>>> not
>>>> the
>>>>>>> path, query parameters, or matrix parameters?
>>>>>>> 
>>>>>>> I think the problem is you are not giving details and people are
>>>> guessing
>>>>>>> at what you are trying to accomplish.
>>>>>>> 
>>>>>>> -Bryan
>>>>>>> 
>>>>>>> 
>>>>>>>> On Jun 11, 2019, at 10:14 AM, Walt Karas <wkaras@verizonmedia.com
>>>> .INVALID>
>>>>>>> wrote:
>>>>>>>> 
>>>>>>>> We (Verizon) want to deploy a plugin that matches on URL premap.
>>> With
>>>>>>> the
>>>>>>>> host and scheme normalized, we can do the matching using a simple
>>>> string
>>>>>>>> compare.  I had put up a PR to simply change the behavior of
>>>>>>>> TSHttpTxnEffectiveUrlStringGet() but it was pocket vetoed by lack of
>>>>>>>> reviews.
>>>>>>>> 
>>>>>>>> On Tue, Jun 11, 2019 at 12:03 PM Sudheer Vinukonda
>>>>>>>> <su...@yahoo.com.invalid> wrote:
>>>>>>>> 
>>>>>>>>> Hmm..But, how do you define "correct" normalization? Wouldn't that
>>> be
>>>>>>> use
>>>>>>>>> case specific? Which is exactly why it feels like this shouldn't be
>>>>>>> done in
>>>>>>>>> the core?
>>>>>>>>> If the use case is a common one that benefits everyone, then there
>>>>>>> might
>>>>>>>>> still be value in supporting it. That's why, curious to understand
>>>> the
>>>>>>> use
>>>>>>>>> case.
>>>>>>>>> On Tuesday, June 11, 2019, 8:49:24 AM PDT, Alan Carroll
>>>>>>>>> <so...@verizonmedia.com.INVALID> wrote:
>>>>>>>>> 
>>>>>>>>> The issue is, what is the correct normalization to perform? If
>>> that's
>>>>>>>>> non-trivial, there's an argument for embedding that in the API
>>>> rather
>>>>>>> than
>>>>>>>>> requiring every plugin to hand roll it. It would be the same reason
>>>>>>>>> `realpath` exists.
>>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>> 
>>>> 
>>> 
> 


Re: PROPOSED new TS API function TSHttpTxnEffectiveNormalizedUrlStringGet()

Posted by Sudheer Vinukonda <su...@yahoo.com.INVALID>.
That sounds reasonable. However, it does cause a compatibility issue for someone upgrading in that, I think this API might change the cache key and could potentially break cache. 

Given that, wouldn’t we prefer to break the API in a more obvious fashion (e.g. modify the signature to require a “normalized” param), to let existing plugins that use this API explicitly modify as they need?


> On Jun 12, 2019, at 6:06 AM, Alan Carroll <so...@verizonmedia.com.invalid> wrote:
> 
> Didn't that PR do the normalization at a lower level? I like zwoop's idea,
> that TSEffectiveUrlGet always return a normalized URL. It's kind of the
> point of the call, to get the "real" URL from the request. I don't even
> think that breaks compatibility since that's the correct value to return.
> 
> On Tue, Jun 11, 2019 at 3:33 PM Walt Karas <wk...@verizonmedia.com.invalid>
> wrote:
> 
>> That was the original version of the PR, to just change the behavior of the
>> existing effective URL get function.  It just twisted in the wind for two
>> months.
>> 
>> I think an aversion to long names in a C API is not realistic.  When ya got
>> no scoping or overloading, it's either long names or very unintuitive ones.
>> 
>> 
>>> On Tue, Jun 11, 2019 at 3:03 PM Leif Hedstrom <zw...@apache.org> wrote:
>>> 
>>> 
>>> 
>>>> On Jun 11, 2019, at 12:24 PM, Walt Karas <wkaras@verizonmedia.com
>> .INVALID>
>>> wrote:
>>>> 
>>>> Sorry, premature send, my Mac sucks, fix it zwoop.
>>>> 
>>>> I looked and the IETF specs and discussed it with Dave Thompson.  It
>>> seems
>>>> that the only parts of the URL/URI that the Standards require to be
>> case
>>>> insensitive are the scheme and the host.  Our plugin compares the URL
>>> with
>>>> a simple string compare.  I think, for the purposes of that plugin, all
>>>> other parts of the URL are case sensitive.  I'd rather not have to
>> change
>>>> the plugin to have to deal with each component of the effective URL
>>>> individually.  Of course, this is probably just a fail safe, I would
>> bet
>>>> $10 that the widely used browsers normalize the scheme and host to
>>>> lowercase anyway.
>>> 
>>> 
>>> Seems we could normalize those two fields in the existing API then. If
>>> that is an expected behavior (which I think both Walt and amc are
>>> implying), then why have an option to open up confusion. This would be
>>> inline with amc’s argument as well, that leaving normalization as an
>> option
>>> opens up a can of worm. So just always do the same thing, and everyone is
>>> happy (i don’t think we need two APIs for this).
>>> 
>>> Alternatively, we can change the existing API to take a normalization
>>> option, and break compatibility. I much prefer either of these options
>> than
>>> adding this really convoluted API contraption.
>>> 
>>> — Leif
>>> 
>>>> 
>>>> On Tue, Jun 11, 2019 at 1:18 PM Walt Karas <wk...@verizonmedia.com>
>>> wrote:
>>>> 
>>>>> I looked and the IETF specs and discussed it with Dave Thompson.  It
>>> seems
>>>>> that the only parts of the URL/URI that the Standards requ
>>>>> 
>>>>>> On Tue, Jun 11, 2019 at 12:59 PM Bryan Call <bc...@apache.org> wrote:
>>>>>> 
>>>>>> What are you matching against?  Are you trying to match against the
>> URL
>>>>>> of a previous request?  Why only normalize the scheme and host and
>> not
>>> the
>>>>>> path, query parameters, or matrix parameters?
>>>>>> 
>>>>>> I think the problem is you are not giving details and people are
>>> guessing
>>>>>> at what you are trying to accomplish.
>>>>>> 
>>>>>> -Bryan
>>>>>> 
>>>>>> 
>>>>>>> On Jun 11, 2019, at 10:14 AM, Walt Karas <wkaras@verizonmedia.com
>>> .INVALID>
>>>>>> wrote:
>>>>>>> 
>>>>>>> We (Verizon) want to deploy a plugin that matches on URL premap.
>> With
>>>>>> the
>>>>>>> host and scheme normalized, we can do the matching using a simple
>>> string
>>>>>>> compare.  I had put up a PR to simply change the behavior of
>>>>>>> TSHttpTxnEffectiveUrlStringGet() but it was pocket vetoed by lack of
>>>>>>> reviews.
>>>>>>> 
>>>>>>> On Tue, Jun 11, 2019 at 12:03 PM Sudheer Vinukonda
>>>>>>> <su...@yahoo.com.invalid> wrote:
>>>>>>> 
>>>>>>>> Hmm..But, how do you define "correct" normalization? Wouldn't that
>> be
>>>>>> use
>>>>>>>> case specific? Which is exactly why it feels like this shouldn't be
>>>>>> done in
>>>>>>>> the core?
>>>>>>>> If the use case is a common one that benefits everyone, then there
>>>>>> might
>>>>>>>> still be value in supporting it. That's why, curious to understand
>>> the
>>>>>> use
>>>>>>>> case.
>>>>>>>>  On Tuesday, June 11, 2019, 8:49:24 AM PDT, Alan Carroll
>>>>>>>> <so...@verizonmedia.com.INVALID> wrote:
>>>>>>>> 
>>>>>>>> The issue is, what is the correct normalization to perform? If
>> that's
>>>>>>>> non-trivial, there's an argument for embedding that in the API
>>> rather
>>>>>> than
>>>>>>>> requiring every plugin to hand roll it. It would be the same reason
>>>>>>>> `realpath` exists.
>>>>>>>> 
>>>>>> 
>>>>>> 
>>> 
>>> 
>> 


Re: PROPOSED new TS API function TSHttpTxnEffectiveNormalizedUrlStringGet()

Posted by Alan Carroll <so...@verizonmedia.com.INVALID>.
Didn't that PR do the normalization at a lower level? I like zwoop's idea,
that TSEffectiveUrlGet always return a normalized URL. It's kind of the
point of the call, to get the "real" URL from the request. I don't even
think that breaks compatibility since that's the correct value to return.

On Tue, Jun 11, 2019 at 3:33 PM Walt Karas <wk...@verizonmedia.com.invalid>
wrote:

> That was the original version of the PR, to just change the behavior of the
> existing effective URL get function.  It just twisted in the wind for two
> months.
>
> I think an aversion to long names in a C API is not realistic.  When ya got
> no scoping or overloading, it's either long names or very unintuitive ones.
>
>
> On Tue, Jun 11, 2019 at 3:03 PM Leif Hedstrom <zw...@apache.org> wrote:
>
> >
> >
> > > On Jun 11, 2019, at 12:24 PM, Walt Karas <wkaras@verizonmedia.com
> .INVALID>
> > wrote:
> > >
> > > Sorry, premature send, my Mac sucks, fix it zwoop.
> > >
> > > I looked and the IETF specs and discussed it with Dave Thompson.  It
> > seems
> > > that the only parts of the URL/URI that the Standards require to be
> case
> > > insensitive are the scheme and the host.  Our plugin compares the URL
> > with
> > > a simple string compare.  I think, for the purposes of that plugin, all
> > > other parts of the URL are case sensitive.  I'd rather not have to
> change
> > > the plugin to have to deal with each component of the effective URL
> > > individually.  Of course, this is probably just a fail safe, I would
> bet
> > > $10 that the widely used browsers normalize the scheme and host to
> > > lowercase anyway.
> >
> >
> > Seems we could normalize those two fields in the existing API then. If
> > that is an expected behavior (which I think both Walt and amc are
> > implying), then why have an option to open up confusion. This would be
> > inline with amc’s argument as well, that leaving normalization as an
> option
> > opens up a can of worm. So just always do the same thing, and everyone is
> > happy (i don’t think we need two APIs for this).
> >
> > Alternatively, we can change the existing API to take a normalization
> > option, and break compatibility. I much prefer either of these options
> than
> > adding this really convoluted API contraption.
> >
> > — Leif
> >
> > >
> > > On Tue, Jun 11, 2019 at 1:18 PM Walt Karas <wk...@verizonmedia.com>
> > wrote:
> > >
> > >> I looked and the IETF specs and discussed it with Dave Thompson.  It
> > seems
> > >> that the only parts of the URL/URI that the Standards requ
> > >>
> > >> On Tue, Jun 11, 2019 at 12:59 PM Bryan Call <bc...@apache.org> wrote:
> > >>
> > >>> What are you matching against?  Are you trying to match against the
> URL
> > >>> of a previous request?  Why only normalize the scheme and host and
> not
> > the
> > >>> path, query parameters, or matrix parameters?
> > >>>
> > >>> I think the problem is you are not giving details and people are
> > guessing
> > >>> at what you are trying to accomplish.
> > >>>
> > >>> -Bryan
> > >>>
> > >>>
> > >>>> On Jun 11, 2019, at 10:14 AM, Walt Karas <wkaras@verizonmedia.com
> > .INVALID>
> > >>> wrote:
> > >>>>
> > >>>> We (Verizon) want to deploy a plugin that matches on URL premap.
> With
> > >>> the
> > >>>> host and scheme normalized, we can do the matching using a simple
> > string
> > >>>> compare.  I had put up a PR to simply change the behavior of
> > >>>> TSHttpTxnEffectiveUrlStringGet() but it was pocket vetoed by lack of
> > >>>> reviews.
> > >>>>
> > >>>> On Tue, Jun 11, 2019 at 12:03 PM Sudheer Vinukonda
> > >>>> <su...@yahoo.com.invalid> wrote:
> > >>>>
> > >>>>> Hmm..But, how do you define "correct" normalization? Wouldn't that
> be
> > >>> use
> > >>>>> case specific? Which is exactly why it feels like this shouldn't be
> > >>> done in
> > >>>>> the core?
> > >>>>> If the use case is a common one that benefits everyone, then there
> > >>> might
> > >>>>> still be value in supporting it. That's why, curious to understand
> > the
> > >>> use
> > >>>>> case.
> > >>>>>   On Tuesday, June 11, 2019, 8:49:24 AM PDT, Alan Carroll
> > >>>>> <so...@verizonmedia.com.INVALID> wrote:
> > >>>>>
> > >>>>> The issue is, what is the correct normalization to perform? If
> that's
> > >>>>> non-trivial, there's an argument for embedding that in the API
> > rather
> > >>> than
> > >>>>> requiring every plugin to hand roll it. It would be the same reason
> > >>>>> `realpath` exists.
> > >>>>>
> > >>>
> > >>>
> >
> >
>

Re: PROPOSED new TS API function TSHttpTxnEffectiveNormalizedUrlStringGet()

Posted by Walt Karas <wk...@verizonmedia.com.INVALID>.
That was the original version of the PR, to just change the behavior of the
existing effective URL get function.  It just twisted in the wind for two
months.

I think an aversion to long names in a C API is not realistic.  When ya got
no scoping or overloading, it's either long names or very unintuitive ones.


On Tue, Jun 11, 2019 at 3:03 PM Leif Hedstrom <zw...@apache.org> wrote:

>
>
> > On Jun 11, 2019, at 12:24 PM, Walt Karas <wk...@verizonmedia.com.INVALID>
> wrote:
> >
> > Sorry, premature send, my Mac sucks, fix it zwoop.
> >
> > I looked and the IETF specs and discussed it with Dave Thompson.  It
> seems
> > that the only parts of the URL/URI that the Standards require to be case
> > insensitive are the scheme and the host.  Our plugin compares the URL
> with
> > a simple string compare.  I think, for the purposes of that plugin, all
> > other parts of the URL are case sensitive.  I'd rather not have to change
> > the plugin to have to deal with each component of the effective URL
> > individually.  Of course, this is probably just a fail safe, I would bet
> > $10 that the widely used browsers normalize the scheme and host to
> > lowercase anyway.
>
>
> Seems we could normalize those two fields in the existing API then. If
> that is an expected behavior (which I think both Walt and amc are
> implying), then why have an option to open up confusion. This would be
> inline with amc’s argument as well, that leaving normalization as an option
> opens up a can of worm. So just always do the same thing, and everyone is
> happy (i don’t think we need two APIs for this).
>
> Alternatively, we can change the existing API to take a normalization
> option, and break compatibility. I much prefer either of these options than
> adding this really convoluted API contraption.
>
> — Leif
>
> >
> > On Tue, Jun 11, 2019 at 1:18 PM Walt Karas <wk...@verizonmedia.com>
> wrote:
> >
> >> I looked and the IETF specs and discussed it with Dave Thompson.  It
> seems
> >> that the only parts of the URL/URI that the Standards requ
> >>
> >> On Tue, Jun 11, 2019 at 12:59 PM Bryan Call <bc...@apache.org> wrote:
> >>
> >>> What are you matching against?  Are you trying to match against the URL
> >>> of a previous request?  Why only normalize the scheme and host and not
> the
> >>> path, query parameters, or matrix parameters?
> >>>
> >>> I think the problem is you are not giving details and people are
> guessing
> >>> at what you are trying to accomplish.
> >>>
> >>> -Bryan
> >>>
> >>>
> >>>> On Jun 11, 2019, at 10:14 AM, Walt Karas <wkaras@verizonmedia.com
> .INVALID>
> >>> wrote:
> >>>>
> >>>> We (Verizon) want to deploy a plugin that matches on URL premap.  With
> >>> the
> >>>> host and scheme normalized, we can do the matching using a simple
> string
> >>>> compare.  I had put up a PR to simply change the behavior of
> >>>> TSHttpTxnEffectiveUrlStringGet() but it was pocket vetoed by lack of
> >>>> reviews.
> >>>>
> >>>> On Tue, Jun 11, 2019 at 12:03 PM Sudheer Vinukonda
> >>>> <su...@yahoo.com.invalid> wrote:
> >>>>
> >>>>> Hmm..But, how do you define "correct" normalization? Wouldn't that be
> >>> use
> >>>>> case specific? Which is exactly why it feels like this shouldn't be
> >>> done in
> >>>>> the core?
> >>>>> If the use case is a common one that benefits everyone, then there
> >>> might
> >>>>> still be value in supporting it. That's why, curious to understand
> the
> >>> use
> >>>>> case.
> >>>>>   On Tuesday, June 11, 2019, 8:49:24 AM PDT, Alan Carroll
> >>>>> <so...@verizonmedia.com.INVALID> wrote:
> >>>>>
> >>>>> The issue is, what is the correct normalization to perform? If that's
> >>>>> non-trivial, there's an argument for embedding that in the API
> rather
> >>> than
> >>>>> requiring every plugin to hand roll it. It would be the same reason
> >>>>> `realpath` exists.
> >>>>>
> >>>
> >>>
>
>

Re: PROPOSED new TS API function TSHttpTxnEffectiveNormalizedUrlStringGet()

Posted by Leif Hedstrom <zw...@apache.org>.

> On Jun 11, 2019, at 12:24 PM, Walt Karas <wk...@verizonmedia.com.INVALID> wrote:
> 
> Sorry, premature send, my Mac sucks, fix it zwoop.
> 
> I looked and the IETF specs and discussed it with Dave Thompson.  It seems
> that the only parts of the URL/URI that the Standards require to be case
> insensitive are the scheme and the host.  Our plugin compares the URL with
> a simple string compare.  I think, for the purposes of that plugin, all
> other parts of the URL are case sensitive.  I'd rather not have to change
> the plugin to have to deal with each component of the effective URL
> individually.  Of course, this is probably just a fail safe, I would bet
> $10 that the widely used browsers normalize the scheme and host to
> lowercase anyway.


Seems we could normalize those two fields in the existing API then. If that is an expected behavior (which I think both Walt and amc are implying), then why have an option to open up confusion. This would be inline with amc’s argument as well, that leaving normalization as an option opens up a can of worm. So just always do the same thing, and everyone is happy (i don’t think we need two APIs for this).

Alternatively, we can change the existing API to take a normalization option, and break compatibility. I much prefer either of these options than adding this really convoluted API contraption.

— Leif

> 
> On Tue, Jun 11, 2019 at 1:18 PM Walt Karas <wk...@verizonmedia.com> wrote:
> 
>> I looked and the IETF specs and discussed it with Dave Thompson.  It seems
>> that the only parts of the URL/URI that the Standards requ
>> 
>> On Tue, Jun 11, 2019 at 12:59 PM Bryan Call <bc...@apache.org> wrote:
>> 
>>> What are you matching against?  Are you trying to match against the URL
>>> of a previous request?  Why only normalize the scheme and host and not the
>>> path, query parameters, or matrix parameters?
>>> 
>>> I think the problem is you are not giving details and people are guessing
>>> at what you are trying to accomplish.
>>> 
>>> -Bryan
>>> 
>>> 
>>>> On Jun 11, 2019, at 10:14 AM, Walt Karas <wk...@verizonmedia.com.INVALID>
>>> wrote:
>>>> 
>>>> We (Verizon) want to deploy a plugin that matches on URL premap.  With
>>> the
>>>> host and scheme normalized, we can do the matching using a simple string
>>>> compare.  I had put up a PR to simply change the behavior of
>>>> TSHttpTxnEffectiveUrlStringGet() but it was pocket vetoed by lack of
>>>> reviews.
>>>> 
>>>> On Tue, Jun 11, 2019 at 12:03 PM Sudheer Vinukonda
>>>> <su...@yahoo.com.invalid> wrote:
>>>> 
>>>>> Hmm..But, how do you define "correct" normalization? Wouldn't that be
>>> use
>>>>> case specific? Which is exactly why it feels like this shouldn't be
>>> done in
>>>>> the core?
>>>>> If the use case is a common one that benefits everyone, then there
>>> might
>>>>> still be value in supporting it. That's why, curious to understand the
>>> use
>>>>> case.
>>>>>   On Tuesday, June 11, 2019, 8:49:24 AM PDT, Alan Carroll
>>>>> <so...@verizonmedia.com.INVALID> wrote:
>>>>> 
>>>>> The issue is, what is the correct normalization to perform? If that's
>>>>> non-trivial, there's an argument for embedding that in the API  rather
>>> than
>>>>> requiring every plugin to hand roll it. It would be the same reason
>>>>> `realpath` exists.
>>>>> 
>>> 
>>> 


Re: PROPOSED new TS API function TSHttpTxnEffectiveNormalizedUrlStringGet()

Posted by Walt Karas <wk...@verizonmedia.com.INVALID>.
Sorry, premature send, my Mac sucks, fix it zwoop.

I looked and the IETF specs and discussed it with Dave Thompson.  It seems
that the only parts of the URL/URI that the Standards require to be case
insensitive are the scheme and the host.  Our plugin compares the URL with
a simple string compare.  I think, for the purposes of that plugin, all
other parts of the URL are case sensitive.  I'd rather not have to change
the plugin to have to deal with each component of the effective URL
individually.  Of course, this is probably just a fail safe, I would bet
$10 that the widely used browsers normalize the scheme and host to
lowercase anyway.

On Tue, Jun 11, 2019 at 1:18 PM Walt Karas <wk...@verizonmedia.com> wrote:

> I looked and the IETF specs and discussed it with Dave Thompson.  It seems
> that the only parts of the URL/URI that the Standards requ
>
> On Tue, Jun 11, 2019 at 12:59 PM Bryan Call <bc...@apache.org> wrote:
>
>> What are you matching against?  Are you trying to match against the URL
>> of a previous request?  Why only normalize the scheme and host and not the
>> path, query parameters, or matrix parameters?
>>
>> I think the problem is you are not giving details and people are guessing
>> at what you are trying to accomplish.
>>
>> -Bryan
>>
>>
>> > On Jun 11, 2019, at 10:14 AM, Walt Karas <wk...@verizonmedia.com.INVALID>
>> wrote:
>> >
>> > We (Verizon) want to deploy a plugin that matches on URL premap.  With
>> the
>> > host and scheme normalized, we can do the matching using a simple string
>> > compare.  I had put up a PR to simply change the behavior of
>> > TSHttpTxnEffectiveUrlStringGet() but it was pocket vetoed by lack of
>> > reviews.
>> >
>> > On Tue, Jun 11, 2019 at 12:03 PM Sudheer Vinukonda
>> > <su...@yahoo.com.invalid> wrote:
>> >
>> >> Hmm..But, how do you define "correct" normalization? Wouldn't that be
>> use
>> >> case specific? Which is exactly why it feels like this shouldn't be
>> done in
>> >> the core?
>> >> If the use case is a common one that benefits everyone, then there
>> might
>> >> still be value in supporting it. That's why, curious to understand the
>> use
>> >> case.
>> >>    On Tuesday, June 11, 2019, 8:49:24 AM PDT, Alan Carroll
>> >> <so...@verizonmedia.com.INVALID> wrote:
>> >>
>> >> The issue is, what is the correct normalization to perform? If that's
>> >> non-trivial, there's an argument for embedding that in the API  rather
>> than
>> >> requiring every plugin to hand roll it. It would be the same reason
>> >> `realpath` exists.
>> >>
>>
>>

Re: PROPOSED new TS API function TSHttpTxnEffectiveNormalizedUrlStringGet()

Posted by Walt Karas <wk...@verizonmedia.com.INVALID>.
I looked and the IETF specs and discussed it with Dave Thompson.  It seems
that the only parts of the URL/URI that the Standards requ

On Tue, Jun 11, 2019 at 12:59 PM Bryan Call <bc...@apache.org> wrote:

> What are you matching against?  Are you trying to match against the URL of
> a previous request?  Why only normalize the scheme and host and not the
> path, query parameters, or matrix parameters?
>
> I think the problem is you are not giving details and people are guessing
> at what you are trying to accomplish.
>
> -Bryan
>
>
> > On Jun 11, 2019, at 10:14 AM, Walt Karas <wk...@verizonmedia.com.INVALID>
> wrote:
> >
> > We (Verizon) want to deploy a plugin that matches on URL premap.  With
> the
> > host and scheme normalized, we can do the matching using a simple string
> > compare.  I had put up a PR to simply change the behavior of
> > TSHttpTxnEffectiveUrlStringGet() but it was pocket vetoed by lack of
> > reviews.
> >
> > On Tue, Jun 11, 2019 at 12:03 PM Sudheer Vinukonda
> > <su...@yahoo.com.invalid> wrote:
> >
> >> Hmm..But, how do you define "correct" normalization? Wouldn't that be
> use
> >> case specific? Which is exactly why it feels like this shouldn't be
> done in
> >> the core?
> >> If the use case is a common one that benefits everyone, then there might
> >> still be value in supporting it. That's why, curious to understand the
> use
> >> case.
> >>    On Tuesday, June 11, 2019, 8:49:24 AM PDT, Alan Carroll
> >> <so...@verizonmedia.com.INVALID> wrote:
> >>
> >> The issue is, what is the correct normalization to perform? If that's
> >> non-trivial, there's an argument for embedding that in the API  rather
> than
> >> requiring every plugin to hand roll it. It would be the same reason
> >> `realpath` exists.
> >>
>
>

Re: PROPOSED new TS API function TSHttpTxnEffectiveNormalizedUrlStringGet()

Posted by Bryan Call <bc...@apache.org>.
What are you matching against?  Are you trying to match against the URL of a previous request?  Why only normalize the scheme and host and not the path, query parameters, or matrix parameters?

I think the problem is you are not giving details and people are guessing at what you are trying to accomplish.

-Bryan


> On Jun 11, 2019, at 10:14 AM, Walt Karas <wk...@verizonmedia.com.INVALID> wrote:
> 
> We (Verizon) want to deploy a plugin that matches on URL premap.  With the
> host and scheme normalized, we can do the matching using a simple string
> compare.  I had put up a PR to simply change the behavior of
> TSHttpTxnEffectiveUrlStringGet() but it was pocket vetoed by lack of
> reviews.
> 
> On Tue, Jun 11, 2019 at 12:03 PM Sudheer Vinukonda
> <su...@yahoo.com.invalid> wrote:
> 
>> Hmm..But, how do you define "correct" normalization? Wouldn't that be use
>> case specific? Which is exactly why it feels like this shouldn't be done in
>> the core?
>> If the use case is a common one that benefits everyone, then there might
>> still be value in supporting it. That's why, curious to understand the use
>> case.
>>    On Tuesday, June 11, 2019, 8:49:24 AM PDT, Alan Carroll
>> <so...@verizonmedia.com.INVALID> wrote:
>> 
>> The issue is, what is the correct normalization to perform? If that's
>> non-trivial, there's an argument for embedding that in the API  rather than
>> requiring every plugin to hand roll it. It would be the same reason
>> `realpath` exists.
>> 


Re: PROPOSED new TS API function TSHttpTxnEffectiveNormalizedUrlStringGet()

Posted by Walt Karas <wk...@verizonmedia.com.INVALID>.
We (Verizon) want to deploy a plugin that matches on URL premap.  With the
host and scheme normalized, we can do the matching using a simple string
compare.  I had put up a PR to simply change the behavior of
TSHttpTxnEffectiveUrlStringGet() but it was pocket vetoed by lack of
reviews.

On Tue, Jun 11, 2019 at 12:03 PM Sudheer Vinukonda
<su...@yahoo.com.invalid> wrote:

>  Hmm..But, how do you define "correct" normalization? Wouldn't that be use
> case specific? Which is exactly why it feels like this shouldn't be done in
> the core?
> If the use case is a common one that benefits everyone, then there might
> still be value in supporting it. That's why, curious to understand the use
> case.
>     On Tuesday, June 11, 2019, 8:49:24 AM PDT, Alan Carroll
> <so...@verizonmedia.com.INVALID> wrote:
>
>  The issue is, what is the correct normalization to perform? If that's
> non-trivial, there's an argument for embedding that in the API  rather than
> requiring every plugin to hand roll it. It would be the same reason
> `realpath` exists.
>

Re: PROPOSED new TS API function TSHttpTxnEffectiveNormalizedUrlStringGet()

Posted by Sudheer Vinukonda <su...@yahoo.com.INVALID>.
 Hmm..But, how do you define "correct" normalization? Wouldn't that be use case specific? Which is exactly why it feels like this shouldn't be done in the core?
If the use case is a common one that benefits everyone, then there might still be value in supporting it. That's why, curious to understand the use case.
    On Tuesday, June 11, 2019, 8:49:24 AM PDT, Alan Carroll <so...@verizonmedia.com.INVALID> wrote:  
 
 The issue is, what is the correct normalization to perform? If that's
non-trivial, there's an argument for embedding that in the API  rather than
requiring every plugin to hand roll it. It would be the same reason
`realpath` exists.
  

Re: PROPOSED new TS API function TSHttpTxnEffectiveNormalizedUrlStringGet()

Posted by Alan Carroll <so...@verizonmedia.com.INVALID>.
The issue is, what is the correct normalization to perform? If that's
non-trivial, there's an argument for embedding that in the API  rather than
requiring every plugin to hand roll it. It would be the same reason
`realpath` exists.

Re: PROPOSED new TS API function TSHttpTxnEffectiveNormalizedUrlStringGet()

Posted by Leif Hedstrom <zw...@apache.org>.

> On Jun 10, 2019, at 21:38, Sudheer Vinukonda <su...@yahoo.com.invalid> wrote:
> 
> Can you describe the use cases for needing this API?
> 
> Also, why can’t the caller normalise the url returned by TSHttpTxnEffectiveUrlStringGet()? It appears like an “overload” to support an API just for normalisation alone, unless I’m missing something.

I agree, overloading basic computation into compound APIs is just making things harder to use,  not easier. One could also produce this string from the URL object I think, and do whatever normalization necessary there.

Let’s hear what the use case is though.

— Leif 
> 
> - Sudheer 
> 
>> On Jun 10, 2019, at 6:26 PM, Walt Karas <wk...@verizonmedia.com.invalid> wrote:
>> 
>> TSHttpTxnEffectiveUrlStringGet
> 


Re: PROPOSED new TS API function TSHttpTxnEffectiveNormalizedUrlStringGet()

Posted by Sudheer Vinukonda <su...@yahoo.com.INVALID>.
Can you describe the use cases for needing this API?

Also, why can’t the caller normalise the url returned by TSHttpTxnEffectiveUrlStringGet()? It appears like an “overload” to support an API just for normalisation alone, unless I’m missing something.

- Sudheer 

> On Jun 10, 2019, at 6:26 PM, Walt Karas <wk...@verizonmedia.com.invalid> wrote:
> 
> TSHttpTxnEffectiveUrlStringGet