You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@jmeter.apache.org by Felix Schumacher <fe...@internetallee.de> on 2014/10/04 14:10:34 UTC

Re: Test Script Recorder XML Regex Matching

Am 29.09.2014 um 22:32 schrieb Philippe Mouawad:
> Hi Felix,
> I agree with sebb, patch is interesting.
> But it clearly needs to be documented (I think many users don't know about
> this feature which is really interesting) as long as code, reading patch
> first it wasn't clear for me what was intended.
I have added documentation to the patch and found two other things, that 
I changed
in the same bug-entry.

The random order of applying the matchers, seems a bit strange, so I 
sorted the matchers
first by their length and if the matchers are the same length, then by 
the name of their keys. So
the set
  {'domain': 'example.com', 'server': 'www',  'regex': 'w.*' }
would be applied in the order ['domain', 'regex', 'server'] since 
'domain' has the longest matcher and
'regex' comes before 'server' alphabetically (matchers are both the same 
length).

If no one objects, I will submit it next week.

Regards
  Felix
>
> Thanks for contributing
> Regards
>
>
> On Monday, September 29, 2014, sebb <se...@gmail.com> wrote:
>
>> On 29 September 2014 15:49, Felix Schumacher
>> <felix.schumacher@internetallee.de <javascript:;>> wrote:
>>>
>>> Am 29. September 2014 12:46:19 MESZ, schrieb sebb <sebbaz@gmail.com
>> <javascript:;>>:
>>>> On 29 September 2014 11:24, Felix Schumacher
>>>> <felix.schumacher@internetallee.de <javascript:;>> wrote:
>>>>> Am 29.09.2014 11:56, schrieb sebb:
>>>>>
>>>>>> On 28 September 2014 18:11, Felix Schumacher
>>>>>> <felix.schumacher@internetallee.de <javascript:;>> wrote:
>>>>>>> Am 22.09.2014 um 11:13 schrieb Marijn Wijbenga:
>>>>>>>
>>>>>>> I've attached a jmeter project file and a html file that
>>>> demonstrates the
>>>>>>> issue. In order to reproduce:
>>>>>>> 1. Load up xml-bug-test.jmx in jmeter.
>>>>>>> 2. Start the proxy (recorder)
>>>>>>> 3. Place xml-bug-test.html on a webserver somewhere (if on
>>>> localhost, do
>>>>>>> not
>>>>>>> forget to remove localhost from proxy exclusion if applicable)
>>>>>>> 4. Navigate with a browser to this file (using the proxy)
>>>>>>> 5. Click both buttons in order.
>>>>>>>
>>>>>>> I could not post to a html file, hence the "test 2" button will
>>>> post to
>>>>>>> Google. The page that loads has an error, but it still records the
>>>> post
>>>>>>> request which is what we want to see.
>>>>>>>
>>>>>>> I also discovered that when I was using a "get" request instead
>>>> (I've
>>>>>>> made
>>>>>>> that "test 1") then it doesn't match the first character (%). I
>>>> think
>>>>>>> this
>>>>>>> is related.
>>>>>>>
>>>>>>> The project has a user defined variable called "TEST" with a value
>>>> os
>>>>>>> ".*",
>>>>>>> I've ticked the box
>>>>>>>
>>>>>>> To see the results, in the recording controller the last two
>>>> requests
>>>>>>> contain a parameter with these values:
>>>>>>> Test 1: %${TEST}
>>>>>>> Test 2: <${TEST}>
>>>>>>>
>>>>>>> Both should be just ${TEST} I believe.
>>>>>>>
>>>>>>> In the current implementation the regex will be matched against a
>>>> pattern
>>>>>>> which looks like
>>>>>>>   \b(YOUR_VALUE)\b
>>>>>>>
>>>>>>> As % and < are boundary characters they are excluded from you
>>>> pattern.
>>>>>>
>>>>>> This is deliberate.
>>>>>> There were problems previously as partial values were being
>>>>>> unexpectedly matched.
>>>>>>
>>>>>> See https://issues.apache.org/bugzilla/show_bug.cgi?id=52678
>>>>>
>>>>> I thougt so. Maybe, that would have been helped by adding more
>>>>> documentation, but then it is regex...
>>>>>>
>>>>>>> I would consider this a bug, or at least documentation could be a
>>>> bit
>>>>>>> more
>>>>>>> concise.
>>>>>>
>>>>>> Patches welcome.
>>>>> A patch was attached :)
>>>> I meant that we would welcome a patch for the documentation.
>>>> Or at least some indication of where the documentation needs to be
>>>> updated to clarify the current behaviour.
>>> I will look into that.
>> Thanks.
>>
>>> What is your opinion on the option to detect parens and modify the regex
>> behavior?
>>
>> Looks good to me.
>>
>> The parens are very unlikely to have been used in existing tests, so
>> the modified behaviour is unlikely to break anything.
>> But we should document it in the release notes just in case.
>>
>>> Felix
>>>>>>> Attached is a patch against trunk, which checks the regex if it
>>>> starts
>>>>>>> with
>>>>>>> '(' and ends with ')' and uses the regex as given, instead of
>>>> building
>>>>>>> its
>>>>>>> own version.
>>>>>>
>>>>>> Please use Bugzilla for patches; it's easier to keep track of them.
>>>>> I have already done so yesterday shortly after sending my mail. It is
>>>>> https://issues.apache.org/bugzilla/show_bug.cgi?id=57032
>>>>>
>>>>> What is missing from the patch is documentation. If the feature as
>>>> such is
>>>>> ok, then I would add that to the existing documentation.
>>>>>
>>>>>
>>>>> Regards
>>>>>   Felix
>>>>>>>
>>>>>>>
>>>>>>> Also, see notes below.
>>>>>>>
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: sebb [mailto:sebbaz@gmail.com <javascript:;>]
>>>>>>> Sent: 21 September 2014 01:52
>>>>>>> To: JMeter Users List
>>>>>>> Subject: Re: Test Script Recorder XML Regex Matching
>>>>>>>
>>>>>>> On 19 September 2014 16:45, Marijn Wijbenga
>>>>>>> <Marijn.Wijbenga@cgpbooks.co.uk <javascript:;>> wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I have an issue, which might well be a potential bug, where a
>>>> posted
>>>>>>> value
>>>>>>> is
>>>>>>>
>>>>>>> not being matched by the Test Script Recorder's Regex Matching
>>>>>>> functionality.
>>>>>>>
>>>>>>> The request I'm recording has a post value containing XML (SAML
>>>> token to
>>>>>>> be
>>>>>>>
>>>>>>> exact) which I'd like to replace with a variable automatically.
>>>>>>>
>>>>>>> What does the value look like?
>>>>>>> Does it have multiple lines?
>>>>>>>
>>>>>>> No, it did not have multiple lines. I did check if this was the
>>>> case, but
>>>>>>> it
>>>>>>> wasn't
>>>>>>>
>>>>>>> For testing purposes I have configured a User Defined Variable
>>>> (called
>>>>>>> TEST)
>>>>>>>
>>>>>>> with a value of "(?s)^.*$", I've tried "^.*$" and ".*" as well (all
>>>>>>> without
>>>>>>> double
>>>>>>> quotes).
>>>>>>>
>>>>>>> Only ".*" replaces the content with this: <${TEST}>
>>>>>>>
>>>>>>> That does not make sense.
>>>>>>> ".*" will match everything, including < and >, so the content would
>>>>>>> become
>>>>>>> ${TEST}
>>>>>>>
>>>>>>> I know. It doesn't really. Hence I think this might be a bug.
>>>>>>>
>>>>>>> I've tried other expressions as well and I'm able to match anything
>>>>>>> within
>>>>>>> the
>>>>>>>
>>>>>>> <> characters, but not those characters itself.
>>>>>>>
>>>>>>> Again, that does not make sense.
>>>>>>>
>>>>>>> The weird thing is, that inside the outer <> characters there are
>>>> other
>>>>>>> <>
>>>>>>> characters that are matched fine. It's just the first and last
>>>> character.
>>>>>>> Does anyone else have experienced the same thing, or is this a
>>>> known
>>>>>>> issue?
>>>>>>>
>>>>>>> It is not a known issue, and may not even be an issue.
>>>>>>>
>>>>>>> Or should I post this in the developer's mailing list?
>>>>>>>
>>>>>>> No, the developers all follow this list.
>>>>>>>
>>>>>>> Great, please see attachment for an example.
>>>>>>>
>>>>>>> Cheers
>>>>>>>
>>>>>>>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@jmeter.apache.org
For additional commands, e-mail: user-help@jmeter.apache.org


Re: Test Script Recorder XML Regex Matching

Posted by Felix Schumacher <fe...@internetallee.de>.
Am 06.10.2014 um 02:08 schrieb sebb:
> On 5 October 2014 15:24, Felix Schumacher
> <fe...@internetallee.de> wrote:
>> Am 05.10.2014 um 14:35 schrieb sebb:
>>
>>> On 5 October 2014 13:26, Felix Schumacher
>>> <fe...@internetallee.de> wrote:
>>>> Am 05.10.2014 um 11:30 schrieb sebb:
>>>>
>>>>> On 4 October 2014 19:41, Philippe Mouawad <ph...@gmail.com>
>>>>> wrote:
>>>>>> On Sat, Oct 4, 2014 at 2:10 PM, Felix Schumacher <
>>>>>> felix.schumacher@internetallee.de> wrote:
>>>>>>
>>>>>>> Am 29.09.2014 um 22:32 schrieb Philippe Mouawad:
>>>>>>>
>>>>>>>> Hi Felix,
>>>>>>>>
>>>>>>> Hi
>>>>>>> I agree with sebb, patch is interesting.
>>>>>>>> But it clearly needs to be documented (I think many users don't know
>>>>>>>> about
>>>>>>>> this feature which is really interesting) as long as code, reading
>>>>>>>> patch
>>>>>>>> first it wasn't clear for me what was intended.
>>>>>>>>
>>>>>>> I have added documentation to the patch and found two other things,
>>>>>>> that
>>>>>>> I
>>>>>>> changed
>>>>>>> in the same bug-entry.
>>>>>>>
>>>>>>> The random order of applying the matchers, seems a bit strange, so I
>>>>>>> sorted the matchers
>>>>>>> first by their length and if the matchers are the same length, then by
>>>>>>> the
>>>>>>> name of their keys. So
>>>>>>> the set
>>>>>>>     {'domain': 'example.com', 'server': 'www',  'regex': 'w.*' }
>>>>>>> would be applied in the order ['domain', 'regex', 'server'] since
>>>>>>> 'domain'
>>>>>>> has the longest matcher and
>>>>>>> 'regex' comes before 'server' alphabetically (matchers are both the
>>>>>>> same
>>>>>>> length).
>>>>>>>
>>>>>> Isn't it better to order by longest value or regexp ?
>>>>>> www is more specific than w.*
>>>>>> So would be :
>>>>>> domain, server , regex
>>>>> Or the code could try to match every variable and select the one that
>>>>> produces the longest match.
>>>>>
>>>>> But rather than try and sort the regexes, which is always going to be
>>>>> tricky to do "correctly" (whatever that means), maybe the user should
>>>>> be given control of the matching order.
>>>>>
>>>>> For example, it is probably possible to match by order of appearance.
>>>>>
>>>>> It would certainly be possible to match the variables in sorted order by
>>>>> name.
>>>>> This would be a bit more awkard to use than changing the order of
>>>>> variable definitions.
>>>> I just wanted to give a simple algorithm for ordering, which I think is
>>>> better than random ordering.
>>>>
>>>> Correctness will be hard to implement, when everyone has a different view
>>>> on
>>>> the correct ordering.
>>>>
>>>> I had thought of giving more control to the user by appending the
>>>> variable
>>>> names with something to sort by.
>>>>
>>>> For example extending the above example with variable names ['domain',
>>>> 'server', 'regex'] the names could be
>>>> changed to ['domain_3', 'server_1', 'regex_2'] to impose replacement in
>>>> the
>>>> order ['server', 'regex', 'domain'].
>>>> But what should we do with the suffix '_\d+'? (A prefix could be used,
>>>> too)
>>>>
>>>> We could look for a specially named variable like '_regex_order' which
>>>> could
>>>> have a comma separated list of
>>>> the variable names in the wished order.
>>>>
>>>> The longer I think about it, the more I am inclined to take the simple
>>>> ordering algorithm of length and then name. One can
>>>> always make any regex longer by adding useless junk like
>>>> '(?:WILLNOTBEFOUNDANYWAY)?' and in such a way influence
>>>> the order.
>>> No, length of regex is not useful.
>> But it is easy to do and can be done consistently before trying to match :)
> Just because it is easy does not make it useful.
>
>>> More useful would be sorting by matched string.
>> I will try to do a patch which will do that, but I think it will be more
>> complex.
> Yes, it will be more complex.
> But I think it is more likely to be correct, but that's not
> guaranteed, which is why I think the user should have control.
>
>>> Sorting by name is awkward to use, and anyway what about non-regexes
>>> that happen to match the same text?
>> Well in regex mode every string happens to be a regex. And with sorting by
>> name do you include using
>> (and possibly stripping off) a prefix or suffix?
> It's not only regex matching that has potential ordering issues.
>
>>>
>>> I don't think it's possible to automatically sort correctly by regex.
>> Well it is simple to order it correctly, when you want to have it sorted by
>> the current algorithm. But that is
>> obviously not your preferred order.
> I just don't think it's possible to guarantee the correct order automatically.
> The best one can hope for is that it will be right more often than
> not, but there will always be edge cases.
>
> Which is why I don't think it's worth trying to guess what people will need.
> The user needs to be in control.
>
>> As I said, I think any repeatable
>> ordering is better then no order.
> It needs to be predictable (and documented).
> The current order is repeatable (on a given system), but is not easily
> predictable.
>
>>> So we should allow the user to control the search order, as I already
>>> suggested a short while ago.
>> Right, what is your suggestion of means to accomplish that order?
> I already suggested using the order of definition of the variables on
> the test plan.
> That should be possible, and is easier to use than variable renaming -
> which may result in awkward names.
Having looked a bit deeper into this, it seems the order is already 
defined :)

In Arguments#getArgumentsAsMap the arguments will be added to a 
LinkedHashMap,
so we already have a deterministic behaviour.

Sorry for the noise on that one.

Regards
  Felix
>
>> Would you like it to be another variable with a special name?
>> (I called that one '_regex_order' above).
> Not unless it's not possible to use the definition order.
> It's even more awkward to use than variable renaming.
>
>> What happens to variables, that the user missed to mention?
> Hopefully not relevant, but I imagine they would be handled in alpha
> order after the others.
>
>
>> Regards
>>
>>   Felix
>>>>>>> If no one objects, I will submit it next week.
>>>>>>>
>>>>>>> Regards
>>>>>>>     Felix
>>>>>>>
>>>>>>>> Thanks for contributing
>>>>>>>> Regards
>>>>>>>>
>>>>>>>>
>>>>>>>> On Monday, September 29, 2014, sebb <se...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>     On 29 September 2014 15:49, Felix Schumacher
>>>>>>>>> <felix.schumacher@internetallee.de <javascript:;>> wrote:
>>>>>>>>>
>>>>>>>>>> Am 29. September 2014 12:46:19 MESZ, schrieb sebb <sebbaz@gmail.com
>>>>>>>>>>
>>>>>>>>> <javascript:;>>:
>>>>>>>>>
>>>>>>>>>> On 29 September 2014 11:24, Felix Schumacher
>>>>>>>>>>> <felix.schumacher@internetallee.de <javascript:;>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Am 29.09.2014 11:56, schrieb sebb:
>>>>>>>>>>>>
>>>>>>>>>>>>     On 28 September 2014 18:11, Felix Schumacher
>>>>>>>>>>>>> <felix.schumacher@internetallee.de <javascript:;>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Am 22.09.2014 um 11:13 schrieb Marijn Wijbenga:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I've attached a jmeter project file and a html file that
>>>>>>>>>>>>>>
>>>>>>>>>>>>> demonstrates the
>>>>>>>>>>>> issue. In order to reproduce:
>>>>>>>>>>>>>> 1. Load up xml-bug-test.jmx in jmeter.
>>>>>>>>>>>>>> 2. Start the proxy (recorder)
>>>>>>>>>>>>>> 3. Place xml-bug-test.html on a webserver somewhere (if on
>>>>>>>>>>>>>>
>>>>>>>>>>>>> localhost, do
>>>>>>>>>>>> not
>>>>>>>>>>>>>> forget to remove localhost from proxy exclusion if applicable)
>>>>>>>>>>>>>> 4. Navigate with a browser to this file (using the proxy)
>>>>>>>>>>>>>> 5. Click both buttons in order.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I could not post to a html file, hence the "test 2" button will
>>>>>>>>>>>>>>
>>>>>>>>>>>>> post to
>>>>>>>>>>>> Google. The page that loads has an error, but it still records
>>>>>>>>>>>> the
>>>>>>>>>>>>> post
>>>>>>>>>>>> request which is what we want to see.
>>>>>>>>>>>>>> I also discovered that when I was using a "get" request instead
>>>>>>>>>>>>>>
>>>>>>>>>>>>> (I've
>>>>>>>>>>>> made
>>>>>>>>>>>>>> that "test 1") then it doesn't match the first character (%). I
>>>>>>>>>>>>>>
>>>>>>>>>>>>> think
>>>>>>>>>>>> this
>>>>>>>>>>>>>> is related.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The project has a user defined variable called "TEST" with a
>>>>>>>>>>>>>> value
>>>>>>>>>>>>>>
>>>>>>>>>>>>> os
>>>>>>>>>>>> ".*",
>>>>>>>>>>>>>> I've ticked the box
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> To see the results, in the recording controller the last two
>>>>>>>>>>>>>>
>>>>>>>>>>>>> requests
>>>>>>>>>>>> contain a parameter with these values:
>>>>>>>>>>>>>> Test 1: %${TEST}
>>>>>>>>>>>>>> Test 2: <${TEST}>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Both should be just ${TEST} I believe.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In the current implementation the regex will be matched against
>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>
>>>>>>>>>>>>> pattern
>>>>>>>>>>>> which looks like
>>>>>>>>>>>>>>      \b(YOUR_VALUE)\b
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> As % and < are boundary characters they are excluded from you
>>>>>>>>>>>>>>
>>>>>>>>>>>>> pattern.
>>>>>>>>>>>>> This is deliberate.
>>>>>>>>>>>>> There were problems previously as partial values were being
>>>>>>>>>>>>> unexpectedly matched.
>>>>>>>>>>>>>
>>>>>>>>>>>>> See https://issues.apache.org/bugzilla/show_bug.cgi?id=52678
>>>>>>>>>>>>>
>>>>>>>>>>>> I thougt so. Maybe, that would have been helped by adding more
>>>>>>>>>>>> documentation, but then it is regex...
>>>>>>>>>>>>
>>>>>>>>>>>>>     I would consider this a bug, or at least documentation could
>>>>>>>>>>>>> be
>>>>>>>>>>>>> a
>>>>>>>>>>>>> bit
>>>>>>>>>>>> more
>>>>>>>>>>>>>> concise.
>>>>>>>>>>>>>>
>>>>>>>>>>>>> Patches welcome.
>>>>>>>>>>>>>
>>>>>>>>>>>> A patch was attached :)
>>>>>>>>>>>>
>>>>>>>>>>> I meant that we would welcome a patch for the documentation.
>>>>>>>>>>> Or at least some indication of where the documentation needs to be
>>>>>>>>>>> updated to clarify the current behaviour.
>>>>>>>>>>>
>>>>>>>>>> I will look into that.
>>>>>>>>>>
>>>>>>>>> Thanks.
>>>>>>>>>
>>>>>>>>>     What is your opinion on the option to detect parens and modify
>>>>>>>>> the
>>>>>>>>> regex
>>>>>>>>> behavior?
>>>>>>>>>
>>>>>>>>> Looks good to me.
>>>>>>>>>
>>>>>>>>> The parens are very unlikely to have been used in existing tests, so
>>>>>>>>> the modified behaviour is unlikely to break anything.
>>>>>>>>> But we should document it in the release notes just in case.
>>>>>>>>>
>>>>>>>>>     Felix
>>>>>>>>>>> Attached is a patch against trunk, which checks the regex if it
>>>>>>>>>>>>> starts
>>>>>>>>>>>> with
>>>>>>>>>>>>>> '(' and ends with ')' and uses the regex as given, instead of
>>>>>>>>>>>>>>
>>>>>>>>>>>>> building
>>>>>>>>>>>> its
>>>>>>>>>>>>>> own version.
>>>>>>>>>>>>>>
>>>>>>>>>>>>> Please use Bugzilla for patches; it's easier to keep track of
>>>>>>>>>>>>> them.
>>>>>>>>>>>>>
>>>>>>>>>>>> I have already done so yesterday shortly after sending my mail.
>>>>>>>>>>>> It
>>>>>>>>>>>> is
>>>>>>>>>>>> https://issues.apache.org/bugzilla/show_bug.cgi?id=57032
>>>>>>>>>>>>
>>>>>>>>>>>> What is missing from the patch is documentation. If the feature
>>>>>>>>>>>> as
>>>>>>>>>>>>
>>>>>>>>>>> such is
>>>>>>>>>>>
>>>>>>>>>>>> ok, then I would add that to the existing documentation.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Regards
>>>>>>>>>>>>      Felix
>>>>>>>>>>>>
>>>>>>>>>>>>>> Also, see notes below.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>> From: sebb [mailto:sebbaz@gmail.com <javascript:;>]
>>>>>>>>>>>>>> Sent: 21 September 2014 01:52
>>>>>>>>>>>>>> To: JMeter Users List
>>>>>>>>>>>>>> Subject: Re: Test Script Recorder XML Regex Matching
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 19 September 2014 16:45, Marijn Wijbenga
>>>>>>>>>>>>>> <Marijn.Wijbenga@cgpbooks.co.uk <javascript:;>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I have an issue, which might well be a potential bug, where a
>>>>>>>>>>>>>>
>>>>>>>>>>>>> posted
>>>>>>>>>>>> value
>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> not being matched by the Test Script Recorder's Regex Matching
>>>>>>>>>>>>>> functionality.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The request I'm recording has a post value containing XML (SAML
>>>>>>>>>>>>>>
>>>>>>>>>>>>> token to
>>>>>>>>>>>> be
>>>>>>>>>>>>>> exact) which I'd like to replace with a variable automatically.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> What does the value look like?
>>>>>>>>>>>>>> Does it have multiple lines?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> No, it did not have multiple lines. I did check if this was the
>>>>>>>>>>>>>>
>>>>>>>>>>>>> case, but
>>>>>>>>>>>> it
>>>>>>>>>>>>>> wasn't
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> For testing purposes I have configured a User Defined Variable
>>>>>>>>>>>>>>
>>>>>>>>>>>>> (called
>>>>>>>>>>>> TEST)
>>>>>>>>>>>>>> with a value of "(?s)^.*$", I've tried "^.*$" and ".*" as well
>>>>>>>>>>>>>> (all
>>>>>>>>>>>>>> without
>>>>>>>>>>>>>> double
>>>>>>>>>>>>>> quotes).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Only ".*" replaces the content with this: <${TEST}>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> That does not make sense.
>>>>>>>>>>>>>> ".*" will match everything, including < and >, so the content
>>>>>>>>>>>>>> would
>>>>>>>>>>>>>> become
>>>>>>>>>>>>>> ${TEST}
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I know. It doesn't really. Hence I think this might be a bug.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I've tried other expressions as well and I'm able to match
>>>>>>>>>>>>>> anything
>>>>>>>>>>>>>> within
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> <> characters, but not those characters itself.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Again, that does not make sense.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The weird thing is, that inside the outer <> characters there
>>>>>>>>>>>>>> are
>>>>>>>>>>>>>>
>>>>>>>>>>>>> other
>>>>>>>>>>>> <>
>>>>>>>>>>>>>> characters that are matched fine. It's just the first and last
>>>>>>>>>>>>>>
>>>>>>>>>>>>> character.
>>>>>>>>>>>> Does anyone else have experienced the same thing, or is this a
>>>>>>>>>>>>> known
>>>>>>>>>>>> issue?
>>>>>>>>>>>>>> It is not a known issue, and may not even be an issue.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Or should I post this in the developer's mailing list?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> No, the developers all follow this list.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Great, please see attachment for an example.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cheers
>>>>>>>>>>>>>>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@jmeter.apache.org
For additional commands, e-mail: user-help@jmeter.apache.org


Re: Test Script Recorder XML Regex Matching

Posted by sebb <se...@gmail.com>.
On 5 October 2014 15:24, Felix Schumacher
<fe...@internetallee.de> wrote:
> Am 05.10.2014 um 14:35 schrieb sebb:
>
>> On 5 October 2014 13:26, Felix Schumacher
>> <fe...@internetallee.de> wrote:
>>>
>>> Am 05.10.2014 um 11:30 schrieb sebb:
>>>
>>>> On 4 October 2014 19:41, Philippe Mouawad <ph...@gmail.com>
>>>> wrote:
>>>>>
>>>>> On Sat, Oct 4, 2014 at 2:10 PM, Felix Schumacher <
>>>>> felix.schumacher@internetallee.de> wrote:
>>>>>
>>>>>> Am 29.09.2014 um 22:32 schrieb Philippe Mouawad:
>>>>>>
>>>>>>> Hi Felix,
>>>>>>>
>>>>>> Hi
>>>>>> I agree with sebb, patch is interesting.
>>>>>>>
>>>>>>> But it clearly needs to be documented (I think many users don't know
>>>>>>> about
>>>>>>> this feature which is really interesting) as long as code, reading
>>>>>>> patch
>>>>>>> first it wasn't clear for me what was intended.
>>>>>>>
>>>>>> I have added documentation to the patch and found two other things,
>>>>>> that
>>>>>> I
>>>>>> changed
>>>>>> in the same bug-entry.
>>>>>>
>>>>>> The random order of applying the matchers, seems a bit strange, so I
>>>>>> sorted the matchers
>>>>>> first by their length and if the matchers are the same length, then by
>>>>>> the
>>>>>> name of their keys. So
>>>>>> the set
>>>>>>    {'domain': 'example.com', 'server': 'www',  'regex': 'w.*' }
>>>>>> would be applied in the order ['domain', 'regex', 'server'] since
>>>>>> 'domain'
>>>>>> has the longest matcher and
>>>>>> 'regex' comes before 'server' alphabetically (matchers are both the
>>>>>> same
>>>>>> length).
>>>>>>
>>>>> Isn't it better to order by longest value or regexp ?
>>>>> www is more specific than w.*
>>>>> So would be :
>>>>> domain, server , regex
>>>>
>>>> Or the code could try to match every variable and select the one that
>>>> produces the longest match.
>>>>
>>>> But rather than try and sort the regexes, which is always going to be
>>>> tricky to do "correctly" (whatever that means), maybe the user should
>>>> be given control of the matching order.
>>>>
>>>> For example, it is probably possible to match by order of appearance.
>>>>
>>>> It would certainly be possible to match the variables in sorted order by
>>>> name.
>>>> This would be a bit more awkard to use than changing the order of
>>>> variable definitions.
>>>
>>> I just wanted to give a simple algorithm for ordering, which I think is
>>> better than random ordering.
>>>
>>> Correctness will be hard to implement, when everyone has a different view
>>> on
>>> the correct ordering.
>>>
>>> I had thought of giving more control to the user by appending the
>>> variable
>>> names with something to sort by.
>>>
>>> For example extending the above example with variable names ['domain',
>>> 'server', 'regex'] the names could be
>>> changed to ['domain_3', 'server_1', 'regex_2'] to impose replacement in
>>> the
>>> order ['server', 'regex', 'domain'].
>>> But what should we do with the suffix '_\d+'? (A prefix could be used,
>>> too)
>>>
>>> We could look for a specially named variable like '_regex_order' which
>>> could
>>> have a comma separated list of
>>> the variable names in the wished order.
>>>
>>> The longer I think about it, the more I am inclined to take the simple
>>> ordering algorithm of length and then name. One can
>>> always make any regex longer by adding useless junk like
>>> '(?:WILLNOTBEFOUNDANYWAY)?' and in such a way influence
>>> the order.
>>
>> No, length of regex is not useful.
>
> But it is easy to do and can be done consistently before trying to match :)

Just because it is easy does not make it useful.

>>
>> More useful would be sorting by matched string.
>
> I will try to do a patch which will do that, but I think it will be more
> complex.

Yes, it will be more complex.
But I think it is more likely to be correct, but that's not
guaranteed, which is why I think the user should have control.

>>
>> Sorting by name is awkward to use, and anyway what about non-regexes
>> that happen to match the same text?
>
> Well in regex mode every string happens to be a regex. And with sorting by
> name do you include using
> (and possibly stripping off) a prefix or suffix?

It's not only regex matching that has potential ordering issues.

>>
>>
>> I don't think it's possible to automatically sort correctly by regex.
>
> Well it is simple to order it correctly, when you want to have it sorted by
> the current algorithm. But that is
> obviously not your preferred order.

I just don't think it's possible to guarantee the correct order automatically.
The best one can hope for is that it will be right more often than
not, but there will always be edge cases.

Which is why I don't think it's worth trying to guess what people will need.
The user needs to be in control.

> As I said, I think any repeatable
> ordering is better then no order.

It needs to be predictable (and documented).
The current order is repeatable (on a given system), but is not easily
predictable.

>>
>> So we should allow the user to control the search order, as I already
>> suggested a short while ago.
>
> Right, what is your suggestion of means to accomplish that order?

I already suggested using the order of definition of the variables on
the test plan.
That should be possible, and is easier to use than variable renaming -
which may result in awkward names.

> Would you like it to be another variable with a special name?
> (I called that one '_regex_order' above).

Not unless it's not possible to use the definition order.
It's even more awkward to use than variable renaming.

> What happens to variables, that the user missed to mention?

Hopefully not relevant, but I imagine they would be handled in alpha
order after the others.


> Regards
>
>  Felix
>>>>>>
>>>>>> If no one objects, I will submit it next week.
>>>>>>
>>>>>> Regards
>>>>>>    Felix
>>>>>>
>>>>>>> Thanks for contributing
>>>>>>> Regards
>>>>>>>
>>>>>>>
>>>>>>> On Monday, September 29, 2014, sebb <se...@gmail.com> wrote:
>>>>>>>
>>>>>>>    On 29 September 2014 15:49, Felix Schumacher
>>>>>>>>
>>>>>>>> <felix.schumacher@internetallee.de <javascript:;>> wrote:
>>>>>>>>
>>>>>>>>> Am 29. September 2014 12:46:19 MESZ, schrieb sebb <sebbaz@gmail.com
>>>>>>>>>
>>>>>>>> <javascript:;>>:
>>>>>>>>
>>>>>>>>> On 29 September 2014 11:24, Felix Schumacher
>>>>>>>>>>
>>>>>>>>>> <felix.schumacher@internetallee.de <javascript:;>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Am 29.09.2014 11:56, schrieb sebb:
>>>>>>>>>>>
>>>>>>>>>>>    On 28 September 2014 18:11, Felix Schumacher
>>>>>>>>>>>>
>>>>>>>>>>>> <felix.schumacher@internetallee.de <javascript:;>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Am 22.09.2014 um 11:13 schrieb Marijn Wijbenga:
>>>>>>>>>>>>>
>>>>>>>>>>>>> I've attached a jmeter project file and a html file that
>>>>>>>>>>>>>
>>>>>>>>>>>> demonstrates the
>>>>>>>>>>>
>>>>>>>>>>> issue. In order to reproduce:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 1. Load up xml-bug-test.jmx in jmeter.
>>>>>>>>>>>>> 2. Start the proxy (recorder)
>>>>>>>>>>>>> 3. Place xml-bug-test.html on a webserver somewhere (if on
>>>>>>>>>>>>>
>>>>>>>>>>>> localhost, do
>>>>>>>>>>>
>>>>>>>>>>> not
>>>>>>>>>>>>>
>>>>>>>>>>>>> forget to remove localhost from proxy exclusion if applicable)
>>>>>>>>>>>>> 4. Navigate with a browser to this file (using the proxy)
>>>>>>>>>>>>> 5. Click both buttons in order.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I could not post to a html file, hence the "test 2" button will
>>>>>>>>>>>>>
>>>>>>>>>>>> post to
>>>>>>>>>>>
>>>>>>>>>>> Google. The page that loads has an error, but it still records
>>>>>>>>>>> the
>>>>>>>>>>>>
>>>>>>>>>>>> post
>>>>>>>>>>>
>>>>>>>>>>> request which is what we want to see.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I also discovered that when I was using a "get" request instead
>>>>>>>>>>>>>
>>>>>>>>>>>> (I've
>>>>>>>>>>>
>>>>>>>>>>> made
>>>>>>>>>>>>>
>>>>>>>>>>>>> that "test 1") then it doesn't match the first character (%). I
>>>>>>>>>>>>>
>>>>>>>>>>>> think
>>>>>>>>>>>
>>>>>>>>>>> this
>>>>>>>>>>>>>
>>>>>>>>>>>>> is related.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The project has a user defined variable called "TEST" with a
>>>>>>>>>>>>> value
>>>>>>>>>>>>>
>>>>>>>>>>>> os
>>>>>>>>>>>
>>>>>>>>>>> ".*",
>>>>>>>>>>>>>
>>>>>>>>>>>>> I've ticked the box
>>>>>>>>>>>>>
>>>>>>>>>>>>> To see the results, in the recording controller the last two
>>>>>>>>>>>>>
>>>>>>>>>>>> requests
>>>>>>>>>>>
>>>>>>>>>>> contain a parameter with these values:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Test 1: %${TEST}
>>>>>>>>>>>>> Test 2: <${TEST}>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Both should be just ${TEST} I believe.
>>>>>>>>>>>>>
>>>>>>>>>>>>> In the current implementation the regex will be matched against
>>>>>>>>>>>>> a
>>>>>>>>>>>>>
>>>>>>>>>>>> pattern
>>>>>>>>>>>
>>>>>>>>>>> which looks like
>>>>>>>>>>>>>
>>>>>>>>>>>>>     \b(YOUR_VALUE)\b
>>>>>>>>>>>>>
>>>>>>>>>>>>> As % and < are boundary characters they are excluded from you
>>>>>>>>>>>>>
>>>>>>>>>>>> pattern.
>>>>>>>>>>>> This is deliberate.
>>>>>>>>>>>> There were problems previously as partial values were being
>>>>>>>>>>>> unexpectedly matched.
>>>>>>>>>>>>
>>>>>>>>>>>> See https://issues.apache.org/bugzilla/show_bug.cgi?id=52678
>>>>>>>>>>>>
>>>>>>>>>>> I thougt so. Maybe, that would have been helped by adding more
>>>>>>>>>>> documentation, but then it is regex...
>>>>>>>>>>>
>>>>>>>>>>>>    I would consider this a bug, or at least documentation could
>>>>>>>>>>>> be
>>>>>>>>>>>> a
>>>>>>>>>>>> bit
>>>>>>>>>>>
>>>>>>>>>>> more
>>>>>>>>>>>>>
>>>>>>>>>>>>> concise.
>>>>>>>>>>>>>
>>>>>>>>>>>> Patches welcome.
>>>>>>>>>>>>
>>>>>>>>>>> A patch was attached :)
>>>>>>>>>>>
>>>>>>>>>> I meant that we would welcome a patch for the documentation.
>>>>>>>>>> Or at least some indication of where the documentation needs to be
>>>>>>>>>> updated to clarify the current behaviour.
>>>>>>>>>>
>>>>>>>>> I will look into that.
>>>>>>>>>
>>>>>>>> Thanks.
>>>>>>>>
>>>>>>>>    What is your opinion on the option to detect parens and modify
>>>>>>>> the
>>>>>>>> regex
>>>>>>>> behavior?
>>>>>>>>
>>>>>>>> Looks good to me.
>>>>>>>>
>>>>>>>> The parens are very unlikely to have been used in existing tests, so
>>>>>>>> the modified behaviour is unlikely to break anything.
>>>>>>>> But we should document it in the release notes just in case.
>>>>>>>>
>>>>>>>>    Felix
>>>>>>>>>>
>>>>>>>>>> Attached is a patch against trunk, which checks the regex if it
>>>>>>>>>>>>
>>>>>>>>>>>> starts
>>>>>>>>>>>
>>>>>>>>>>> with
>>>>>>>>>>>>>
>>>>>>>>>>>>> '(' and ends with ')' and uses the regex as given, instead of
>>>>>>>>>>>>>
>>>>>>>>>>>> building
>>>>>>>>>>>
>>>>>>>>>>> its
>>>>>>>>>>>>>
>>>>>>>>>>>>> own version.
>>>>>>>>>>>>>
>>>>>>>>>>>> Please use Bugzilla for patches; it's easier to keep track of
>>>>>>>>>>>> them.
>>>>>>>>>>>>
>>>>>>>>>>> I have already done so yesterday shortly after sending my mail.
>>>>>>>>>>> It
>>>>>>>>>>> is
>>>>>>>>>>> https://issues.apache.org/bugzilla/show_bug.cgi?id=57032
>>>>>>>>>>>
>>>>>>>>>>> What is missing from the patch is documentation. If the feature
>>>>>>>>>>> as
>>>>>>>>>>>
>>>>>>>>>> such is
>>>>>>>>>>
>>>>>>>>>>> ok, then I would add that to the existing documentation.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Regards
>>>>>>>>>>>     Felix
>>>>>>>>>>>
>>>>>>>>>>>>> Also, see notes below.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>> From: sebb [mailto:sebbaz@gmail.com <javascript:;>]
>>>>>>>>>>>>> Sent: 21 September 2014 01:52
>>>>>>>>>>>>> To: JMeter Users List
>>>>>>>>>>>>> Subject: Re: Test Script Recorder XML Regex Matching
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 19 September 2014 16:45, Marijn Wijbenga
>>>>>>>>>>>>> <Marijn.Wijbenga@cgpbooks.co.uk <javascript:;>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have an issue, which might well be a potential bug, where a
>>>>>>>>>>>>>
>>>>>>>>>>>> posted
>>>>>>>>>>>
>>>>>>>>>>> value
>>>>>>>>>>>>>
>>>>>>>>>>>>> is
>>>>>>>>>>>>>
>>>>>>>>>>>>> not being matched by the Test Script Recorder's Regex Matching
>>>>>>>>>>>>> functionality.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The request I'm recording has a post value containing XML (SAML
>>>>>>>>>>>>>
>>>>>>>>>>>> token to
>>>>>>>>>>>
>>>>>>>>>>> be
>>>>>>>>>>>>>
>>>>>>>>>>>>> exact) which I'd like to replace with a variable automatically.
>>>>>>>>>>>>>
>>>>>>>>>>>>> What does the value look like?
>>>>>>>>>>>>> Does it have multiple lines?
>>>>>>>>>>>>>
>>>>>>>>>>>>> No, it did not have multiple lines. I did check if this was the
>>>>>>>>>>>>>
>>>>>>>>>>>> case, but
>>>>>>>>>>>
>>>>>>>>>>> it
>>>>>>>>>>>>>
>>>>>>>>>>>>> wasn't
>>>>>>>>>>>>>
>>>>>>>>>>>>> For testing purposes I have configured a User Defined Variable
>>>>>>>>>>>>>
>>>>>>>>>>>> (called
>>>>>>>>>>>
>>>>>>>>>>> TEST)
>>>>>>>>>>>>>
>>>>>>>>>>>>> with a value of "(?s)^.*$", I've tried "^.*$" and ".*" as well
>>>>>>>>>>>>> (all
>>>>>>>>>>>>> without
>>>>>>>>>>>>> double
>>>>>>>>>>>>> quotes).
>>>>>>>>>>>>>
>>>>>>>>>>>>> Only ".*" replaces the content with this: <${TEST}>
>>>>>>>>>>>>>
>>>>>>>>>>>>> That does not make sense.
>>>>>>>>>>>>> ".*" will match everything, including < and >, so the content
>>>>>>>>>>>>> would
>>>>>>>>>>>>> become
>>>>>>>>>>>>> ${TEST}
>>>>>>>>>>>>>
>>>>>>>>>>>>> I know. It doesn't really. Hence I think this might be a bug.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I've tried other expressions as well and I'm able to match
>>>>>>>>>>>>> anything
>>>>>>>>>>>>> within
>>>>>>>>>>>>> the
>>>>>>>>>>>>>
>>>>>>>>>>>>> <> characters, but not those characters itself.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Again, that does not make sense.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The weird thing is, that inside the outer <> characters there
>>>>>>>>>>>>> are
>>>>>>>>>>>>>
>>>>>>>>>>>> other
>>>>>>>>>>>
>>>>>>>>>>> <>
>>>>>>>>>>>>>
>>>>>>>>>>>>> characters that are matched fine. It's just the first and last
>>>>>>>>>>>>>
>>>>>>>>>>>> character.
>>>>>>>>>>>
>>>>>>>>>>> Does anyone else have experienced the same thing, or is this a
>>>>>>>>>>>>
>>>>>>>>>>>> known
>>>>>>>>>>>
>>>>>>>>>>> issue?
>>>>>>>>>>>>>
>>>>>>>>>>>>> It is not a known issue, and may not even be an issue.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Or should I post this in the developer's mailing list?
>>>>>>>>>>>>>
>>>>>>>>>>>>> No, the developers all follow this list.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Great, please see attachment for an example.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cheers
>>>>>>>>>>>>>
>>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@jmeter.apache.org
> For additional commands, e-mail: user-help@jmeter.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@jmeter.apache.org
For additional commands, e-mail: user-help@jmeter.apache.org


Re: Test Script Recorder XML Regex Matching

Posted by Felix Schumacher <fe...@internetallee.de>.
Am 05.10.2014 um 14:35 schrieb sebb:
> On 5 October 2014 13:26, Felix Schumacher
> <fe...@internetallee.de> wrote:
>> Am 05.10.2014 um 11:30 schrieb sebb:
>>
>>> On 4 October 2014 19:41, Philippe Mouawad <ph...@gmail.com>
>>> wrote:
>>>> On Sat, Oct 4, 2014 at 2:10 PM, Felix Schumacher <
>>>> felix.schumacher@internetallee.de> wrote:
>>>>
>>>>> Am 29.09.2014 um 22:32 schrieb Philippe Mouawad:
>>>>>
>>>>>> Hi Felix,
>>>>>>
>>>>> Hi
>>>>> I agree with sebb, patch is interesting.
>>>>>> But it clearly needs to be documented (I think many users don't know
>>>>>> about
>>>>>> this feature which is really interesting) as long as code, reading
>>>>>> patch
>>>>>> first it wasn't clear for me what was intended.
>>>>>>
>>>>> I have added documentation to the patch and found two other things, that
>>>>> I
>>>>> changed
>>>>> in the same bug-entry.
>>>>>
>>>>> The random order of applying the matchers, seems a bit strange, so I
>>>>> sorted the matchers
>>>>> first by their length and if the matchers are the same length, then by
>>>>> the
>>>>> name of their keys. So
>>>>> the set
>>>>>    {'domain': 'example.com', 'server': 'www',  'regex': 'w.*' }
>>>>> would be applied in the order ['domain', 'regex', 'server'] since
>>>>> 'domain'
>>>>> has the longest matcher and
>>>>> 'regex' comes before 'server' alphabetically (matchers are both the same
>>>>> length).
>>>>>
>>>> Isn't it better to order by longest value or regexp ?
>>>> www is more specific than w.*
>>>> So would be :
>>>> domain, server , regex
>>> Or the code could try to match every variable and select the one that
>>> produces the longest match.
>>>
>>> But rather than try and sort the regexes, which is always going to be
>>> tricky to do "correctly" (whatever that means), maybe the user should
>>> be given control of the matching order.
>>>
>>> For example, it is probably possible to match by order of appearance.
>>>
>>> It would certainly be possible to match the variables in sorted order by
>>> name.
>>> This would be a bit more awkard to use than changing the order of
>>> variable definitions.
>> I just wanted to give a simple algorithm for ordering, which I think is
>> better than random ordering.
>>
>> Correctness will be hard to implement, when everyone has a different view on
>> the correct ordering.
>>
>> I had thought of giving more control to the user by appending the variable
>> names with something to sort by.
>>
>> For example extending the above example with variable names ['domain',
>> 'server', 'regex'] the names could be
>> changed to ['domain_3', 'server_1', 'regex_2'] to impose replacement in the
>> order ['server', 'regex', 'domain'].
>> But what should we do with the suffix '_\d+'? (A prefix could be used, too)
>>
>> We could look for a specially named variable like '_regex_order' which could
>> have a comma separated list of
>> the variable names in the wished order.
>>
>> The longer I think about it, the more I am inclined to take the simple
>> ordering algorithm of length and then name. One can
>> always make any regex longer by adding useless junk like
>> '(?:WILLNOTBEFOUNDANYWAY)?' and in such a way influence
>> the order.
> No, length of regex is not useful.
But it is easy to do and can be done consistently before trying to match :)
> More useful would be sorting by matched string.
I will try to do a patch which will do that, but I think it will be more 
complex.
> Sorting by name is awkward to use, and anyway what about non-regexes
> that happen to match the same text?
Well in regex mode every string happens to be a regex. And with sorting 
by name do you include using
(and possibly stripping off) a prefix or suffix?
>
> I don't think it's possible to automatically sort correctly by regex.
Well it is simple to order it correctly, when you want to have it sorted 
by the current algorithm. But that is
obviously not your preferred order. As I said, I think any repeatable 
ordering is better then no order.
> So we should allow the user to control the search order, as I already
> suggested a short while ago.
Right, what is your suggestion of means to accomplish that order? Would 
you like it to be another variable with a special name?
(I called that one '_regex_order' above). What happens to variables, 
that the user missed to mention?

Regards
  Felix
>>>>> If no one objects, I will submit it next week.
>>>>>
>>>>> Regards
>>>>>    Felix
>>>>>
>>>>>> Thanks for contributing
>>>>>> Regards
>>>>>>
>>>>>>
>>>>>> On Monday, September 29, 2014, sebb <se...@gmail.com> wrote:
>>>>>>
>>>>>>    On 29 September 2014 15:49, Felix Schumacher
>>>>>>> <felix.schumacher@internetallee.de <javascript:;>> wrote:
>>>>>>>
>>>>>>>> Am 29. September 2014 12:46:19 MESZ, schrieb sebb <sebbaz@gmail.com
>>>>>>>>
>>>>>>> <javascript:;>>:
>>>>>>>
>>>>>>>> On 29 September 2014 11:24, Felix Schumacher
>>>>>>>>> <felix.schumacher@internetallee.de <javascript:;>> wrote:
>>>>>>>>>
>>>>>>>>>> Am 29.09.2014 11:56, schrieb sebb:
>>>>>>>>>>
>>>>>>>>>>    On 28 September 2014 18:11, Felix Schumacher
>>>>>>>>>>> <felix.schumacher@internetallee.de <javascript:;>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Am 22.09.2014 um 11:13 schrieb Marijn Wijbenga:
>>>>>>>>>>>>
>>>>>>>>>>>> I've attached a jmeter project file and a html file that
>>>>>>>>>>>>
>>>>>>>>>>> demonstrates the
>>>>>>>>>> issue. In order to reproduce:
>>>>>>>>>>>> 1. Load up xml-bug-test.jmx in jmeter.
>>>>>>>>>>>> 2. Start the proxy (recorder)
>>>>>>>>>>>> 3. Place xml-bug-test.html on a webserver somewhere (if on
>>>>>>>>>>>>
>>>>>>>>>>> localhost, do
>>>>>>>>>> not
>>>>>>>>>>>> forget to remove localhost from proxy exclusion if applicable)
>>>>>>>>>>>> 4. Navigate with a browser to this file (using the proxy)
>>>>>>>>>>>> 5. Click both buttons in order.
>>>>>>>>>>>>
>>>>>>>>>>>> I could not post to a html file, hence the "test 2" button will
>>>>>>>>>>>>
>>>>>>>>>>> post to
>>>>>>>>>> Google. The page that loads has an error, but it still records the
>>>>>>>>>>> post
>>>>>>>>>> request which is what we want to see.
>>>>>>>>>>>> I also discovered that when I was using a "get" request instead
>>>>>>>>>>>>
>>>>>>>>>>> (I've
>>>>>>>>>> made
>>>>>>>>>>>> that "test 1") then it doesn't match the first character (%). I
>>>>>>>>>>>>
>>>>>>>>>>> think
>>>>>>>>>> this
>>>>>>>>>>>> is related.
>>>>>>>>>>>>
>>>>>>>>>>>> The project has a user defined variable called "TEST" with a
>>>>>>>>>>>> value
>>>>>>>>>>>>
>>>>>>>>>>> os
>>>>>>>>>> ".*",
>>>>>>>>>>>> I've ticked the box
>>>>>>>>>>>>
>>>>>>>>>>>> To see the results, in the recording controller the last two
>>>>>>>>>>>>
>>>>>>>>>>> requests
>>>>>>>>>> contain a parameter with these values:
>>>>>>>>>>>> Test 1: %${TEST}
>>>>>>>>>>>> Test 2: <${TEST}>
>>>>>>>>>>>>
>>>>>>>>>>>> Both should be just ${TEST} I believe.
>>>>>>>>>>>>
>>>>>>>>>>>> In the current implementation the regex will be matched against a
>>>>>>>>>>>>
>>>>>>>>>>> pattern
>>>>>>>>>> which looks like
>>>>>>>>>>>>     \b(YOUR_VALUE)\b
>>>>>>>>>>>>
>>>>>>>>>>>> As % and < are boundary characters they are excluded from you
>>>>>>>>>>>>
>>>>>>>>>>> pattern.
>>>>>>>>>>> This is deliberate.
>>>>>>>>>>> There were problems previously as partial values were being
>>>>>>>>>>> unexpectedly matched.
>>>>>>>>>>>
>>>>>>>>>>> See https://issues.apache.org/bugzilla/show_bug.cgi?id=52678
>>>>>>>>>>>
>>>>>>>>>> I thougt so. Maybe, that would have been helped by adding more
>>>>>>>>>> documentation, but then it is regex...
>>>>>>>>>>
>>>>>>>>>>>    I would consider this a bug, or at least documentation could be
>>>>>>>>>>> a
>>>>>>>>>>> bit
>>>>>>>>>> more
>>>>>>>>>>>> concise.
>>>>>>>>>>>>
>>>>>>>>>>> Patches welcome.
>>>>>>>>>>>
>>>>>>>>>> A patch was attached :)
>>>>>>>>>>
>>>>>>>>> I meant that we would welcome a patch for the documentation.
>>>>>>>>> Or at least some indication of where the documentation needs to be
>>>>>>>>> updated to clarify the current behaviour.
>>>>>>>>>
>>>>>>>> I will look into that.
>>>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>>    What is your opinion on the option to detect parens and modify the
>>>>>>> regex
>>>>>>> behavior?
>>>>>>>
>>>>>>> Looks good to me.
>>>>>>>
>>>>>>> The parens are very unlikely to have been used in existing tests, so
>>>>>>> the modified behaviour is unlikely to break anything.
>>>>>>> But we should document it in the release notes just in case.
>>>>>>>
>>>>>>>    Felix
>>>>>>>>> Attached is a patch against trunk, which checks the regex if it
>>>>>>>>>>> starts
>>>>>>>>>> with
>>>>>>>>>>>> '(' and ends with ')' and uses the regex as given, instead of
>>>>>>>>>>>>
>>>>>>>>>>> building
>>>>>>>>>> its
>>>>>>>>>>>> own version.
>>>>>>>>>>>>
>>>>>>>>>>> Please use Bugzilla for patches; it's easier to keep track of
>>>>>>>>>>> them.
>>>>>>>>>>>
>>>>>>>>>> I have already done so yesterday shortly after sending my mail. It
>>>>>>>>>> is
>>>>>>>>>> https://issues.apache.org/bugzilla/show_bug.cgi?id=57032
>>>>>>>>>>
>>>>>>>>>> What is missing from the patch is documentation. If the feature as
>>>>>>>>>>
>>>>>>>>> such is
>>>>>>>>>
>>>>>>>>>> ok, then I would add that to the existing documentation.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Regards
>>>>>>>>>>     Felix
>>>>>>>>>>
>>>>>>>>>>>> Also, see notes below.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>> From: sebb [mailto:sebbaz@gmail.com <javascript:;>]
>>>>>>>>>>>> Sent: 21 September 2014 01:52
>>>>>>>>>>>> To: JMeter Users List
>>>>>>>>>>>> Subject: Re: Test Script Recorder XML Regex Matching
>>>>>>>>>>>>
>>>>>>>>>>>> On 19 September 2014 16:45, Marijn Wijbenga
>>>>>>>>>>>> <Marijn.Wijbenga@cgpbooks.co.uk <javascript:;>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> I have an issue, which might well be a potential bug, where a
>>>>>>>>>>>>
>>>>>>>>>>> posted
>>>>>>>>>> value
>>>>>>>>>>>> is
>>>>>>>>>>>>
>>>>>>>>>>>> not being matched by the Test Script Recorder's Regex Matching
>>>>>>>>>>>> functionality.
>>>>>>>>>>>>
>>>>>>>>>>>> The request I'm recording has a post value containing XML (SAML
>>>>>>>>>>>>
>>>>>>>>>>> token to
>>>>>>>>>> be
>>>>>>>>>>>> exact) which I'd like to replace with a variable automatically.
>>>>>>>>>>>>
>>>>>>>>>>>> What does the value look like?
>>>>>>>>>>>> Does it have multiple lines?
>>>>>>>>>>>>
>>>>>>>>>>>> No, it did not have multiple lines. I did check if this was the
>>>>>>>>>>>>
>>>>>>>>>>> case, but
>>>>>>>>>> it
>>>>>>>>>>>> wasn't
>>>>>>>>>>>>
>>>>>>>>>>>> For testing purposes I have configured a User Defined Variable
>>>>>>>>>>>>
>>>>>>>>>>> (called
>>>>>>>>>> TEST)
>>>>>>>>>>>> with a value of "(?s)^.*$", I've tried "^.*$" and ".*" as well
>>>>>>>>>>>> (all
>>>>>>>>>>>> without
>>>>>>>>>>>> double
>>>>>>>>>>>> quotes).
>>>>>>>>>>>>
>>>>>>>>>>>> Only ".*" replaces the content with this: <${TEST}>
>>>>>>>>>>>>
>>>>>>>>>>>> That does not make sense.
>>>>>>>>>>>> ".*" will match everything, including < and >, so the content
>>>>>>>>>>>> would
>>>>>>>>>>>> become
>>>>>>>>>>>> ${TEST}
>>>>>>>>>>>>
>>>>>>>>>>>> I know. It doesn't really. Hence I think this might be a bug.
>>>>>>>>>>>>
>>>>>>>>>>>> I've tried other expressions as well and I'm able to match
>>>>>>>>>>>> anything
>>>>>>>>>>>> within
>>>>>>>>>>>> the
>>>>>>>>>>>>
>>>>>>>>>>>> <> characters, but not those characters itself.
>>>>>>>>>>>>
>>>>>>>>>>>> Again, that does not make sense.
>>>>>>>>>>>>
>>>>>>>>>>>> The weird thing is, that inside the outer <> characters there are
>>>>>>>>>>>>
>>>>>>>>>>> other
>>>>>>>>>> <>
>>>>>>>>>>>> characters that are matched fine. It's just the first and last
>>>>>>>>>>>>
>>>>>>>>>>> character.
>>>>>>>>>> Does anyone else have experienced the same thing, or is this a
>>>>>>>>>>> known
>>>>>>>>>> issue?
>>>>>>>>>>>> It is not a known issue, and may not even be an issue.
>>>>>>>>>>>>
>>>>>>>>>>>> Or should I post this in the developer's mailing list?
>>>>>>>>>>>>
>>>>>>>>>>>> No, the developers all follow this list.
>>>>>>>>>>>>
>>>>>>>>>>>> Great, please see attachment for an example.
>>>>>>>>>>>>
>>>>>>>>>>>> Cheers
>>>>>>>>>>>>
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@jmeter.apache.org
For additional commands, e-mail: user-help@jmeter.apache.org


Re: Test Script Recorder XML Regex Matching

Posted by sebb <se...@gmail.com>.
On 5 October 2014 13:26, Felix Schumacher
<fe...@internetallee.de> wrote:
> Am 05.10.2014 um 11:30 schrieb sebb:
>
>> On 4 October 2014 19:41, Philippe Mouawad <ph...@gmail.com>
>> wrote:
>>>
>>> On Sat, Oct 4, 2014 at 2:10 PM, Felix Schumacher <
>>> felix.schumacher@internetallee.de> wrote:
>>>
>>>> Am 29.09.2014 um 22:32 schrieb Philippe Mouawad:
>>>>
>>>>> Hi Felix,
>>>>>
>>>> Hi
>>>> I agree with sebb, patch is interesting.
>>>>>
>>>>> But it clearly needs to be documented (I think many users don't know
>>>>> about
>>>>> this feature which is really interesting) as long as code, reading
>>>>> patch
>>>>> first it wasn't clear for me what was intended.
>>>>>
>>>> I have added documentation to the patch and found two other things, that
>>>> I
>>>> changed
>>>> in the same bug-entry.
>>>>
>>>> The random order of applying the matchers, seems a bit strange, so I
>>>> sorted the matchers
>>>> first by their length and if the matchers are the same length, then by
>>>> the
>>>> name of their keys. So
>>>> the set
>>>>   {'domain': 'example.com', 'server': 'www',  'regex': 'w.*' }
>>>> would be applied in the order ['domain', 'regex', 'server'] since
>>>> 'domain'
>>>> has the longest matcher and
>>>> 'regex' comes before 'server' alphabetically (matchers are both the same
>>>> length).
>>>>
>>> Isn't it better to order by longest value or regexp ?
>>> www is more specific than w.*
>>> So would be :
>>> domain, server , regex
>>
>> Or the code could try to match every variable and select the one that
>> produces the longest match.
>>
>> But rather than try and sort the regexes, which is always going to be
>> tricky to do "correctly" (whatever that means), maybe the user should
>> be given control of the matching order.
>>
>> For example, it is probably possible to match by order of appearance.
>>
>> It would certainly be possible to match the variables in sorted order by
>> name.
>> This would be a bit more awkard to use than changing the order of
>> variable definitions.
>
> I just wanted to give a simple algorithm for ordering, which I think is
> better than random ordering.
>
> Correctness will be hard to implement, when everyone has a different view on
> the correct ordering.
>
> I had thought of giving more control to the user by appending the variable
> names with something to sort by.
>
> For example extending the above example with variable names ['domain',
> 'server', 'regex'] the names could be
> changed to ['domain_3', 'server_1', 'regex_2'] to impose replacement in the
> order ['server', 'regex', 'domain'].
> But what should we do with the suffix '_\d+'? (A prefix could be used, too)
>
> We could look for a specially named variable like '_regex_order' which could
> have a comma separated list of
> the variable names in the wished order.
>
> The longer I think about it, the more I am inclined to take the simple
> ordering algorithm of length and then name. One can
> always make any regex longer by adding useless junk like
> '(?:WILLNOTBEFOUNDANYWAY)?' and in such a way influence
> the order.

No, length of regex is not useful.
More useful would be sorting by matched string.
Sorting by name is awkward to use, and anyway what about non-regexes
that happen to match the same text?

I don't think it's possible to automatically sort correctly by regex.
So we should allow the user to control the search order, as I already
suggested a short while ago.

> Felix
>
>>
>>>
>>>> If no one objects, I will submit it next week.
>>>>
>>>> Regards
>>>>   Felix
>>>>
>>>>> Thanks for contributing
>>>>> Regards
>>>>>
>>>>>
>>>>> On Monday, September 29, 2014, sebb <se...@gmail.com> wrote:
>>>>>
>>>>>   On 29 September 2014 15:49, Felix Schumacher
>>>>>>
>>>>>> <felix.schumacher@internetallee.de <javascript:;>> wrote:
>>>>>>
>>>>>>> Am 29. September 2014 12:46:19 MESZ, schrieb sebb <sebbaz@gmail.com
>>>>>>>
>>>>>> <javascript:;>>:
>>>>>>
>>>>>>> On 29 September 2014 11:24, Felix Schumacher
>>>>>>>>
>>>>>>>> <felix.schumacher@internetallee.de <javascript:;>> wrote:
>>>>>>>>
>>>>>>>>> Am 29.09.2014 11:56, schrieb sebb:
>>>>>>>>>
>>>>>>>>>   On 28 September 2014 18:11, Felix Schumacher
>>>>>>>>>>
>>>>>>>>>> <felix.schumacher@internetallee.de <javascript:;>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Am 22.09.2014 um 11:13 schrieb Marijn Wijbenga:
>>>>>>>>>>>
>>>>>>>>>>> I've attached a jmeter project file and a html file that
>>>>>>>>>>>
>>>>>>>>>> demonstrates the
>>>>>>>>>
>>>>>>>>> issue. In order to reproduce:
>>>>>>>>>>>
>>>>>>>>>>> 1. Load up xml-bug-test.jmx in jmeter.
>>>>>>>>>>> 2. Start the proxy (recorder)
>>>>>>>>>>> 3. Place xml-bug-test.html on a webserver somewhere (if on
>>>>>>>>>>>
>>>>>>>>>> localhost, do
>>>>>>>>>
>>>>>>>>> not
>>>>>>>>>>>
>>>>>>>>>>> forget to remove localhost from proxy exclusion if applicable)
>>>>>>>>>>> 4. Navigate with a browser to this file (using the proxy)
>>>>>>>>>>> 5. Click both buttons in order.
>>>>>>>>>>>
>>>>>>>>>>> I could not post to a html file, hence the "test 2" button will
>>>>>>>>>>>
>>>>>>>>>> post to
>>>>>>>>>
>>>>>>>>> Google. The page that loads has an error, but it still records the
>>>>>>>>>>
>>>>>>>>>> post
>>>>>>>>>
>>>>>>>>> request which is what we want to see.
>>>>>>>>>>>
>>>>>>>>>>> I also discovered that when I was using a "get" request instead
>>>>>>>>>>>
>>>>>>>>>> (I've
>>>>>>>>>
>>>>>>>>> made
>>>>>>>>>>>
>>>>>>>>>>> that "test 1") then it doesn't match the first character (%). I
>>>>>>>>>>>
>>>>>>>>>> think
>>>>>>>>>
>>>>>>>>> this
>>>>>>>>>>>
>>>>>>>>>>> is related.
>>>>>>>>>>>
>>>>>>>>>>> The project has a user defined variable called "TEST" with a
>>>>>>>>>>> value
>>>>>>>>>>>
>>>>>>>>>> os
>>>>>>>>>
>>>>>>>>> ".*",
>>>>>>>>>>>
>>>>>>>>>>> I've ticked the box
>>>>>>>>>>>
>>>>>>>>>>> To see the results, in the recording controller the last two
>>>>>>>>>>>
>>>>>>>>>> requests
>>>>>>>>>
>>>>>>>>> contain a parameter with these values:
>>>>>>>>>>>
>>>>>>>>>>> Test 1: %${TEST}
>>>>>>>>>>> Test 2: <${TEST}>
>>>>>>>>>>>
>>>>>>>>>>> Both should be just ${TEST} I believe.
>>>>>>>>>>>
>>>>>>>>>>> In the current implementation the regex will be matched against a
>>>>>>>>>>>
>>>>>>>>>> pattern
>>>>>>>>>
>>>>>>>>> which looks like
>>>>>>>>>>>
>>>>>>>>>>>    \b(YOUR_VALUE)\b
>>>>>>>>>>>
>>>>>>>>>>> As % and < are boundary characters they are excluded from you
>>>>>>>>>>>
>>>>>>>>>> pattern.
>>>>>>>>>> This is deliberate.
>>>>>>>>>> There were problems previously as partial values were being
>>>>>>>>>> unexpectedly matched.
>>>>>>>>>>
>>>>>>>>>> See https://issues.apache.org/bugzilla/show_bug.cgi?id=52678
>>>>>>>>>>
>>>>>>>>> I thougt so. Maybe, that would have been helped by adding more
>>>>>>>>> documentation, but then it is regex...
>>>>>>>>>
>>>>>>>>>>   I would consider this a bug, or at least documentation could be
>>>>>>>>>> a
>>>>>>>>>> bit
>>>>>>>>>
>>>>>>>>> more
>>>>>>>>>>>
>>>>>>>>>>> concise.
>>>>>>>>>>>
>>>>>>>>>> Patches welcome.
>>>>>>>>>>
>>>>>>>>> A patch was attached :)
>>>>>>>>>
>>>>>>>> I meant that we would welcome a patch for the documentation.
>>>>>>>> Or at least some indication of where the documentation needs to be
>>>>>>>> updated to clarify the current behaviour.
>>>>>>>>
>>>>>>> I will look into that.
>>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>>   What is your opinion on the option to detect parens and modify the
>>>>>> regex
>>>>>> behavior?
>>>>>>
>>>>>> Looks good to me.
>>>>>>
>>>>>> The parens are very unlikely to have been used in existing tests, so
>>>>>> the modified behaviour is unlikely to break anything.
>>>>>> But we should document it in the release notes just in case.
>>>>>>
>>>>>>   Felix
>>>>>>>>
>>>>>>>> Attached is a patch against trunk, which checks the regex if it
>>>>>>>>>>
>>>>>>>>>> starts
>>>>>>>>>
>>>>>>>>> with
>>>>>>>>>>>
>>>>>>>>>>> '(' and ends with ')' and uses the regex as given, instead of
>>>>>>>>>>>
>>>>>>>>>> building
>>>>>>>>>
>>>>>>>>> its
>>>>>>>>>>>
>>>>>>>>>>> own version.
>>>>>>>>>>>
>>>>>>>>>> Please use Bugzilla for patches; it's easier to keep track of
>>>>>>>>>> them.
>>>>>>>>>>
>>>>>>>>> I have already done so yesterday shortly after sending my mail. It
>>>>>>>>> is
>>>>>>>>> https://issues.apache.org/bugzilla/show_bug.cgi?id=57032
>>>>>>>>>
>>>>>>>>> What is missing from the patch is documentation. If the feature as
>>>>>>>>>
>>>>>>>> such is
>>>>>>>>
>>>>>>>>> ok, then I would add that to the existing documentation.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Regards
>>>>>>>>>    Felix
>>>>>>>>>
>>>>>>>>>>> Also, see notes below.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: sebb [mailto:sebbaz@gmail.com <javascript:;>]
>>>>>>>>>>> Sent: 21 September 2014 01:52
>>>>>>>>>>> To: JMeter Users List
>>>>>>>>>>> Subject: Re: Test Script Recorder XML Regex Matching
>>>>>>>>>>>
>>>>>>>>>>> On 19 September 2014 16:45, Marijn Wijbenga
>>>>>>>>>>> <Marijn.Wijbenga@cgpbooks.co.uk <javascript:;>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> I have an issue, which might well be a potential bug, where a
>>>>>>>>>>>
>>>>>>>>>> posted
>>>>>>>>>
>>>>>>>>> value
>>>>>>>>>>>
>>>>>>>>>>> is
>>>>>>>>>>>
>>>>>>>>>>> not being matched by the Test Script Recorder's Regex Matching
>>>>>>>>>>> functionality.
>>>>>>>>>>>
>>>>>>>>>>> The request I'm recording has a post value containing XML (SAML
>>>>>>>>>>>
>>>>>>>>>> token to
>>>>>>>>>
>>>>>>>>> be
>>>>>>>>>>>
>>>>>>>>>>> exact) which I'd like to replace with a variable automatically.
>>>>>>>>>>>
>>>>>>>>>>> What does the value look like?
>>>>>>>>>>> Does it have multiple lines?
>>>>>>>>>>>
>>>>>>>>>>> No, it did not have multiple lines. I did check if this was the
>>>>>>>>>>>
>>>>>>>>>> case, but
>>>>>>>>>
>>>>>>>>> it
>>>>>>>>>>>
>>>>>>>>>>> wasn't
>>>>>>>>>>>
>>>>>>>>>>> For testing purposes I have configured a User Defined Variable
>>>>>>>>>>>
>>>>>>>>>> (called
>>>>>>>>>
>>>>>>>>> TEST)
>>>>>>>>>>>
>>>>>>>>>>> with a value of "(?s)^.*$", I've tried "^.*$" and ".*" as well
>>>>>>>>>>> (all
>>>>>>>>>>> without
>>>>>>>>>>> double
>>>>>>>>>>> quotes).
>>>>>>>>>>>
>>>>>>>>>>> Only ".*" replaces the content with this: <${TEST}>
>>>>>>>>>>>
>>>>>>>>>>> That does not make sense.
>>>>>>>>>>> ".*" will match everything, including < and >, so the content
>>>>>>>>>>> would
>>>>>>>>>>> become
>>>>>>>>>>> ${TEST}
>>>>>>>>>>>
>>>>>>>>>>> I know. It doesn't really. Hence I think this might be a bug.
>>>>>>>>>>>
>>>>>>>>>>> I've tried other expressions as well and I'm able to match
>>>>>>>>>>> anything
>>>>>>>>>>> within
>>>>>>>>>>> the
>>>>>>>>>>>
>>>>>>>>>>> <> characters, but not those characters itself.
>>>>>>>>>>>
>>>>>>>>>>> Again, that does not make sense.
>>>>>>>>>>>
>>>>>>>>>>> The weird thing is, that inside the outer <> characters there are
>>>>>>>>>>>
>>>>>>>>>> other
>>>>>>>>>
>>>>>>>>> <>
>>>>>>>>>>>
>>>>>>>>>>> characters that are matched fine. It's just the first and last
>>>>>>>>>>>
>>>>>>>>>> character.
>>>>>>>>>
>>>>>>>>> Does anyone else have experienced the same thing, or is this a
>>>>>>>>>>
>>>>>>>>>> known
>>>>>>>>>
>>>>>>>>> issue?
>>>>>>>>>>>
>>>>>>>>>>> It is not a known issue, and may not even be an issue.
>>>>>>>>>>>
>>>>>>>>>>> Or should I post this in the developer's mailing list?
>>>>>>>>>>>
>>>>>>>>>>> No, the developers all follow this list.
>>>>>>>>>>>
>>>>>>>>>>> Great, please see attachment for an example.
>>>>>>>>>>>
>>>>>>>>>>> Cheers
>>>>>>>>>>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@jmeter.apache.org
> For additional commands, e-mail: user-help@jmeter.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@jmeter.apache.org
For additional commands, e-mail: user-help@jmeter.apache.org


Re: Test Script Recorder XML Regex Matching

Posted by Felix Schumacher <fe...@internetallee.de>.
Am 05.10.2014 um 11:30 schrieb sebb:
> On 4 October 2014 19:41, Philippe Mouawad <ph...@gmail.com> wrote:
>> On Sat, Oct 4, 2014 at 2:10 PM, Felix Schumacher <
>> felix.schumacher@internetallee.de> wrote:
>>
>>> Am 29.09.2014 um 22:32 schrieb Philippe Mouawad:
>>>
>>>> Hi Felix,
>>>>
>>> Hi
>>> I agree with sebb, patch is interesting.
>>>> But it clearly needs to be documented (I think many users don't know about
>>>> this feature which is really interesting) as long as code, reading patch
>>>> first it wasn't clear for me what was intended.
>>>>
>>> I have added documentation to the patch and found two other things, that I
>>> changed
>>> in the same bug-entry.
>>>
>>> The random order of applying the matchers, seems a bit strange, so I
>>> sorted the matchers
>>> first by their length and if the matchers are the same length, then by the
>>> name of their keys. So
>>> the set
>>>   {'domain': 'example.com', 'server': 'www',  'regex': 'w.*' }
>>> would be applied in the order ['domain', 'regex', 'server'] since 'domain'
>>> has the longest matcher and
>>> 'regex' comes before 'server' alphabetically (matchers are both the same
>>> length).
>>>
>> Isn't it better to order by longest value or regexp ?
>> www is more specific than w.*
>> So would be :
>> domain, server , regex
> Or the code could try to match every variable and select the one that
> produces the longest match.
>
> But rather than try and sort the regexes, which is always going to be
> tricky to do "correctly" (whatever that means), maybe the user should
> be given control of the matching order.
>
> For example, it is probably possible to match by order of appearance.
>
> It would certainly be possible to match the variables in sorted order by name.
> This would be a bit more awkard to use than changing the order of
> variable definitions.
I just wanted to give a simple algorithm for ordering, which I think is 
better than random ordering.

Correctness will be hard to implement, when everyone has a different 
view on the correct ordering.

I had thought of giving more control to the user by appending the 
variable names with something to sort by.

For example extending the above example with variable names ['domain', 
'server', 'regex'] the names could be
changed to ['domain_3', 'server_1', 'regex_2'] to impose replacement in 
the order ['server', 'regex', 'domain'].
But what should we do with the suffix '_\d+'? (A prefix could be used, too)

We could look for a specially named variable like '_regex_order' which 
could have a comma separated list of
the variable names in the wished order.

The longer I think about it, the more I am inclined to take the simple 
ordering algorithm of length and then name. One can
always make any regex longer by adding useless junk like 
'(?:WILLNOTBEFOUNDANYWAY)?' and in such a way influence
the order.

Felix
>
>>
>>> If no one objects, I will submit it next week.
>>>
>>> Regards
>>>   Felix
>>>
>>>> Thanks for contributing
>>>> Regards
>>>>
>>>>
>>>> On Monday, September 29, 2014, sebb <se...@gmail.com> wrote:
>>>>
>>>>   On 29 September 2014 15:49, Felix Schumacher
>>>>> <felix.schumacher@internetallee.de <javascript:;>> wrote:
>>>>>
>>>>>> Am 29. September 2014 12:46:19 MESZ, schrieb sebb <sebbaz@gmail.com
>>>>>>
>>>>> <javascript:;>>:
>>>>>
>>>>>> On 29 September 2014 11:24, Felix Schumacher
>>>>>>> <felix.schumacher@internetallee.de <javascript:;>> wrote:
>>>>>>>
>>>>>>>> Am 29.09.2014 11:56, schrieb sebb:
>>>>>>>>
>>>>>>>>   On 28 September 2014 18:11, Felix Schumacher
>>>>>>>>> <felix.schumacher@internetallee.de <javascript:;>> wrote:
>>>>>>>>>
>>>>>>>>>> Am 22.09.2014 um 11:13 schrieb Marijn Wijbenga:
>>>>>>>>>>
>>>>>>>>>> I've attached a jmeter project file and a html file that
>>>>>>>>>>
>>>>>>>>> demonstrates the
>>>>>>>> issue. In order to reproduce:
>>>>>>>>>> 1. Load up xml-bug-test.jmx in jmeter.
>>>>>>>>>> 2. Start the proxy (recorder)
>>>>>>>>>> 3. Place xml-bug-test.html on a webserver somewhere (if on
>>>>>>>>>>
>>>>>>>>> localhost, do
>>>>>>>> not
>>>>>>>>>> forget to remove localhost from proxy exclusion if applicable)
>>>>>>>>>> 4. Navigate with a browser to this file (using the proxy)
>>>>>>>>>> 5. Click both buttons in order.
>>>>>>>>>>
>>>>>>>>>> I could not post to a html file, hence the "test 2" button will
>>>>>>>>>>
>>>>>>>>> post to
>>>>>>>> Google. The page that loads has an error, but it still records the
>>>>>>>>> post
>>>>>>>> request which is what we want to see.
>>>>>>>>>> I also discovered that when I was using a "get" request instead
>>>>>>>>>>
>>>>>>>>> (I've
>>>>>>>> made
>>>>>>>>>> that "test 1") then it doesn't match the first character (%). I
>>>>>>>>>>
>>>>>>>>> think
>>>>>>>> this
>>>>>>>>>> is related.
>>>>>>>>>>
>>>>>>>>>> The project has a user defined variable called "TEST" with a value
>>>>>>>>>>
>>>>>>>>> os
>>>>>>>> ".*",
>>>>>>>>>> I've ticked the box
>>>>>>>>>>
>>>>>>>>>> To see the results, in the recording controller the last two
>>>>>>>>>>
>>>>>>>>> requests
>>>>>>>> contain a parameter with these values:
>>>>>>>>>> Test 1: %${TEST}
>>>>>>>>>> Test 2: <${TEST}>
>>>>>>>>>>
>>>>>>>>>> Both should be just ${TEST} I believe.
>>>>>>>>>>
>>>>>>>>>> In the current implementation the regex will be matched against a
>>>>>>>>>>
>>>>>>>>> pattern
>>>>>>>> which looks like
>>>>>>>>>>    \b(YOUR_VALUE)\b
>>>>>>>>>>
>>>>>>>>>> As % and < are boundary characters they are excluded from you
>>>>>>>>>>
>>>>>>>>> pattern.
>>>>>>>>> This is deliberate.
>>>>>>>>> There were problems previously as partial values were being
>>>>>>>>> unexpectedly matched.
>>>>>>>>>
>>>>>>>>> See https://issues.apache.org/bugzilla/show_bug.cgi?id=52678
>>>>>>>>>
>>>>>>>> I thougt so. Maybe, that would have been helped by adding more
>>>>>>>> documentation, but then it is regex...
>>>>>>>>
>>>>>>>>>   I would consider this a bug, or at least documentation could be a
>>>>>>>>> bit
>>>>>>>> more
>>>>>>>>>> concise.
>>>>>>>>>>
>>>>>>>>> Patches welcome.
>>>>>>>>>
>>>>>>>> A patch was attached :)
>>>>>>>>
>>>>>>> I meant that we would welcome a patch for the documentation.
>>>>>>> Or at least some indication of where the documentation needs to be
>>>>>>> updated to clarify the current behaviour.
>>>>>>>
>>>>>> I will look into that.
>>>>>>
>>>>> Thanks.
>>>>>
>>>>>   What is your opinion on the option to detect parens and modify the regex
>>>>> behavior?
>>>>>
>>>>> Looks good to me.
>>>>>
>>>>> The parens are very unlikely to have been used in existing tests, so
>>>>> the modified behaviour is unlikely to break anything.
>>>>> But we should document it in the release notes just in case.
>>>>>
>>>>>   Felix
>>>>>>> Attached is a patch against trunk, which checks the regex if it
>>>>>>>>> starts
>>>>>>>> with
>>>>>>>>>> '(' and ends with ')' and uses the regex as given, instead of
>>>>>>>>>>
>>>>>>>>> building
>>>>>>>> its
>>>>>>>>>> own version.
>>>>>>>>>>
>>>>>>>>> Please use Bugzilla for patches; it's easier to keep track of them.
>>>>>>>>>
>>>>>>>> I have already done so yesterday shortly after sending my mail. It is
>>>>>>>> https://issues.apache.org/bugzilla/show_bug.cgi?id=57032
>>>>>>>>
>>>>>>>> What is missing from the patch is documentation. If the feature as
>>>>>>>>
>>>>>>> such is
>>>>>>>
>>>>>>>> ok, then I would add that to the existing documentation.
>>>>>>>>
>>>>>>>>
>>>>>>>> Regards
>>>>>>>>    Felix
>>>>>>>>
>>>>>>>>>> Also, see notes below.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: sebb [mailto:sebbaz@gmail.com <javascript:;>]
>>>>>>>>>> Sent: 21 September 2014 01:52
>>>>>>>>>> To: JMeter Users List
>>>>>>>>>> Subject: Re: Test Script Recorder XML Regex Matching
>>>>>>>>>>
>>>>>>>>>> On 19 September 2014 16:45, Marijn Wijbenga
>>>>>>>>>> <Marijn.Wijbenga@cgpbooks.co.uk <javascript:;>> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I have an issue, which might well be a potential bug, where a
>>>>>>>>>>
>>>>>>>>> posted
>>>>>>>> value
>>>>>>>>>> is
>>>>>>>>>>
>>>>>>>>>> not being matched by the Test Script Recorder's Regex Matching
>>>>>>>>>> functionality.
>>>>>>>>>>
>>>>>>>>>> The request I'm recording has a post value containing XML (SAML
>>>>>>>>>>
>>>>>>>>> token to
>>>>>>>> be
>>>>>>>>>> exact) which I'd like to replace with a variable automatically.
>>>>>>>>>>
>>>>>>>>>> What does the value look like?
>>>>>>>>>> Does it have multiple lines?
>>>>>>>>>>
>>>>>>>>>> No, it did not have multiple lines. I did check if this was the
>>>>>>>>>>
>>>>>>>>> case, but
>>>>>>>> it
>>>>>>>>>> wasn't
>>>>>>>>>>
>>>>>>>>>> For testing purposes I have configured a User Defined Variable
>>>>>>>>>>
>>>>>>>>> (called
>>>>>>>> TEST)
>>>>>>>>>> with a value of "(?s)^.*$", I've tried "^.*$" and ".*" as well (all
>>>>>>>>>> without
>>>>>>>>>> double
>>>>>>>>>> quotes).
>>>>>>>>>>
>>>>>>>>>> Only ".*" replaces the content with this: <${TEST}>
>>>>>>>>>>
>>>>>>>>>> That does not make sense.
>>>>>>>>>> ".*" will match everything, including < and >, so the content would
>>>>>>>>>> become
>>>>>>>>>> ${TEST}
>>>>>>>>>>
>>>>>>>>>> I know. It doesn't really. Hence I think this might be a bug.
>>>>>>>>>>
>>>>>>>>>> I've tried other expressions as well and I'm able to match anything
>>>>>>>>>> within
>>>>>>>>>> the
>>>>>>>>>>
>>>>>>>>>> <> characters, but not those characters itself.
>>>>>>>>>>
>>>>>>>>>> Again, that does not make sense.
>>>>>>>>>>
>>>>>>>>>> The weird thing is, that inside the outer <> characters there are
>>>>>>>>>>
>>>>>>>>> other
>>>>>>>> <>
>>>>>>>>>> characters that are matched fine. It's just the first and last
>>>>>>>>>>
>>>>>>>>> character.
>>>>>>>> Does anyone else have experienced the same thing, or is this a
>>>>>>>>> known
>>>>>>>> issue?
>>>>>>>>>> It is not a known issue, and may not even be an issue.
>>>>>>>>>>
>>>>>>>>>> Or should I post this in the developer's mailing list?
>>>>>>>>>>
>>>>>>>>>> No, the developers all follow this list.
>>>>>>>>>>
>>>>>>>>>> Great, please see attachment for an example.
>>>>>>>>>>
>>>>>>>>>> Cheers
>>>>>>>>>>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@jmeter.apache.org
For additional commands, e-mail: user-help@jmeter.apache.org


Re: Test Script Recorder XML Regex Matching

Posted by sebb <se...@gmail.com>.
On 4 October 2014 19:41, Philippe Mouawad <ph...@gmail.com> wrote:
> On Sat, Oct 4, 2014 at 2:10 PM, Felix Schumacher <
> felix.schumacher@internetallee.de> wrote:
>
>> Am 29.09.2014 um 22:32 schrieb Philippe Mouawad:
>>
>>> Hi Felix,
>>>
>> Hi
>
>> I agree with sebb, patch is interesting.
>>> But it clearly needs to be documented (I think many users don't know about
>>> this feature which is really interesting) as long as code, reading patch
>>> first it wasn't clear for me what was intended.
>>>
>> I have added documentation to the patch and found two other things, that I
>> changed
>> in the same bug-entry.
>>
>> The random order of applying the matchers, seems a bit strange, so I
>> sorted the matchers
>> first by their length and if the matchers are the same length, then by the
>> name of their keys. So
>> the set
>>  {'domain': 'example.com', 'server': 'www',  'regex': 'w.*' }
>> would be applied in the order ['domain', 'regex', 'server'] since 'domain'
>> has the longest matcher and
>> 'regex' comes before 'server' alphabetically (matchers are both the same
>> length).
>>
> Isn't it better to order by longest value or regexp ?
> www is more specific than w.*
> So would be :
> domain, server , regex

Or the code could try to match every variable and select the one that
produces the longest match.

But rather than try and sort the regexes, which is always going to be
tricky to do "correctly" (whatever that means), maybe the user should
be given control of the matching order.

For example, it is probably possible to match by order of appearance.

It would certainly be possible to match the variables in sorted order by name.
This would be a bit more awkard to use than changing the order of
variable definitions.

>
>
>>
>> If no one objects, I will submit it next week.
>>
>> Regards
>>  Felix
>>
>>>
>>> Thanks for contributing
>>> Regards
>>>
>>>
>>> On Monday, September 29, 2014, sebb <se...@gmail.com> wrote:
>>>
>>>  On 29 September 2014 15:49, Felix Schumacher
>>>> <felix.schumacher@internetallee.de <javascript:;>> wrote:
>>>>
>>>>>
>>>>> Am 29. September 2014 12:46:19 MESZ, schrieb sebb <sebbaz@gmail.com
>>>>>
>>>> <javascript:;>>:
>>>>
>>>>> On 29 September 2014 11:24, Felix Schumacher
>>>>>> <felix.schumacher@internetallee.de <javascript:;>> wrote:
>>>>>>
>>>>>>> Am 29.09.2014 11:56, schrieb sebb:
>>>>>>>
>>>>>>>  On 28 September 2014 18:11, Felix Schumacher
>>>>>>>> <felix.schumacher@internetallee.de <javascript:;>> wrote:
>>>>>>>>
>>>>>>>>> Am 22.09.2014 um 11:13 schrieb Marijn Wijbenga:
>>>>>>>>>
>>>>>>>>> I've attached a jmeter project file and a html file that
>>>>>>>>>
>>>>>>>> demonstrates the
>>>>>>
>>>>>>> issue. In order to reproduce:
>>>>>>>>> 1. Load up xml-bug-test.jmx in jmeter.
>>>>>>>>> 2. Start the proxy (recorder)
>>>>>>>>> 3. Place xml-bug-test.html on a webserver somewhere (if on
>>>>>>>>>
>>>>>>>> localhost, do
>>>>>>
>>>>>>> not
>>>>>>>>> forget to remove localhost from proxy exclusion if applicable)
>>>>>>>>> 4. Navigate with a browser to this file (using the proxy)
>>>>>>>>> 5. Click both buttons in order.
>>>>>>>>>
>>>>>>>>> I could not post to a html file, hence the "test 2" button will
>>>>>>>>>
>>>>>>>> post to
>>>>>>
>>>>>>> Google. The page that loads has an error, but it still records the
>>>>>>>>>
>>>>>>>> post
>>>>>>
>>>>>>> request which is what we want to see.
>>>>>>>>>
>>>>>>>>> I also discovered that when I was using a "get" request instead
>>>>>>>>>
>>>>>>>> (I've
>>>>>>
>>>>>>> made
>>>>>>>>> that "test 1") then it doesn't match the first character (%). I
>>>>>>>>>
>>>>>>>> think
>>>>>>
>>>>>>> this
>>>>>>>>> is related.
>>>>>>>>>
>>>>>>>>> The project has a user defined variable called "TEST" with a value
>>>>>>>>>
>>>>>>>> os
>>>>>>
>>>>>>> ".*",
>>>>>>>>> I've ticked the box
>>>>>>>>>
>>>>>>>>> To see the results, in the recording controller the last two
>>>>>>>>>
>>>>>>>> requests
>>>>>>
>>>>>>> contain a parameter with these values:
>>>>>>>>> Test 1: %${TEST}
>>>>>>>>> Test 2: <${TEST}>
>>>>>>>>>
>>>>>>>>> Both should be just ${TEST} I believe.
>>>>>>>>>
>>>>>>>>> In the current implementation the regex will be matched against a
>>>>>>>>>
>>>>>>>> pattern
>>>>>>
>>>>>>> which looks like
>>>>>>>>>   \b(YOUR_VALUE)\b
>>>>>>>>>
>>>>>>>>> As % and < are boundary characters they are excluded from you
>>>>>>>>>
>>>>>>>> pattern.
>>>>>>
>>>>>>>
>>>>>>>> This is deliberate.
>>>>>>>> There were problems previously as partial values were being
>>>>>>>> unexpectedly matched.
>>>>>>>>
>>>>>>>> See https://issues.apache.org/bugzilla/show_bug.cgi?id=52678
>>>>>>>>
>>>>>>>
>>>>>>> I thougt so. Maybe, that would have been helped by adding more
>>>>>>> documentation, but then it is regex...
>>>>>>>
>>>>>>>>
>>>>>>>>  I would consider this a bug, or at least documentation could be a
>>>>>>>>>
>>>>>>>> bit
>>>>>>
>>>>>>> more
>>>>>>>>> concise.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Patches welcome.
>>>>>>>>
>>>>>>> A patch was attached :)
>>>>>>>
>>>>>> I meant that we would welcome a patch for the documentation.
>>>>>> Or at least some indication of where the documentation needs to be
>>>>>> updated to clarify the current behaviour.
>>>>>>
>>>>> I will look into that.
>>>>>
>>>> Thanks.
>>>>
>>>>  What is your opinion on the option to detect parens and modify the regex
>>>>>
>>>> behavior?
>>>>
>>>> Looks good to me.
>>>>
>>>> The parens are very unlikely to have been used in existing tests, so
>>>> the modified behaviour is unlikely to break anything.
>>>> But we should document it in the release notes just in case.
>>>>
>>>>  Felix
>>>>>
>>>>>> Attached is a patch against trunk, which checks the regex if it
>>>>>>>>>
>>>>>>>> starts
>>>>>>
>>>>>>> with
>>>>>>>>> '(' and ends with ')' and uses the regex as given, instead of
>>>>>>>>>
>>>>>>>> building
>>>>>>
>>>>>>> its
>>>>>>>>> own version.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Please use Bugzilla for patches; it's easier to keep track of them.
>>>>>>>>
>>>>>>> I have already done so yesterday shortly after sending my mail. It is
>>>>>>> https://issues.apache.org/bugzilla/show_bug.cgi?id=57032
>>>>>>>
>>>>>>> What is missing from the patch is documentation. If the feature as
>>>>>>>
>>>>>> such is
>>>>>>
>>>>>>> ok, then I would add that to the existing documentation.
>>>>>>>
>>>>>>>
>>>>>>> Regards
>>>>>>>   Felix
>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Also, see notes below.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: sebb [mailto:sebbaz@gmail.com <javascript:;>]
>>>>>>>>> Sent: 21 September 2014 01:52
>>>>>>>>> To: JMeter Users List
>>>>>>>>> Subject: Re: Test Script Recorder XML Regex Matching
>>>>>>>>>
>>>>>>>>> On 19 September 2014 16:45, Marijn Wijbenga
>>>>>>>>> <Marijn.Wijbenga@cgpbooks.co.uk <javascript:;>> wrote:
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I have an issue, which might well be a potential bug, where a
>>>>>>>>>
>>>>>>>> posted
>>>>>>
>>>>>>> value
>>>>>>>>> is
>>>>>>>>>
>>>>>>>>> not being matched by the Test Script Recorder's Regex Matching
>>>>>>>>> functionality.
>>>>>>>>>
>>>>>>>>> The request I'm recording has a post value containing XML (SAML
>>>>>>>>>
>>>>>>>> token to
>>>>>>
>>>>>>> be
>>>>>>>>>
>>>>>>>>> exact) which I'd like to replace with a variable automatically.
>>>>>>>>>
>>>>>>>>> What does the value look like?
>>>>>>>>> Does it have multiple lines?
>>>>>>>>>
>>>>>>>>> No, it did not have multiple lines. I did check if this was the
>>>>>>>>>
>>>>>>>> case, but
>>>>>>
>>>>>>> it
>>>>>>>>> wasn't
>>>>>>>>>
>>>>>>>>> For testing purposes I have configured a User Defined Variable
>>>>>>>>>
>>>>>>>> (called
>>>>>>
>>>>>>> TEST)
>>>>>>>>>
>>>>>>>>> with a value of "(?s)^.*$", I've tried "^.*$" and ".*" as well (all
>>>>>>>>> without
>>>>>>>>> double
>>>>>>>>> quotes).
>>>>>>>>>
>>>>>>>>> Only ".*" replaces the content with this: <${TEST}>
>>>>>>>>>
>>>>>>>>> That does not make sense.
>>>>>>>>> ".*" will match everything, including < and >, so the content would
>>>>>>>>> become
>>>>>>>>> ${TEST}
>>>>>>>>>
>>>>>>>>> I know. It doesn't really. Hence I think this might be a bug.
>>>>>>>>>
>>>>>>>>> I've tried other expressions as well and I'm able to match anything
>>>>>>>>> within
>>>>>>>>> the
>>>>>>>>>
>>>>>>>>> <> characters, but not those characters itself.
>>>>>>>>>
>>>>>>>>> Again, that does not make sense.
>>>>>>>>>
>>>>>>>>> The weird thing is, that inside the outer <> characters there are
>>>>>>>>>
>>>>>>>> other
>>>>>>
>>>>>>> <>
>>>>>>>>> characters that are matched fine. It's just the first and last
>>>>>>>>>
>>>>>>>> character.
>>>>>>
>>>>>>> Does anyone else have experienced the same thing, or is this a
>>>>>>>>>
>>>>>>>> known
>>>>>>
>>>>>>> issue?
>>>>>>>>>
>>>>>>>>> It is not a known issue, and may not even be an issue.
>>>>>>>>>
>>>>>>>>> Or should I post this in the developer's mailing list?
>>>>>>>>>
>>>>>>>>> No, the developers all follow this list.
>>>>>>>>>
>>>>>>>>> Great, please see attachment for an example.
>>>>>>>>>
>>>>>>>>> Cheers
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@jmeter.apache.org
>> For additional commands, e-mail: user-help@jmeter.apache.org
>>
>>
>
>
> --
> Cordialement.
> Philippe Mouawad.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@jmeter.apache.org
For additional commands, e-mail: user-help@jmeter.apache.org


Re: Test Script Recorder XML Regex Matching

Posted by Felix Schumacher <fe...@internetallee.de>.
Am 04.10.2014 um 20:41 schrieb Philippe Mouawad:
> On Sat, Oct 4, 2014 at 2:10 PM, Felix Schumacher <
> felix.schumacher@internetallee.de> wrote:
>
>> Am 29.09.2014 um 22:32 schrieb Philippe Mouawad:
>>
>>> Hi Felix,
>>>
>> Hi
>> I agree with sebb, patch is interesting.
>>> But it clearly needs to be documented (I think many users don't know about
>>> this feature which is really interesting) as long as code, reading patch
>>> first it wasn't clear for me what was intended.
>>>
>> I have added documentation to the patch and found two other things, that I
>> changed
>> in the same bug-entry.
>>
>> The random order of applying the matchers, seems a bit strange, so I
>> sorted the matchers
>> first by their length and if the matchers are the same length, then by the
>> name of their keys. So
>> the set
>>   {'domain': 'example.com', 'server': 'www',  'regex': 'w.*' }
>> would be applied in the order ['domain', 'regex', 'server'] since 'domain'
>> has the longest matcher and
>> 'regex' comes before 'server' alphabetically (matchers are both the same
>> length).
>>
> Isn't it better to order by longest value or regexp ?
> www is more specific than w.*
> So would be :
> domain, server , regex
Then we have the problem to decide, whether a string could be an regex. 
A simple one would be to check for alphanumeric characters only.

But on the other hand, if someone wanted to check for a non regex first, 
she could rename the variable name, so that it was first.

Felix
>
>
>
>> If no one objects, I will submit it next week.
>>
>> Regards
>>   Felix
>>
>>> Thanks for contributing
>>> Regards
>>>
>>>
>>> On Monday, September 29, 2014, sebb <se...@gmail.com> wrote:
>>>
>>>   On 29 September 2014 15:49, Felix Schumacher
>>>> <felix.schumacher@internetallee.de <javascript:;>> wrote:
>>>>
>>>>> Am 29. September 2014 12:46:19 MESZ, schrieb sebb <sebbaz@gmail.com
>>>>>
>>>> <javascript:;>>:
>>>>
>>>>> On 29 September 2014 11:24, Felix Schumacher
>>>>>> <felix.schumacher@internetallee.de <javascript:;>> wrote:
>>>>>>
>>>>>>> Am 29.09.2014 11:56, schrieb sebb:
>>>>>>>
>>>>>>>   On 28 September 2014 18:11, Felix Schumacher
>>>>>>>> <felix.schumacher@internetallee.de <javascript:;>> wrote:
>>>>>>>>
>>>>>>>>> Am 22.09.2014 um 11:13 schrieb Marijn Wijbenga:
>>>>>>>>>
>>>>>>>>> I've attached a jmeter project file and a html file that
>>>>>>>>>
>>>>>>>> demonstrates the
>>>>>>> issue. In order to reproduce:
>>>>>>>>> 1. Load up xml-bug-test.jmx in jmeter.
>>>>>>>>> 2. Start the proxy (recorder)
>>>>>>>>> 3. Place xml-bug-test.html on a webserver somewhere (if on
>>>>>>>>>
>>>>>>>> localhost, do
>>>>>>> not
>>>>>>>>> forget to remove localhost from proxy exclusion if applicable)
>>>>>>>>> 4. Navigate with a browser to this file (using the proxy)
>>>>>>>>> 5. Click both buttons in order.
>>>>>>>>>
>>>>>>>>> I could not post to a html file, hence the "test 2" button will
>>>>>>>>>
>>>>>>>> post to
>>>>>>> Google. The page that loads has an error, but it still records the
>>>>>>>> post
>>>>>>> request which is what we want to see.
>>>>>>>>> I also discovered that when I was using a "get" request instead
>>>>>>>>>
>>>>>>>> (I've
>>>>>>> made
>>>>>>>>> that "test 1") then it doesn't match the first character (%). I
>>>>>>>>>
>>>>>>>> think
>>>>>>> this
>>>>>>>>> is related.
>>>>>>>>>
>>>>>>>>> The project has a user defined variable called "TEST" with a value
>>>>>>>>>
>>>>>>>> os
>>>>>>> ".*",
>>>>>>>>> I've ticked the box
>>>>>>>>>
>>>>>>>>> To see the results, in the recording controller the last two
>>>>>>>>>
>>>>>>>> requests
>>>>>>> contain a parameter with these values:
>>>>>>>>> Test 1: %${TEST}
>>>>>>>>> Test 2: <${TEST}>
>>>>>>>>>
>>>>>>>>> Both should be just ${TEST} I believe.
>>>>>>>>>
>>>>>>>>> In the current implementation the regex will be matched against a
>>>>>>>>>
>>>>>>>> pattern
>>>>>>> which looks like
>>>>>>>>>    \b(YOUR_VALUE)\b
>>>>>>>>>
>>>>>>>>> As % and < are boundary characters they are excluded from you
>>>>>>>>>
>>>>>>>> pattern.
>>>>>>>> This is deliberate.
>>>>>>>> There were problems previously as partial values were being
>>>>>>>> unexpectedly matched.
>>>>>>>>
>>>>>>>> See https://issues.apache.org/bugzilla/show_bug.cgi?id=52678
>>>>>>>>
>>>>>>> I thougt so. Maybe, that would have been helped by adding more
>>>>>>> documentation, but then it is regex...
>>>>>>>
>>>>>>>>   I would consider this a bug, or at least documentation could be a
>>>>>>>> bit
>>>>>>> more
>>>>>>>>> concise.
>>>>>>>>>
>>>>>>>> Patches welcome.
>>>>>>>>
>>>>>>> A patch was attached :)
>>>>>>>
>>>>>> I meant that we would welcome a patch for the documentation.
>>>>>> Or at least some indication of where the documentation needs to be
>>>>>> updated to clarify the current behaviour.
>>>>>>
>>>>> I will look into that.
>>>>>
>>>> Thanks.
>>>>
>>>>   What is your opinion on the option to detect parens and modify the regex
>>>> behavior?
>>>>
>>>> Looks good to me.
>>>>
>>>> The parens are very unlikely to have been used in existing tests, so
>>>> the modified behaviour is unlikely to break anything.
>>>> But we should document it in the release notes just in case.
>>>>
>>>>   Felix
>>>>>> Attached is a patch against trunk, which checks the regex if it
>>>>>>>> starts
>>>>>>> with
>>>>>>>>> '(' and ends with ')' and uses the regex as given, instead of
>>>>>>>>>
>>>>>>>> building
>>>>>>> its
>>>>>>>>> own version.
>>>>>>>>>
>>>>>>>> Please use Bugzilla for patches; it's easier to keep track of them.
>>>>>>>>
>>>>>>> I have already done so yesterday shortly after sending my mail. It is
>>>>>>> https://issues.apache.org/bugzilla/show_bug.cgi?id=57032
>>>>>>>
>>>>>>> What is missing from the patch is documentation. If the feature as
>>>>>>>
>>>>>> such is
>>>>>>
>>>>>>> ok, then I would add that to the existing documentation.
>>>>>>>
>>>>>>>
>>>>>>> Regards
>>>>>>>    Felix
>>>>>>>
>>>>>>>>> Also, see notes below.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: sebb [mailto:sebbaz@gmail.com <javascript:;>]
>>>>>>>>> Sent: 21 September 2014 01:52
>>>>>>>>> To: JMeter Users List
>>>>>>>>> Subject: Re: Test Script Recorder XML Regex Matching
>>>>>>>>>
>>>>>>>>> On 19 September 2014 16:45, Marijn Wijbenga
>>>>>>>>> <Marijn.Wijbenga@cgpbooks.co.uk <javascript:;>> wrote:
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I have an issue, which might well be a potential bug, where a
>>>>>>>>>
>>>>>>>> posted
>>>>>>> value
>>>>>>>>> is
>>>>>>>>>
>>>>>>>>> not being matched by the Test Script Recorder's Regex Matching
>>>>>>>>> functionality.
>>>>>>>>>
>>>>>>>>> The request I'm recording has a post value containing XML (SAML
>>>>>>>>>
>>>>>>>> token to
>>>>>>> be
>>>>>>>>> exact) which I'd like to replace with a variable automatically.
>>>>>>>>>
>>>>>>>>> What does the value look like?
>>>>>>>>> Does it have multiple lines?
>>>>>>>>>
>>>>>>>>> No, it did not have multiple lines. I did check if this was the
>>>>>>>>>
>>>>>>>> case, but
>>>>>>> it
>>>>>>>>> wasn't
>>>>>>>>>
>>>>>>>>> For testing purposes I have configured a User Defined Variable
>>>>>>>>>
>>>>>>>> (called
>>>>>>> TEST)
>>>>>>>>> with a value of "(?s)^.*$", I've tried "^.*$" and ".*" as well (all
>>>>>>>>> without
>>>>>>>>> double
>>>>>>>>> quotes).
>>>>>>>>>
>>>>>>>>> Only ".*" replaces the content with this: <${TEST}>
>>>>>>>>>
>>>>>>>>> That does not make sense.
>>>>>>>>> ".*" will match everything, including < and >, so the content would
>>>>>>>>> become
>>>>>>>>> ${TEST}
>>>>>>>>>
>>>>>>>>> I know. It doesn't really. Hence I think this might be a bug.
>>>>>>>>>
>>>>>>>>> I've tried other expressions as well and I'm able to match anything
>>>>>>>>> within
>>>>>>>>> the
>>>>>>>>>
>>>>>>>>> <> characters, but not those characters itself.
>>>>>>>>>
>>>>>>>>> Again, that does not make sense.
>>>>>>>>>
>>>>>>>>> The weird thing is, that inside the outer <> characters there are
>>>>>>>>>
>>>>>>>> other
>>>>>>> <>
>>>>>>>>> characters that are matched fine. It's just the first and last
>>>>>>>>>
>>>>>>>> character.
>>>>>>> Does anyone else have experienced the same thing, or is this a
>>>>>>>> known
>>>>>>> issue?
>>>>>>>>> It is not a known issue, and may not even be an issue.
>>>>>>>>>
>>>>>>>>> Or should I post this in the developer's mailing list?
>>>>>>>>>
>>>>>>>>> No, the developers all follow this list.
>>>>>>>>>
>>>>>>>>> Great, please see attachment for an example.
>>>>>>>>>
>>>>>>>>> Cheers
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@jmeter.apache.org
For additional commands, e-mail: user-help@jmeter.apache.org


Re: Test Script Recorder XML Regex Matching

Posted by Philippe Mouawad <ph...@gmail.com>.
On Sat, Oct 4, 2014 at 2:10 PM, Felix Schumacher <
felix.schumacher@internetallee.de> wrote:

> Am 29.09.2014 um 22:32 schrieb Philippe Mouawad:
>
>> Hi Felix,
>>
> Hi

> I agree with sebb, patch is interesting.
>> But it clearly needs to be documented (I think many users don't know about
>> this feature which is really interesting) as long as code, reading patch
>> first it wasn't clear for me what was intended.
>>
> I have added documentation to the patch and found two other things, that I
> changed
> in the same bug-entry.
>
> The random order of applying the matchers, seems a bit strange, so I
> sorted the matchers
> first by their length and if the matchers are the same length, then by the
> name of their keys. So
> the set
>  {'domain': 'example.com', 'server': 'www',  'regex': 'w.*' }
> would be applied in the order ['domain', 'regex', 'server'] since 'domain'
> has the longest matcher and
> 'regex' comes before 'server' alphabetically (matchers are both the same
> length).
>
Isn't it better to order by longest value or regexp ?
www is more specific than w.*
So would be :
domain, server , regex



>
> If no one objects, I will submit it next week.
>
> Regards
>  Felix
>
>>
>> Thanks for contributing
>> Regards
>>
>>
>> On Monday, September 29, 2014, sebb <se...@gmail.com> wrote:
>>
>>  On 29 September 2014 15:49, Felix Schumacher
>>> <felix.schumacher@internetallee.de <javascript:;>> wrote:
>>>
>>>>
>>>> Am 29. September 2014 12:46:19 MESZ, schrieb sebb <sebbaz@gmail.com
>>>>
>>> <javascript:;>>:
>>>
>>>> On 29 September 2014 11:24, Felix Schumacher
>>>>> <felix.schumacher@internetallee.de <javascript:;>> wrote:
>>>>>
>>>>>> Am 29.09.2014 11:56, schrieb sebb:
>>>>>>
>>>>>>  On 28 September 2014 18:11, Felix Schumacher
>>>>>>> <felix.schumacher@internetallee.de <javascript:;>> wrote:
>>>>>>>
>>>>>>>> Am 22.09.2014 um 11:13 schrieb Marijn Wijbenga:
>>>>>>>>
>>>>>>>> I've attached a jmeter project file and a html file that
>>>>>>>>
>>>>>>> demonstrates the
>>>>>
>>>>>> issue. In order to reproduce:
>>>>>>>> 1. Load up xml-bug-test.jmx in jmeter.
>>>>>>>> 2. Start the proxy (recorder)
>>>>>>>> 3. Place xml-bug-test.html on a webserver somewhere (if on
>>>>>>>>
>>>>>>> localhost, do
>>>>>
>>>>>> not
>>>>>>>> forget to remove localhost from proxy exclusion if applicable)
>>>>>>>> 4. Navigate with a browser to this file (using the proxy)
>>>>>>>> 5. Click both buttons in order.
>>>>>>>>
>>>>>>>> I could not post to a html file, hence the "test 2" button will
>>>>>>>>
>>>>>>> post to
>>>>>
>>>>>> Google. The page that loads has an error, but it still records the
>>>>>>>>
>>>>>>> post
>>>>>
>>>>>> request which is what we want to see.
>>>>>>>>
>>>>>>>> I also discovered that when I was using a "get" request instead
>>>>>>>>
>>>>>>> (I've
>>>>>
>>>>>> made
>>>>>>>> that "test 1") then it doesn't match the first character (%). I
>>>>>>>>
>>>>>>> think
>>>>>
>>>>>> this
>>>>>>>> is related.
>>>>>>>>
>>>>>>>> The project has a user defined variable called "TEST" with a value
>>>>>>>>
>>>>>>> os
>>>>>
>>>>>> ".*",
>>>>>>>> I've ticked the box
>>>>>>>>
>>>>>>>> To see the results, in the recording controller the last two
>>>>>>>>
>>>>>>> requests
>>>>>
>>>>>> contain a parameter with these values:
>>>>>>>> Test 1: %${TEST}
>>>>>>>> Test 2: <${TEST}>
>>>>>>>>
>>>>>>>> Both should be just ${TEST} I believe.
>>>>>>>>
>>>>>>>> In the current implementation the regex will be matched against a
>>>>>>>>
>>>>>>> pattern
>>>>>
>>>>>> which looks like
>>>>>>>>   \b(YOUR_VALUE)\b
>>>>>>>>
>>>>>>>> As % and < are boundary characters they are excluded from you
>>>>>>>>
>>>>>>> pattern.
>>>>>
>>>>>>
>>>>>>> This is deliberate.
>>>>>>> There were problems previously as partial values were being
>>>>>>> unexpectedly matched.
>>>>>>>
>>>>>>> See https://issues.apache.org/bugzilla/show_bug.cgi?id=52678
>>>>>>>
>>>>>>
>>>>>> I thougt so. Maybe, that would have been helped by adding more
>>>>>> documentation, but then it is regex...
>>>>>>
>>>>>>>
>>>>>>>  I would consider this a bug, or at least documentation could be a
>>>>>>>>
>>>>>>> bit
>>>>>
>>>>>> more
>>>>>>>> concise.
>>>>>>>>
>>>>>>>
>>>>>>> Patches welcome.
>>>>>>>
>>>>>> A patch was attached :)
>>>>>>
>>>>> I meant that we would welcome a patch for the documentation.
>>>>> Or at least some indication of where the documentation needs to be
>>>>> updated to clarify the current behaviour.
>>>>>
>>>> I will look into that.
>>>>
>>> Thanks.
>>>
>>>  What is your opinion on the option to detect parens and modify the regex
>>>>
>>> behavior?
>>>
>>> Looks good to me.
>>>
>>> The parens are very unlikely to have been used in existing tests, so
>>> the modified behaviour is unlikely to break anything.
>>> But we should document it in the release notes just in case.
>>>
>>>  Felix
>>>>
>>>>> Attached is a patch against trunk, which checks the regex if it
>>>>>>>>
>>>>>>> starts
>>>>>
>>>>>> with
>>>>>>>> '(' and ends with ')' and uses the regex as given, instead of
>>>>>>>>
>>>>>>> building
>>>>>
>>>>>> its
>>>>>>>> own version.
>>>>>>>>
>>>>>>>
>>>>>>> Please use Bugzilla for patches; it's easier to keep track of them.
>>>>>>>
>>>>>> I have already done so yesterday shortly after sending my mail. It is
>>>>>> https://issues.apache.org/bugzilla/show_bug.cgi?id=57032
>>>>>>
>>>>>> What is missing from the patch is documentation. If the feature as
>>>>>>
>>>>> such is
>>>>>
>>>>>> ok, then I would add that to the existing documentation.
>>>>>>
>>>>>>
>>>>>> Regards
>>>>>>   Felix
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Also, see notes below.
>>>>>>>>
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: sebb [mailto:sebbaz@gmail.com <javascript:;>]
>>>>>>>> Sent: 21 September 2014 01:52
>>>>>>>> To: JMeter Users List
>>>>>>>> Subject: Re: Test Script Recorder XML Regex Matching
>>>>>>>>
>>>>>>>> On 19 September 2014 16:45, Marijn Wijbenga
>>>>>>>> <Marijn.Wijbenga@cgpbooks.co.uk <javascript:;>> wrote:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I have an issue, which might well be a potential bug, where a
>>>>>>>>
>>>>>>> posted
>>>>>
>>>>>> value
>>>>>>>> is
>>>>>>>>
>>>>>>>> not being matched by the Test Script Recorder's Regex Matching
>>>>>>>> functionality.
>>>>>>>>
>>>>>>>> The request I'm recording has a post value containing XML (SAML
>>>>>>>>
>>>>>>> token to
>>>>>
>>>>>> be
>>>>>>>>
>>>>>>>> exact) which I'd like to replace with a variable automatically.
>>>>>>>>
>>>>>>>> What does the value look like?
>>>>>>>> Does it have multiple lines?
>>>>>>>>
>>>>>>>> No, it did not have multiple lines. I did check if this was the
>>>>>>>>
>>>>>>> case, but
>>>>>
>>>>>> it
>>>>>>>> wasn't
>>>>>>>>
>>>>>>>> For testing purposes I have configured a User Defined Variable
>>>>>>>>
>>>>>>> (called
>>>>>
>>>>>> TEST)
>>>>>>>>
>>>>>>>> with a value of "(?s)^.*$", I've tried "^.*$" and ".*" as well (all
>>>>>>>> without
>>>>>>>> double
>>>>>>>> quotes).
>>>>>>>>
>>>>>>>> Only ".*" replaces the content with this: <${TEST}>
>>>>>>>>
>>>>>>>> That does not make sense.
>>>>>>>> ".*" will match everything, including < and >, so the content would
>>>>>>>> become
>>>>>>>> ${TEST}
>>>>>>>>
>>>>>>>> I know. It doesn't really. Hence I think this might be a bug.
>>>>>>>>
>>>>>>>> I've tried other expressions as well and I'm able to match anything
>>>>>>>> within
>>>>>>>> the
>>>>>>>>
>>>>>>>> <> characters, but not those characters itself.
>>>>>>>>
>>>>>>>> Again, that does not make sense.
>>>>>>>>
>>>>>>>> The weird thing is, that inside the outer <> characters there are
>>>>>>>>
>>>>>>> other
>>>>>
>>>>>> <>
>>>>>>>> characters that are matched fine. It's just the first and last
>>>>>>>>
>>>>>>> character.
>>>>>
>>>>>> Does anyone else have experienced the same thing, or is this a
>>>>>>>>
>>>>>>> known
>>>>>
>>>>>> issue?
>>>>>>>>
>>>>>>>> It is not a known issue, and may not even be an issue.
>>>>>>>>
>>>>>>>> Or should I post this in the developer's mailing list?
>>>>>>>>
>>>>>>>> No, the developers all follow this list.
>>>>>>>>
>>>>>>>> Great, please see attachment for an example.
>>>>>>>>
>>>>>>>> Cheers
>>>>>>>>
>>>>>>>>
>>>>>>>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@jmeter.apache.org
> For additional commands, e-mail: user-help@jmeter.apache.org
>
>


-- 
Cordialement.
Philippe Mouawad.