You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@brooklyn.apache.org by aledsage <gi...@git.apache.org> on 2017/03/10 16:54:15 UTC

[GitHub] brooklyn-server pull request #589: yaml-test assertions: fix use of “match...

GitHub user aledsage opened a pull request:

    https://github.com/apache/brooklyn-server/pull/589

    yaml-test assertions: fix use of \u201cmatches\u201d

    Previously `matches: <regex>` would not work well for multi-line.
    
    For example, if using something like:
    ```
    type: TestSshCommand
    brooklyn.config:
      assert.out:
      - matches: ".*foo.*"
    ```
    and the ssh command outputs:
    ```
    foo1
    foo2
    ```
    then it would not match.
    
    You can see this in pure java: `"foo1\nfoo2\n".matches(".*foo.*")` is false.
    
    This PR changes our "matches" to do two things: first it tries to match the regex against the entire string; but if that does't work then it splits the input string into multiple lines and checks if the regex matches against any single line - if it does, then it considers it a match.
    
    Do you agree that is what the user will find most intuitive?

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/aledsage/brooklyn-server fix-test-assert-for-matches

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/brooklyn-server/pull/589.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #589
    
----
commit 5bd690887a57b0ec3a9ccc7c619fe099a4f83991
Author: Aled Sage <al...@gmail.com>
Date:   2017-03-10T16:44:07Z

    yaml-test assertions: fix use of \u201cmatches\u201d
    
    Previously \u201cmatches: <regex>\u201d would not work well for multi-line.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] brooklyn-server issue #589: yaml-test assertions: fix use of “matches”

Posted by geomacy <gi...@git.apache.org>.
Github user geomacy commented on the issue:

    https://github.com/apache/brooklyn-server/pull/589
  
    I've gone 360 degrees on this after a bit more thought and investigation!  I was going to say
    
    > Actually now that you mention it I think that's a good idea. i.e. Leave "matches" the way it is, just doing a `Matcher.match()` (without us modifying things with MULTILINE or DOTALL, which could confuse at times), but also provide `containsMatch` which gives us `Matcher.find()`, so we have both capabilities on offer.  
    
    But I've actually given up on that - `find` is of most use when you're wanting to iterate over the matches. Fair enough we could provide it as an alternative to `matches` that would go find the first match in the lines of output and return true if it's in there somewhere.  But it seems to leave us two assertions that do very much the same thing, which is not very minimal. 
    
    So I'm going to go radical and suggest we do nothing to the code, and just update the docs to mention the issue of matching multi lines and give examples of how to do that:
    ```
    type: TestSshCommand
    brooklyn.config:
      assert.out:
      - matches: "(?s).*foo.*"
    ```
    
    What do you say?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] brooklyn-server issue #589: yaml-test assertions: fix use of “matches”

Posted by aledsage <gi...@git.apache.org>.
Github user aledsage commented on the issue:

    https://github.com/apache/brooklyn-server/pull/589
  
    @geomacy @neykov I'm getting too bogged down in the possibilities here, and can't decide what's best!
    
    I looked at automatically setting MULTILINE and DOTALL, but I think it's then impossible to turn them off again for the user (as demonstrated by the code below - the args to "compile" seem to take precedence over the `(?!s)` in the pattern, but I couldn't find docs that would say definitively if that's the case):
    ```
        @Test
        public void tmp() {
            String expected = "(?!s).*line2.*";
            String actual = "line1\nline2\n";
            Pattern matchesPattern = Pattern.compile(expected, Pattern.DOTALL | Pattern.MULTILINE);
            Asserts.assertFalse(matchesPattern.matcher(actual.toString()).matches());
        }
    ```
    
    I therefore lean towards @geomacy's suggestion of leaving `matches` as-is, and just documenting it.
    
    However, I still lean towards adding `containsMatch`, which uses `find()` *without* setting DOTALL or MULTILINE. I think it's reasonable to support all of `contains`, `matches` and `containsMatch`, with the behaviour being "obvious" to the user.
    
    The non-obvious bit is that `matches` (and `containsMatch`) will default to not DOTALL and not MULTILINE. Unfortunately neither way round would be intuitive to everyone [1].
    
    I'm therefore going to update the PR as described above!
    
    [1] See section "Using ^ and $ as Start of Line and End of Line Anchors" of http://www.regular-expressions.info/anchors.html, which talks about different behaviour in different systems - it says that for almost all programming languages and libraries, you have to explicitly enable multiline/dotall yourself.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] brooklyn-server issue #589: yaml-test assertions: fix use of “matches”

Posted by drigodwin <gi...@git.apache.org>.
Github user drigodwin commented on the issue:

    https://github.com/apache/brooklyn-server/pull/589
  
    This PR now adds the useful additional functionality of `containsMatch` without changing any defaults. I think since this was useful to @aledsage it might be useful to other users, so it's worth adding. Any additional changing of defaults as @neykov described should be a topic for the mailing list.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] brooklyn-server issue #589: yaml-test assertions: fix use of “matches”

Posted by neykov <gi...@git.apache.org>.
Github user neykov commented on the issue:

    https://github.com/apache/brooklyn-server/pull/589
  
    Wouldn't it make more sense to change the code to match the regex across all lines (`Pattern.MULTILINE` + `Pattern.DOTALL`)?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] brooklyn-server issue #589: yaml-test assertions: fix use of “matches”

Posted by neykov <gi...@git.apache.org>.
Github user neykov commented on the issue:

    https://github.com/apache/brooklyn-server/pull/589
  
    @aledsage Can you expand on your thoughts re multiline support by default. Why is it surprising to have it on by default and why would one need to disable it? Especially if combined with `contains` behaviour.
    I suppose this is up to personal preference so I'm fine with any decision. The dev mailing list could be useful to probe what more ppl think.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] brooklyn-server issue #589: yaml-test assertions: fix use of “matches”

Posted by aledsage <gi...@git.apache.org>.
Github user aledsage commented on the issue:

    https://github.com/apache/brooklyn-server/pull/589
  
    @neykov regarding whether multiline support should be on by default, it feels like using the language defaults is a good idea as it will be less surprising to users (though I see the point about things like `grep`'s regex have different defaults; most languages seem to have the same defaults as java though).
    
    I still think supporting the `matches` and `find` functionality of the language is a good thing for a user's convenience - with `containsMatch` being our wording for `find`, because it's more fitting with our other names.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] brooklyn-server issue #589: yaml-test assertions: fix use of “matches”

Posted by tbouron <gi...@git.apache.org>.
Github user tbouron commented on the issue:

    https://github.com/apache/brooklyn-server/pull/589
  
    Just to add my 2c: I think it's better to let the user decide what the regex should match. There are 2 cases:
    1. an author want to match pretty simple thing, i.e. most likely a simple string (no regex) within the same line => what we currently have works.
    2. a more advanced author want to match something more complex and uses regex. At that point, this person know how to do multiple regex, or is able to easily find out.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] brooklyn-server issue #589: yaml-test assertions: fix use of “matches”

Posted by neykov <gi...@git.apache.org>.
Github user neykov commented on the issue:

    https://github.com/apache/brooklyn-server/pull/589
  
    +1 for `containsMatch` (or just `contains`) with `MULTILINE` + `DOTALL`.
    It covers all use cases:
      * simple contains: `xxx`
      * match all: `^.*xxx.*$`
      * match single line: `(^|\n)xxx(\n|$)`
      * match across lines: `expected value:\n\txxx`
    
    Most of our tests are the "simple contains" case anyway.
    
    I think it's still worth changing `matches` to `MULTILINE` + `DOTALL`. For existing single-line tests there's no change and for multi-line results it's more natural (much like the IDE search works).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] brooklyn-server issue #589: yaml-test assertions: fix use of “matches”

Posted by aledsage <gi...@git.apache.org>.
Github user aledsage commented on the issue:

    https://github.com/apache/brooklyn-server/pull/589
  
    Thanks @neykov and @geomacy - interesting!
    
    I'm now not sure what is intuitive to a user!
    
    ---
    FYI, my specific use-case is to write a yaml-test using `TestSshCommand` that checks for the presence of particular lines in a log file. The command I execute will grep the log, and then I want to assert that a couple of lines are present in the stdout (I use a a regex to try to match the pattern on each line).
    
    I also had in the back of my mind someone using `TestSshCommand` to execute a command and assert the stdout, but because of the `.bashrc` or whatever they get extra stdout lines that they weren't expecting.
    
    However, a power-user might well have a more complex regex, where they are trying to match across multi-line output. If we start messing with the defaults of DOTALL then it might well surprise them.
    
    ---
    Looking at http://stackoverflow.com/questions/3651725/match-multiline-text-using-regular-expression/3652402#3652402, I wonder if what I want (for my use-case at least) is something like `containsMatch: foo.*bar`. This would call `matcher.find()`.
    
    @geomacy @neykov what do you think - shall I drop these changes and instead add support for `containsMatch: ...`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] brooklyn-server issue #589: yaml-test assertions: fix use of “matches”

Posted by neykov <gi...@git.apache.org>.
Github user neykov commented on the issue:

    https://github.com/apache/brooklyn-server/pull/589
  
    Alternatively if multi-lines are expected could do the same from the regex itself - `(?m)`, `(?s)`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] brooklyn-server issue #589: yaml-test assertions: fix use of “matches”

Posted by aledsage <gi...@git.apache.org>.
Github user aledsage commented on the issue:

    https://github.com/apache/brooklyn-server/pull/589
  
    @tbouron (cc @geomacy @neykov) agree, but I'm not sure from that what you think we should support. There are a few nuances to your second point (about regex):
    * Use of regex is not just for advanced authors - a lot of people use simple regexes such as `foo.*bar`.
    * The multiline support etc is language specific (e.g. see [1]). The use of `(?m)` and `(?s)` are java-specific I believe [2,3], so not nice for YAML users to expect them to know those. We'll need to document that, as Geoff says.
    * It feels (from my experience of using it) like there are two common use-cases for regex matching: one is matching everything (i.e. normal regex); the second is the "containsMatch" (e.g. using `TestSshCommand` to execute `ps aux` gives multiple line stdout, but a user is only interested in checking if it contains a string (on a single-line) that matches my regex; for that particular example some folk currently write `ps aux | grep -E ".*foo.*bar.*"` and check the exit code, but if the command fails, you don't get shown the actual output of `ps aux`, so some people end up executing `ps aux; ps aux | grep -E ".*foo.*bar.*"` to get around that!).
    
    Based on that, do you agree with these changes? What do you think we should support?
    
    [1] http://stackoverflow.com/a/159139/1393883
    [2] http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html#MULTILINE
    [3] http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html#DOTALL


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] brooklyn-server issue #589: yaml-test assertions: fix use of “matches”

Posted by geomacy <gi...@git.apache.org>.
Github user geomacy commented on the issue:

    https://github.com/apache/brooklyn-server/pull/589
  
    +1 sounds a good approach to me 
    
    > I'm getting too bogged down in the possibilities here, and can't decide what's best!
    
    I know the feeling!  ;-)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] brooklyn-server issue #589: yaml-test assertions: fix use of “matches”

Posted by geomacy <gi...@git.apache.org>.
Github user geomacy commented on the issue:

    https://github.com/apache/brooklyn-server/pull/589
  
    I agree with @neykov about MULTILINE + DOTALL, but won't insist on it.  I think it's nice to support the multiline match without the test author having to supply the (?m) themselves.  
    
    I'm ok with merging this as-is, but will leave it to you to see if you'd prefer to change to MULTILINE + DOTALL first.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] brooklyn-server issue #589: yaml-test assertions: fix use of “matches”

Posted by tbouron <gi...@git.apache.org>.
Github user tbouron commented on the issue:

    https://github.com/apache/brooklyn-server/pull/589
  
    @aledsage My assumption was that those flags **were not** language specific, my bad.
    
    I think I will side with @geomacy and [his radical idea](https://github.com/apache/brooklyn-server/pull/589#issuecomment-285778134) => not touching the code but adding documentation on how to support multiline via the `(?s)` flag. I know it is java specific but the Brooklyn codebase is java, I don't see any reason against using it.
    
    Personally, I do prefer be in charge of what to match rather than letting Brooklyn trying to be smart and possibly fail.
    
    That being said, I'm happy to go with what the majority think it's best


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] brooklyn-server issue #589: yaml-test assertions: fix use of “matches”

Posted by geomacy <gi...@git.apache.org>.
Github user geomacy commented on the issue:

    https://github.com/apache/brooklyn-server/pull/589
  
    Nice; I think this is probably a better match (no pun intended) to what test authors want. It is likely to make their lives easier, in that they don't have to spend time thinking about how to squash multiple lines.   You could argue that we should provide both ways of doing it and making that a configuration option; on balance though I would say let's not do that until there is a demand for it, let's keep it simple.  Will do a quick review of the code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] brooklyn-server pull request #589: yaml-test assertions: fix use of “match...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/brooklyn-server/pull/589


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---