You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@aurora.apache.org by Mark Chu-Carroll <mc...@twopensource.com> on 2014/06/11 15:18:30 UTC

Review Request 22457: Improve aurora "job diff" command.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/
-----------------------------------------------------------

Review request for Aurora, David McLaughlin and Brian Wickman.


Bugs: aurora-520
    https://issues.apache.org/jira/browse/aurora-520


Repository: aurora


Description
-------

Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.

- Allow exclusion of semantically irrelevant fields.
- Provide a clearer list of the differences between configs.
- Provide a scripting-friendly alternative JSON syntax for diffs.

The old diff behavior is still available under the "--use-shell-diff" option.


Diffs
-----

  src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
  src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
  src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
  src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
  src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
  src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 

Diff: https://reviews.apache.org/r/22457/diff/


Testing
-------

New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
All tests pass.


Thanks,

Mark Chu-Carroll


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Mark Chu-Carroll <mc...@twopensource.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review45749
-----------------------------------------------------------


ping?

- Mark Chu-Carroll


On June 13, 2014, 4:22 p.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated June 13, 2014, 4:22 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Mark Chu-Carroll <mc...@twopensource.com>.

> On June 16, 2014, 5:25 p.m., David McLaughlin wrote:
> > src/main/python/apache/aurora/client/cli/json_tree_diff.py, line 56
> > <https://reviews.apache.org/r/22457/diff/3/?file=609593#file609593line56>
> >
> >     Not a big deal given how rare it would occur, but you're using a delimiter that is a valid key character, so you could have ambiguous keys. 
> >     
> >     i.e. given the path of "hello.world" and value True should that refer to the path:
> >      
> >     { "hello.world": True }
> >     
> >     Or
> >     
> >     {
> >       "hello": {
> >         "world": True
> >       }
> >     }
> >     
> >     ?
> >     
> >     Returning the path as an n-tuple of keys avoids this.

But we know that these json trees are thrift; I'm exploiting the way that thrift serializes as lists and dicts to build the diff. And thrift doesn't allow "." in field names.


- Mark


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review45832
-----------------------------------------------------------


On June 13, 2014, 4:22 p.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated June 13, 2014, 4:22 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by David McLaughlin <da...@dmclaughlin.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review45832
-----------------------------------------------------------



src/main/python/apache/aurora/client/cli/json_tree_diff.py
<https://reviews.apache.org/r/22457/#comment80822>

    Not a big deal given how rare it would occur, but you're using a delimiter that is a valid key character, so you could have ambiguous keys. 
    
    i.e. given the path of "hello.world" and value True should that refer to the path:
     
    { "hello.world": True }
    
    Or
    
    {
      "hello": {
        "world": True
      }
    }
    
    ?
    
    Returning the path as an n-tuple of keys avoids this.  


- David McLaughlin


On June 13, 2014, 8:22 p.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated June 13, 2014, 8:22 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Maxim Khutornenko <ma...@apache.org>.

> On June 16, 2014, 9:45 p.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/client/cli/jobs.py, lines 181-182
> > <https://reviews.apache.org/r/22457/diff/3/?file=609592#file609592line181>
> >
> >     I don't think it's enough to json-serialize a thrift task. This is bound to set/dict serialization sorting problem we have battled with in updater.py. Specifically this block: https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/client/api/updater.py#L200-L223
> >     
> >     Without sorting, the diff will only work 99% of the time. This has been demonstrated in a few real-life job updates.
> 
> Mark Chu-Carroll wrote:
>     This isn't a new thing in this updated diff system - this is the current behavior.
>     
>     If you use the new json-tree-diff, you won't have this problem; but for people who rely on the current diff
>     behavior, I'd rather not change that.
>
> 
> Maxim Khutornenko wrote:
>     | If you use the new json-tree-diff, you won't have this problem; 
>     
>     Perhaps I am missing something but how would the new approach ensure equivalence of these two structs?
>     { 'set1': { ['k1':'v1', 'k2':'v2']} }
>     { 'set2': { ['k2':'v2', 'k1':'v1']} }
>     
>     It's "normal" to see element re-ordering like this during thrift struct serialization.
> 
> Maxim Khutornenko wrote:
>     Here is a real TaskConfig json fragment and its reordered twin:
>     
>     1: u'constraints': [{u'name': u'value', u'constraint': {u'values': [u'1', u'2']}}, {u'name': u'limit', u'constraint': {u'limit': {u'limit': 10}}}]
>     2: u'constraints': [{u'constraint': {u'values': [u'2', u'1']}, u'name': u'value'}, {u'name': u'limit', u'constraint': {u'limit': {u'limit': 10}}}]
> 
> Mark Chu-Carroll wrote:
>     I don't think that the reordering that you show in your first example isn't valid json. Key-value pairs belong in a dictionary, not a list.  In a dictionary, the comparison routine does the right thing: it compares key by key, without regard to ordering.
>     
>     In the second example: I'd argue that that's actually a bug in our serializer: it's using an ordered list for an unordered constraint. Diff can't tell that the list structure isn't really supposed to be ordered; the serializer should be canonicalizing it.
>
> 
> Maxim Khutornenko wrote:
>     Not sure how else you could represent a python set in JSON without completely changing the underlying data structure (e.g. emitting fake keys for every element and converting set into a dictionary with sorted keys). Agree, the TJSONProtocol could be a bit smarter when dealing with python sets but the reality is that python sets are inherently "unhashable" structs. I have seen cases when completely repr-equivalent python sets were not comparing as equal.
> 
> Mark Chu-Carroll wrote:
>     All the serializer would need to do is just canonicalize when it converts to a list. It's not an issue of being unhashable. But it should at least bother to be *uniform*. If you're converting a set to a list, chose a convention for ordering elements. Put 'em in lexicographic order, problem solved.
>     
>     As it stands, that problem is unsolvable for this diff tool: there's no way to tell that a given field was supposed to be a set, and should be compared without ordering. 
>     
>     I'm not sure what you'd like me to do here, short of giving up and throwing away the new diff tool. I don't see any way to solve this, short of creating something that uses thrift introspection.
>
> 
> Maxim Khutornenko wrote:
>     We solved this problem in updater diff before (see the link above) and I am sure it can be solved here. Converting to sorted tuples does not produce a pretty diff but if we are targeting value-comparison diff output rather than serialized string compare the problem shifts into pretty-fying the sorted tuple output rather than handling diff mechanics.
> 
> Mark Chu-Carroll wrote:
>     Not sure what you mean by "converting to sorted tuples". Is there any documentation of what the "sorted tuple" format looks like? Or do I just need to reverse-engineer? (I hate doing that, because if I make any wrong assumptions, things get ugly.)
>
> 
> Maxim Khutornenko wrote:
>     I have modified the _diff_configs() in updater.py to print out the output and ran "./pants src/test/python/apache/aurora/client/api:updater -s -k test_diff_unordered_configs":
>     
>     ((u'constraints', (((u'constraint', ((u'limit', ((u'limit', 10),)),)), (u'name', u'limit')), ((u'constraint', ((u'values', (u'1', u'2')),)), (u'name', u'value')))), (u'contactEmail', u'foo@bar.com'), (u'diskMb', 1), (u'environment', u'test'), (u'executorConfig', ((u'data', u'test data'), (u'name', u'test'))), (u'jobName', u'jimbob'), (u'maxTaskFailures', 1), (u'metadata', (((u'key', u'k1'), (u'value', u'v1')), ((u'key', u'k2'), (u'value', u'v2')))), (u'numCpus', 1.0), (u'owner', ((u'role', u'mesos'),)), (u'priority', 0), (u'production', 1), (u'ramMb', 1), (u'requestedPorts', (u'142', u'3424', u'45235')), (u'taskLinks', ((u'task1', u'link1'), (u'task2', u'link2')))) 
>     
>     The idea here is to eliminate the ordering unpredictability by replacing all thrift structs with tuples, which are sortable. It sure isn't pretty but it works.
> 
> Mark Chu-Carroll wrote:
>     I understand the reason for doing it; what I still don't understand is just what the format here *is*. Is there any documentation on it? I can make a reasonable guess at just what the "tuplization" format is and what it means, but it's just a reasonable guess, based on looking at sample outputs and reverse engineering.
>     
>     I don't understand how this fixes anything. As far as I can tell, what it does is just replace the sets, dicts, and structs with tuples. But that doesn't change anything about consistency, as far as I can understand.
>     
>     Let me try to ask in a slightly different way, to get at what I'm not understanding. How does this solve the problem? At some point, you're still taking the set of values whose order is unpredictable, and stuffing them into a tuple, which is ordered. What about this format guarantees that the order in which the elements of a set are converted into a tuple will be consistent?
>     
>     In the example you gave before, you've got a constraints field, which is a set of strings. In normal JSON serialization, that turns into:
>        {u'values': [u'2', u'1']}
>     
>     In the serialization process, the json serializer took a field which was actually a set([u'1', u'2']), and serialized it, producing [u'2', u'1']. The serialization into a list imposed an order on the set. The problem is that we can't count on consistency in how that order is chosen.
>     
>     Now, in this new serialization format, we're doing something very similar - except that instead of serializing set([u'1', u'2']) into [u'1', u'2'], it serializes it as (u'1', u'2'). To serialize it into a tuple, the serialized needed to chose an order for the elements of the set. If I understand you correctly, you're saying that this tuplized format solves the ordering-consistency problem, somehow. But I don't see how. What about tuples guarantees order consistency where lists don't?
>     
>     
>     
>     
>
> 
> Maxim Khutornenko wrote:
>     The key point here is that tuples are sorted at every step during construction and specifically this line [1] ensures the consistency of sets on compare. So, the {u'values': [u'2', u'1']} becomes ((u'values', (u'1', u'2')),)) and so on when unwinding the recursion. 
>     
>     [1] - https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/client/api/updater.py#L202
>     
>     There is no documentation besides what is already there in the _diff_configs(). The concept is quite simple, so I am not sure what else needs to be documented there. Perhaps add an output example?
> 
> Mark Chu-Carroll wrote:
>     An output example would definitely help.
>     
>     Also, more on the rationale - the explanation in that code doesn't explain why it makes to go so far - particularly in the case of the new diff algorithm that we're discussing.
>     
>     For example: why convert lists to tuples? Lists have the same ordering guarantees and comparison guarantees as tuples. So you're not increasing the comparability, but you are making the resulting diffs harder to read?
>     
>     (In fact, what's going on in that _hashable and _diff_configs seems to not just be over-aggressive, but incorrect. A list [b, a] in a constraint expression can't be converted to [a, b]. As it stands, that will claim no diffs in cases where there's a significant diff, because it erases ordering even when it shouldn't.)
>     
>     Why convert dictionaries to tuples? Dictionaries don't have an ordering constraint; key-by-key comparison is fine. If you're dumping the config out to a file for comparison by the unix diff tool, that sort-of makes sense, because you need the keys to be in the same order for a text diff. But why convert to a tuple of tuples? Why not just do a list of tuples? Again, comparison wise it's equivalent, but readability-wise, a list of pairs is provides an extra bit of visual cue about the structure, which is helpful for the person reading the diff.
>     
>
> 
> Maxim Khutornenko wrote:
>     The specific case we were trying to solve in updater diff did not require list support (as TaskConfig does not have them). So your observation is correct, lists should not have been there. The correct implementation should have lists as a separate branch where the element order would be preserved.
>     
>     As for the dictionaries, they have to be flattened as they have the same inherent problems as sets. I have observed exactly the same unpredictable re-ordering behavior in both sets and dicts when converted to json.
>     
>     As for the "tuples of tuples" approach, it was chosen for simplicity. The diff results are not normally shown in the output so readability was not critical in updater.
> 
> Mark Chu-Carroll wrote:
>     Ok - so the dictionary flatting part doesn't apply in the new diff routine, because like-key comparison doesn't depend on key ordering.
>     
>     But the set issues are a problem. Based on your experience here, will set-flattening-and-ordering be enough to get correct results?
>

As long as dict keys are compared consistently and values are compared recursively, I think set flattening should be enough.


- Maxim


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review45835
-----------------------------------------------------------


On June 18, 2014, 2:59 p.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated June 18, 2014, 2:59 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Mark Chu-Carroll <mc...@twopensource.com>.

> On June 16, 2014, 5:45 p.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/client/cli/jobs.py, lines 181-182
> > <https://reviews.apache.org/r/22457/diff/3/?file=609592#file609592line181>
> >
> >     I don't think it's enough to json-serialize a thrift task. This is bound to set/dict serialization sorting problem we have battled with in updater.py. Specifically this block: https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/client/api/updater.py#L200-L223
> >     
> >     Without sorting, the diff will only work 99% of the time. This has been demonstrated in a few real-life job updates.
> 
> Mark Chu-Carroll wrote:
>     This isn't a new thing in this updated diff system - this is the current behavior.
>     
>     If you use the new json-tree-diff, you won't have this problem; but for people who rely on the current diff
>     behavior, I'd rather not change that.
>
> 
> Maxim Khutornenko wrote:
>     | If you use the new json-tree-diff, you won't have this problem; 
>     
>     Perhaps I am missing something but how would the new approach ensure equivalence of these two structs?
>     { 'set1': { ['k1':'v1', 'k2':'v2']} }
>     { 'set2': { ['k2':'v2', 'k1':'v1']} }
>     
>     It's "normal" to see element re-ordering like this during thrift struct serialization.
> 
> Maxim Khutornenko wrote:
>     Here is a real TaskConfig json fragment and its reordered twin:
>     
>     1: u'constraints': [{u'name': u'value', u'constraint': {u'values': [u'1', u'2']}}, {u'name': u'limit', u'constraint': {u'limit': {u'limit': 10}}}]
>     2: u'constraints': [{u'constraint': {u'values': [u'2', u'1']}, u'name': u'value'}, {u'name': u'limit', u'constraint': {u'limit': {u'limit': 10}}}]
> 
> Mark Chu-Carroll wrote:
>     I don't think that the reordering that you show in your first example isn't valid json. Key-value pairs belong in a dictionary, not a list.  In a dictionary, the comparison routine does the right thing: it compares key by key, without regard to ordering.
>     
>     In the second example: I'd argue that that's actually a bug in our serializer: it's using an ordered list for an unordered constraint. Diff can't tell that the list structure isn't really supposed to be ordered; the serializer should be canonicalizing it.
>
> 
> Maxim Khutornenko wrote:
>     Not sure how else you could represent a python set in JSON without completely changing the underlying data structure (e.g. emitting fake keys for every element and converting set into a dictionary with sorted keys). Agree, the TJSONProtocol could be a bit smarter when dealing with python sets but the reality is that python sets are inherently "unhashable" structs. I have seen cases when completely repr-equivalent python sets were not comparing as equal.
> 
> Mark Chu-Carroll wrote:
>     All the serializer would need to do is just canonicalize when it converts to a list. It's not an issue of being unhashable. But it should at least bother to be *uniform*. If you're converting a set to a list, chose a convention for ordering elements. Put 'em in lexicographic order, problem solved.
>     
>     As it stands, that problem is unsolvable for this diff tool: there's no way to tell that a given field was supposed to be a set, and should be compared without ordering. 
>     
>     I'm not sure what you'd like me to do here, short of giving up and throwing away the new diff tool. I don't see any way to solve this, short of creating something that uses thrift introspection.
>
> 
> Maxim Khutornenko wrote:
>     We solved this problem in updater diff before (see the link above) and I am sure it can be solved here. Converting to sorted tuples does not produce a pretty diff but if we are targeting value-comparison diff output rather than serialized string compare the problem shifts into pretty-fying the sorted tuple output rather than handling diff mechanics.
> 
> Mark Chu-Carroll wrote:
>     Not sure what you mean by "converting to sorted tuples". Is there any documentation of what the "sorted tuple" format looks like? Or do I just need to reverse-engineer? (I hate doing that, because if I make any wrong assumptions, things get ugly.)
>
> 
> Maxim Khutornenko wrote:
>     I have modified the _diff_configs() in updater.py to print out the output and ran "./pants src/test/python/apache/aurora/client/api:updater -s -k test_diff_unordered_configs":
>     
>     ((u'constraints', (((u'constraint', ((u'limit', ((u'limit', 10),)),)), (u'name', u'limit')), ((u'constraint', ((u'values', (u'1', u'2')),)), (u'name', u'value')))), (u'contactEmail', u'foo@bar.com'), (u'diskMb', 1), (u'environment', u'test'), (u'executorConfig', ((u'data', u'test data'), (u'name', u'test'))), (u'jobName', u'jimbob'), (u'maxTaskFailures', 1), (u'metadata', (((u'key', u'k1'), (u'value', u'v1')), ((u'key', u'k2'), (u'value', u'v2')))), (u'numCpus', 1.0), (u'owner', ((u'role', u'mesos'),)), (u'priority', 0), (u'production', 1), (u'ramMb', 1), (u'requestedPorts', (u'142', u'3424', u'45235')), (u'taskLinks', ((u'task1', u'link1'), (u'task2', u'link2')))) 
>     
>     The idea here is to eliminate the ordering unpredictability by replacing all thrift structs with tuples, which are sortable. It sure isn't pretty but it works.
> 
> Mark Chu-Carroll wrote:
>     I understand the reason for doing it; what I still don't understand is just what the format here *is*. Is there any documentation on it? I can make a reasonable guess at just what the "tuplization" format is and what it means, but it's just a reasonable guess, based on looking at sample outputs and reverse engineering.
>     
>     I don't understand how this fixes anything. As far as I can tell, what it does is just replace the sets, dicts, and structs with tuples. But that doesn't change anything about consistency, as far as I can understand.
>     
>     Let me try to ask in a slightly different way, to get at what I'm not understanding. How does this solve the problem? At some point, you're still taking the set of values whose order is unpredictable, and stuffing them into a tuple, which is ordered. What about this format guarantees that the order in which the elements of a set are converted into a tuple will be consistent?
>     
>     In the example you gave before, you've got a constraints field, which is a set of strings. In normal JSON serialization, that turns into:
>        {u'values': [u'2', u'1']}
>     
>     In the serialization process, the json serializer took a field which was actually a set([u'1', u'2']), and serialized it, producing [u'2', u'1']. The serialization into a list imposed an order on the set. The problem is that we can't count on consistency in how that order is chosen.
>     
>     Now, in this new serialization format, we're doing something very similar - except that instead of serializing set([u'1', u'2']) into [u'1', u'2'], it serializes it as (u'1', u'2'). To serialize it into a tuple, the serialized needed to chose an order for the elements of the set. If I understand you correctly, you're saying that this tuplized format solves the ordering-consistency problem, somehow. But I don't see how. What about tuples guarantees order consistency where lists don't?
>     
>     
>     
>     
>
> 
> Maxim Khutornenko wrote:
>     The key point here is that tuples are sorted at every step during construction and specifically this line [1] ensures the consistency of sets on compare. So, the {u'values': [u'2', u'1']} becomes ((u'values', (u'1', u'2')),)) and so on when unwinding the recursion. 
>     
>     [1] - https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/client/api/updater.py#L202
>     
>     There is no documentation besides what is already there in the _diff_configs(). The concept is quite simple, so I am not sure what else needs to be documented there. Perhaps add an output example?

An output example would definitely help.

Also, more on the rationale - the explanation in that code doesn't explain why it makes to go so far - particularly in the case of the new diff algorithm that we're discussing.

For example: why convert lists to tuples? Lists have the same ordering guarantees and comparison guarantees as tuples. So you're not increasing the comparability, but you are making the resulting diffs harder to read?

(In fact, what's going on in that _hashable and _diff_configs seems to not just be over-aggressive, but incorrect. A list [b, a] in a constraint expression can't be converted to [a, b]. As it stands, that will claim no diffs in cases where there's a significant diff, because it erases ordering even when it shouldn't.)

Why convert dictionaries to tuples? Dictionaries don't have an ordering constraint; key-by-key comparison is fine. If you're dumping the config out to a file for comparison by the unix diff tool, that sort-of makes sense, because you need the keys to be in the same order for a text diff. But why convert to a tuple of tuples? Why not just do a list of tuples? Again, comparison wise it's equivalent, but readability-wise, a list of pairs is provides an extra bit of visual cue about the structure, which is helpful for the person reading the diff.


- Mark


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review45835
-----------------------------------------------------------


On June 18, 2014, 10:59 a.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated June 18, 2014, 10:59 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Maxim Khutornenko <ma...@apache.org>.

> On June 16, 2014, 9:45 p.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/client/cli/jobs.py, lines 181-182
> > <https://reviews.apache.org/r/22457/diff/3/?file=609592#file609592line181>
> >
> >     I don't think it's enough to json-serialize a thrift task. This is bound to set/dict serialization sorting problem we have battled with in updater.py. Specifically this block: https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/client/api/updater.py#L200-L223
> >     
> >     Without sorting, the diff will only work 99% of the time. This has been demonstrated in a few real-life job updates.
> 
> Mark Chu-Carroll wrote:
>     This isn't a new thing in this updated diff system - this is the current behavior.
>     
>     If you use the new json-tree-diff, you won't have this problem; but for people who rely on the current diff
>     behavior, I'd rather not change that.
>
> 
> Maxim Khutornenko wrote:
>     | If you use the new json-tree-diff, you won't have this problem; 
>     
>     Perhaps I am missing something but how would the new approach ensure equivalence of these two structs?
>     { 'set1': { ['k1':'v1', 'k2':'v2']} }
>     { 'set2': { ['k2':'v2', 'k1':'v1']} }
>     
>     It's "normal" to see element re-ordering like this during thrift struct serialization.
> 
> Maxim Khutornenko wrote:
>     Here is a real TaskConfig json fragment and its reordered twin:
>     
>     1: u'constraints': [{u'name': u'value', u'constraint': {u'values': [u'1', u'2']}}, {u'name': u'limit', u'constraint': {u'limit': {u'limit': 10}}}]
>     2: u'constraints': [{u'constraint': {u'values': [u'2', u'1']}, u'name': u'value'}, {u'name': u'limit', u'constraint': {u'limit': {u'limit': 10}}}]
> 
> Mark Chu-Carroll wrote:
>     I don't think that the reordering that you show in your first example isn't valid json. Key-value pairs belong in a dictionary, not a list.  In a dictionary, the comparison routine does the right thing: it compares key by key, without regard to ordering.
>     
>     In the second example: I'd argue that that's actually a bug in our serializer: it's using an ordered list for an unordered constraint. Diff can't tell that the list structure isn't really supposed to be ordered; the serializer should be canonicalizing it.
>
> 
> Maxim Khutornenko wrote:
>     Not sure how else you could represent a python set in JSON without completely changing the underlying data structure (e.g. emitting fake keys for every element and converting set into a dictionary with sorted keys). Agree, the TJSONProtocol could be a bit smarter when dealing with python sets but the reality is that python sets are inherently "unhashable" structs. I have seen cases when completely repr-equivalent python sets were not comparing as equal.
> 
> Mark Chu-Carroll wrote:
>     All the serializer would need to do is just canonicalize when it converts to a list. It's not an issue of being unhashable. But it should at least bother to be *uniform*. If you're converting a set to a list, chose a convention for ordering elements. Put 'em in lexicographic order, problem solved.
>     
>     As it stands, that problem is unsolvable for this diff tool: there's no way to tell that a given field was supposed to be a set, and should be compared without ordering. 
>     
>     I'm not sure what you'd like me to do here, short of giving up and throwing away the new diff tool. I don't see any way to solve this, short of creating something that uses thrift introspection.
>
> 
> Maxim Khutornenko wrote:
>     We solved this problem in updater diff before (see the link above) and I am sure it can be solved here. Converting to sorted tuples does not produce a pretty diff but if we are targeting value-comparison diff output rather than serialized string compare the problem shifts into pretty-fying the sorted tuple output rather than handling diff mechanics.
> 
> Mark Chu-Carroll wrote:
>     Not sure what you mean by "converting to sorted tuples". Is there any documentation of what the "sorted tuple" format looks like? Or do I just need to reverse-engineer? (I hate doing that, because if I make any wrong assumptions, things get ugly.)
>

I have modified the _diff_configs() in updater.py to print out the output and ran "./pants src/test/python/apache/aurora/client/api:updater -s -k test_diff_unordered_configs":

((u'constraints', (((u'constraint', ((u'limit', ((u'limit', 10),)),)), (u'name', u'limit')), ((u'constraint', ((u'values', (u'1', u'2')),)), (u'name', u'value')))), (u'contactEmail', u'foo@bar.com'), (u'diskMb', 1), (u'environment', u'test'), (u'executorConfig', ((u'data', u'test data'), (u'name', u'test'))), (u'jobName', u'jimbob'), (u'maxTaskFailures', 1), (u'metadata', (((u'key', u'k1'), (u'value', u'v1')), ((u'key', u'k2'), (u'value', u'v2')))), (u'numCpus', 1.0), (u'owner', ((u'role', u'mesos'),)), (u'priority', 0), (u'production', 1), (u'ramMb', 1), (u'requestedPorts', (u'142', u'3424', u'45235')), (u'taskLinks', ((u'task1', u'link1'), (u'task2', u'link2')))) 

The idea here is to eliminate the ordering unpredictability by replacing all thrift structs with tuples, which are sortable. It sure isn't pretty but it works.


- Maxim


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review45835
-----------------------------------------------------------


On June 18, 2014, 2:59 p.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated June 18, 2014, 2:59 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Maxim Khutornenko <ma...@apache.org>.

> On June 16, 2014, 9:45 p.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/client/cli/jobs.py, lines 181-182
> > <https://reviews.apache.org/r/22457/diff/3/?file=609592#file609592line181>
> >
> >     I don't think it's enough to json-serialize a thrift task. This is bound to set/dict serialization sorting problem we have battled with in updater.py. Specifically this block: https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/client/api/updater.py#L200-L223
> >     
> >     Without sorting, the diff will only work 99% of the time. This has been demonstrated in a few real-life job updates.
> 
> Mark Chu-Carroll wrote:
>     This isn't a new thing in this updated diff system - this is the current behavior.
>     
>     If you use the new json-tree-diff, you won't have this problem; but for people who rely on the current diff
>     behavior, I'd rather not change that.
>

| If you use the new json-tree-diff, you won't have this problem; 

Perhaps I am missing something but how would the new approach ensure equivalence of these two structs?
{ 'set1': { ['k1':'v1', 'k2':'v2']} }
{ 'set2': { ['k2':'v2', 'k1':'v1']} }

It's "normal" to see element re-ordering like this during thrift struct serialization.


- Maxim


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review45835
-----------------------------------------------------------


On June 18, 2014, 2:59 p.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated June 18, 2014, 2:59 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Mark Chu-Carroll <mc...@twopensource.com>.

> On June 16, 2014, 5:45 p.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/client/cli/jobs.py, lines 181-182
> > <https://reviews.apache.org/r/22457/diff/3/?file=609592#file609592line181>
> >
> >     I don't think it's enough to json-serialize a thrift task. This is bound to set/dict serialization sorting problem we have battled with in updater.py. Specifically this block: https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/client/api/updater.py#L200-L223
> >     
> >     Without sorting, the diff will only work 99% of the time. This has been demonstrated in a few real-life job updates.
> 
> Mark Chu-Carroll wrote:
>     This isn't a new thing in this updated diff system - this is the current behavior.
>     
>     If you use the new json-tree-diff, you won't have this problem; but for people who rely on the current diff
>     behavior, I'd rather not change that.
>
> 
> Maxim Khutornenko wrote:
>     | If you use the new json-tree-diff, you won't have this problem; 
>     
>     Perhaps I am missing something but how would the new approach ensure equivalence of these two structs?
>     { 'set1': { ['k1':'v1', 'k2':'v2']} }
>     { 'set2': { ['k2':'v2', 'k1':'v1']} }
>     
>     It's "normal" to see element re-ordering like this during thrift struct serialization.
> 
> Maxim Khutornenko wrote:
>     Here is a real TaskConfig json fragment and its reordered twin:
>     
>     1: u'constraints': [{u'name': u'value', u'constraint': {u'values': [u'1', u'2']}}, {u'name': u'limit', u'constraint': {u'limit': {u'limit': 10}}}]
>     2: u'constraints': [{u'constraint': {u'values': [u'2', u'1']}, u'name': u'value'}, {u'name': u'limit', u'constraint': {u'limit': {u'limit': 10}}}]
> 
> Mark Chu-Carroll wrote:
>     I don't think that the reordering that you show in your first example isn't valid json. Key-value pairs belong in a dictionary, not a list.  In a dictionary, the comparison routine does the right thing: it compares key by key, without regard to ordering.
>     
>     In the second example: I'd argue that that's actually a bug in our serializer: it's using an ordered list for an unordered constraint. Diff can't tell that the list structure isn't really supposed to be ordered; the serializer should be canonicalizing it.
>
> 
> Maxim Khutornenko wrote:
>     Not sure how else you could represent a python set in JSON without completely changing the underlying data structure (e.g. emitting fake keys for every element and converting set into a dictionary with sorted keys). Agree, the TJSONProtocol could be a bit smarter when dealing with python sets but the reality is that python sets are inherently "unhashable" structs. I have seen cases when completely repr-equivalent python sets were not comparing as equal.

All the serializer would need to do is just canonicalize when it converts to a list. It's not an issue of being unhashable. But it should at least bother to be *uniform*. If you're converting a set to a list, chose a convention for ordering elements. Put 'em in lexicographic order, problem solved.

As it stands, that problem is unsolvable for this diff tool: there's no way to tell that a given field was supposed to be a set, and should be compared without ordering. 

I'm not sure what you'd like me to do here, short of giving up and throwing away the new diff tool. I don't see any way to solve this, short of creating something that uses thrift introspection.


- Mark


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review45835
-----------------------------------------------------------


On June 18, 2014, 10:59 a.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated June 18, 2014, 10:59 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Mark Chu-Carroll <mc...@twopensource.com>.

> On June 16, 2014, 5:45 p.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/client/cli/jobs.py, lines 181-182
> > <https://reviews.apache.org/r/22457/diff/3/?file=609592#file609592line181>
> >
> >     I don't think it's enough to json-serialize a thrift task. This is bound to set/dict serialization sorting problem we have battled with in updater.py. Specifically this block: https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/client/api/updater.py#L200-L223
> >     
> >     Without sorting, the diff will only work 99% of the time. This has been demonstrated in a few real-life job updates.
> 
> Mark Chu-Carroll wrote:
>     This isn't a new thing in this updated diff system - this is the current behavior.
>     
>     If you use the new json-tree-diff, you won't have this problem; but for people who rely on the current diff
>     behavior, I'd rather not change that.
>
> 
> Maxim Khutornenko wrote:
>     | If you use the new json-tree-diff, you won't have this problem; 
>     
>     Perhaps I am missing something but how would the new approach ensure equivalence of these two structs?
>     { 'set1': { ['k1':'v1', 'k2':'v2']} }
>     { 'set2': { ['k2':'v2', 'k1':'v1']} }
>     
>     It's "normal" to see element re-ordering like this during thrift struct serialization.
> 
> Maxim Khutornenko wrote:
>     Here is a real TaskConfig json fragment and its reordered twin:
>     
>     1: u'constraints': [{u'name': u'value', u'constraint': {u'values': [u'1', u'2']}}, {u'name': u'limit', u'constraint': {u'limit': {u'limit': 10}}}]
>     2: u'constraints': [{u'constraint': {u'values': [u'2', u'1']}, u'name': u'value'}, {u'name': u'limit', u'constraint': {u'limit': {u'limit': 10}}}]
> 
> Mark Chu-Carroll wrote:
>     I don't think that the reordering that you show in your first example isn't valid json. Key-value pairs belong in a dictionary, not a list.  In a dictionary, the comparison routine does the right thing: it compares key by key, without regard to ordering.
>     
>     In the second example: I'd argue that that's actually a bug in our serializer: it's using an ordered list for an unordered constraint. Diff can't tell that the list structure isn't really supposed to be ordered; the serializer should be canonicalizing it.
>
> 
> Maxim Khutornenko wrote:
>     Not sure how else you could represent a python set in JSON without completely changing the underlying data structure (e.g. emitting fake keys for every element and converting set into a dictionary with sorted keys). Agree, the TJSONProtocol could be a bit smarter when dealing with python sets but the reality is that python sets are inherently "unhashable" structs. I have seen cases when completely repr-equivalent python sets were not comparing as equal.
> 
> Mark Chu-Carroll wrote:
>     All the serializer would need to do is just canonicalize when it converts to a list. It's not an issue of being unhashable. But it should at least bother to be *uniform*. If you're converting a set to a list, chose a convention for ordering elements. Put 'em in lexicographic order, problem solved.
>     
>     As it stands, that problem is unsolvable for this diff tool: there's no way to tell that a given field was supposed to be a set, and should be compared without ordering. 
>     
>     I'm not sure what you'd like me to do here, short of giving up and throwing away the new diff tool. I don't see any way to solve this, short of creating something that uses thrift introspection.
>
> 
> Maxim Khutornenko wrote:
>     We solved this problem in updater diff before (see the link above) and I am sure it can be solved here. Converting to sorted tuples does not produce a pretty diff but if we are targeting value-comparison diff output rather than serialized string compare the problem shifts into pretty-fying the sorted tuple output rather than handling diff mechanics.
> 
> Mark Chu-Carroll wrote:
>     Not sure what you mean by "converting to sorted tuples". Is there any documentation of what the "sorted tuple" format looks like? Or do I just need to reverse-engineer? (I hate doing that, because if I make any wrong assumptions, things get ugly.)
>
> 
> Maxim Khutornenko wrote:
>     I have modified the _diff_configs() in updater.py to print out the output and ran "./pants src/test/python/apache/aurora/client/api:updater -s -k test_diff_unordered_configs":
>     
>     ((u'constraints', (((u'constraint', ((u'limit', ((u'limit', 10),)),)), (u'name', u'limit')), ((u'constraint', ((u'values', (u'1', u'2')),)), (u'name', u'value')))), (u'contactEmail', u'foo@bar.com'), (u'diskMb', 1), (u'environment', u'test'), (u'executorConfig', ((u'data', u'test data'), (u'name', u'test'))), (u'jobName', u'jimbob'), (u'maxTaskFailures', 1), (u'metadata', (((u'key', u'k1'), (u'value', u'v1')), ((u'key', u'k2'), (u'value', u'v2')))), (u'numCpus', 1.0), (u'owner', ((u'role', u'mesos'),)), (u'priority', 0), (u'production', 1), (u'ramMb', 1), (u'requestedPorts', (u'142', u'3424', u'45235')), (u'taskLinks', ((u'task1', u'link1'), (u'task2', u'link2')))) 
>     
>     The idea here is to eliminate the ordering unpredictability by replacing all thrift structs with tuples, which are sortable. It sure isn't pretty but it works.

I understand the reason for doing it; what I still don't understand is just what the format here *is*. Is there any documentation on it? I can make a reasonable guess at just what the "tuplization" format is and what it means, but it's just a reasonable guess, based on looking at sample outputs and reverse engineering.

I don't understand how this fixes anything. As far as I can tell, what it does is just replace the sets, dicts, and structs with tuples. But that doesn't change anything about consistency, as far as I can understand.

Let me try to ask in a slightly different way, to get at what I'm not understanding. How does this solve the problem? At some point, you're still taking the set of values whose order is unpredictable, and stuffing them into a tuple, which is ordered. What about this format guarantees that the order in which the elements of a set are converted into a tuple will be consistent?

In the example you gave before, you've got a constraints field, which is a set of strings. In normal JSON serialization, that turns into:
   {u'values': [u'2', u'1']}

In the serialization process, the json serializer took a field which was actually a set([u'1', u'2']), and serialized it, producing [u'2', u'1']. The serialization into a list imposed an order on the set. The problem is that we can't count on consistency in how that order is chosen.

Now, in this new serialization format, we're doing something very similar - except that instead of serializing set([u'1', u'2']) into [u'1', u'2'], it serializes it as (u'1', u'2'). To serialize it into a tuple, the serialized needed to chose an order for the elements of the set. If I understand you correctly, you're saying that this tuplized format solves the ordering-consistency problem, somehow. But I don't see how. What about tuples guarantees order consistency where lists don't?


- Mark


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review45835
-----------------------------------------------------------


On June 18, 2014, 10:59 a.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated June 18, 2014, 10:59 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Maxim Khutornenko <ma...@apache.org>.

> On June 16, 2014, 9:45 p.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/client/cli/jobs.py, lines 181-182
> > <https://reviews.apache.org/r/22457/diff/3/?file=609592#file609592line181>
> >
> >     I don't think it's enough to json-serialize a thrift task. This is bound to set/dict serialization sorting problem we have battled with in updater.py. Specifically this block: https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/client/api/updater.py#L200-L223
> >     
> >     Without sorting, the diff will only work 99% of the time. This has been demonstrated in a few real-life job updates.
> 
> Mark Chu-Carroll wrote:
>     This isn't a new thing in this updated diff system - this is the current behavior.
>     
>     If you use the new json-tree-diff, you won't have this problem; but for people who rely on the current diff
>     behavior, I'd rather not change that.
>
> 
> Maxim Khutornenko wrote:
>     | If you use the new json-tree-diff, you won't have this problem; 
>     
>     Perhaps I am missing something but how would the new approach ensure equivalence of these two structs?
>     { 'set1': { ['k1':'v1', 'k2':'v2']} }
>     { 'set2': { ['k2':'v2', 'k1':'v1']} }
>     
>     It's "normal" to see element re-ordering like this during thrift struct serialization.

Here is a real TaskConfig json fragment and its reordered twin:

1: u'constraints': [{u'name': u'value', u'constraint': {u'values': [u'1', u'2']}}, {u'name': u'limit', u'constraint': {u'limit': {u'limit': 10}}}]
2: u'constraints': [{u'constraint': {u'values': [u'2', u'1']}, u'name': u'value'}, {u'name': u'limit', u'constraint': {u'limit': {u'limit': 10}}}]


- Maxim


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review45835
-----------------------------------------------------------


On June 18, 2014, 2:59 p.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated June 18, 2014, 2:59 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Maxim Khutornenko <ma...@apache.org>.

> On June 16, 2014, 9:45 p.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/client/cli/jobs.py, lines 181-182
> > <https://reviews.apache.org/r/22457/diff/3/?file=609592#file609592line181>
> >
> >     I don't think it's enough to json-serialize a thrift task. This is bound to set/dict serialization sorting problem we have battled with in updater.py. Specifically this block: https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/client/api/updater.py#L200-L223
> >     
> >     Without sorting, the diff will only work 99% of the time. This has been demonstrated in a few real-life job updates.
> 
> Mark Chu-Carroll wrote:
>     This isn't a new thing in this updated diff system - this is the current behavior.
>     
>     If you use the new json-tree-diff, you won't have this problem; but for people who rely on the current diff
>     behavior, I'd rather not change that.
>
> 
> Maxim Khutornenko wrote:
>     | If you use the new json-tree-diff, you won't have this problem; 
>     
>     Perhaps I am missing something but how would the new approach ensure equivalence of these two structs?
>     { 'set1': { ['k1':'v1', 'k2':'v2']} }
>     { 'set2': { ['k2':'v2', 'k1':'v1']} }
>     
>     It's "normal" to see element re-ordering like this during thrift struct serialization.
> 
> Maxim Khutornenko wrote:
>     Here is a real TaskConfig json fragment and its reordered twin:
>     
>     1: u'constraints': [{u'name': u'value', u'constraint': {u'values': [u'1', u'2']}}, {u'name': u'limit', u'constraint': {u'limit': {u'limit': 10}}}]
>     2: u'constraints': [{u'constraint': {u'values': [u'2', u'1']}, u'name': u'value'}, {u'name': u'limit', u'constraint': {u'limit': {u'limit': 10}}}]
> 
> Mark Chu-Carroll wrote:
>     I don't think that the reordering that you show in your first example isn't valid json. Key-value pairs belong in a dictionary, not a list.  In a dictionary, the comparison routine does the right thing: it compares key by key, without regard to ordering.
>     
>     In the second example: I'd argue that that's actually a bug in our serializer: it's using an ordered list for an unordered constraint. Diff can't tell that the list structure isn't really supposed to be ordered; the serializer should be canonicalizing it.
>

Not sure how else you could represent a python set in JSON without completely changing the underlying data structure (e.g. emitting fake keys for every element and converting set into a dictionary with sorted keys). Agree, the TJSONProtocol could be a bit smarter when dealing with python sets but the reality is that python sets are inherently "unhashable" structs. I have seen cases when completely repr-equivalent python sets were not comparing as equal.  


- Maxim


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review45835
-----------------------------------------------------------


On June 18, 2014, 2:59 p.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated June 18, 2014, 2:59 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Maxim Khutornenko <ma...@apache.org>.

> On June 16, 2014, 9:45 p.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/client/cli/jobs.py, lines 181-182
> > <https://reviews.apache.org/r/22457/diff/3/?file=609592#file609592line181>
> >
> >     I don't think it's enough to json-serialize a thrift task. This is bound to set/dict serialization sorting problem we have battled with in updater.py. Specifically this block: https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/client/api/updater.py#L200-L223
> >     
> >     Without sorting, the diff will only work 99% of the time. This has been demonstrated in a few real-life job updates.
> 
> Mark Chu-Carroll wrote:
>     This isn't a new thing in this updated diff system - this is the current behavior.
>     
>     If you use the new json-tree-diff, you won't have this problem; but for people who rely on the current diff
>     behavior, I'd rather not change that.
>
> 
> Maxim Khutornenko wrote:
>     | If you use the new json-tree-diff, you won't have this problem; 
>     
>     Perhaps I am missing something but how would the new approach ensure equivalence of these two structs?
>     { 'set1': { ['k1':'v1', 'k2':'v2']} }
>     { 'set2': { ['k2':'v2', 'k1':'v1']} }
>     
>     It's "normal" to see element re-ordering like this during thrift struct serialization.
> 
> Maxim Khutornenko wrote:
>     Here is a real TaskConfig json fragment and its reordered twin:
>     
>     1: u'constraints': [{u'name': u'value', u'constraint': {u'values': [u'1', u'2']}}, {u'name': u'limit', u'constraint': {u'limit': {u'limit': 10}}}]
>     2: u'constraints': [{u'constraint': {u'values': [u'2', u'1']}, u'name': u'value'}, {u'name': u'limit', u'constraint': {u'limit': {u'limit': 10}}}]
> 
> Mark Chu-Carroll wrote:
>     I don't think that the reordering that you show in your first example isn't valid json. Key-value pairs belong in a dictionary, not a list.  In a dictionary, the comparison routine does the right thing: it compares key by key, without regard to ordering.
>     
>     In the second example: I'd argue that that's actually a bug in our serializer: it's using an ordered list for an unordered constraint. Diff can't tell that the list structure isn't really supposed to be ordered; the serializer should be canonicalizing it.
>
> 
> Maxim Khutornenko wrote:
>     Not sure how else you could represent a python set in JSON without completely changing the underlying data structure (e.g. emitting fake keys for every element and converting set into a dictionary with sorted keys). Agree, the TJSONProtocol could be a bit smarter when dealing with python sets but the reality is that python sets are inherently "unhashable" structs. I have seen cases when completely repr-equivalent python sets were not comparing as equal.
> 
> Mark Chu-Carroll wrote:
>     All the serializer would need to do is just canonicalize when it converts to a list. It's not an issue of being unhashable. But it should at least bother to be *uniform*. If you're converting a set to a list, chose a convention for ordering elements. Put 'em in lexicographic order, problem solved.
>     
>     As it stands, that problem is unsolvable for this diff tool: there's no way to tell that a given field was supposed to be a set, and should be compared without ordering. 
>     
>     I'm not sure what you'd like me to do here, short of giving up and throwing away the new diff tool. I don't see any way to solve this, short of creating something that uses thrift introspection.
>

We solved this problem in updater diff before (see the link above) and I am sure it can be solved here. Converting to sorted tuples does not produce a pretty diff but if we are targeting value-comparison diff output rather than serialized string compare the problem shifts into pretty-fying the sorted tuple output rather than handling diff mechanics.  


- Maxim


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review45835
-----------------------------------------------------------


On June 18, 2014, 2:59 p.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated June 18, 2014, 2:59 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Maxim Khutornenko <ma...@apache.org>.

> On June 16, 2014, 9:45 p.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/client/cli/jobs.py, lines 181-182
> > <https://reviews.apache.org/r/22457/diff/3/?file=609592#file609592line181>
> >
> >     I don't think it's enough to json-serialize a thrift task. This is bound to set/dict serialization sorting problem we have battled with in updater.py. Specifically this block: https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/client/api/updater.py#L200-L223
> >     
> >     Without sorting, the diff will only work 99% of the time. This has been demonstrated in a few real-life job updates.
> 
> Mark Chu-Carroll wrote:
>     This isn't a new thing in this updated diff system - this is the current behavior.
>     
>     If you use the new json-tree-diff, you won't have this problem; but for people who rely on the current diff
>     behavior, I'd rather not change that.
>
> 
> Maxim Khutornenko wrote:
>     | If you use the new json-tree-diff, you won't have this problem; 
>     
>     Perhaps I am missing something but how would the new approach ensure equivalence of these two structs?
>     { 'set1': { ['k1':'v1', 'k2':'v2']} }
>     { 'set2': { ['k2':'v2', 'k1':'v1']} }
>     
>     It's "normal" to see element re-ordering like this during thrift struct serialization.
> 
> Maxim Khutornenko wrote:
>     Here is a real TaskConfig json fragment and its reordered twin:
>     
>     1: u'constraints': [{u'name': u'value', u'constraint': {u'values': [u'1', u'2']}}, {u'name': u'limit', u'constraint': {u'limit': {u'limit': 10}}}]
>     2: u'constraints': [{u'constraint': {u'values': [u'2', u'1']}, u'name': u'value'}, {u'name': u'limit', u'constraint': {u'limit': {u'limit': 10}}}]
> 
> Mark Chu-Carroll wrote:
>     I don't think that the reordering that you show in your first example isn't valid json. Key-value pairs belong in a dictionary, not a list.  In a dictionary, the comparison routine does the right thing: it compares key by key, without regard to ordering.
>     
>     In the second example: I'd argue that that's actually a bug in our serializer: it's using an ordered list for an unordered constraint. Diff can't tell that the list structure isn't really supposed to be ordered; the serializer should be canonicalizing it.
>
> 
> Maxim Khutornenko wrote:
>     Not sure how else you could represent a python set in JSON without completely changing the underlying data structure (e.g. emitting fake keys for every element and converting set into a dictionary with sorted keys). Agree, the TJSONProtocol could be a bit smarter when dealing with python sets but the reality is that python sets are inherently "unhashable" structs. I have seen cases when completely repr-equivalent python sets were not comparing as equal.
> 
> Mark Chu-Carroll wrote:
>     All the serializer would need to do is just canonicalize when it converts to a list. It's not an issue of being unhashable. But it should at least bother to be *uniform*. If you're converting a set to a list, chose a convention for ordering elements. Put 'em in lexicographic order, problem solved.
>     
>     As it stands, that problem is unsolvable for this diff tool: there's no way to tell that a given field was supposed to be a set, and should be compared without ordering. 
>     
>     I'm not sure what you'd like me to do here, short of giving up and throwing away the new diff tool. I don't see any way to solve this, short of creating something that uses thrift introspection.
>
> 
> Maxim Khutornenko wrote:
>     We solved this problem in updater diff before (see the link above) and I am sure it can be solved here. Converting to sorted tuples does not produce a pretty diff but if we are targeting value-comparison diff output rather than serialized string compare the problem shifts into pretty-fying the sorted tuple output rather than handling diff mechanics.
> 
> Mark Chu-Carroll wrote:
>     Not sure what you mean by "converting to sorted tuples". Is there any documentation of what the "sorted tuple" format looks like? Or do I just need to reverse-engineer? (I hate doing that, because if I make any wrong assumptions, things get ugly.)
>
> 
> Maxim Khutornenko wrote:
>     I have modified the _diff_configs() in updater.py to print out the output and ran "./pants src/test/python/apache/aurora/client/api:updater -s -k test_diff_unordered_configs":
>     
>     ((u'constraints', (((u'constraint', ((u'limit', ((u'limit', 10),)),)), (u'name', u'limit')), ((u'constraint', ((u'values', (u'1', u'2')),)), (u'name', u'value')))), (u'contactEmail', u'foo@bar.com'), (u'diskMb', 1), (u'environment', u'test'), (u'executorConfig', ((u'data', u'test data'), (u'name', u'test'))), (u'jobName', u'jimbob'), (u'maxTaskFailures', 1), (u'metadata', (((u'key', u'k1'), (u'value', u'v1')), ((u'key', u'k2'), (u'value', u'v2')))), (u'numCpus', 1.0), (u'owner', ((u'role', u'mesos'),)), (u'priority', 0), (u'production', 1), (u'ramMb', 1), (u'requestedPorts', (u'142', u'3424', u'45235')), (u'taskLinks', ((u'task1', u'link1'), (u'task2', u'link2')))) 
>     
>     The idea here is to eliminate the ordering unpredictability by replacing all thrift structs with tuples, which are sortable. It sure isn't pretty but it works.
> 
> Mark Chu-Carroll wrote:
>     I understand the reason for doing it; what I still don't understand is just what the format here *is*. Is there any documentation on it? I can make a reasonable guess at just what the "tuplization" format is and what it means, but it's just a reasonable guess, based on looking at sample outputs and reverse engineering.
>     
>     I don't understand how this fixes anything. As far as I can tell, what it does is just replace the sets, dicts, and structs with tuples. But that doesn't change anything about consistency, as far as I can understand.
>     
>     Let me try to ask in a slightly different way, to get at what I'm not understanding. How does this solve the problem? At some point, you're still taking the set of values whose order is unpredictable, and stuffing them into a tuple, which is ordered. What about this format guarantees that the order in which the elements of a set are converted into a tuple will be consistent?
>     
>     In the example you gave before, you've got a constraints field, which is a set of strings. In normal JSON serialization, that turns into:
>        {u'values': [u'2', u'1']}
>     
>     In the serialization process, the json serializer took a field which was actually a set([u'1', u'2']), and serialized it, producing [u'2', u'1']. The serialization into a list imposed an order on the set. The problem is that we can't count on consistency in how that order is chosen.
>     
>     Now, in this new serialization format, we're doing something very similar - except that instead of serializing set([u'1', u'2']) into [u'1', u'2'], it serializes it as (u'1', u'2'). To serialize it into a tuple, the serialized needed to chose an order for the elements of the set. If I understand you correctly, you're saying that this tuplized format solves the ordering-consistency problem, somehow. But I don't see how. What about tuples guarantees order consistency where lists don't?
>     
>     
>     
>     
>
> 
> Maxim Khutornenko wrote:
>     The key point here is that tuples are sorted at every step during construction and specifically this line [1] ensures the consistency of sets on compare. So, the {u'values': [u'2', u'1']} becomes ((u'values', (u'1', u'2')),)) and so on when unwinding the recursion. 
>     
>     [1] - https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/client/api/updater.py#L202
>     
>     There is no documentation besides what is already there in the _diff_configs(). The concept is quite simple, so I am not sure what else needs to be documented there. Perhaps add an output example?
> 
> Mark Chu-Carroll wrote:
>     An output example would definitely help.
>     
>     Also, more on the rationale - the explanation in that code doesn't explain why it makes to go so far - particularly in the case of the new diff algorithm that we're discussing.
>     
>     For example: why convert lists to tuples? Lists have the same ordering guarantees and comparison guarantees as tuples. So you're not increasing the comparability, but you are making the resulting diffs harder to read?
>     
>     (In fact, what's going on in that _hashable and _diff_configs seems to not just be over-aggressive, but incorrect. A list [b, a] in a constraint expression can't be converted to [a, b]. As it stands, that will claim no diffs in cases where there's a significant diff, because it erases ordering even when it shouldn't.)
>     
>     Why convert dictionaries to tuples? Dictionaries don't have an ordering constraint; key-by-key comparison is fine. If you're dumping the config out to a file for comparison by the unix diff tool, that sort-of makes sense, because you need the keys to be in the same order for a text diff. But why convert to a tuple of tuples? Why not just do a list of tuples? Again, comparison wise it's equivalent, but readability-wise, a list of pairs is provides an extra bit of visual cue about the structure, which is helpful for the person reading the diff.
>     
>

The specific case we were trying to solve in updater diff did not require list support (as TaskConfig does not have them). So your observation is correct, lists should not have been there. The correct implementation should have lists as a separate branch where the element order would be preserved.

As for the dictionaries, they have to be flattened as they have the same inherent problems as sets. I have observed exactly the same unpredictable re-ordering behavior in both sets and dicts when converted to json.

As for the "tuples of tuples" approach, it was chosen for simplicity. The diff results are not normally shown in the output so readability was not critical in updater. 


- Maxim


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review45835
-----------------------------------------------------------


On June 18, 2014, 2:59 p.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated June 18, 2014, 2:59 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Mark Chu-Carroll <mc...@twopensource.com>.

> On June 16, 2014, 5:45 p.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/client/cli/jobs.py, lines 181-182
> > <https://reviews.apache.org/r/22457/diff/3/?file=609592#file609592line181>
> >
> >     I don't think it's enough to json-serialize a thrift task. This is bound to set/dict serialization sorting problem we have battled with in updater.py. Specifically this block: https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/client/api/updater.py#L200-L223
> >     
> >     Without sorting, the diff will only work 99% of the time. This has been demonstrated in a few real-life job updates.
> 
> Mark Chu-Carroll wrote:
>     This isn't a new thing in this updated diff system - this is the current behavior.
>     
>     If you use the new json-tree-diff, you won't have this problem; but for people who rely on the current diff
>     behavior, I'd rather not change that.
>
> 
> Maxim Khutornenko wrote:
>     | If you use the new json-tree-diff, you won't have this problem; 
>     
>     Perhaps I am missing something but how would the new approach ensure equivalence of these two structs?
>     { 'set1': { ['k1':'v1', 'k2':'v2']} }
>     { 'set2': { ['k2':'v2', 'k1':'v1']} }
>     
>     It's "normal" to see element re-ordering like this during thrift struct serialization.
> 
> Maxim Khutornenko wrote:
>     Here is a real TaskConfig json fragment and its reordered twin:
>     
>     1: u'constraints': [{u'name': u'value', u'constraint': {u'values': [u'1', u'2']}}, {u'name': u'limit', u'constraint': {u'limit': {u'limit': 10}}}]
>     2: u'constraints': [{u'constraint': {u'values': [u'2', u'1']}, u'name': u'value'}, {u'name': u'limit', u'constraint': {u'limit': {u'limit': 10}}}]
> 
> Mark Chu-Carroll wrote:
>     I don't think that the reordering that you show in your first example isn't valid json. Key-value pairs belong in a dictionary, not a list.  In a dictionary, the comparison routine does the right thing: it compares key by key, without regard to ordering.
>     
>     In the second example: I'd argue that that's actually a bug in our serializer: it's using an ordered list for an unordered constraint. Diff can't tell that the list structure isn't really supposed to be ordered; the serializer should be canonicalizing it.
>
> 
> Maxim Khutornenko wrote:
>     Not sure how else you could represent a python set in JSON without completely changing the underlying data structure (e.g. emitting fake keys for every element and converting set into a dictionary with sorted keys). Agree, the TJSONProtocol could be a bit smarter when dealing with python sets but the reality is that python sets are inherently "unhashable" structs. I have seen cases when completely repr-equivalent python sets were not comparing as equal.
> 
> Mark Chu-Carroll wrote:
>     All the serializer would need to do is just canonicalize when it converts to a list. It's not an issue of being unhashable. But it should at least bother to be *uniform*. If you're converting a set to a list, chose a convention for ordering elements. Put 'em in lexicographic order, problem solved.
>     
>     As it stands, that problem is unsolvable for this diff tool: there's no way to tell that a given field was supposed to be a set, and should be compared without ordering. 
>     
>     I'm not sure what you'd like me to do here, short of giving up and throwing away the new diff tool. I don't see any way to solve this, short of creating something that uses thrift introspection.
>
> 
> Maxim Khutornenko wrote:
>     We solved this problem in updater diff before (see the link above) and I am sure it can be solved here. Converting to sorted tuples does not produce a pretty diff but if we are targeting value-comparison diff output rather than serialized string compare the problem shifts into pretty-fying the sorted tuple output rather than handling diff mechanics.

Not sure what you mean by "converting to sorted tuples". Is there any documentation of what the "sorted tuple" format looks like? Or do I just need to reverse-engineer? (I hate doing that, because if I make any wrong assumptions, things get ugly.)


- Mark


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review45835
-----------------------------------------------------------


On June 18, 2014, 10:59 a.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated June 18, 2014, 10:59 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Mark Chu-Carroll <mc...@twopensource.com>.

> On June 16, 2014, 5:45 p.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/client/cli/jobs.py, lines 181-182
> > <https://reviews.apache.org/r/22457/diff/3/?file=609592#file609592line181>
> >
> >     I don't think it's enough to json-serialize a thrift task. This is bound to set/dict serialization sorting problem we have battled with in updater.py. Specifically this block: https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/client/api/updater.py#L200-L223
> >     
> >     Without sorting, the diff will only work 99% of the time. This has been demonstrated in a few real-life job updates.

This isn't a new thing in this updated diff system - this is the current behavior.

If you use the new json-tree-diff, you won't have this problem; but for people who rely on the current diff
behavior, I'd rather not change that.


- Mark


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review45835
-----------------------------------------------------------


On June 13, 2014, 4:22 p.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated June 13, 2014, 4:22 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Mark Chu-Carroll <mc...@twopensource.com>.

> On June 16, 2014, 5:45 p.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/client/cli/jobs.py, lines 181-182
> > <https://reviews.apache.org/r/22457/diff/3/?file=609592#file609592line181>
> >
> >     I don't think it's enough to json-serialize a thrift task. This is bound to set/dict serialization sorting problem we have battled with in updater.py. Specifically this block: https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/client/api/updater.py#L200-L223
> >     
> >     Without sorting, the diff will only work 99% of the time. This has been demonstrated in a few real-life job updates.
> 
> Mark Chu-Carroll wrote:
>     This isn't a new thing in this updated diff system - this is the current behavior.
>     
>     If you use the new json-tree-diff, you won't have this problem; but for people who rely on the current diff
>     behavior, I'd rather not change that.
>
> 
> Maxim Khutornenko wrote:
>     | If you use the new json-tree-diff, you won't have this problem; 
>     
>     Perhaps I am missing something but how would the new approach ensure equivalence of these two structs?
>     { 'set1': { ['k1':'v1', 'k2':'v2']} }
>     { 'set2': { ['k2':'v2', 'k1':'v1']} }
>     
>     It's "normal" to see element re-ordering like this during thrift struct serialization.
> 
> Maxim Khutornenko wrote:
>     Here is a real TaskConfig json fragment and its reordered twin:
>     
>     1: u'constraints': [{u'name': u'value', u'constraint': {u'values': [u'1', u'2']}}, {u'name': u'limit', u'constraint': {u'limit': {u'limit': 10}}}]
>     2: u'constraints': [{u'constraint': {u'values': [u'2', u'1']}, u'name': u'value'}, {u'name': u'limit', u'constraint': {u'limit': {u'limit': 10}}}]

I don't think that the reordering that you show in your first example isn't valid json. Key-value pairs belong in a dictionary, not a list.  In a dictionary, the comparison routine does the right thing: it compares key by key, without regard to ordering.

In the second example: I'd argue that that's actually a bug in our serializer: it's using an ordered list for an unordered constraint. Diff can't tell that the list structure isn't really supposed to be ordered; the serializer should be canonicalizing it.


- Mark


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review45835
-----------------------------------------------------------


On June 18, 2014, 10:59 a.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated June 18, 2014, 10:59 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Maxim Khutornenko <ma...@apache.org>.

> On June 16, 2014, 9:45 p.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/client/cli/jobs.py, lines 181-182
> > <https://reviews.apache.org/r/22457/diff/3/?file=609592#file609592line181>
> >
> >     I don't think it's enough to json-serialize a thrift task. This is bound to set/dict serialization sorting problem we have battled with in updater.py. Specifically this block: https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/client/api/updater.py#L200-L223
> >     
> >     Without sorting, the diff will only work 99% of the time. This has been demonstrated in a few real-life job updates.
> 
> Mark Chu-Carroll wrote:
>     This isn't a new thing in this updated diff system - this is the current behavior.
>     
>     If you use the new json-tree-diff, you won't have this problem; but for people who rely on the current diff
>     behavior, I'd rather not change that.
>
> 
> Maxim Khutornenko wrote:
>     | If you use the new json-tree-diff, you won't have this problem; 
>     
>     Perhaps I am missing something but how would the new approach ensure equivalence of these two structs?
>     { 'set1': { ['k1':'v1', 'k2':'v2']} }
>     { 'set2': { ['k2':'v2', 'k1':'v1']} }
>     
>     It's "normal" to see element re-ordering like this during thrift struct serialization.
> 
> Maxim Khutornenko wrote:
>     Here is a real TaskConfig json fragment and its reordered twin:
>     
>     1: u'constraints': [{u'name': u'value', u'constraint': {u'values': [u'1', u'2']}}, {u'name': u'limit', u'constraint': {u'limit': {u'limit': 10}}}]
>     2: u'constraints': [{u'constraint': {u'values': [u'2', u'1']}, u'name': u'value'}, {u'name': u'limit', u'constraint': {u'limit': {u'limit': 10}}}]
> 
> Mark Chu-Carroll wrote:
>     I don't think that the reordering that you show in your first example isn't valid json. Key-value pairs belong in a dictionary, not a list.  In a dictionary, the comparison routine does the right thing: it compares key by key, without regard to ordering.
>     
>     In the second example: I'd argue that that's actually a bug in our serializer: it's using an ordered list for an unordered constraint. Diff can't tell that the list structure isn't really supposed to be ordered; the serializer should be canonicalizing it.
>
> 
> Maxim Khutornenko wrote:
>     Not sure how else you could represent a python set in JSON without completely changing the underlying data structure (e.g. emitting fake keys for every element and converting set into a dictionary with sorted keys). Agree, the TJSONProtocol could be a bit smarter when dealing with python sets but the reality is that python sets are inherently "unhashable" structs. I have seen cases when completely repr-equivalent python sets were not comparing as equal.
> 
> Mark Chu-Carroll wrote:
>     All the serializer would need to do is just canonicalize when it converts to a list. It's not an issue of being unhashable. But it should at least bother to be *uniform*. If you're converting a set to a list, chose a convention for ordering elements. Put 'em in lexicographic order, problem solved.
>     
>     As it stands, that problem is unsolvable for this diff tool: there's no way to tell that a given field was supposed to be a set, and should be compared without ordering. 
>     
>     I'm not sure what you'd like me to do here, short of giving up and throwing away the new diff tool. I don't see any way to solve this, short of creating something that uses thrift introspection.
>
> 
> Maxim Khutornenko wrote:
>     We solved this problem in updater diff before (see the link above) and I am sure it can be solved here. Converting to sorted tuples does not produce a pretty diff but if we are targeting value-comparison diff output rather than serialized string compare the problem shifts into pretty-fying the sorted tuple output rather than handling diff mechanics.
> 
> Mark Chu-Carroll wrote:
>     Not sure what you mean by "converting to sorted tuples". Is there any documentation of what the "sorted tuple" format looks like? Or do I just need to reverse-engineer? (I hate doing that, because if I make any wrong assumptions, things get ugly.)
>
> 
> Maxim Khutornenko wrote:
>     I have modified the _diff_configs() in updater.py to print out the output and ran "./pants src/test/python/apache/aurora/client/api:updater -s -k test_diff_unordered_configs":
>     
>     ((u'constraints', (((u'constraint', ((u'limit', ((u'limit', 10),)),)), (u'name', u'limit')), ((u'constraint', ((u'values', (u'1', u'2')),)), (u'name', u'value')))), (u'contactEmail', u'foo@bar.com'), (u'diskMb', 1), (u'environment', u'test'), (u'executorConfig', ((u'data', u'test data'), (u'name', u'test'))), (u'jobName', u'jimbob'), (u'maxTaskFailures', 1), (u'metadata', (((u'key', u'k1'), (u'value', u'v1')), ((u'key', u'k2'), (u'value', u'v2')))), (u'numCpus', 1.0), (u'owner', ((u'role', u'mesos'),)), (u'priority', 0), (u'production', 1), (u'ramMb', 1), (u'requestedPorts', (u'142', u'3424', u'45235')), (u'taskLinks', ((u'task1', u'link1'), (u'task2', u'link2')))) 
>     
>     The idea here is to eliminate the ordering unpredictability by replacing all thrift structs with tuples, which are sortable. It sure isn't pretty but it works.
> 
> Mark Chu-Carroll wrote:
>     I understand the reason for doing it; what I still don't understand is just what the format here *is*. Is there any documentation on it? I can make a reasonable guess at just what the "tuplization" format is and what it means, but it's just a reasonable guess, based on looking at sample outputs and reverse engineering.
>     
>     I don't understand how this fixes anything. As far as I can tell, what it does is just replace the sets, dicts, and structs with tuples. But that doesn't change anything about consistency, as far as I can understand.
>     
>     Let me try to ask in a slightly different way, to get at what I'm not understanding. How does this solve the problem? At some point, you're still taking the set of values whose order is unpredictable, and stuffing them into a tuple, which is ordered. What about this format guarantees that the order in which the elements of a set are converted into a tuple will be consistent?
>     
>     In the example you gave before, you've got a constraints field, which is a set of strings. In normal JSON serialization, that turns into:
>        {u'values': [u'2', u'1']}
>     
>     In the serialization process, the json serializer took a field which was actually a set([u'1', u'2']), and serialized it, producing [u'2', u'1']. The serialization into a list imposed an order on the set. The problem is that we can't count on consistency in how that order is chosen.
>     
>     Now, in this new serialization format, we're doing something very similar - except that instead of serializing set([u'1', u'2']) into [u'1', u'2'], it serializes it as (u'1', u'2'). To serialize it into a tuple, the serialized needed to chose an order for the elements of the set. If I understand you correctly, you're saying that this tuplized format solves the ordering-consistency problem, somehow. But I don't see how. What about tuples guarantees order consistency where lists don't?
>     
>     
>     
>     
>

The key point here is that tuples are sorted at every step during construction and specifically this line [1] ensures the consistency of sets on compare. So, the {u'values': [u'2', u'1']} becomes ((u'values', (u'1', u'2')),)) and so on when unwinding the recursion. 

[1] - https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/client/api/updater.py#L202

There is no documentation besides what is already there in the _diff_configs(). The concept is quite simple, so I am not sure what else needs to be documented there. Perhaps add an output example?


- Maxim


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review45835
-----------------------------------------------------------


On June 18, 2014, 2:59 p.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated June 18, 2014, 2:59 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Mark Chu-Carroll <mc...@twopensource.com>.

> On June 16, 2014, 5:45 p.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/client/cli/jobs.py, lines 181-182
> > <https://reviews.apache.org/r/22457/diff/3/?file=609592#file609592line181>
> >
> >     I don't think it's enough to json-serialize a thrift task. This is bound to set/dict serialization sorting problem we have battled with in updater.py. Specifically this block: https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/client/api/updater.py#L200-L223
> >     
> >     Without sorting, the diff will only work 99% of the time. This has been demonstrated in a few real-life job updates.
> 
> Mark Chu-Carroll wrote:
>     This isn't a new thing in this updated diff system - this is the current behavior.
>     
>     If you use the new json-tree-diff, you won't have this problem; but for people who rely on the current diff
>     behavior, I'd rather not change that.
>
> 
> Maxim Khutornenko wrote:
>     | If you use the new json-tree-diff, you won't have this problem; 
>     
>     Perhaps I am missing something but how would the new approach ensure equivalence of these two structs?
>     { 'set1': { ['k1':'v1', 'k2':'v2']} }
>     { 'set2': { ['k2':'v2', 'k1':'v1']} }
>     
>     It's "normal" to see element re-ordering like this during thrift struct serialization.
> 
> Maxim Khutornenko wrote:
>     Here is a real TaskConfig json fragment and its reordered twin:
>     
>     1: u'constraints': [{u'name': u'value', u'constraint': {u'values': [u'1', u'2']}}, {u'name': u'limit', u'constraint': {u'limit': {u'limit': 10}}}]
>     2: u'constraints': [{u'constraint': {u'values': [u'2', u'1']}, u'name': u'value'}, {u'name': u'limit', u'constraint': {u'limit': {u'limit': 10}}}]
> 
> Mark Chu-Carroll wrote:
>     I don't think that the reordering that you show in your first example isn't valid json. Key-value pairs belong in a dictionary, not a list.  In a dictionary, the comparison routine does the right thing: it compares key by key, without regard to ordering.
>     
>     In the second example: I'd argue that that's actually a bug in our serializer: it's using an ordered list for an unordered constraint. Diff can't tell that the list structure isn't really supposed to be ordered; the serializer should be canonicalizing it.
>
> 
> Maxim Khutornenko wrote:
>     Not sure how else you could represent a python set in JSON without completely changing the underlying data structure (e.g. emitting fake keys for every element and converting set into a dictionary with sorted keys). Agree, the TJSONProtocol could be a bit smarter when dealing with python sets but the reality is that python sets are inherently "unhashable" structs. I have seen cases when completely repr-equivalent python sets were not comparing as equal.
> 
> Mark Chu-Carroll wrote:
>     All the serializer would need to do is just canonicalize when it converts to a list. It's not an issue of being unhashable. But it should at least bother to be *uniform*. If you're converting a set to a list, chose a convention for ordering elements. Put 'em in lexicographic order, problem solved.
>     
>     As it stands, that problem is unsolvable for this diff tool: there's no way to tell that a given field was supposed to be a set, and should be compared without ordering. 
>     
>     I'm not sure what you'd like me to do here, short of giving up and throwing away the new diff tool. I don't see any way to solve this, short of creating something that uses thrift introspection.
>
> 
> Maxim Khutornenko wrote:
>     We solved this problem in updater diff before (see the link above) and I am sure it can be solved here. Converting to sorted tuples does not produce a pretty diff but if we are targeting value-comparison diff output rather than serialized string compare the problem shifts into pretty-fying the sorted tuple output rather than handling diff mechanics.
> 
> Mark Chu-Carroll wrote:
>     Not sure what you mean by "converting to sorted tuples". Is there any documentation of what the "sorted tuple" format looks like? Or do I just need to reverse-engineer? (I hate doing that, because if I make any wrong assumptions, things get ugly.)
>
> 
> Maxim Khutornenko wrote:
>     I have modified the _diff_configs() in updater.py to print out the output and ran "./pants src/test/python/apache/aurora/client/api:updater -s -k test_diff_unordered_configs":
>     
>     ((u'constraints', (((u'constraint', ((u'limit', ((u'limit', 10),)),)), (u'name', u'limit')), ((u'constraint', ((u'values', (u'1', u'2')),)), (u'name', u'value')))), (u'contactEmail', u'foo@bar.com'), (u'diskMb', 1), (u'environment', u'test'), (u'executorConfig', ((u'data', u'test data'), (u'name', u'test'))), (u'jobName', u'jimbob'), (u'maxTaskFailures', 1), (u'metadata', (((u'key', u'k1'), (u'value', u'v1')), ((u'key', u'k2'), (u'value', u'v2')))), (u'numCpus', 1.0), (u'owner', ((u'role', u'mesos'),)), (u'priority', 0), (u'production', 1), (u'ramMb', 1), (u'requestedPorts', (u'142', u'3424', u'45235')), (u'taskLinks', ((u'task1', u'link1'), (u'task2', u'link2')))) 
>     
>     The idea here is to eliminate the ordering unpredictability by replacing all thrift structs with tuples, which are sortable. It sure isn't pretty but it works.
> 
> Mark Chu-Carroll wrote:
>     I understand the reason for doing it; what I still don't understand is just what the format here *is*. Is there any documentation on it? I can make a reasonable guess at just what the "tuplization" format is and what it means, but it's just a reasonable guess, based on looking at sample outputs and reverse engineering.
>     
>     I don't understand how this fixes anything. As far as I can tell, what it does is just replace the sets, dicts, and structs with tuples. But that doesn't change anything about consistency, as far as I can understand.
>     
>     Let me try to ask in a slightly different way, to get at what I'm not understanding. How does this solve the problem? At some point, you're still taking the set of values whose order is unpredictable, and stuffing them into a tuple, which is ordered. What about this format guarantees that the order in which the elements of a set are converted into a tuple will be consistent?
>     
>     In the example you gave before, you've got a constraints field, which is a set of strings. In normal JSON serialization, that turns into:
>        {u'values': [u'2', u'1']}
>     
>     In the serialization process, the json serializer took a field which was actually a set([u'1', u'2']), and serialized it, producing [u'2', u'1']. The serialization into a list imposed an order on the set. The problem is that we can't count on consistency in how that order is chosen.
>     
>     Now, in this new serialization format, we're doing something very similar - except that instead of serializing set([u'1', u'2']) into [u'1', u'2'], it serializes it as (u'1', u'2'). To serialize it into a tuple, the serialized needed to chose an order for the elements of the set. If I understand you correctly, you're saying that this tuplized format solves the ordering-consistency problem, somehow. But I don't see how. What about tuples guarantees order consistency where lists don't?
>     
>     
>     
>     
>
> 
> Maxim Khutornenko wrote:
>     The key point here is that tuples are sorted at every step during construction and specifically this line [1] ensures the consistency of sets on compare. So, the {u'values': [u'2', u'1']} becomes ((u'values', (u'1', u'2')),)) and so on when unwinding the recursion. 
>     
>     [1] - https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/client/api/updater.py#L202
>     
>     There is no documentation besides what is already there in the _diff_configs(). The concept is quite simple, so I am not sure what else needs to be documented there. Perhaps add an output example?
> 
> Mark Chu-Carroll wrote:
>     An output example would definitely help.
>     
>     Also, more on the rationale - the explanation in that code doesn't explain why it makes to go so far - particularly in the case of the new diff algorithm that we're discussing.
>     
>     For example: why convert lists to tuples? Lists have the same ordering guarantees and comparison guarantees as tuples. So you're not increasing the comparability, but you are making the resulting diffs harder to read?
>     
>     (In fact, what's going on in that _hashable and _diff_configs seems to not just be over-aggressive, but incorrect. A list [b, a] in a constraint expression can't be converted to [a, b]. As it stands, that will claim no diffs in cases where there's a significant diff, because it erases ordering even when it shouldn't.)
>     
>     Why convert dictionaries to tuples? Dictionaries don't have an ordering constraint; key-by-key comparison is fine. If you're dumping the config out to a file for comparison by the unix diff tool, that sort-of makes sense, because you need the keys to be in the same order for a text diff. But why convert to a tuple of tuples? Why not just do a list of tuples? Again, comparison wise it's equivalent, but readability-wise, a list of pairs is provides an extra bit of visual cue about the structure, which is helpful for the person reading the diff.
>     
>
> 
> Maxim Khutornenko wrote:
>     The specific case we were trying to solve in updater diff did not require list support (as TaskConfig does not have them). So your observation is correct, lists should not have been there. The correct implementation should have lists as a separate branch where the element order would be preserved.
>     
>     As for the dictionaries, they have to be flattened as they have the same inherent problems as sets. I have observed exactly the same unpredictable re-ordering behavior in both sets and dicts when converted to json.
>     
>     As for the "tuples of tuples" approach, it was chosen for simplicity. The diff results are not normally shown in the output so readability was not critical in updater.

Ok - so the dictionary flatting part doesn't apply in the new diff routine, because like-key comparison doesn't depend on key ordering.

But the set issues are a problem. Based on your experience here, will set-flattening-and-ordering be enough to get correct results?


- Mark


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review45835
-----------------------------------------------------------


On June 18, 2014, 10:59 a.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated June 18, 2014, 10:59 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Maxim Khutornenko <ma...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review45835
-----------------------------------------------------------



src/main/python/apache/aurora/client/cli/jobs.py
<https://reviews.apache.org/r/22457/#comment80827>

    I don't think it's enough to json-serialize a thrift task. This is bound to set/dict serialization sorting problem we have battled with in updater.py. Specifically this block: https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/client/api/updater.py#L200-L223
    
    Without sorting, the diff will only work 99% of the time. This has been demonstrated in a few real-life job updates.


- Maxim Khutornenko


On June 13, 2014, 8:22 p.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated June 13, 2014, 8:22 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Maxim Khutornenko <ma...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review46747
-----------------------------------------------------------



src/main/python/apache/aurora/client/cli/jobs.py
<https://reviews.apache.org/r/22457/#comment82306>

    s/not/be?



src/main/python/apache/aurora/client/cli/jobs.py
<https://reviews.apache.org/r/22457/#comment82311>

    You may want to use temporary_file() here instead that cleans up after itself: https://github.com/twitter/commons/blob/master/src/python/twitter/common/contextutil/__init__.py#L93-L107



src/test/python/apache/aurora/client/cli/test_diff.py
<https://reviews.apache.org/r/22457/#comment82312>

    Would be great to see populated sets here instead of empty arrays. The task.constraints field is the one with nested sets that would be perfect for testing set ordering.



src/test/python/apache/aurora/client/cli/test_diff.py
<https://reviews.apache.org/r/22457/#comment82313>

    How about "local has less tasks than remote" test case for completeness?


- Maxim Khutornenko


On June 26, 2014, 11:42 a.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated June 26, 2014, 11:42 a.m.)
> 
> 
> Review request for Aurora, Maxim Khutornenko and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Mark Chu-Carroll <mc...@twopensource.com>.

> On July 29, 2014, 7:36 p.m., Bill Farner wrote:
> > Mark - looks like this was committed, reverted, then re-committed.  If it's now in, can you please close the review?

Sadly no. It was committed, reverted, recommitted, reverted, rerecommitted, rerereverted. 

Why it fails in the hudson build is still a mystery. The best guess so far is python version, but I haven't been able to successfully reproduce the test failures.


- Mark


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review49046
-----------------------------------------------------------


On July 17, 2014, 10:09 a.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated July 17, 2014, 10:09 a.m.)
> 
> 
> Review request for Aurora, Maxim Khutornenko and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Bill Farner <wf...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review49046
-----------------------------------------------------------


Mark - looks like this was committed, reverted, then re-committed.  If it's now in, can you please close the review?

- Bill Farner


On July 17, 2014, 2:09 p.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated July 17, 2014, 2:09 p.m.)
> 
> 
> Review request for Aurora, Maxim Khutornenko and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Aurora ReviewBot <wf...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review58220
-----------------------------------------------------------


This patch does not apply cleanly on master (53f4e73), do you need to rebase?

- Aurora ReviewBot


On Sept. 9, 2014, 2:07 p.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated Sept. 9, 2014, 2:07 p.m.)
> 
> 
> Review request for Aurora, Maxim Khutornenko and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 8f349c09637c16e2499e85f2dc96eb7ccffd0aaf 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD e1f9ebf96774b8f5c75de8570c6ba87d953ab649 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Aurora ReviewBot <wf...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review58294
-----------------------------------------------------------


Master (53f4e73) is red with this patch.
  ./build-support/jenkins/build.sh

src.test.python.apache.aurora.client.api.api                                    .....   SUCCESS
src.test.python.apache.aurora.client.api.disambiguator                          .....   SUCCESS
src.test.python.apache.aurora.client.api.instance_watcher                       .....   SUCCESS
src.test.python.apache.aurora.client.api.job_monitor                            .....   SUCCESS
src.test.python.apache.aurora.client.api.mux                                    .....   SUCCESS
src.test.python.apache.aurora.client.api.quota_check                            .....   SUCCESS
src.test.python.apache.aurora.client.api.restarter                              .....   SUCCESS
src.test.python.apache.aurora.client.api.scheduler_client                       .....   SUCCESS
src.test.python.apache.aurora.client.api.sla                                    .....   SUCCESS
src.test.python.apache.aurora.client.api.updater                                .....   SUCCESS
src.test.python.apache.aurora.client.api.updater_util                           .....   SUCCESS
src.test.python.apache.aurora.client.binding_helper                             .....   SUCCESS
src.test.python.apache.aurora.client.cli.api                                    .....   SUCCESS
src.test.python.apache.aurora.client.cli.bridge                                 .....   SUCCESS
src.test.python.apache.aurora.client.cli.command_hooks                          .....   SUCCESS
src.test.python.apache.aurora.client.cli.cron                                   .....   SUCCESS
src.test.python.apache.aurora.client.cli.help                                   .....   SUCCESS
src.test.python.apache.aurora.client.cli.inspect                                .....   SUCCESS
src.test.python.apache.aurora.client.cli.job                                    .....   FAILURE
src.test.python.apache.aurora.client.config                                     .....   SUCCESS

- Aurora ReviewBot


On Oct. 24, 2014, 2:50 p.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated Oct. 24, 2014, 2:50 p.m.)
> 
> 
> Review request for Aurora, Maxim Khutornenko and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD 995570325bbb09ecbcc2ace5d223760c5d49367f 
>   src/main/python/apache/aurora/client/cli/jobs.py 625cb80a33ae565b403fc71bb9795e4700e1aeb7 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 4692d31a9c128664273f71d15ee217dc060e66f0 
>   src/test/python/apache/aurora/client/cli/test_diff.py 78694d7559f2041f27cd2a7e4cb81ca467f63ac2 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Mark Chu-Carroll <mc...@twopensource.com>.

> On Oct. 24, 2014, 11:58 a.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/client/cli/jobs.py, line 311
> > <https://reviews.apache.org/r/22457/diff/10/?file=732192#file732192line311>
> >
> >     This is no longer true. The `populated` field reprsents a single TaskConfig now. You'd need to inflate a task config list according to the local instance count.
> 
> Mark Chu-Carroll wrote:
>     So you changed the semantics of the API call without changing its type signature to reflect that? Really?
> 
> Maxim Khutornenko wrote:
>     We can't change type signatures without breaking backward compatibility. This deprecation was a long overdue change as there is no reason to return a set of completely identical structs over the wire now that all TaskConfigs are identical.

Backward compatibility is only backward compatibility if existing code is unbroken by a change. That is *not* the case here. This is a breaking change - it will break the comparisons of any client code that isn't explicitly updated to reflect the change. Semantic changes break backward compatibility - as this change shows!

The right way to handle something like this is to be explicit about the deprecation, not to sloppily overload an existing structure so that the change will slip by unnoticed because you kept the type signature, but changed the semantics.


- Mark


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review58280
-----------------------------------------------------------


On Oct. 24, 2014, 10:50 a.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated Oct. 24, 2014, 10:50 a.m.)
> 
> 
> Review request for Aurora, Maxim Khutornenko and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD 995570325bbb09ecbcc2ace5d223760c5d49367f 
>   src/main/python/apache/aurora/client/cli/jobs.py 625cb80a33ae565b403fc71bb9795e4700e1aeb7 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 4692d31a9c128664273f71d15ee217dc060e66f0 
>   src/test/python/apache/aurora/client/cli/test_diff.py 78694d7559f2041f27cd2a7e4cb81ca467f63ac2 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Maxim Khutornenko <ma...@apache.org>.

> On Oct. 24, 2014, 3:58 p.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/client/cli/jobs.py, line 311
> > <https://reviews.apache.org/r/22457/diff/10/?file=732192#file732192line311>
> >
> >     This is no longer true. The `populated` field reprsents a single TaskConfig now. You'd need to inflate a task config list according to the local instance count.
> 
> Mark Chu-Carroll wrote:
>     So you changed the semantics of the API call without changing its type signature to reflect that? Really?

We can't change type signatures without breaking backward compatibility. This deprecation was a long overdue change as there is no reason to return a set of completely identical structs over the wire now that all TaskConfigs are identical.


- Maxim


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review58280
-----------------------------------------------------------


On Oct. 24, 2014, 2:50 p.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated Oct. 24, 2014, 2:50 p.m.)
> 
> 
> Review request for Aurora, Maxim Khutornenko and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD 995570325bbb09ecbcc2ace5d223760c5d49367f 
>   src/main/python/apache/aurora/client/cli/jobs.py 625cb80a33ae565b403fc71bb9795e4700e1aeb7 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 4692d31a9c128664273f71d15ee217dc060e66f0 
>   src/test/python/apache/aurora/client/cli/test_diff.py 78694d7559f2041f27cd2a7e4cb81ca467f63ac2 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Mark Chu-Carroll <mc...@twopensource.com>.

> On Oct. 24, 2014, 11:58 a.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/client/cli/jobs.py, line 311
> > <https://reviews.apache.org/r/22457/diff/10/?file=732192#file732192line311>
> >
> >     This is no longer true. The `populated` field reprsents a single TaskConfig now. You'd need to inflate a task config list according to the local instance count.

So you changed the semantics of the API call without changing its type signature to reflect that? Really?


- Mark


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review58280
-----------------------------------------------------------


On Oct. 24, 2014, 10:50 a.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated Oct. 24, 2014, 10:50 a.m.)
> 
> 
> Review request for Aurora, Maxim Khutornenko and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD 995570325bbb09ecbcc2ace5d223760c5d49367f 
>   src/main/python/apache/aurora/client/cli/jobs.py 625cb80a33ae565b403fc71bb9795e4700e1aeb7 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 4692d31a9c128664273f71d15ee217dc060e66f0 
>   src/test/python/apache/aurora/client/cli/test_diff.py 78694d7559f2041f27cd2a7e4cb81ca467f63ac2 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Maxim Khutornenko <ma...@apache.org>.

> On Oct. 24, 2014, 3:58 p.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/client/cli/jobs.py, line 311
> > <https://reviews.apache.org/r/22457/diff/10/?file=732192#file732192line311>
> >
> >     This is no longer true. The `populated` field reprsents a single TaskConfig now. You'd need to inflate a task config list according to the local instance count.
> 
> Mark Chu-Carroll wrote:
>     So you changed the semantics of the API call without changing its type signature to reflect that? Really?
> 
> Maxim Khutornenko wrote:
>     We can't change type signatures without breaking backward compatibility. This deprecation was a long overdue change as there is no reason to return a set of completely identical structs over the wire now that all TaskConfigs are identical.
> 
> Mark Chu-Carroll wrote:
>     Backward compatibility is only backward compatibility if existing code is unbroken by a change. That is *not* the case here. This is a breaking change - it will break the comparisons of any client code that isn't explicitly updated to reflect the change. Semantic changes break backward compatibility - as this change shows!
>     
>     The right way to handle something like this is to be explicit about the deprecation, not to sloppily overload an existing structure so that the change will slip by unnoticed because you kept the type signature, but changed the semantics.

Ah, I see what the problem is. I accidentally referred to "populated" in my original comment by what I really meant was "taskConfig". Sorry about that.

You have a choice of using the new "taskConfig" field and expanding the result to the number of instances as I suggested or continue using the populatedDEPRECATED field and keep the current behavior. Given that this is a deep refactoring, I suggest the former.


- Maxim


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review58280
-----------------------------------------------------------


On Oct. 24, 2014, 2:50 p.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated Oct. 24, 2014, 2:50 p.m.)
> 
> 
> Review request for Aurora, Maxim Khutornenko and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD 995570325bbb09ecbcc2ace5d223760c5d49367f 
>   src/main/python/apache/aurora/client/cli/jobs.py 625cb80a33ae565b403fc71bb9795e4700e1aeb7 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 4692d31a9c128664273f71d15ee217dc060e66f0 
>   src/test/python/apache/aurora/client/cli/test_diff.py 78694d7559f2041f27cd2a7e4cb81ca467f63ac2 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Maxim Khutornenko <ma...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review58280
-----------------------------------------------------------



src/main/python/apache/aurora/client/cli/jobs.py
<https://reviews.apache.org/r/22457/#comment99224>

    This is no longer true. The `populated` field reprsents a single TaskConfig now. You'd need to inflate a task config list according to the local instance count.



src/test/python/apache/aurora/client/cli/test_diff.py
<https://reviews.apache.org/r/22457/#comment99225>

    Please, refactor the correspondent method in util.py to give you a single scheduled task. There is nothing unique here that cannot be reused.



src/test/python/apache/aurora/client/cli/test_diff.py
<https://reviews.apache.org/r/22457/#comment99228>

    A ScheduledTask does not have a `key` field. Please, avoid using dynamic properties as they are very confusing in test.



src/test/python/apache/aurora/client/cli/test_diff.py
<https://reviews.apache.org/r/22457/#comment99226>

    These are not mocks anymore, please update method name.



src/test/python/apache/aurora/client/cli/test_diff.py
<https://reviews.apache.org/r/22457/#comment99227>

    same here



src/test/python/apache/aurora/client/cli/test_diff.py
<https://reviews.apache.org/r/22457/#comment99229>

    Missing `job` field as a future replacment to owner/environment/jobName.



src/test/python/apache/aurora/client/cli/test_diff.py
<https://reviews.apache.org/r/22457/#comment99230>

    Is this still needed? If so, would be great to have a bit of formatting around it to really help debugging when needed.


- Maxim Khutornenko


On Oct. 24, 2014, 2:50 p.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated Oct. 24, 2014, 2:50 p.m.)
> 
> 
> Review request for Aurora, Maxim Khutornenko and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD 995570325bbb09ecbcc2ace5d223760c5d49367f 
>   src/main/python/apache/aurora/client/cli/jobs.py 625cb80a33ae565b403fc71bb9795e4700e1aeb7 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 4692d31a9c128664273f71d15ee217dc060e66f0 
>   src/test/python/apache/aurora/client/cli/test_diff.py 78694d7559f2041f27cd2a7e4cb81ca467f63ac2 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Mark Chu-Carroll <mc...@twopensource.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/
-----------------------------------------------------------

(Updated Oct. 24, 2014, 10:50 a.m.)


Review request for Aurora, Maxim Khutornenko and Brian Wickman.


Changes
-------

I think I finally found the problem that was causing the jenkins failures. Since this rebase was a bit messy, can reviewers please take another look before I try pushing it?


Bugs: aurora-520
    https://issues.apache.org/jira/browse/aurora-520


Repository: aurora


Description
-------

Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.

- Allow exclusion of semantically irrelevant fields.
- Provide a clearer list of the differences between configs.
- Provide a scripting-friendly alternative JSON syntax for diffs.

The old diff behavior is still available under the "--use-shell-diff" option.


Diffs (updated)
-----

  src/main/python/apache/aurora/client/cli/BUILD 995570325bbb09ecbcc2ace5d223760c5d49367f 
  src/main/python/apache/aurora/client/cli/jobs.py 625cb80a33ae565b403fc71bb9795e4700e1aeb7 
  src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
  src/test/python/apache/aurora/client/cli/BUILD 4692d31a9c128664273f71d15ee217dc060e66f0 
  src/test/python/apache/aurora/client/cli/test_diff.py 78694d7559f2041f27cd2a7e4cb81ca467f63ac2 
  src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 

Diff: https://reviews.apache.org/r/22457/diff/


Testing
-------

New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
All tests pass.


Thanks,

Mark Chu-Carroll


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Mark Chu-Carroll <mc...@twopensource.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/
-----------------------------------------------------------

(Updated Sept. 9, 2014, 10:07 a.m.)


Review request for Aurora, Maxim Khutornenko and Brian Wickman.


Changes
-------

Tried modifying the test, to see if that fixes the jenkins build issue.


Bugs: aurora-520
    https://issues.apache.org/jira/browse/aurora-520


Repository: aurora


Description (updated)
-------

Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.

- Allow exclusion of semantically irrelevant fields.
- Provide a clearer list of the differences between configs.
- Provide a scripting-friendly alternative JSON syntax for diffs.

The old diff behavior is still available under the "--use-shell-diff" option.


Diffs (updated)
-----

  src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
  src/main/python/apache/aurora/client/cli/jobs.py 8f349c09637c16e2499e85f2dc96eb7ccffd0aaf 
  src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
  src/test/python/apache/aurora/client/cli/BUILD e1f9ebf96774b8f5c75de8570c6ba87d953ab649 
  src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
  src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 

Diff: https://reviews.apache.org/r/22457/diff/


Testing
-------

New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
All tests pass.


Thanks,

Mark Chu-Carroll


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Mark Chu-Carroll <mc...@twopensource.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/
-----------------------------------------------------------

(Updated July 17, 2014, 10:09 a.m.)


Review request for Aurora, Maxim Khutornenko and Brian Wickman.


Changes
-------

Another stab at fixing the jenkins build for this change.

The failures could be explained by a different order of execution of
the tests: the error attributed to the failed test could have been
produced by another test leaving extraneous state behind. This change
adds extra cleanup steps to eliminate that.


Bugs: aurora-520
    https://issues.apache.org/jira/browse/aurora-520


Repository: aurora


Description
-------

Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.

- Allow exclusion of semantically irrelevant fields.
- Provide a clearer list of the differences between configs.
- Provide a scripting-friendly alternative JSON syntax for diffs.

The old diff behavior is still available under the "--use-shell-diff" option.


Diffs (updated)
-----

  src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
  src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
  src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
  src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
  src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
  src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 

Diff: https://reviews.apache.org/r/22457/diff/


Testing
-------

New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
All tests pass.


Thanks,

Mark Chu-Carroll


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Mark Chu-Carroll <mc...@twopensource.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/
-----------------------------------------------------------

(Updated July 16, 2014, 3:26 p.m.)


Review request for Aurora, Maxim Khutornenko and Brian Wickman.


Changes
-------

Attempt to fix jenkins testing issue, by canonicalizing tasks in advance.


Bugs: aurora-520
    https://issues.apache.org/jira/browse/aurora-520


Repository: aurora


Description
-------

Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.

- Allow exclusion of semantically irrelevant fields.
- Provide a clearer list of the differences between configs.
- Provide a scripting-friendly alternative JSON syntax for diffs.

The old diff behavior is still available under the "--use-shell-diff" option.


Diffs (updated)
-----

  src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
  src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
  src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
  src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
  src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
  src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 

Diff: https://reviews.apache.org/r/22457/diff/


Testing
-------

New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
All tests pass.


Thanks,

Mark Chu-Carroll


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Mark Chu-Carroll <mc...@twopensource.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review47423
-----------------------------------------------------------


It's been a week since the last time I pinged this review, and still, I'm waiting on it. Brian, please, can you please review this?


- Mark Chu-Carroll


On June 27, 2014, 10:41 a.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated June 27, 2014, 10:41 a.m.)
> 
> 
> Review request for Aurora, Maxim Khutornenko and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Mark Chu-Carroll <mc...@apache.org>.
Brian, please, can I get a review on this?  It's been dragging on for weeks!



On Mon, Jun 30, 2014 at 8:06 AM, Mark Chu-Carroll <mc...@apache.org>
wrote:

> wickman, ping?
>
>
> On Fri, Jun 27, 2014 at 11:25 AM, Maxim Khutornenko <ma...@apache.org>
> wrote:
>
>>    This is an automatically generated e-mail. To reply, visit:
>> https://reviews.apache.org/r/22457/
>>
>> Ship it!
>>
>> Ship It!
>>
>>
>> - Maxim Khutornenko
>>
>> On June 27th, 2014, 2:41 p.m. UTC, Mark Chu-Carroll wrote:
>>   Review request for Aurora, Maxim Khutornenko and Brian Wickman.
>> By Mark Chu-Carroll.
>>
>> *Updated June 27, 2014, 2:41 p.m.*
>>  *Bugs: * aurora-520 <https://issues.apache.org/jira/browse/aurora-520>
>>  *Repository: * aurora
>> Description
>>
>> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
>>
>> - Allow exclusion of semantically irrelevant fields.
>> - Provide a clearer list of the differences between configs.
>> - Provide a scripting-friendly alternative JSON syntax for diffs.
>>
>> The old diff behavior is still available under the "--use-shell-diff" option.
>>
>>   Testing
>>
>> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
>> All tests pass.
>>
>>   Diffs
>>
>>    - src/main/python/apache/aurora/client/cli/BUILD
>>    (ebe681a0d1735b7cc695dc3b7a14c4292d87ae32)
>>    - src/main/python/apache/aurora/client/cli/jobs.py
>>    (4fa03a6c9919651551238b0dc211ed69a8dfe565)
>>    - src/main/python/apache/aurora/client/cli/json_tree_diff.py
>>    (PRE-CREATION)
>>    - src/test/python/apache/aurora/client/cli/BUILD
>>    (3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378)
>>    - src/test/python/apache/aurora/client/cli/test_diff.py
>>    (38629b63c082cf81cb891dace2a70d9e8f418e18)
>>    - src/test/python/apache/aurora/client/cli/test_json_diff.py
>>    (PRE-CREATION)
>>
>> View Diff <https://reviews.apache.org/r/22457/diff/>
>>
>
>

Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Mark Chu-Carroll <mc...@apache.org>.
wickman, ping?


On Fri, Jun 27, 2014 at 11:25 AM, Maxim Khutornenko <ma...@apache.org>
wrote:

>    This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
>
> Ship it!
>
> Ship It!
>
>
> - Maxim Khutornenko
>
> On June 27th, 2014, 2:41 p.m. UTC, Mark Chu-Carroll wrote:
>   Review request for Aurora, Maxim Khutornenko and Brian Wickman.
> By Mark Chu-Carroll.
>
> *Updated June 27, 2014, 2:41 p.m.*
>  *Bugs: * aurora-520 <https://issues.apache.org/jira/browse/aurora-520>
>  *Repository: * aurora
> Description
>
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
>
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
>
> The old diff behavior is still available under the "--use-shell-diff" option.
>
>   Testing
>
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
>
>   Diffs
>
>    - src/main/python/apache/aurora/client/cli/BUILD
>    (ebe681a0d1735b7cc695dc3b7a14c4292d87ae32)
>    - src/main/python/apache/aurora/client/cli/jobs.py
>    (4fa03a6c9919651551238b0dc211ed69a8dfe565)
>    - src/main/python/apache/aurora/client/cli/json_tree_diff.py
>    (PRE-CREATION)
>    - src/test/python/apache/aurora/client/cli/BUILD
>    (3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378)
>    - src/test/python/apache/aurora/client/cli/test_diff.py
>    (38629b63c082cf81cb891dace2a70d9e8f418e18)
>    - src/test/python/apache/aurora/client/cli/test_json_diff.py
>    (PRE-CREATION)
>
> View Diff <https://reviews.apache.org/r/22457/diff/>
>

Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Maxim Khutornenko <ma...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review46866
-----------------------------------------------------------

Ship it!


Ship It!

- Maxim Khutornenko


On June 27, 2014, 2:41 p.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated June 27, 2014, 2:41 p.m.)
> 
> 
> Review request for Aurora, Maxim Khutornenko and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Brian Wickman <wi...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review47783
-----------------------------------------------------------

Ship it!


Ship It!

- Brian Wickman


On June 27, 2014, 2:41 p.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated June 27, 2014, 2:41 p.m.)
> 
> 
> Review request for Aurora, Maxim Khutornenko and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Mark Chu-Carroll <mc...@twopensource.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/
-----------------------------------------------------------

(Updated June 27, 2014, 10:41 a.m.)


Review request for Aurora, Maxim Khutornenko and Brian Wickman.


Changes
-------

Addressed maxims review.


Bugs: aurora-520
    https://issues.apache.org/jira/browse/aurora-520


Repository: aurora


Description
-------

Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.

- Allow exclusion of semantically irrelevant fields.
- Provide a clearer list of the differences between configs.
- Provide a scripting-friendly alternative JSON syntax for diffs.

The old diff behavior is still available under the "--use-shell-diff" option.


Diffs (updated)
-----

  src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
  src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
  src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
  src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
  src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
  src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 

Diff: https://reviews.apache.org/r/22457/diff/


Testing
-------

New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
All tests pass.


Thanks,

Mark Chu-Carroll


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Mark Chu-Carroll <mc...@twopensource.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/
-----------------------------------------------------------

(Updated June 26, 2014, 7:42 a.m.)


Review request for Aurora, Maxim Khutornenko and Brian Wickman.


Changes
-------

Remove David McLaughlin from the reviewers list (at his request).


Bugs: aurora-520
    https://issues.apache.org/jira/browse/aurora-520


Repository: aurora


Description
-------

Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.

- Allow exclusion of semantically irrelevant fields.
- Provide a clearer list of the differences between configs.
- Provide a scripting-friendly alternative JSON syntax for diffs.

The old diff behavior is still available under the "--use-shell-diff" option.


Diffs
-----

  src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
  src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
  src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
  src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
  src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
  src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 

Diff: https://reviews.apache.org/r/22457/diff/


Testing
-------

New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
All tests pass.


Thanks,

Mark Chu-Carroll


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by David McLaughlin <da...@dmclaughlin.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review46714
-----------------------------------------------------------


Mark, can you replace me with Maxim on people line? This lgtm, but it seems like Brian and Maxim are more qualified to give feedback in this instance.

- David McLaughlin


On June 26, 2014, 1:16 a.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated June 26, 2014, 1:16 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Mark Chu-Carroll <mc...@twopensource.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/
-----------------------------------------------------------

(Updated June 25, 2014, 9:16 p.m.)


Review request for Aurora, David McLaughlin and Brian Wickman.


Changes
-------

Updated the diff routine to canonicalize the ordering of elements in the lists that were set fields in thrift.


Bugs: aurora-520
    https://issues.apache.org/jira/browse/aurora-520


Repository: aurora


Description
-------

Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.

- Allow exclusion of semantically irrelevant fields.
- Provide a clearer list of the differences between configs.
- Provide a scripting-friendly alternative JSON syntax for diffs.

The old diff behavior is still available under the "--use-shell-diff" option.


Diffs (updated)
-----

  src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
  src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
  src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
  src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
  src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
  src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 

Diff: https://reviews.apache.org/r/22457/diff/


Testing
-------

New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
All tests pass.


Thanks,

Mark Chu-Carroll


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Mark Chu-Carroll <mc...@twopensource.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/
-----------------------------------------------------------

(Updated June 18, 2014, 10:59 a.m.)


Review request for Aurora, David McLaughlin and Brian Wickman.


Changes
-------

Reviews.


Bugs: aurora-520
    https://issues.apache.org/jira/browse/aurora-520


Repository: aurora


Description
-------

Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.

- Allow exclusion of semantically irrelevant fields.
- Provide a clearer list of the differences between configs.
- Provide a scripting-friendly alternative JSON syntax for diffs.

The old diff behavior is still available under the "--use-shell-diff" option.


Diffs (updated)
-----

  src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
  src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
  src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
  src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
  src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
  src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 

Diff: https://reviews.apache.org/r/22457/diff/


Testing
-------

New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
All tests pass.


Thanks,

Mark Chu-Carroll


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Mark Chu-Carroll <mc...@twopensource.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review45822
-----------------------------------------------------------



src/main/python/apache/aurora/client/cli/jobs.py
<https://reviews.apache.org/r/22457/#comment80813>

    Yes. 
    
    This is not new functionality for aurora: this is what it's been doing forever.
    
    Unix diff tools have a standard non-standard interface: unlike most tools, they don't return 0 for successful completion - they return 0 or 1, depending on whether or not they found differences. 
    


- Mark Chu-Carroll


On June 13, 2014, 4:22 p.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated June 13, 2014, 4:22 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Brian Wickman <wi...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review45808
-----------------------------------------------------------



src/main/python/apache/aurora/client/cli/jobs.py
<https://reviews.apache.org/r/22457/#comment80794>

    indentation on this comment should be dedented 1



src/main/python/apache/aurora/client/cli/jobs.py
<https://reviews.apache.org/r/22457/#comment80796>

    given that DIFF_VIEWER is user-supplied, should we be performing this check?



src/main/python/apache/aurora/client/cli/json_tree_diff.py
<https://reviews.apache.org/r/22457/#comment80804>

    from twitter.common.lang import Compatibility and use Compatibility.string
    
    StringTypes doesn't exist in python 3.x:
    
    mba=~=; python3.3
    Python 3.3.3 (default, Apr 15 2014, 11:17:36) 
    [GCC 4.2.1 Compatible Apple Clang 4.0 ((tags/Apple/clang-421.0.60))] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> from types import StringTypes
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ImportError: cannot import name StringTypes
    >>> 
    



src/main/python/apache/aurora/client/cli/json_tree_diff.py
<https://reviews.apache.org/r/22457/#comment80803>

    isinstance(e, list)



src/main/python/apache/aurora/client/cli/json_tree_diff.py
<https://reviews.apache.org/r/22457/#comment80805>

    isinstance(base_val, dict)
    isinstance(other_val, dict)
    



src/test/python/apache/aurora/client/cli/test_diff.py
<https://reviews.apache.org/r/22457/#comment80802>

    this sort of testing seems fragile as it relies upon the sort order of the python dictionary hash which is not guaranteed to be the same between VM implementations (or even within minor versions of VM implementations.)
    
    instead, you'll probably probably need to parse the output rather than checking string comparisons.


- Brian Wickman


On June 13, 2014, 8:22 p.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated June 13, 2014, 8:22 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Kevin Sweeney <ke...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review45834
-----------------------------------------------------------



src/main/python/apache/aurora/client/cli/jobs.py
<https://reviews.apache.org/r/22457/#comment80826>

    while we're here can we change the default to unified diff (["diff", "-u"])? the ancient diff output is pretty-much unreadable.


- Kevin Sweeney


On June 13, 2014, 1:22 p.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated June 13, 2014, 1:22 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Mark Chu-Carroll <mc...@twopensource.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/
-----------------------------------------------------------

(Updated June 13, 2014, 4:22 p.m.)


Review request for Aurora, David McLaughlin and Brian Wickman.


Changes
-------

Remove unnecessary and obfuscatory code.


Bugs: aurora-520
    https://issues.apache.org/jira/browse/aurora-520


Repository: aurora


Description
-------

Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.

- Allow exclusion of semantically irrelevant fields.
- Provide a clearer list of the differences between configs.
- Provide a scripting-friendly alternative JSON syntax for diffs.

The old diff behavior is still available under the "--use-shell-diff" option.


Diffs (updated)
-----

  src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
  src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
  src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
  src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
  src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
  src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 

Diff: https://reviews.apache.org/r/22457/diff/


Testing
-------

New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
All tests pass.


Thanks,

Mark Chu-Carroll


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Mark Chu-Carroll <mc...@twopensource.com>.

> On June 11, 2014, 11:18 a.m., Maxim Khutornenko wrote:
> > Given this new diff routine lives in cli, I figure the updater is out of luck again? I was hoping to have a single diff engine shared between cli and updater to finally have a consistent diff story.

I'm happy to move it out. I'd like to first get it deployed here, just so that we can let people try it, and make sure it's what they want. If it doesn't get any complaints from users, and we're sure it's doing the right thing, we can move it to api.


- Mark


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review45382
-----------------------------------------------------------


On June 11, 2014, 11 a.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated June 11, 2014, 11 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Joe Smith <ya...@gmail.com>.

> On June 11, 2014, 8:18 a.m., Maxim Khutornenko wrote:
> > Given this new diff routine lives in cli, I figure the updater is out of luck again? I was hoping to have a single diff engine shared between cli and updater to finally have a consistent diff story.
> 
> Mark Chu-Carroll wrote:
>     I'm happy to move it out. I'd like to first get it deployed here, just so that we can let people try it, and make sure it's what they want. If it doesn't get any complaints from users, and we're sure it's doing the right thing, we can move it to api.
>

agreed, I noticed this too


- Joe


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review45382
-----------------------------------------------------------


On June 11, 2014, 8 a.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated June 11, 2014, 8 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Maxim Khutornenko <ma...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review45382
-----------------------------------------------------------


Given this new diff routine lives in cli, I figure the updater is out of luck again? I was hoping to have a single diff engine shared between cli and updater to finally have a consistent diff story. 

- Maxim Khutornenko


On June 11, 2014, 3 p.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated June 11, 2014, 3 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Joe Smith <ya...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review45635
-----------------------------------------------------------


cool, lgtm


src/main/python/apache/aurora/client/cli/jobs.py
<https://reviews.apache.org/r/22457/#comment80532>

    It's probably worthy- otherwise how else is a user going to see it?



src/main/python/apache/aurora/client/cli/jobs.py
<https://reviews.apache.org/r/22457/#comment80533>

    "the output of DIFF_VIEWER or unix diff" maybe?



src/main/python/apache/aurora/client/cli/json_tree_diff.py
<https://reviews.apache.org/r/22457/#comment80539>

    is this actually called somewhere?
    
    (and I promise I tried to understand what this is doing, but I can't ~_~)
    
    (what does 'rebase' mean?)


- Joe Smith


On June 11, 2014, 8 a.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated June 11, 2014, 8 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Mark Chu-Carroll <mc...@twopensource.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review45649
-----------------------------------------------------------



src/main/python/apache/aurora/client/cli/json_tree_diff.py
<https://reviews.apache.org/r/22457/#comment80544>

    Oops - that's code that got refactored out into prune_structure. I completely forgot that I refactored it out - I even went back and cleaned it up and added documentation...
    
    Just to reward you for noticing: rebasing means taking a path expression and rewriting it to be relative to a new basepoint.
    
    Think of a filesystem: you get an absolute path /u/users/markcc/work/code/foo. You want the relative path to that file from markcc's home dir. That operation of rewriting the path from the absolute path relative to root, to a local path (./work/code/foo) relative to "/u/users/markcc" is rebasing. 
    


- Mark Chu-Carroll


On June 11, 2014, 11 a.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated June 11, 2014, 11 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>


Re: Review Request 22457: Improve aurora "job diff" command.

Posted by Mark Chu-Carroll <mc...@twopensource.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/
-----------------------------------------------------------

(Updated June 11, 2014, 11 a.m.)


Review request for Aurora, David McLaughlin and Brian Wickman.


Changes
-------

Added a command-line parameter to allow users to specify fields excluded from the comparison.


Bugs: aurora-520
    https://issues.apache.org/jira/browse/aurora-520


Repository: aurora


Description
-------

Add a new diff method, which uses field-by-field comparison of JSON trees for comparing running job configurations to potentially updated configs.

- Allow exclusion of semantically irrelevant fields.
- Provide a clearer list of the differences between configs.
- Provide a scripting-friendly alternative JSON syntax for diffs.

The old diff behavior is still available under the "--use-shell-diff" option.


Diffs (updated)
-----

  src/main/python/apache/aurora/client/cli/BUILD ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
  src/main/python/apache/aurora/client/cli/jobs.py 4fa03a6c9919651551238b0dc211ed69a8dfe565 
  src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
  src/test/python/apache/aurora/client/cli/BUILD 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
  src/test/python/apache/aurora/client/cli/test_diff.py 38629b63c082cf81cb891dace2a70d9e8f418e18 
  src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 

Diff: https://reviews.apache.org/r/22457/diff/


Testing
-------

New unit tests of the JSON tree diff code, plus a bunch of new "job diff" tests of the new functionality.
All tests pass.


Thanks,

Mark Chu-Carroll