You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@daffodil.apache.org by "Beckerle, Mike" <mb...@owlcyberdefense.com> on 2021/01/11 19:55:39 UTC

debugger thoughts - parsing oriented

Here's some thoughts about the debugger for parsing.

* The debugger doesn't distinguish the big parsers that correspond to user-specified possibly named entities (elements, model groups) from the finer-grained parsers that are the details. It needs to distinguish these.

* There's really different kinds of steps that must be visually distinguishable in some obvious way
* element steps - tier one, correspond to infoset elements (hidden or no),
* model-group steps - tier two
* sequence - with special treatment in debugger due to separated/unseparated
* suggests to me that parsers need hooks where they can specifically supply information to the debugger that is not ordinarily part of the PState, or where code that knows how to interpret that PState data structure feels like the responsibility of the Parser class itself.
* arrays - will have to be distinguished and treated specifically. There are fundamentally different kinds and there's no point pretending they're all similar when some just compute N, others backtrack their way to success. A parse failure that ends an array needs to very explicitly show that this failure is suppressed and ends the array. Right now you can barely tell.
* optional - when it is truly optional (not occursCountKind 'parsed' which is an array), then these need probably special treatment to avoid introducing all the array overhead when it is a 0 or 1 determination only.
* choice - with special treatment for computed vs. backtracking
* backtracking needs special treatment generally in the debug/trace.
* details steps - tier 3.
* scanning for delimiters - special treatment for match/non-match
* determining length
* explicit - expression evaluation here is key because it can succeed/fail, and if it succeeds can produce a correct or incorrect value.
* pattern - definitely need to show and highlight length=0 due to no-match
* evaluation of asserts
* evaluation of discriminators
* This suggests that establishing points of uncertainty that are later impacted by discriminators/asserts or just plain parse errors is itself important in the trace.
* variables events
* introduction of new variables
* evaluation of setVariable or a new variable default value
* Maybe even reading the value of a variable - but this could get very verbose quickly so would need to be suppressable - maybe we have a command saying you want to watch certain variables, with a 'watch all' option available.
* watch points generally on variables, and on infoset paths, seem like a useful concept. A path that doesn't exist should not be an error but just display as "foo/bar (doesn't exist)" when displaying the watchpoint values.
* values - extracting bits or text and converting it to values is usually uninteresting and should be suppressed. The user can see what ended up in the infoset anyway, and if padding or escapes are wrong that will be obvious mostly.
* Mostly I think value extraction and conversion, once length is known, are uninteresting to users.
* Layers
* These need to be heavily visible as boundaries (enter layer, exit layer), but otherwise I think for parsing are not particularly eventful.

I guess the biggest point here is that the debugger/trace output needs to be different depending on the lengthKind, and occursCountKind, and different for array, optional, scalar.

[cid:43c459ac-7725-4dc6-ba59-391b3b7aef31] Mike Beckerle | Principal Engineer

[OWL Cyber Defense]

P +1-781-330-0412
W owlcyberdefense.com<http://www.owlcyberdefense.com>