You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@daffodil.apache.org by "Sloane, Brandon" <bs...@tresys.com> on 2019/06/27 14:20:40 UTC

Justification for seperate DPathCompileInfo object

The comment on DPathCompileInfo says:


" What makes the circularity is that the runtime data structures (ElementRuntimeData in particular), are not lazy. Everything part of them is forced to be evaluated when those are constructed. So anything that needs even one member of an ERD is artificially dependent on *everything* in the ERD."

This isn't true, is it? Everything in ElementRuntimeData is passed by-name and assigned to a lazy member field.




Brandon T. Sloane

Associate, Services

bsloane@tresys.com | tresys.com

Re: Justification for seperate DPathCompileInfo object

Posted by "Beckerle, Mike" <mb...@tresys.com>.
So I have to correct myself here.


Brandon, sorry to waste your time. Obviously the original comment you found in the code was in fact just no longer correct and just needs to be changed.


I actually needed to look at ElementRuntimeData today.


As you said, every member is passed by name. I honestly hadn't looked because I thought you were speaking in hyperbole. But sure enough, every member passed so as to not have any coupling of computations by the object. At least not until serialization.


Now I don't recall making this change, but I may well have because I got stuck in the same circularity hell that Brandon has been running into. At some point I just decided to decouple everything this way.


Apologies for the oversight. I should RTFC - read the !#&^$#% code. 😊


Of note, however, not every member of DPathElementCompileInfo is similarly passed by name. So those will still cause coupling/circular bugs.


Those could be decoupled by following the pattern used by the RuntimeData objects and passing all their args by name.

I don't know of a better way to do this in Scala.


-mike

________________________________
From: Beckerle, Mike
Sent: Thursday, June 27, 2019 10:37 AM
To: dev@daffodil.apache.org
Subject: Re: Justification for seperate DPathCompileInfo object


Some things are passed by name because they are, by definition circular dependencies - a DPathCompileInfo points to its parent, which points to its child. Those linkages are created by way of the common lazy and call-by-name idiom that lets objects be allocated before all the args to the constructors have been evaluated.


The intent, was other than the above case, everything else would be strict - not lazy.


This strictness however, can lead to coupling and circularity bugs when the DPathCompileInfo objects have members that are accessed by the schema compiler.  For the RuntimeData objects, those members are intended for use by the runtime system, so the schema compiler shouldn't be using them really.


But....Given that we have to compile DPath expressions in the schema compiler, the notion that these DPathCompileInfo objects are "runtime system" data structures is simply not the case.


(Historically, the DPathCompileInfo objects were split out of the regular RuntimeData objects, in order to allow expressions to be compiled by the DPath compiler at "runtime" in the interactive debugger when paused at a breakpoint. In hindsight, it is not clear this was helpful long term. It does increase separation of concerns between the DPath expression language implementation and everything else about the runtime system. )


Given the heavy use of lazy evaluation in the schema compiler, and that these DPathCompileInfo objects will force all evaluations when they are serialized for the runtime system, I'm not sure we shouldn't change this behavior and just make all the parts of the DPathCompileInfo objects lazy.


________________________________
From: Sloane, Brandon <bs...@tresys.com>
Sent: Thursday, June 27, 2019 10:20:40 AM
To: dev@daffodil.apache.org
Subject: Justification for seperate DPathCompileInfo object

The comment on DPathCompileInfo says:


" What makes the circularity is that the runtime data structures (ElementRuntimeData in particular), are not lazy. Everything part of them is forced to be evaluated when those are constructed. So anything that needs even one member of an ERD is artificially dependent on *everything* in the ERD."

This isn't true, is it? Everything in ElementRuntimeData is passed by-name and assigned to a lazy member field.




Brandon T. Sloane

Associate, Services

bsloane@tresys.com | tresys.com

Re: Justification for seperate DPathCompileInfo object

Posted by "Beckerle, Mike" <mb...@tresys.com>.
Some things are passed by name because they are, by definition circular dependencies - a DPathCompileInfo points to its parent, which points to its child. Those linkages are created by way of the common lazy and call-by-name idiom that lets objects be allocated before all the args to the constructors have been evaluated.


The intent, was other than the above case, everything else would be strict - not lazy.


This strictness however, can lead to coupling and circularity bugs when the DPathCompileInfo objects have members that are accessed by the schema compiler.  For the RuntimeData objects, those members are intended for use by the runtime system, so the schema compiler shouldn't be using them really.


But....Given that we have to compile DPath expressions in the schema compiler, the notion that these DPathCompileInfo objects are "runtime system" data structures is simply not the case.


(Historically, the DPathCompileInfo objects were split out of the regular RuntimeData objects, in order to allow expressions to be compiled by the DPath compiler at "runtime" in the interactive debugger when paused at a breakpoint. In hindsight, it is not clear this was helpful long term. It does increase separation of concerns between the DPath expression language implementation and everything else about the runtime system. )


Given the heavy use of lazy evaluation in the schema compiler, and that these DPathCompileInfo objects will force all evaluations when they are serialized for the runtime system, I'm not sure we shouldn't change this behavior and just make all the parts of the DPathCompileInfo objects lazy.


________________________________
From: Sloane, Brandon <bs...@tresys.com>
Sent: Thursday, June 27, 2019 10:20:40 AM
To: dev@daffodil.apache.org
Subject: Justification for seperate DPathCompileInfo object

The comment on DPathCompileInfo says:


" What makes the circularity is that the runtime data structures (ElementRuntimeData in particular), are not lazy. Everything part of them is forced to be evaluated when those are constructed. So anything that needs even one member of an ERD is artificially dependent on *everything* in the ERD."

This isn't true, is it? Everything in ElementRuntimeData is passed by-name and assigned to a lazy member field.




Brandon T. Sloane

Associate, Services

bsloane@tresys.com | tresys.com