You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@daffodil.apache.org by Mike Beckerle <mb...@tresys.com> on 2017/10/20 17:28:22 UTC

Interoperability goals for Daffodil to support DFDL in OGF Standardization

This is an introduction to our "interoperability" goals, to be followed by a number of design topic threads about implementing the features and tools we need.

A good interoperability demonstration is the primary thing the Daffodil project needs to achieve in order to advance the DFDL specification from "proposed recommendation" to a full "recommendation" status with the OGF. This is critical to the credibility of DFDL as a standard.

OGF requires interop between two independent implementations. There are two other implementations of DFDL. The IBM one (part of the Information Bus product), and the ESA's DFDL4S (DFDL for Space) implementation. DFDL4S is a much smaller subset of DFDL, so interoperability testing will focus on demonstration of interop between Daffodil and IBM DFDL.

Our goal is not total interoperabililty - the DFDL standard allows for certain feature subsets to be considered conforming implementations. Schemas that use features of DFDL outside of one implementation's subset won't run on that implementation. That is ok.

Our goal is for enough DFDL schemas to interoperate across Daffodil and IBM DFDL that it is clear that many schemas can be made to work portably.

The interoperability demonstration consists of these things:

  *   make Daffodil run the 4 DFDL schemas published by IBM to github: ISO8583, IBM4690_TLOG, EDIFACT, and vCard.
  *   optionally: run FOUO formats like USMTF and other similar (uscg-ucop?) on IBM DFDL
  *   make Daffodil run most/more of it's tests that are currently parked in scala-debug and associated with the traditionally IBM-oriented data formats. Some of these are IBM contributions to Daffodil that were contributed in order to provide some test interoperability.
  *   run a cross-validation test where Daffodil's TDML tests are run against IBM DFDL so that we can discuss within the OGF DFDL workgroup, the places where the two main implementations disagree on behavior.
     *   Positive tests should produce the same infoset(parsing), or data (unparsing)
     *   Negative tests will differ greatly in the diagnostic message content. The DFDL spec generally doesn't specify the content of diagnostics. It just says ".. ..is an SDE" or "... is a processing error", so that's all we can check for.


As we do this it all this gets written up in an OGF "Experience Document" for the DFDL workgroup that documents the interoperability.


To achieve the above, we must implement missing features in Daffodil including at least:

  *   zoned, packed, ibm4690packed numbers
  *   prefix length
  *   unordered sequences
  *   ... perhaps a few other things


In addition we must enhance the TDML language and runner to enable tests to be run against Daffodil or IBM DFDL. Some tests will be specified as expected to run on both, others will be specified as expected to run only on Daffodil, or only on IBM so as to test things known to be not (or not yet) implemented by one or the other implementation.


Important detail: IBM DFDL has to be dynamically linked if found on the classpath, so that it is not needed in order to build/run Daffodil.


Our JIRA (still on NCSA today) has a category "interoperability" which is used for the related tickets for this overall topic.


-----------------