You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@daffodil.apache.org by "Costello, Roger L." <co...@mitre.org> on 2019/10/20 12:08:44 UTC

Why use DFDL? Why parse and unparse data formats?

Hi Folks,

It occurred to me that in my tutorial I have explained what DFDL is and how to use DFDL, but I never explained why DFDL should be used. Below is a slide that takes a stab at why. I am sure there are other reasons. Would you provide other reasons, please? I think answering why is a crucial thing. I consider this slide to be very important. /Roger

[cid:image003.png@01D5871D.A7A92860]

Re: Why use DFDL? Why parse and unparse data formats?

Posted by "Beckerle, Mike" <mb...@tresys.com>.
The first 3 points are about the "Cybersecurity Use Case" specifically. This is one small slice of computerdom. Important one, but a niche area that is beside the point for most people who want to get on with using the data for some actual end purpose.

Your last sentence on the slide wants to end with "data" not "tool".

Your phrase "Why parse and unparse data formats?" gives me some concern. I mean do you have a choice?
So I'm assuming you have to use the data, so the issue is why use DFDL vs. other ways of using the data:

The reasons to use DFDL are:

  *   it is an emerging open standard. In the long run standards give users power over vendors, reduce costs, increase skills leverage, etc.
  *   it is comprehensive - can handle everything from military messaging formats to COBOL, binary and text and mixtures thereof.  There are a few things it cannot describe as yet (TIFF for example), but it will evolve to cover those as well.
  *   it has superior unparsing capability to any existing data format description system - this is one area where the DFDL standard advances the state-of-the-art.
  *   there are multiple implementations including open-source and commercial.

Some additional related points:

  *   why use DFDL to parse/unparse  to/from an alternative standard textual form such as JSON or XML ?

     *   Note that DFDL doesn't per-se require this. It is one common way to use the Daffodil implementation
     *   Note that not all DFDL implementations even support this.  E.g., the ESA's DFDL4Space tool doesn't convert to/from JSON or XML.  IBM DFDL can be used to convert data to XML, but when used in the most common ways, it takes data directly to/from the native internal data format used by that particular data-handling product/system. No intermediate step of XML/JSON is used.

So I'd say the above point is really about Daffodil, not DFDL generally.  The above is about skills leverage and tools leverage. JSON support is built into all javascript based platforms such as web browsers, NODE.js, etc.  XML has standard tools available from vendors. DFDL is another standard that adds capability to people with JSON/XML skills and/or tools already. Use of a textual format as an intermediate form has significant QA benefits for most systems.

  *   Why learn and use the DFDL standard vs. some other approach?

     *   Such as any of the hundreds of data description tools/systems in the marketplace - one of which commonly comes with any given enterprise software package.
        *   Note that most of these tools are quite declarative, so the "be declarative" argument is orthogonal to the "why DFDL" argument.
        *   Note that many of these tools have much better user interfaces than Daffodil (today). In the short run this may be a good reason not to use DFDL.
           *    In the long run we expect the power of open-source and standards to address this.
     *   Such as just writing software code to handle/parse the data. The problem with this is it is typically procedural, not declarative.
        *   DFDL is also not turing complete. It is far easier to show correctness of a schema than a program.

Point (a) above is just the standards vs. non-standards argument.  At this point, I am digressing into lots of points made in this slideshare deck in slides 3, 4, 5:
https://www.slideshare.net/mbeckerle/tresys-dfdl-data-format-description-language-daffodil-open-source-public-overview-100432615

Another point is about the benefits and power of standards. DFDL is simply better than existing ad-hoc data format description languages in that it is far more comprehensive than most commercial and other open-source systems, and it is an emerging open standard, with multiple implementations with a good deal of demonstrated interoperability: https://cwiki.apache.org/confluence/display/DAFFODIL/Daffodil+Compatibility+with+IBM+DFDL

DFDL is still quite new, and I would expect some users to choose other things until Daffodil gets out of Apache Incubator status, and the DFDL standard is fully ratified by Open Grid Forum, and is proposed as a standard by a larger/more-recognized body.  The fact that IBM has DFDL in multiple products now is a strong statement of support going forward.







________________________________
From: Costello, Roger L. <co...@mitre.org>
Sent: Sunday, October 20, 2019 8:08 AM
To: users@daffodil.apache.org <us...@daffodil.apache.org>
Subject: Why use DFDL? Why parse and unparse data formats?


Hi Folks,



It occurred to me that in my tutorial I have explained what DFDL is and how to use DFDL, but I never explained why DFDL should be used. Below is a slide that takes a stab at why. I am sure there are other reasons. Would you provide other reasons, please? I think answering why is a crucial thing. I consider this slide to be very important. /Roger



[cid:image003.png@01D5871D.A7A92860]