You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@daffodil.apache.org by "Costello, Roger L." <co...@mitre.org> on 2019/11/06 17:52:26 UTC

The Key Motivation for using DFDL is ...

Hi Folks,



Yesterday we said: DFDL is about describing the format of data that is in some external physical representation.



Last week we listed a bunch of use cases for DFDL.



What is the underlying motivation for all those use cases?



I assert that the underlying motivation is this: One wants to put the data into a form that one deems to be more useful. For example,

  *   one finds it more useful to have the data in the form of XML, or
  *   one finds it more useful to have the data in the form of Java objects, or
  *   one finds it more useful to have the data in the form of Apache NiFi record objects, or
  *   one finds it more useful to have the data in the form of the Apache Drill representation,
  *   and so forth.

Key Motivation for using DFDL

The key motivation for using DFDL is to convert data from a representation that one deems to be less useful to a representation that one deems to be more useful.



Do you agree that that is the key motivation for DFDL?



/Roger



Re: The Key Motivation for using DFDL is ...

Posted by "Sloane, Brandon" <bs...@tresys.com>.
> The key motivation for using DFDL is to convert data from a representation that one deems to be less useful to a representation that one deems to be more useful.

I believe this is accurate for the current use cases of DFDL, and it is certainly accurate for Daffodil. However, I believe that DFDL itself can be used for other purposes.

For instance, on the dev mailing list, there has been some discussion about introducing a DFDL backend to Apache Drill. As I understand it, Drill provides a querry engine that supports a large variety of datastores. The hope would be that DFDL provides a simpler way of defining additional datastores. In effect, this would mean that DFDL is allowing users to query the data without converting it.

I can also imagine usecases for debugging tools. For instance a key feature of Wireshark is packet dissectors, which take the raw bytes of a packet and display them as individual fields. A key feature here is that you can jump between the "converted" form and the binary, which could not be done with a simple conversion, but could be done with DFDL (although, again, I am not aware of any DFDL processor that exposes this information).

Another usecase that I have had spinning in the back of my mind is fuzzers. Given a DFDL description of a data format, you could write a program to generate both instances and non-instances of that datatype. Because of the visibility DFDL gives to the data format, it would be possibly to generated these automatically in a way that explores some of the corner cases of the format (as well as just typically fuzzing everything about it). Again, I am aware of no DFDL processor that currently does it.
________________________________
From: Costello, Roger L. <co...@mitre.org>
Sent: Wednesday, November 6, 2019 12:52 PM
To: users@daffodil.apache.org <us...@daffodil.apache.org>
Subject: The Key Motivation for using DFDL is ...


Hi Folks,



Yesterday we said: DFDL is about describing the format of data that is in some external physical representation.



Last week we listed a bunch of use cases for DFDL.



What is the underlying motivation for all those use cases?



I assert that the underlying motivation is this: One wants to put the data into a form that one deems to be more useful. For example,

  *   one finds it more useful to have the data in the form of XML, or
  *   one finds it more useful to have the data in the form of Java objects, or
  *   one finds it more useful to have the data in the form of Apache NiFi record objects, or
  *   one finds it more useful to have the data in the form of the Apache Drill representation,
  *   and so forth.

Key Motivation for using DFDL

The key motivation for using DFDL is to convert data from a representation that one deems to be less useful to a representation that one deems to be more useful.



Do you agree that that is the key motivation for DFDL?



/Roger