You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@daffodil.apache.org by "Costello, Roger L." <co...@mitre.org> on 2019/11/07 13:42:54 UTC

Seek your help in creating a Euclid-like reasoning for why one should use DFDL

Hi Folks,
Long ago I read Euclid. It is a fabulous book. It begins with simple facts and then incrementally builds upon them using basic logic to formulate increasingly sophisticated concepts.
I want someone who has never heard of DFDL to understand why DFDL is relevant to them. I want to follow Euclid's approach: start with simple facts about data and then incrementally build upon them using basic logic. Hopefully, by the end of a series of steps the person will say, "Ah, yes, I see how DFDL is relevant to me." Below is my stab at this. Does each step follow from the previous steps? Is each step easy to understand? Are there typos? Am I missing steps? Are there flaws in my logic? Am I missing alternative explanations and conclusions? Please help! /Roger
DFDL is relevant to you ... here's why
1. Everywhere one turns one sees data being created and consumed.
2. Not only do humans create and consume data, but software and hardware do as well.
3. The data is represented (structured, formatted) in various ways.
4. A data consumer may find one representation more usable (efficient, practical, suitable) than another. For example, the data to be consumed is in a textual Comma Separated Value (CSV) representation but the consumer is a Java application which finds Java objects to be a more useable representation. Another example, the data to be consumed is in a binary representation but the consumer is a person who finds XML documents to be a more usable representation.
5. Consumers receiving data in a less desirable representation may wish to have it converted to a more usable representation.
6. A DFDL processor is a tool that converts one representation to another.
7. Converting a representation requires first understanding the representation.
8. Understanding a representation requires describing the representation.
9. DFDL is a language for describing data representations.
10. When data to be consumed is in a less desirable representation, use DFDL to describe the data, provide the DFDL description to a DFDL processor for converting the representation to a more usable representation, and then consume the data in its usable representation.

Re: Seek your help in creating a Euclid-like reasoning for why one should use DFDL

Posted by "Beckerle, Mike" <mb...@tresys.com>.

The one issue I would suggest you need to think about around this Euclidean approach is distinguishing this "convert" of data you talk about, from general "transformation" of data.

If you start with Data in representation A (in some file format) and you want it in B (in a RDBMS), you have two problems:

(1) You need to describe data A to make it available to step (2)
(2) You need to express how to transform it into B (which may also involve describing B, or that might be implied by the transformation)

The point is DFDL is only about problem (1) above. Not problem (2).

So when you describe DFDL as "converting" from one representation to another desired representation, this "convert" is much more limited than fully general "transformation" of data, yet the two words aren't that clearly distinct in meaning. They're synonyms to most people. (Convert, Transform, Adapt, Evolve, Massage, Extrude, Morph, Map, ....)

There's some essence to the conversion that DFDL is about which is less powerful than general transformation.

This is a nuance that is hard to capture.








________________________________
From: Costello, Roger L. <co...@mitre.org>
Sent: Thursday, November 7, 2019 8:42 AM
To: users@daffodil.apache.org <us...@daffodil.apache.org>
Subject: Seek your help in creating a Euclid-like reasoning for why one should use DFDL


Hi Folks,

Long ago I read Euclid. It is a fabulous book. It begins with simple facts and then incrementally builds upon them using basic logic to formulate increasingly sophisticated concepts.

I want someone who has never heard of DFDL to understand why DFDL is relevant to them. I want to follow Euclid’s approach: start with simple facts about data and then incrementally build upon them using basic logic. Hopefully, by the end of a series of steps the person will say, “Ah, yes, I see how DFDL is relevant to me.” Below is my stab at this. Does each step follow from the previous steps? Is each step easy to understand? Are there typos? Am I missing steps? Are there flaws in my logic? Am I missing alternative explanations and conclusions?  Please help! /Roger

DFDL is relevant to you … here’s why

1. Everywhere one turns one sees data being created and consumed.

2. Not only do humans create and consume data, but software and hardware do as well.

3. The data is represented (structured, formatted) in various ways.

4. A data consumer may find one representation more usable (efficient, practical, suitable) than another. For example, the data to be consumed is in a textual Comma Separated Value (CSV) representation but the consumer is a Java application which finds Java objects to be a more useable representation. Another example, the data to be consumed is in a binary representation but the consumer is a person who finds XML documents to be a more usable representation.

5. Consumers receiving data in a less desirable representation may wish to have it converted to a more usable representation.

6. A DFDL processor is a tool that converts one representation to another.

7. Converting a representation requires first understanding the representation.

8. Understanding a representation requires describing the representation.

9. DFDL is a language for describing data representations.

10. When data to be consumed is in a less desirable representation, use DFDL to describe the data, provide the DFDL description to a DFDL processor for converting the representation to a more usable representation, and then consume the data in its usable representation.