You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@daffodil.apache.org by "Costello, Roger L." <co...@mitre.org> on 2019/07/29 15:42:18 UTC

I created slides on alignment and fill pad bits

Hello DFDL community,

Special thanks to Mike for explaining this alignment/fill stuff. Thanks Mike!

To check my understanding, I created some slides. See below. I would appreciate any feedback, especially feedback on errors in my slides.  /Roger

[cid:image001.png@01D54602.A77836D0]

[cid:image002.png@01D54602.A77836D0]

[cid:image003.png@01D54602.A77836D0]

[cid:image004.png@01D54602.A77836D0]

[cid:image005.png@01D54602.A77836D0]

[cid:image006.png@01D54602.A77836D0]

[cid:image007.png@01D54602.A77836D0]

[cid:image008.png@01D54602.A77836D0]

[cid:image009.png@01D54602.A77836D0]

[cid:image010.png@01D54602.A77836D0]

[cid:image011.png@01D54602.A77836D0]

[cid:image012.png@01D54602.A77836D0]

[cid:image013.png@01D54602.A77836D0]

[cid:image014.png@01D54602.A77836D0]

[cid:image015.png@01D54602.A77836D0]

[cid:image016.png@01D54602.A77836D0]

[cid:image017.png@01D54602.A77836D0]

[cid:image018.png@01D54602.A77836D0]

[cid:image019.png@01D54602.A77836D0]


Re: I created slides on alignment and fill pad bits

Posted by "Beckerle, Mike" <mb...@tresys.com>.
Some thoughts.


1) If you are using leastSignificantBitFirst(LSBF) bitOrder, then you really should position the bytes from right to left. Then your example can include a field that spans a byte boundary sensibly.


In your example as now, no element spans a byte boundary, so this isn't critical, but I always encourage use of right-to-left byte order for display of LSBF data, and left-to-right byte order for display of MSBF.


2) I think of the two perspectives as "padding oriented" and "alignment oriented".


In your slide about perspective #2 (alignment oriented), you mention padding after the 2nd byte. Really I think this is incorrect, not that the format won't work, but it's not consistent with an alignment-oriented perspective. E.g., What tells you that this byte has unused bits? This is really the business of the next element after your "four-bits" element, which might be in an entirely different part of the format, or might be the start of the next message in the data stream, etc. If that next element has alignment="8" then the last 3 bits of byte 2 will be padding. If the next element has alignment="1", then the next element will begin at bit 6 of byte 2 and there will be no padding.


I think of your format (perspective #2) as ending at bit 5 of byte 2, i.e., it is best to be silent about what is next. I.e., whether whatever is next begins at bit 6, or causes bits to be filled and starts somewhere later.


3) Security - A few thoughts here.


DFDL never allows creation (unparsing) of data with any unspecified bits. You can skip bits of input data and ignore them when parsing, but you cannot create data when unparsing that contains unspecified bits.


So a robust cybersecurity firewall is going to parse, and then unparse data, so as to avoid unspecified skipped bits passing the firewall.


But some people will want every bit to show up in the infoset. So no bits are technically being skipped. In such a schema there's an element encompassing every bit, even if the name of the element is "skipped" or "ignore".


But unless those bits are scrutinized then they're still just getting carried unchanged by the parse-unparse cycle. So if you decide to use elements that expose the "skipped" parts of the data in the infoset, you're allowing a by-pass unless you inspect that data e.g., to insure it is always 0, or modify it on unparse (via dfdl:outputValueCalc="{ 0 }") to insure it cannot be used as a covert channel.


Another way to look at it is this: if the bits aren't being used as part of the format syntax, or by the applications that consume the data, then a robust firewall *always* either modifies those bits (typically by setting to 0), or insists/validates that they contain specific values (typically 0).


Such scrutiny doesn't have to happen in the DFDL schema of course. Other validation steps can insist that the unused elements are all zeroed.


If you just skip such bits via alignment (as in your example), a DFDL parse-unparse cycle achieves this level of robustness every time.


________________________________
From: Costello, Roger L. <co...@mitre.org>
Sent: Monday, July 29, 2019 11:42:18 AM
To: users@daffodil.apache.org <us...@daffodil.apache.org>
Subject: I created slides on alignment and fill pad bits


Hello DFDL community,



Special thanks to Mike for explaining this alignment/fill stuff. Thanks Mike!



To check my understanding, I created some slides. See below. I would appreciate any feedback, especially feedback on errors in my slides.  /Roger



[cid:image001.png@01D54602.A77836D0]



[cid:image002.png@01D54602.A77836D0]



[cid:image003.png@01D54602.A77836D0]



[cid:image004.png@01D54602.A77836D0]



[cid:image005.png@01D54602.A77836D0]



[cid:image006.png@01D54602.A77836D0]



[cid:image007.png@01D54602.A77836D0]



[cid:image008.png@01D54602.A77836D0]



[cid:image009.png@01D54602.A77836D0]



[cid:image010.png@01D54602.A77836D0]



[cid:image011.png@01D54602.A77836D0]



[cid:image012.png@01D54602.A77836D0]



[cid:image013.png@01D54602.A77836D0]



[cid:image014.png@01D54602.A77836D0]



[cid:image015.png@01D54602.A77836D0]



[cid:image016.png@01D54602.A77836D0]



[cid:image017.png@01D54602.A77836D0]



[cid:image018.png@01D54602.A77836D0]



[cid:image019.png@01D54602.A77836D0]