You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@daffodil.apache.org by Mike Beckerle <mb...@apache.org> on 2022/09/26 22:31:58 UTC
bits display of data for VSCode debugging
I wanted to share this bits dump so that people working on VSCode and the
data display aspects can see what kind of data display some users
are needing in order to debug data of late.
We've written little tools to create this for us in order to debug DFDL
schemas for mil-std-2045 and other mil-std data formats that use bitOrder
'leastSignificantBitFirst' and are not byte oriented.
These formats never waste a bit. Absolutely nothing is byte aligned ever.
Text strings are infrequent, but when they do occur 7-bits per character,
packed together with no wasted bits, is typical.
******* Data Dump *********
3322 2222 2222 1111 1111 1100 0000 0000 addr bitPos
1098 7654 3210 9876 5432 1098 7654 3210 hex dec(0b)
------------------------------------------------------------
0000 0000 0000 0000 0000 0001 0110 0010 :00000010 128
0101 0010 1110 0110 1100 1101 0100 1101 :00000014 160
0101 0011 0011 0001 1001 0000 0001 0001 :00000018 192
1000 0001 0001 1011 1100 1000 0000 0000 :0000001c 224
0000 0000 0000 0000 0010 1011 0000 0000 :00000020 256
0010 0001 0000 0100 1000 0110 1100 0100 :00000024 288
0100 0011 0111 1011 1100 0111 0010 0100 :00000028 320
0101 0111 1101 1100 0001 1101 0010 0101 :0000002c 352
0000 1101 0010 0100 1000 1111 1110 1100 :00000030 384
0111 0010 1100 1001 1100 0101 0001 0101 :00000034 416
0101 1000 1011 1010 1111 0011 0010 0010 :00000038 448
1111 1111 1101 0011 0001 1110 0101 1101 :0000003c 480
0001 1100 1111 0100 1110 1000 1010 1000 :00000040 512
1100 0010 1000 0011 1001 1110 1001 0100 :00000044 544
0100 1110 0111 1001 0110 1010 0010 0000 :00000048 576
0000 0011 1111 1111 :0000004c 608
------------------------------------------------------------
This is a right-to-left display of data where bytes are numbered right to
left, and bits are numbered right to left.
Text is most often 7-bit ascii, but it is bit packed, and a character
begins at any bit position. However, there are also 6 and 5 bit character
encodings.
The addresses are byte positions, 0-based. The bit positions are decimal,
also 0-based.
I added the color alternating highlighting of the last characters of this
example data which are "This is a Test␡" (which ends with a DEL). This is
just by way of showing that it is possible to pick out the 7-bit-wide
characters. It also illustrates why there is no text dump along side this
bits dump. Because given that a character can start on any bit boundary
there's no sensible interpretation of this data as text until you identify
the start bit of the first character.
Mike Beckerle
Apache Daffodil PMC | daffodil.apache.org
OGF DFDL Workgroup Co-Chair | www.ogf.org/ogf/doku.php/standards/dfdl/dfdl
Owl Cyber Defense | www.owlcyberdefense.com
Re: bits display of data for VSCode debugging
Posted by "Shearer, Davin" <sh...@ctc.com>.
I've got a reasonable attempt at a new wireframe for the data editor. We can use this to begin the discussion on how the UI can help satisfy our use cases. I have created a new issue (#311 at https://github.com/apache/daffodil-vscode/issues/311) and attached the wireframe diagram to the issue.
The use case as described by Mike has given me something complex to help streamline and polish the data editor.
Hope it helps,
Davin
On 9/26/22, 8:01 PM, "Shearer, Davin" <sh...@ctc.com> wrote:
I'm trying to think about what a reasonable GUI for this would even look like. What kinds of controls and displays make sense for helping to develop and debug a schema of this nature?
Brainstorming...
Stretching every x-bits to be 8-bit and aligned (so we can display them). For example for x=7, a starting address as a bit offset, the number of bits to process, and if the LSB (least-significant-bit-first bit order) is on the right (last) or left (first), we can stretch 7-bit ASCII so we can see the text. I can envision a GUI viewport where we see the bits, maybe a click and drag to select some range of bits (this is going to depend on LSB location), then apply some function like this "bit stretcher" and have another viewport with the stretched results.
I see you, Steve, and others have developed https://github.com/DFDLSchemas/mil-std-2045 so we have some examples of how DFDL is supposed to parse that kind of data. What tools would you like to see in the debugger that would facilitate developing and debugging these schemas? Wireframing this out will be extremely helpful.
-Davin
On 9/26/22, 6:32 PM, "Mike Beckerle" <mb...@apache.org> wrote:
I wanted to share this bits dump so that people working on VSCode and the
data display aspects can see what kind of data display some users
are needing in order to debug data of late.
We've written little tools to create this for us in order to debug DFDL
schemas for mil-std-2045 and other mil-std data formats that use bitOrder
'leastSignificantBitFirst' and are not byte oriented.
These formats never waste a bit. Absolutely nothing is byte aligned ever.
Text strings are infrequent, but when they do occur 7-bits per character,
packed together with no wasted bits, is typical.
******* Data Dump *********
3322 2222 2222 1111 1111 1100 0000 0000 addr bitPos
1098 7654 3210 9876 5432 1098 7654 3210 hex dec(0b)
------------------------------------------------------------
0000 0000 0000 0000 0000 0001 0110 0010 :00000010 128
0101 0010 1110 0110 1100 1101 0100 1101 :00000014 160
0101 0011 0011 0001 1001 0000 0001 0001 :00000018 192
1000 0001 0001 1011 1100 1000 0000 0000 :0000001c 224
0000 0000 0000 0000 0010 1011 0000 0000 :00000020 256
0010 0001 0000 0100 1000 0110 1100 0100 :00000024 288
0100 0011 0111 1011 1100 0111 0010 0100 :00000028 320
0101 0111 1101 1100 0001 1101 0010 0101 :0000002c 352
0000 1101 0010 0100 1000 1111 1110 1100 :00000030 384
0111 0010 1100 1001 1100 0101 0001 0101 :00000034 416
0101 1000 1011 1010 1111 0011 0010 0010 :00000038 448
1111 1111 1101 0011 0001 1110 0101 1101 :0000003c 480
0001 1100 1111 0100 1110 1000 1010 1000 :00000040 512
1100 0010 1000 0011 1001 1110 1001 0100 :00000044 544
0100 1110 0111 1001 0110 1010 0010 0000 :00000048 576
0000 0011 1111 1111 :0000004c 608
------------------------------------------------------------
This is a right-to-left display of data where bytes are numbered right to
left, and bits are numbered right to left.
Text is most often 7-bit ascii, but it is bit packed, and a character
begins at any bit position. However, there are also 6 and 5 bit character
encodings.
The addresses are byte positions, 0-based. The bit positions are decimal,
also 0-based.
I added the color alternating highlighting of the last characters of this
example data which are "This is a Test␡" (which ends with a DEL). This is
just by way of showing that it is possible to pick out the 7-bit-wide
characters. It also illustrates why there is no text dump along side this
bits dump. Because given that a character can start on any bit boundary
there's no sensible interpretation of this data as text until you identify
the start bit of the first character.
Mike Beckerle
Apache Daffodil PMC | daffodil.apache.org
OGF DFDL Workgroup Co-Chair | www.ogf.org/ogf/doku.php/standards/dfdl/dfdl
Owl Cyber Defense | www.owlcyberdefense.com
-----------------------------------------------------------------
This message and any files transmitted within are intended
solely for the addressee or its representative and may contain
company proprietary information. If you are not the intended
recipient, notify the sender immediately and delete this
message. Publication, reproduction, forwarding, or content
disclosure is prohibited without the consent of the original
sender and may be unlawful.
Concurrent Technologies Corporation and its Affiliates.
www.ctc.com 1-800-282-4392
-----------------------------------------------------------------
Re: bits display of data for VSCode debugging
Posted by "Shearer, Davin" <sh...@ctc.com>.
I'm trying to think about what a reasonable GUI for this would even look like. What kinds of controls and displays make sense for helping to develop and debug a schema of this nature?
Brainstorming...
Stretching every x-bits to be 8-bit and aligned (so we can display them). For example for x=7, a starting address as a bit offset, the number of bits to process, and if the LSB (least-significant-bit-first bit order) is on the right (last) or left (first), we can stretch 7-bit ASCII so we can see the text. I can envision a GUI viewport where we see the bits, maybe a click and drag to select some range of bits (this is going to depend on LSB location), then apply some function like this "bit stretcher" and have another viewport with the stretched results.
I see you, Steve, and others have developed https://github.com/DFDLSchemas/mil-std-2045 so we have some examples of how DFDL is supposed to parse that kind of data. What tools would you like to see in the debugger that would facilitate developing and debugging these schemas? Wireframing this out will be extremely helpful.
-Davin
On 9/26/22, 6:32 PM, "Mike Beckerle" <mb...@apache.org> wrote:
I wanted to share this bits dump so that people working on VSCode and the
data display aspects can see what kind of data display some users
are needing in order to debug data of late.
We've written little tools to create this for us in order to debug DFDL
schemas for mil-std-2045 and other mil-std data formats that use bitOrder
'leastSignificantBitFirst' and are not byte oriented.
These formats never waste a bit. Absolutely nothing is byte aligned ever.
Text strings are infrequent, but when they do occur 7-bits per character,
packed together with no wasted bits, is typical.
******* Data Dump *********
3322 2222 2222 1111 1111 1100 0000 0000 addr bitPos
1098 7654 3210 9876 5432 1098 7654 3210 hex dec(0b)
------------------------------------------------------------
0000 0000 0000 0000 0000 0001 0110 0010 :00000010 128
0101 0010 1110 0110 1100 1101 0100 1101 :00000014 160
0101 0011 0011 0001 1001 0000 0001 0001 :00000018 192
1000 0001 0001 1011 1100 1000 0000 0000 :0000001c 224
0000 0000 0000 0000 0010 1011 0000 0000 :00000020 256
0010 0001 0000 0100 1000 0110 1100 0100 :00000024 288
0100 0011 0111 1011 1100 0111 0010 0100 :00000028 320
0101 0111 1101 1100 0001 1101 0010 0101 :0000002c 352
0000 1101 0010 0100 1000 1111 1110 1100 :00000030 384
0111 0010 1100 1001 1100 0101 0001 0101 :00000034 416
0101 1000 1011 1010 1111 0011 0010 0010 :00000038 448
1111 1111 1101 0011 0001 1110 0101 1101 :0000003c 480
0001 1100 1111 0100 1110 1000 1010 1000 :00000040 512
1100 0010 1000 0011 1001 1110 1001 0100 :00000044 544
0100 1110 0111 1001 0110 1010 0010 0000 :00000048 576
0000 0011 1111 1111 :0000004c 608
------------------------------------------------------------
This is a right-to-left display of data where bytes are numbered right to
left, and bits are numbered right to left.
Text is most often 7-bit ascii, but it is bit packed, and a character
begins at any bit position. However, there are also 6 and 5 bit character
encodings.
The addresses are byte positions, 0-based. The bit positions are decimal,
also 0-based.
I added the color alternating highlighting of the last characters of this
example data which are "This is a Test␡" (which ends with a DEL). This is
just by way of showing that it is possible to pick out the 7-bit-wide
characters. It also illustrates why there is no text dump along side this
bits dump. Because given that a character can start on any bit boundary
there's no sensible interpretation of this data as text until you identify
the start bit of the first character.
Mike Beckerle
Apache Daffodil PMC | daffodil.apache.org
OGF DFDL Workgroup Co-Chair | www.ogf.org/ogf/doku.php/standards/dfdl/dfdl
Owl Cyber Defense | www.owlcyberdefense.com
-----------------------------------------------------------------
This message and any files transmitted within are intended
solely for the addressee or its representative and may contain
company proprietary information. If you are not the intended
recipient, notify the sender immediately and delete this
message. Publication, reproduction, forwarding, or content
disclosure is prohibited without the consent of the original
sender and may be unlawful.
Concurrent Technologies Corporation and its Affiliates.
www.ctc.com 1-800-282-4392
-----------------------------------------------------------------