You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@daffodil.apache.org by Mike Beckerle <mb...@apache.org> on 2022/09/26 22:31:58 UTC

bits display of data for VSCode debugging

I wanted to share this bits dump so that people working on VSCode and the
data display aspects can see what kind of data display some users
are needing in order to debug data of late.

We've written little tools to create this for us in order to debug DFDL
schemas for mil-std-2045 and other mil-std data formats that use bitOrder
'leastSignificantBitFirst' and are not byte oriented.

These formats never waste a bit. Absolutely nothing is byte aligned ever.
Text strings are infrequent, but when they do occur 7-bits per character,
packed together with no wasted bits, is typical.

******* Data Dump *********

3322 2222  2222 1111  1111 1100  0000 0000      addr bitPos

1098 7654  3210 9876  5432 1098  7654 3210      hex  dec(0b)

------------------------------------------------------------

0000 0000  0000 0000  0000 0001  0110 0010 :00000010    128

0101 0010  1110 0110  1100 1101  0100 1101 :00000014    160

0101 0011  0011 0001  1001 0000  0001 0001 :00000018    192

1000 0001  0001 1011  1100 1000  0000 0000 :0000001c    224

0000 0000  0000 0000  0010 1011  0000 0000 :00000020    256

0010 0001  0000 0100  1000 0110  1100 0100 :00000024    288

0100 0011  0111 1011  1100 0111  0010 0100 :00000028    320

0101 0111  1101 1100  0001 1101  0010 0101 :0000002c    352

0000 1101  0010 0100  1000 1111  1110 1100 :00000030    384

0111 0010  1100 1001  1100 0101  0001 0101 :00000034    416

0101 1000  1011 1010  1111 0011  0010 0010 :00000038    448

1111 1111  1101 0011  0001 1110  0101 1101 :0000003c    480

0001 1100  1111 0100  1110 1000  1010 1000 :00000040    512

1100 0010  1000 0011  1001 1110  1001 0100 :00000044    544

0100 1110  0111 1001  0110 1010  0010 0000 :00000048    576

                      0000 0011  1111 1111 :0000004c    608

------------------------------------------------------------



This is a right-to-left display of data where bytes are numbered right to
left, and bits are numbered right to left.

Text is most often 7-bit ascii, but it is bit packed, and a character
begins at any bit position. However, there are also 6 and 5 bit character
encodings.


The addresses are byte positions, 0-based. The bit positions are decimal,
also 0-based.


I added the color alternating highlighting of the last characters of this
example data which are "This is a Test␡" (which ends with a DEL). This is
just by way of showing that it is possible to pick out the 7-bit-wide
characters. It also illustrates why there is no text dump along side this
bits dump. Because given that a character can start on any bit boundary
there's no sensible interpretation of this data as text until you identify
the start bit of the first character.


Mike Beckerle
Apache Daffodil PMC | daffodil.apache.org
OGF DFDL Workgroup Co-Chair | www.ogf.org/ogf/doku.php/standards/dfdl/dfdl
Owl Cyber Defense | www.owlcyberdefense.com

Re: bits display of data for VSCode debugging

Posted by "Shearer, Davin" <sh...@ctc.com>.
I've got a reasonable attempt at a new wireframe for the data editor.  We can use this to begin the discussion on how the UI can help satisfy our use cases.  I have created a new issue (#311 at https://github.com/apache/daffodil-vscode/issues/311) and attached the wireframe diagram to the issue.

The use case as described by Mike has given me something complex to help streamline and polish the data editor.

Hope it helps,
Davin

On 9/26/22, 8:01 PM, "Shearer, Davin" <sh...@ctc.com> wrote:

    I'm trying to think about what a reasonable GUI for this would even look like.  What kinds of controls and displays make sense for helping to develop and debug a schema of this nature?

    Brainstorming...

    Stretching every x-bits to be 8-bit and aligned (so we can display them).  For example for x=7, a starting address as a bit offset, the number of bits to process, and if the LSB (least-significant-bit-first bit order) is on the right (last) or left (first),  we can stretch 7-bit ASCII so we can see the text.  I can envision a GUI viewport where we see the bits, maybe a click and drag to select some range of bits (this is going to depend on LSB location), then apply some function like this "bit stretcher" and have another viewport with the stretched results.

    I see you, Steve, and others have developed https://github.com/DFDLSchemas/mil-std-2045 so we have some examples of how DFDL is supposed to parse that kind of data.  What tools would you like to see in the debugger that would facilitate developing and debugging these schemas?  Wireframing this out will be extremely helpful.

    -Davin

    On 9/26/22, 6:32 PM, "Mike Beckerle" <mb...@apache.org> wrote:

        I wanted to share this bits dump so that people working on VSCode and the
        data display aspects can see what kind of data display some users
        are needing in order to debug data of late.

        We've written little tools to create this for us in order to debug DFDL
        schemas for mil-std-2045 and other mil-std data formats that use bitOrder
        'leastSignificantBitFirst' and are not byte oriented.

        These formats never waste a bit. Absolutely nothing is byte aligned ever.
        Text strings are infrequent, but when they do occur 7-bits per character,
        packed together with no wasted bits, is typical.

        ******* Data Dump *********

        3322 2222  2222 1111  1111 1100  0000 0000      addr bitPos

        1098 7654  3210 9876  5432 1098  7654 3210      hex  dec(0b)

        ------------------------------------------------------------

        0000 0000  0000 0000  0000 0001  0110 0010 :00000010    128

        0101 0010  1110 0110  1100 1101  0100 1101 :00000014    160

        0101 0011  0011 0001  1001 0000  0001 0001 :00000018    192

        1000 0001  0001 1011  1100 1000  0000 0000 :0000001c    224

        0000 0000  0000 0000  0010 1011  0000 0000 :00000020    256

        0010 0001  0000 0100  1000 0110  1100 0100 :00000024    288

        0100 0011  0111 1011  1100 0111  0010 0100 :00000028    320

        0101 0111  1101 1100  0001 1101  0010 0101 :0000002c    352

        0000 1101  0010 0100  1000 1111  1110 1100 :00000030    384

        0111 0010  1100 1001  1100 0101  0001 0101 :00000034    416

        0101 1000  1011 1010  1111 0011  0010 0010 :00000038    448

        1111 1111  1101 0011  0001 1110  0101 1101 :0000003c    480

        0001 1100  1111 0100  1110 1000  1010 1000 :00000040    512

        1100 0010  1000 0011  1001 1110  1001 0100 :00000044    544

        0100 1110  0111 1001  0110 1010  0010 0000 :00000048    576

                              0000 0011  1111 1111 :0000004c    608

        ------------------------------------------------------------



        This is a right-to-left display of data where bytes are numbered right to
        left, and bits are numbered right to left.

        Text is most often 7-bit ascii, but it is bit packed, and a character
        begins at any bit position. However, there are also 6 and 5 bit character
        encodings.


        The addresses are byte positions, 0-based. The bit positions are decimal,
        also 0-based.


        I added the color alternating highlighting of the last characters of this
        example data which are "This is a Test␡" (which ends with a DEL). This is
        just by way of showing that it is possible to pick out the 7-bit-wide
        characters. It also illustrates why there is no text dump along side this
        bits dump. Because given that a character can start on any bit boundary
        there's no sensible interpretation of this data as text until you identify
        the start bit of the first character.


        Mike Beckerle
        Apache Daffodil PMC | daffodil.apache.org
        OGF DFDL Workgroup Co-Chair | www.ogf.org/ogf/doku.php/standards/dfdl/dfdl
        Owl Cyber Defense | www.owlcyberdefense.com



-----------------------------------------------------------------
This message and any files transmitted within are intended
solely for the addressee or its representative and may contain
company proprietary information.  If you are not the intended
recipient, notify the sender immediately and delete this
message.  Publication, reproduction, forwarding, or content
disclosure is prohibited without the consent of the original
sender and may be unlawful.

Concurrent Technologies Corporation and its Affiliates.
www.ctc.com  1-800-282-4392
-----------------------------------------------------------------

Re: bits display of data for VSCode debugging

Posted by "Shearer, Davin" <sh...@ctc.com>.
I'm trying to think about what a reasonable GUI for this would even look like.  What kinds of controls and displays make sense for helping to develop and debug a schema of this nature?

Brainstorming...

Stretching every x-bits to be 8-bit and aligned (so we can display them).  For example for x=7, a starting address as a bit offset, the number of bits to process, and if the LSB (least-significant-bit-first bit order) is on the right (last) or left (first),  we can stretch 7-bit ASCII so we can see the text.  I can envision a GUI viewport where we see the bits, maybe a click and drag to select some range of bits (this is going to depend on LSB location), then apply some function like this "bit stretcher" and have another viewport with the stretched results.

I see you, Steve, and others have developed https://github.com/DFDLSchemas/mil-std-2045 so we have some examples of how DFDL is supposed to parse that kind of data.  What tools would you like to see in the debugger that would facilitate developing and debugging these schemas?  Wireframing this out will be extremely helpful.

-Davin

On 9/26/22, 6:32 PM, "Mike Beckerle" <mb...@apache.org> wrote:

    I wanted to share this bits dump so that people working on VSCode and the
    data display aspects can see what kind of data display some users
    are needing in order to debug data of late.

    We've written little tools to create this for us in order to debug DFDL
    schemas for mil-std-2045 and other mil-std data formats that use bitOrder
    'leastSignificantBitFirst' and are not byte oriented.

    These formats never waste a bit. Absolutely nothing is byte aligned ever.
    Text strings are infrequent, but when they do occur 7-bits per character,
    packed together with no wasted bits, is typical.

    ******* Data Dump *********

    3322 2222  2222 1111  1111 1100  0000 0000      addr bitPos

    1098 7654  3210 9876  5432 1098  7654 3210      hex  dec(0b)

    ------------------------------------------------------------

    0000 0000  0000 0000  0000 0001  0110 0010 :00000010    128

    0101 0010  1110 0110  1100 1101  0100 1101 :00000014    160

    0101 0011  0011 0001  1001 0000  0001 0001 :00000018    192

    1000 0001  0001 1011  1100 1000  0000 0000 :0000001c    224

    0000 0000  0000 0000  0010 1011  0000 0000 :00000020    256

    0010 0001  0000 0100  1000 0110  1100 0100 :00000024    288

    0100 0011  0111 1011  1100 0111  0010 0100 :00000028    320

    0101 0111  1101 1100  0001 1101  0010 0101 :0000002c    352

    0000 1101  0010 0100  1000 1111  1110 1100 :00000030    384

    0111 0010  1100 1001  1100 0101  0001 0101 :00000034    416

    0101 1000  1011 1010  1111 0011  0010 0010 :00000038    448

    1111 1111  1101 0011  0001 1110  0101 1101 :0000003c    480

    0001 1100  1111 0100  1110 1000  1010 1000 :00000040    512

    1100 0010  1000 0011  1001 1110  1001 0100 :00000044    544

    0100 1110  0111 1001  0110 1010  0010 0000 :00000048    576

                          0000 0011  1111 1111 :0000004c    608

    ------------------------------------------------------------



    This is a right-to-left display of data where bytes are numbered right to
    left, and bits are numbered right to left.

    Text is most often 7-bit ascii, but it is bit packed, and a character
    begins at any bit position. However, there are also 6 and 5 bit character
    encodings.


    The addresses are byte positions, 0-based. The bit positions are decimal,
    also 0-based.


    I added the color alternating highlighting of the last characters of this
    example data which are "This is a Test␡" (which ends with a DEL). This is
    just by way of showing that it is possible to pick out the 7-bit-wide
    characters. It also illustrates why there is no text dump along side this
    bits dump. Because given that a character can start on any bit boundary
    there's no sensible interpretation of this data as text until you identify
    the start bit of the first character.


    Mike Beckerle
    Apache Daffodil PMC | daffodil.apache.org
    OGF DFDL Workgroup Co-Chair | www.ogf.org/ogf/doku.php/standards/dfdl/dfdl
    Owl Cyber Defense | www.owlcyberdefense.com


-----------------------------------------------------------------
This message and any files transmitted within are intended
solely for the addressee or its representative and may contain
company proprietary information.  If you are not the intended
recipient, notify the sender immediately and delete this
message.  Publication, reproduction, forwarding, or content
disclosure is prohibited without the consent of the original
sender and may be unlawful.

Concurrent Technologies Corporation and its Affiliates.
www.ctc.com  1-800-282-4392
-----------------------------------------------------------------