You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@plc4x.apache.org by Łukasz Dywicki <lu...@code-house.org> on 2020/04/10 15:57:54 UTC

Big and littleendian fields in one mspec

I am doing some tests of ADS serialization.

I've run into some troubles with payload which is generated with new
driver. I'm not sure if that's my fault or generated code.

I did a verification of what Wireshark shows and how ads structures are
parsed. There is a gap I think. For example ams port number 1000
(0x1027) is read as 4135.

Obviously I used wrong structures while implementing protocol logic in
first place, but now I am uncertain of how fields are encoded. How we
mark field as little endian when rest of payload is big endian? Do we
have `uint_le`?

As far I remember route creation logic I was tracking last week used
combination of LE and BE.

Best regards,
Łukasz

Re: Big and littleendian fields in one mspec

Posted by Christofer Dutz <ch...@c-ware.de>.

Hi Julian,

I am currently not aware of a protocol that mixes the endianness. I have heard that there are some modbus implementations that send the payload with different endianness for some implementations, but we shouldn't handle that on mspec level. Because then we would have super-complex mspec and/or different variants of the driver. Here I would prefer to have connection string arguments telling the drivers how to process the payload if this differs from the default.

And yes ... bringing the BE/LE into the code generation would make the code generation code more complex, the generated code slightly and bloat the mspec format (in my opinion)

And seeing that Etienne was able to implement the EIP mspec without much help (which was the first real BE protocol we have) seems to prove that it is not too complex to understand if you rid yourself of this idea that a "byte" is something different than an signed or unsigned 8 bit integer value.

I usually make a "byte" a "int 8" even if I think an "uint 8" would be more correct because of the convenience, that in Java a byte is an 8 bit signed integer which makes processing it simpler.

Chris



Am 14.04.20, 12:26 schrieb "Julian Feinauer" <j....@pragmaticminds.de>:

    Hi,
    
    as far as I see it there are technically two ways to achieve the change between BE / LE.
    One ist he one suggested by Lukasz, i.e. always specifying byte order for each read / write item in mspec.
    The other one ist o leave it out of mspec and only define it in the ByteReader / Writer (similar to how java or netty do it with the ByteBuffer).
    
    I cannot judge all the implications oft he first approach, the second is definetly more pragmatic (and keeps things orthogonal).
    The only situation where we would need this is a protocol which mixes byte orderings. Do we have such a protocol?
    Otherwise I am with Chris and would try to stay pragmatic.
    For me it looks like this would mean quite a bit of work to implement this (not speaking about additional complexity that could be introduced).
    
    Is my summary correct so far?
    
    Julian
    
    Am 14.04.20, 10:52 schrieb "Łukasz Dywicki" <lu...@code-house.org>:
    
        Hey Christian,
        The problem I face is quite simple. State type in mspec i declared as a
        bunch of bits. Type length is fixed, however not available anywhere in
        mspec or code generator to re-arrange bytes upfront. All we have exposed
        at reader/write level is read/write bit.
        To be fair LE/BE support in read/write buffers is limited just to
        numbers. There is no such support for raw bytes or bits, cause for that
        you need to declare a length of LE/BE sequence.
        
        I would love if I could just declare State as '2 byte little endian' so
        it would be read properly upfront and parsed with no changes in
        generated code, however I'm not sure how to do it and where. That's why
        I'm playing with different things described in earlier mail.
        Since all type handling is general I am just afraid of more complicated
        scenarios where we have variable length structures such as arrays.
        
        Best regards,
        Łukasz
        
        
        On 14.04.2020 09:41, Christofer Dutz wrote:
        > Hi Lukasz,
        > 
        > I am not sure I am understanding the problems you are facing. We already have LE and BE protocols.
        > For example EIP is LE and the rest is generally BE. It seems that ADS/AMS is also LE. 
        > mspec doesn't even know about endianness.
        > 
        > Up till now the endianness doesn't have an effect on bit-fields or single-bit ints. 
        > It only starts to affect if a field goes from one byte to the next, which is usually for (u)int and floating point values.
        > 
        > That's why we have created the Read/WriteBuffers to set their endianness in the constructor.
        > 
        > So if you're creating a driver for ADS/AMS which is LE, then you write the mspec according to the sequence the information is located in the transferred bytes and have the Read/WriteBuffer handle the endianness issue.
        > 
        > I do see a problem when there are drivers that use mixed endianness, but we have still to encounter such a protocol.
        > 
        > So I have to admit that I don't like any of the mspec changes you proposed, as I think you are just not using the tools we have the right way.
        > 
        > Chris
        > 
        > 
        > 
        > Am 14.04.20, 00:32 schrieb "Łukasz Dywicki" <lu...@code-house.org>:
        > 
        >     Hey Niclas,
        >     I realized how old the old things are when I started preparing
        >     automation training for mere mortals and got into history of frames and
        >     even cabling. Mr. Modbus and EIA-485 is definitely older than I. ;-)
        >     
        >     Getting back to the point - yes. I been thinking how to address the byte
        >     order in effective way. Here are two approaches I have for now:
        >     
        >     A) My initial attempt is just a temporary buffer which is then written
        >     in reverse order to caller. For reading it is similar - just getting N
        >     bytes in reversed order. The hard part is.. knowing N. I had to add a
        >     static calculation in order to allocate valid buffer sizes. I tend to
        >     work but I'm not happy with this approach cause it involves additional work.
        >     B) Second idea I've got is really simple and relies on code generation.
        >     We know in which order fields are coming. Here I'm referring to a State
        >     field which is just bunch of bits. If we would group fields in bytes and
        >     generate code in reverse order then it has chance to work. Requirement
        >     for that - ability to know basic field sizes upfront.
        >     C) Try to combine above with bit-io or Read/WriteBuffers as these are
        >     places which know actual position and state of buffers which are being
        >     read/written.
        >     
        >     Now, getting to two cases which are a problem. CommandId and State. So
        >     with command id situation is simple as it is declared as enum and it is
        >     read as uint. We know size upfront and can generate valid method call
        >     (readIntLE).
        >     [enum uint 16 little endian 'CommandId'
        >         ['0x00' INVALID]
        >         ['0x01' ADS_READ_DEVICE_INFO]
        >         ['0x02' ADS_READ]
        >         ['0x03' ADS_WRITE]
        >         ['0x04' ADS_READ_STATE]
        >         ['0x05' ADS_WRITE_CONTROL]
        >         ['0x06' ADS_ADD_DEVICE_NOTIFICATION]
        >         ['0x07' ADS_DELETE_DEVICE_NOTIFICATION]
        >         ['0x08' ADS_DEVICE_NOTIFICATION]
        >         ['0x09' ADS_READ_WRITE]
        >     ]
        >     
        >     Second candidate is what I'm stuck right now sniping next cycles of
        >     problems. So in case of State we have complex type composed from 2
        >     bytes. A note here - instead of two bytes we might have a variable
        >     length type which includes array or other variable section.
        >     [type little endian 'State'
        >         [simple     bit 'broadcast'             ]
        >         [reserved   int 7 '0x0'                 ]
        >         [simple     bit 'initCommand'           ]
        >         [simple     bit 'updCommand'            ]
        >         [simple     bit 'timestampAdded'        ]
        >         [simple     bit 'highPriorityCommand'   ]
        >         [simple     bit 'systemCommand'         ]
        >         [simple     bit 'adsCommand'            ]
        >         [simple     bit 'noReturn'              ]
        >         [simple     bit 'response'              ]
        >     ]
        >     
        >     The order of reading big endian encoded data to impose little endian
        >     shift would be (please correct me if I'm wrong):
        >     1) init
        >     2) udp
        >     3) add timestamp
        >     4) priority
        >     5) system
        >     6) ads
        >     7) noreturn
        >     8) response (end of byte 1)
        >     9) broadcast
        >     10) reserved (end of byte )
        >     We can do same trick for writing, by re-arranging fields. By this way we
        >     avoid any additional byte level operations.
        >     
        >     Overall trouble with generated driver is to declare "how much" bytes
        >     should be read and interpreted. We have precise size information at the
        >     runtime - due to length fields, we can leverage it at generation time,
        >     but then we won't be able to cover all cases.
        >     
        >     I would love to keep it simple and do not break things thus I need your
        >     advice on how to approach this problem in a valid way.
        >     
        >     Cheers,
        >     Łukasz
        >     
        >     
        >     On 13.04.2020 03:26, Niclas Hedhman wrote:
        >     > <anecdotal-rant>
        >     > For us who were around and shaping the protocols in the 1980s, and people
        >     > before us (and before standards like RS-232), a lot of the "specifications"
        >     > came out of "observation of implementation we managed to get to work",
        >     > rather than "implement this spec". A lot was due to extreme memory
        >     > constraints (in my case, multi-tasking operating system, serial protocol
        >     > 187kbps, interpreted programming language with floating point ops and user
        >     > applications in 2kB RAM and 8kB EPROM) and a general lack of information,
        >     > like what other people were doing, sharing experiences and so on.
        >     > 
        >     > And there were many "innovative" ways to squeeze just a little bit extra
        >     > out of the hardware, resulting in "hard to understand" consequences. Bit
        >     > packing was a typical one, multiple functions packed into a single byte.
        >     > Look at page 14 in https://www.nxp.com/docs/en/data-sheet/80C31_80C32.pdf
        >     > and read up on "UART Enahanced Mode", and we used this, i.e. 9 bits, no
        >     > parity and clever use of address and mask to create a slave-to-slave direct
        >     > protocol, where the master's role was to signal which slave "owned" the
        >     > cable. Yeah, in that 8kB ROM limitation (I think protocol was about 1kB
        >     > ROM) and something like 150 bytes RAM for comm protocol.
        >     > 
        >     > Could you implement a compatible device to this with PLC4X and modern
        >     > hardware (i.e. no 8031/32 co-processor)? Possibly but bit-banging is needed
        >     > to support the 9bit data (+start and stop bits) and an awful lot of CPU
        >     > cycles on something that was automatic on one of the slowest long-lived
        >     > microcontroller ever.
        >     > </anecdotal-rant>
        >     > 
        >     > My point was only to highlight that some of the strange things you see in
        >     > protocols today, have its roots in pre-standardization days. Today no one
        >     > would go down that route, because the hardware cost nothing now (8031  +
        >     > 8kB EPROM + 2kB static RAM + battery backup => ~$50 in 1983's currency) and
        >     > longevity of software is more important.
        >     > 
        >     > Cheers
        >     > Niclas
        >     > 
        >     > 
        >     > On Sun, Apr 12, 2020 at 10:10 PM Christofer Dutz <ch...@c-ware.de>
        >     > wrote:
        >     > 
        >     >> Hi Lukasz,
        >     >>
        >     >> I think it really gets tricky when using BE and having some byte-odd-sizes
        >     >> ... I remember in the Firmata protocol there were some bitmasks and then 10
        >     >> bit uint as BE ... not it really got tricky as the specs were written from
        >     >> a point of view: You read 16 bits BE and then the first6 bits mean XYZ
        >     >> instead of describing how the bits actually travel over the wire.
        >     >>
        >     >> Chris
        >     >>
        >     >>
        >     >>
        >     >> Am 11.04.20, 01:21 schrieb "Łukasz Dywicki" <lu...@code-house.org>:
        >     >>
        >     >>     I've made some progress with topic by modyfing mspec and allowing
        >     >>     'little endian' flag on fields. This moved me further to next issue -
        >     >>     which is whole type encoded little endian.
        >     >>
        >     >>     In ADS driver such type is State, which has 2 bytes and uses 8 bits for
        >     >>     various flags.
        >     >>     There are two cases which require different approach - reading and
        >     >>     writing. So for reading we need to swap N bytes based on type length.
        >     >>     For writing we need to alocate buffer for N bytes and swap them before
        >     >>     writing.
        >     >>
        >     >>     I am stuck now with freemaker templates and bit-io.
        >     >>
        >     >>     Cheers,
        >     >>     Łukasz
        >     >>
        >     >>
        >     >>
        >     >>     On 10.04.2020 17:57, Łukasz Dywicki wrote:
        >     >>     > I am doing some tests of ADS serialization.
        >     >>     >
        >     >>     > I've run into some troubles with payload which is generated with new
        >     >>     > driver. I'm not sure if that's my fault or generated code.
        >     >>     >
        >     >>     > I did a verification of what Wireshark shows and how ads structures
        >     >> are
        >     >>     > parsed. There is a gap I think. For example ams port number 1000
        >     >>     > (0x1027) is read as 4135.
        >     >>     >
        >     >>     > Obviously I used wrong structures while implementing protocol logic
        >     >> in
        >     >>     > first place, but now I am uncertain of how fields are encoded. How we
        >     >>     > mark field as little endian when rest of payload is big endian? Do we
        >     >>     > have `uint_le`?
        >     >>     >
        >     >>     > As far I remember route creation logic I was tracking last week used
        >     >>     > combination of LE and BE.
        >     >>     >
        >     >>     > Best regards,
        >     >>     > Łukasz
        >     >>     >
        >     >>
        >     >>
        >     >>
        >     > 
        >     
        >

Re: Big and littleendian fields in one mspec

Posted by Julian Feinauer <j....@pragmaticminds.de>.

Hi,

as far as I see it there are technically two ways to achieve the change between BE / LE.
One ist he one suggested by Lukasz, i.e. always specifying byte order for each read / write item in mspec.
The other one ist o leave it out of mspec and only define it in the ByteReader / Writer (similar to how java or netty do it with the ByteBuffer).

I cannot judge all the implications oft he first approach, the second is definetly more pragmatic (and keeps things orthogonal).
The only situation where we would need this is a protocol which mixes byte orderings. Do we have such a protocol?
Otherwise I am with Chris and would try to stay pragmatic.
For me it looks like this would mean quite a bit of work to implement this (not speaking about additional complexity that could be introduced).

Is my summary correct so far?

Julian

Am 14.04.20, 10:52 schrieb "Łukasz Dywicki" <lu...@code-house.org>:

    Hey Christian,
    The problem I face is quite simple. State type in mspec i declared as a
    bunch of bits. Type length is fixed, however not available anywhere in
    mspec or code generator to re-arrange bytes upfront. All we have exposed
    at reader/write level is read/write bit.
    To be fair LE/BE support in read/write buffers is limited just to
    numbers. There is no such support for raw bytes or bits, cause for that
    you need to declare a length of LE/BE sequence.
    
    I would love if I could just declare State as '2 byte little endian' so
    it would be read properly upfront and parsed with no changes in
    generated code, however I'm not sure how to do it and where. That's why
    I'm playing with different things described in earlier mail.
    Since all type handling is general I am just afraid of more complicated
    scenarios where we have variable length structures such as arrays.
    
    Best regards,
    Łukasz
    
    
    On 14.04.2020 09:41, Christofer Dutz wrote:
    > Hi Lukasz,
    > 
    > I am not sure I am understanding the problems you are facing. We already have LE and BE protocols.
    > For example EIP is LE and the rest is generally BE. It seems that ADS/AMS is also LE. 
    > mspec doesn't even know about endianness.
    > 
    > Up till now the endianness doesn't have an effect on bit-fields or single-bit ints. 
    > It only starts to affect if a field goes from one byte to the next, which is usually for (u)int and floating point values.
    > 
    > That's why we have created the Read/WriteBuffers to set their endianness in the constructor.
    > 
    > So if you're creating a driver for ADS/AMS which is LE, then you write the mspec according to the sequence the information is located in the transferred bytes and have the Read/WriteBuffer handle the endianness issue.
    > 
    > I do see a problem when there are drivers that use mixed endianness, but we have still to encounter such a protocol.
    > 
    > So I have to admit that I don't like any of the mspec changes you proposed, as I think you are just not using the tools we have the right way.
    > 
    > Chris
    > 
    > 
    > 
    > Am 14.04.20, 00:32 schrieb "Łukasz Dywicki" <lu...@code-house.org>:
    > 
    >     Hey Niclas,
    >     I realized how old the old things are when I started preparing
    >     automation training for mere mortals and got into history of frames and
    >     even cabling. Mr. Modbus and EIA-485 is definitely older than I. ;-)
    >     
    >     Getting back to the point - yes. I been thinking how to address the byte
    >     order in effective way. Here are two approaches I have for now:
    >     
    >     A) My initial attempt is just a temporary buffer which is then written
    >     in reverse order to caller. For reading it is similar - just getting N
    >     bytes in reversed order. The hard part is.. knowing N. I had to add a
    >     static calculation in order to allocate valid buffer sizes. I tend to
    >     work but I'm not happy with this approach cause it involves additional work.
    >     B) Second idea I've got is really simple and relies on code generation.
    >     We know in which order fields are coming. Here I'm referring to a State
    >     field which is just bunch of bits. If we would group fields in bytes and
    >     generate code in reverse order then it has chance to work. Requirement
    >     for that - ability to know basic field sizes upfront.
    >     C) Try to combine above with bit-io or Read/WriteBuffers as these are
    >     places which know actual position and state of buffers which are being
    >     read/written.
    >     
    >     Now, getting to two cases which are a problem. CommandId and State. So
    >     with command id situation is simple as it is declared as enum and it is
    >     read as uint. We know size upfront and can generate valid method call
    >     (readIntLE).
    >     [enum uint 16 little endian 'CommandId'
    >         ['0x00' INVALID]
    >         ['0x01' ADS_READ_DEVICE_INFO]
    >         ['0x02' ADS_READ]
    >         ['0x03' ADS_WRITE]
    >         ['0x04' ADS_READ_STATE]
    >         ['0x05' ADS_WRITE_CONTROL]
    >         ['0x06' ADS_ADD_DEVICE_NOTIFICATION]
    >         ['0x07' ADS_DELETE_DEVICE_NOTIFICATION]
    >         ['0x08' ADS_DEVICE_NOTIFICATION]
    >         ['0x09' ADS_READ_WRITE]
    >     ]
    >     
    >     Second candidate is what I'm stuck right now sniping next cycles of
    >     problems. So in case of State we have complex type composed from 2
    >     bytes. A note here - instead of two bytes we might have a variable
    >     length type which includes array or other variable section.
    >     [type little endian 'State'
    >         [simple     bit 'broadcast'             ]
    >         [reserved   int 7 '0x0'                 ]
    >         [simple     bit 'initCommand'           ]
    >         [simple     bit 'updCommand'            ]
    >         [simple     bit 'timestampAdded'        ]
    >         [simple     bit 'highPriorityCommand'   ]
    >         [simple     bit 'systemCommand'         ]
    >         [simple     bit 'adsCommand'            ]
    >         [simple     bit 'noReturn'              ]
    >         [simple     bit 'response'              ]
    >     ]
    >     
    >     The order of reading big endian encoded data to impose little endian
    >     shift would be (please correct me if I'm wrong):
    >     1) init
    >     2) udp
    >     3) add timestamp
    >     4) priority
    >     5) system
    >     6) ads
    >     7) noreturn
    >     8) response (end of byte 1)
    >     9) broadcast
    >     10) reserved (end of byte )
    >     We can do same trick for writing, by re-arranging fields. By this way we
    >     avoid any additional byte level operations.
    >     
    >     Overall trouble with generated driver is to declare "how much" bytes
    >     should be read and interpreted. We have precise size information at the
    >     runtime - due to length fields, we can leverage it at generation time,
    >     but then we won't be able to cover all cases.
    >     
    >     I would love to keep it simple and do not break things thus I need your
    >     advice on how to approach this problem in a valid way.
    >     
    >     Cheers,
    >     Łukasz
    >     
    >     
    >     On 13.04.2020 03:26, Niclas Hedhman wrote:
    >     > <anecdotal-rant>
    >     > For us who were around and shaping the protocols in the 1980s, and people
    >     > before us (and before standards like RS-232), a lot of the "specifications"
    >     > came out of "observation of implementation we managed to get to work",
    >     > rather than "implement this spec". A lot was due to extreme memory
    >     > constraints (in my case, multi-tasking operating system, serial protocol
    >     > 187kbps, interpreted programming language with floating point ops and user
    >     > applications in 2kB RAM and 8kB EPROM) and a general lack of information,
    >     > like what other people were doing, sharing experiences and so on.
    >     > 
    >     > And there were many "innovative" ways to squeeze just a little bit extra
    >     > out of the hardware, resulting in "hard to understand" consequences. Bit
    >     > packing was a typical one, multiple functions packed into a single byte.
    >     > Look at page 14 in https://www.nxp.com/docs/en/data-sheet/80C31_80C32.pdf
    >     > and read up on "UART Enahanced Mode", and we used this, i.e. 9 bits, no
    >     > parity and clever use of address and mask to create a slave-to-slave direct
    >     > protocol, where the master's role was to signal which slave "owned" the
    >     > cable. Yeah, in that 8kB ROM limitation (I think protocol was about 1kB
    >     > ROM) and something like 150 bytes RAM for comm protocol.
    >     > 
    >     > Could you implement a compatible device to this with PLC4X and modern
    >     > hardware (i.e. no 8031/32 co-processor)? Possibly but bit-banging is needed
    >     > to support the 9bit data (+start and stop bits) and an awful lot of CPU
    >     > cycles on something that was automatic on one of the slowest long-lived
    >     > microcontroller ever.
    >     > </anecdotal-rant>
    >     > 
    >     > My point was only to highlight that some of the strange things you see in
    >     > protocols today, have its roots in pre-standardization days. Today no one
    >     > would go down that route, because the hardware cost nothing now (8031  +
    >     > 8kB EPROM + 2kB static RAM + battery backup => ~$50 in 1983's currency) and
    >     > longevity of software is more important.
    >     > 
    >     > Cheers
    >     > Niclas
    >     > 
    >     > 
    >     > On Sun, Apr 12, 2020 at 10:10 PM Christofer Dutz <ch...@c-ware.de>
    >     > wrote:
    >     > 
    >     >> Hi Lukasz,
    >     >>
    >     >> I think it really gets tricky when using BE and having some byte-odd-sizes
    >     >> ... I remember in the Firmata protocol there were some bitmasks and then 10
    >     >> bit uint as BE ... not it really got tricky as the specs were written from
    >     >> a point of view: You read 16 bits BE and then the first6 bits mean XYZ
    >     >> instead of describing how the bits actually travel over the wire.
    >     >>
    >     >> Chris
    >     >>
    >     >>
    >     >>
    >     >> Am 11.04.20, 01:21 schrieb "Łukasz Dywicki" <lu...@code-house.org>:
    >     >>
    >     >>     I've made some progress with topic by modyfing mspec and allowing
    >     >>     'little endian' flag on fields. This moved me further to next issue -
    >     >>     which is whole type encoded little endian.
    >     >>
    >     >>     In ADS driver such type is State, which has 2 bytes and uses 8 bits for
    >     >>     various flags.
    >     >>     There are two cases which require different approach - reading and
    >     >>     writing. So for reading we need to swap N bytes based on type length.
    >     >>     For writing we need to alocate buffer for N bytes and swap them before
    >     >>     writing.
    >     >>
    >     >>     I am stuck now with freemaker templates and bit-io.
    >     >>
    >     >>     Cheers,
    >     >>     Łukasz
    >     >>
    >     >>
    >     >>
    >     >>     On 10.04.2020 17:57, Łukasz Dywicki wrote:
    >     >>     > I am doing some tests of ADS serialization.
    >     >>     >
    >     >>     > I've run into some troubles with payload which is generated with new
    >     >>     > driver. I'm not sure if that's my fault or generated code.
    >     >>     >
    >     >>     > I did a verification of what Wireshark shows and how ads structures
    >     >> are
    >     >>     > parsed. There is a gap I think. For example ams port number 1000
    >     >>     > (0x1027) is read as 4135.
    >     >>     >
    >     >>     > Obviously I used wrong structures while implementing protocol logic
    >     >> in
    >     >>     > first place, but now I am uncertain of how fields are encoded. How we
    >     >>     > mark field as little endian when rest of payload is big endian? Do we
    >     >>     > have `uint_le`?
    >     >>     >
    >     >>     > As far I remember route creation logic I was tracking last week used
    >     >>     > combination of LE and BE.
    >     >>     >
    >     >>     > Best regards,
    >     >>     > Łukasz
    >     >>     >
    >     >>
    >     >>
    >     >>
    >     > 
    >     
    >

Re: Big and littleendian fields in one mspec

Posted by Julian Feinauer <j....@pragmaticminds.de>.

That is really good news, thanks to both of you Lukasz and Chris.

Not only that we got a technical solution but that we manage it time after time to come to joint solutions in the project (which is way more important than finding technical solutions IMHO).
Hope to be able to contribute more, soon : )

Julian

Am 06.05.20, 01:33 schrieb "Łukasz Dywicki" <lu...@code-house.org>:

    Last Saturday I promised to Chris I will confirm the case and I just did.

    As suggested - I rearranged field order in State type and made a test to
    proof myself, that it works:
    https://github.com/apache/plc4x/commit/3010fdd15524410dfd55746562ed44bcee0fe80f

    I've closed related issue (PLC4X-193) and created pull requests. Changes
    in mspec are not needed for that case.

    It took quite long, but we managed to find solution.

    Cheers,
    Łukasz


    On 01.05.2020 10:53, Christofer Dutz wrote:
    > Hi folks,
    > 
    > so Lukasz just sent me a little pcap file with some AMS packets in there and my assumption was correct:
    > 
    > for example in the byte-stream I can see the "flags" being 0x0400 ... However wireshark read this as unsigned short LE and therefore re-aranged the bytes:
    >     StateFlags: 0x0004
    >         .... .... .... ...0 = RESPONSE: Not set
    >         .... .... .... ..0. = NO RETURN: Not set
    >         .... .... .... .1.. = ADS COMMAND: Set
    >         .... .... .... 0... = SYSTEM COMMAND: Not set
    >         .... .... ...0 .... = HIGH PRIORITY COMMAND: Not set
    >         .... .... ..0. .... = TIMESTAMP ADDED: Not set
    >         .... .... .0.. .... = UDP COMMAND: Not set
    >         .... .... 0... .... = INIT COMMAND: Not set
    >         0... .... .... .... = BROADCAST: Not set
    > 
    > So in above example only one field is true: "ADS COMMAND" ... so if I decode 0x0400 to binary that’s: 00000100 00000000
    > 
    > So this exactly fits my proposed mspec definition of: 
    > 
    >     [type 'State'
    >             [simple     bit 'initCommand'           ]
    >             [simple     bit 'updCommand'            ]
    >             [simple     bit 'timestampAdded'        ]
    >             [simple     bit 'highPriorityCommand'   ]
    >             [simple     bit 'systemCommand'         ]
    >             [simple     bit 'adsCommand'            ]
    >             [simple     bit 'noReturn'              ]
    >             [simple     bit 'response'              ]
    >             [simple     bit 'broadcast'             ]
    >             [reserved   int 7 '0x0'                 ]
    >         ]
    > 
    > So I strongly object introducing any special endianness "feature" for this (in this case) ... 
    > we might encounter something where we need it, but for this we don't.
    > 
    > Chris
    > 
    > 
    > 
    > Am 30.04.20, 23:02 schrieb "Christofer Dutz" <ch...@c-ware.de>:
    > 
    >     Hi Lukasz,
    > 
    >     I wasn't suggesting to look how WireShark decodes things. Cause I would assume that they read a short in little endian and then (on the rearranged bytes) apply some bit checks. 
    > 
    >     I was more suggesting that you look into the bytes Wireshark shows you and to check their exact position.
    > 
    >     When implementing drivers, I used the Mac's Calculator tool quite often to see the bits for a given byte value.
    > 
    >     In your example I can't see any need for endianness at all ... it's all just bits and one empty block which forms a full byte together with the first bit.
    >     As this block doesn't cross any byte boundaries I can't see any problems with this.
    > 
    >     Assuming I understand what you are implying with 
    > 
    >     [type little endian 'State'
    >             [simple     bit 'broadcast'             ]
    >             [reserved   int 7 '0x0'                 ]
    >             [simple     bit 'initCommand'           ]
    >             [simple     bit 'updCommand'            ]
    >             [simple     bit 'timestampAdded'        ]
    >             [simple     bit 'highPriorityCommand'   ]
    >             [simple     bit 'systemCommand'         ]
    >             [simple     bit 'adsCommand'            ]
    >             [simple     bit 'noReturn'              ]
    >             [simple     bit 'response'              ]
    >         ]
    > 
    >     Then it should be equal to:
    > 
    >     [type 'State'
    >             [simple     bit 'initCommand'           ]
    >             [simple     bit 'updCommand'            ]
    >             [simple     bit 'timestampAdded'        ]
    >             [simple     bit 'highPriorityCommand'   ]
    >             [simple     bit 'systemCommand'         ]
    >             [simple     bit 'adsCommand'            ]
    >             [simple     bit 'noReturn'              ]
    >             [simple     bit 'response'              ]
    >             [simple     bit 'broadcast'             ]
    >             [reserved   int 7 '0x0'                 ]
    >         ]
    > 
    >     Which also seems to be the exact order how the bits go over the wire ... I mean ... it doesn't make any sense to put the 7 empty bits in the middle.
    > 
    >     Chris
    > 
    > 
    >     Am 30.04.20, 22:42 schrieb "Łukasz Dywicki" <lu...@code-house.org>:
    > 
    >         Hey Chris and others,
    >         I let this topic to dust and freeze a bit, however I would like to lift
    >         it up again. Since the 0.7 release is now in discussion and ADS is not
    >         ported yet I believe it is reasonable to validate this problem.
    > 
    >         As you suggested I've checked wireshark sources to see how they parse
    >         state fragment of AMS header. While I can not read C properly and was
    >         unable to locate actual reading logic, I do see they care about endianness:
    >         https://github.com/wireshark/wireshark/blob/5cf3fd03f1538b37704d83b6c38b8223f9036108/plugins/epan/ethercat/packet-ams.c#L429
    > 
    >         Their frame traversal goes in following order:
    >         - response
    >         - no return
    >         - ads cmd
    >         - system cmd
    >         - high priority
    >         - timestamp add
    >         - udp
    >         - init cmd
    >         - broadcast
    > 
    >         When I've tried to model mspec with respect to little endian order, I
    >         end up with following field order:
    >         - init cmd
    >         - udp
    >         - timestamp add
    >         - high priority
    >         - system cmd
    >         - ads cmd
    >         - no return
    >         - response
    >         - [reserved]
    >         - broadcast
    > 
    >         I haven't seen ADS protocol book, my main source of information is
    >         infosys which is rather short in this area. I doubt if they would
    >         describe fields in above order.
    > 
    >         My initial attempts were similar to what you said - to read the value
    >         and then shift bytes. With your feedback I started to look for better
    >         ways to solve it. With my last attempt I solved the problem by
    >         re-adjusting fields during code generation, so they appear in a valid
    >         byte order. Beside additional flag at the mspec there is a need for this
    >         helper:
    >         build-utils/protocol-base-mspec/src/main/java/org/apache/plc4x/plugins/codegenerator/language/mspec/model/definitions/DefaultComplexTypeDefinition.java
    > 
    >         In above commit the State declaration in the mspec looks like this:
    >         [type little endian 'State'
    >             [simple     bit 'broadcast'             ]
    >             [reserved   int 7 '0x0'                 ]
    >             [simple     bit 'initCommand'           ]
    >             [simple     bit 'updCommand'            ]
    >             [simple     bit 'timestampAdded'        ]
    >             [simple     bit 'highPriorityCommand'   ]
    >             [simple     bit 'systemCommand'         ]
    >             [simple     bit 'adsCommand'            ]
    >             [simple     bit 'noReturn'              ]
    >             [simple     bit 'response'              ]
    >         ]
    > 
    >         Maybe it is not in line with wireshark, but it is consistent and:
    >         1) requires minor change in code generator
    >         2) does not involve additional processing of data
    >         3) mspec is still easy to write
    >         4) type declaration is in line with wireshark output when reading from
    >         the left and not from the top. ;-)
    >         5) it can be adjusted to be 100% consistent with wireshark by changing
    >         helper logic. ;-)
    > 
    >         Best regards,
    >         Łukasz
    > 
    > 
    >         On 14.04.2020 21:25, Christofer Dutz wrote:
    >         > Hi Lukasz,
    >         > 
    >         > I see it differently.
    >         > 
    >         > If you say something is LE/BE then it should be consistent. 
    >         > 
    >         > I for my part like it if the mspec is consistent with what I see in WireShark. Otherwise you would always have to do a sort of 2 step parsing:
    >         > 
    >         > "Read 4 bytes in BE and then on the rearranged byte sequence take the first 3 bits and call them X and the next bit after that Y and so on."
    >         > 
    >         > In this case you would also have to know how large the chunk is to read in order to know how to decode it's content.
    >         > Having a historically partitioned set of two 16 bit bit-fields would look completely different to a 32 bit bit-field. 
    >         > 
    >         > I don't like adding this extra level, even if it's probably the way protocols were implemented when lacking the ability to write individual bits.
    >         > I guess the usual process was to read a BE short (2 bytes) and to the resulting unsigned short value apply different logical or operations to get the information.
    >         > In PLC4X we can simply read any number of bits and don't have to twist our minds in which chunks I have to read things first. 
    >         > 
    >         > So I think the way you are proposing is probably the wider used version due to legacy restrictions, but also the more complicated way.
    >         > 
    >         > So I would like to keep it simple, even if the developer will have to do a little more brain-work. 
    >         > 
    >         > I mean all we are doing in this project is going the extra mile so others don't have to ... at least the ones contributing are doing that.
    >         > 
    >         > What do the others say to this? (Ideally ones that have already used mspec or ones that understand the problem)
    >         > 
    >         > Chris
    >         > 
    >         > 
    >         > Am 14.04.20, 15:46 schrieb "Łukasz Dywicki" <lu...@code-house.org>:
    >         > 
    >         >     Hey Chris,
    >         >     Sorry for wrongly addressed mail, it was intended to go to mailing list.
    >         >     
    >         >     To the point - I think that what you propose with reordering fields
    >         >     manually shift the load from the tool to the developer. It will lead to
    >         >     further implementation errors caused by a need to reorder fields while
    >         >     creating spec. It will be inconsistent with rest of places where things
    >         >     are in "natural order" (big endian) except one or two types which
    >         >     collects bit flags which will have to be declared in different order.
    >         >     What I did in my naive implementation with "little endian" flag on the
    >         >     State type is re-arrangement of fields during generation time.
    >         >     It does not require any byte shifting at the runtime and keeps natural
    >         >     order of fields without the need for manual reordering of flags in mspec
    >         >     declaration.
    >         >     
    >         >     I understand that keeping mspec simple is an important priority, however
    >         >     doing it at the cost of developers who will use it to implement codecs
    >         >     will not help us in long term.
    >         >     I'm quite sure that sooner or later we will get a  protocol with mix of
    >         >     BE/LE fields.
    >         >     
    >         >     Cheers,
    >         >     Łukasz
    >         >     
    >         >     
    >         >     On 14.04.2020 14:57, Christofer Dutz wrote:
    >         >     > Hi Lukasz (adding dev@ to the recipients again)
    >         >     > 
    >         >     >  
    >         >     > 
    >         >     > you are not quite correct. Enum types do declare a base type where they
    >         >     > are declared … not where they are used.
    >         >     > 
    >         >     >  
    >         >     > 
    >         >     > If you have bit-fields then why don’t you define them as sequence of
    >         >     > bits? I really can’t understand why you need to flip anything.
    >         >     > 
    >         >     > I think you’re trying to do something similar to Sebastian when he
    >         >     > wanted to introduce some bit-mask types.
    >         >     > 
    >         >     >  
    >         >     > 
    >         >     > I could imagine that in the spec it might be written … these two bytes
    >         >     > are a bit-mask and the bit 1 means X bit 10 means Y, …
    >         >     > 
    >         >     > So why not define the bit fields in the sequence they are sent over the
    >         >     > wire?
    >         >     > 
    >         >     >  
    >         >     > 
    >         >     > Assuming you have a bit field of 16 bits BE uint. If the spec now says:
    >         >     > 
    >         >     >  
    >         >     > 
    >         >     >  
    >         >     > 
    >         >     > Bit:                        15|14|13|12|11|10|9 |8 |7 |6 |5  |4 |3  |2
    >         >     >  |1  |0
    >         >     > 
    >         >     > Meaning:            A  |B  |C  |D  |E  | F  |G|H |I  | J | K |L | M|N |O |P
    >         >     > 
    >         >     >  
    >         >     > 
    >         >     > Then the order they are sent is:
    >         >     > 
    >         >     > 7 |6 |5  |4 |3  |2  |1  |0 | 15|14|13|12|11|10|9 |8
    >         >     > 
    >         >     > So you have to declare the bit fields in the order:
    >         >     > 
    >         >     >                                I  | J | K |L | M|N |O |P  |A  |B  |C  |D
    >         >     >  |E  | F  |G|H
    >         >     > 
    >         >     >  
    >         >     > 
    >         >     > So I don’t quite understand what you’re trying to flip here.
    >         >     > 
    >         >     >  
    >         >     > 
    >         >     > Chris
    >         >     > 
    >         >     >  
    >         >     > 
    >         >     > *Von: *Łukasz Dywicki <lu...@code-house.org>
    >         >     > *Datum: *Dienstag, 14. April 2020 um 13:26
    >         >     > *An: *Christofer Dutz <ch...@c-ware.de>
    >         >     > *Betreff: *Re: Big and littleendian fields in one mspec
    >         >     > 
    >         >     >  
    >         >     > 
    >         >     > I've added the endianness switch to the types in my experiments. These
    >         >     > are not needed for enums as these are read as int and then mapped to
    >         >     > constants. I referred to these as an example - enum does have a length
    >         >     > indication and type behind it.
    >         >     > 
    >         >     >  
    >         >     > 
    >         >     > Again, I will bring the issue - StateIO currently fails to do anything
    >         >     > useful due to BE/LE switch. Because this type uses single bits which
    >         >     > need to be flipped before reading or after writing otherwise they end up
    >         >     > in improper order. Out of 16 bits 9 are in use so we have a lot of
    >         >     > possible combinations (2^9), which seems too much to create an enum.
    >         >     > 
    >         >     > Since you didn't like my initial proposal and modifications (as they
    >         >     > might be redundant to what is available in the buffer API), how would
    >         >     > you handle State serialization without affecting mspec and without
    >         >     > further complication to code generation?
    >         >     > 
    >         >     >  
    >         >     > 
    >         >     > Best,
    >         >     > 
    >         >     > Łukasz
    >         >     > 
    >         >     >  
    >         >     > 
    >         >     > wt., 14 kwi 2020 o 12:47 Christofer Dutz <christofer.dutz@c-ware.de
    >         >     > <ma...@c-ware.de>> napisał(a):
    >         >     > 
    >         >     >     Hi Lukasz,
    >         >     > 
    >         >     >     but we don't have a:
    >         >     >     [enum uint 16 little endian 'CommandId']
    >         >     >     But only a:
    >         >     >     [enum uint 16 'CommandId']
    >         >     > 
    >         >     >     And in your case I think perhaps the constants are not correct. So
    >         >     >     having a 16 bit uint will result in a 4 digit hex string:
    >         >     >     So if you are having problems in mapping them to your constants,
    >         >     >     perhaps the constants are in the wrong endianness.
    >         >     > 
    >         >     >     I have encountered (but can't recall where) that the Enum constants
    >         >     >     in a BE protocol were written down in LE notation.
    >         >     >     Of course the thing can't work then.
    >         >     > 
    >         >     >     So if for example you have the constant "1" in a BE protocol with a
    >         >     >     uint 16 enum type, your constant is not 0x0001, but 0x0100 instead.
    >         >     > 
    >         >     >     Chris
    >         >     > 
    >         >     > 
    >         >     > 
    >         >     > 
    >         >     >     Am 14.04.20, 12:37 schrieb "Łukasz Dywicki" <luke@code-house.org
    >         >     >     <ma...@code-house.org>>:
    >         >     > 
    >         >     >         The legendary "two bytes" (shall we create a band with this
    >         >     >     name?!) are
    >         >     >         coming from unfortunate State type. I don't mind these to be an
    >         >     >         int/sint/uint or whathever - this is matter of interpretation.
    >         >     > 
    >         >     >         If you could please check again my earlier messages you will find
    >         >     >         struggle I have which is - how much given type takes.
    >         >     > 
    >         >     >         We have that for enums
    >         >     >         [enum uint 16 little endian 'CommandId']
    >         >     > 
    >         >     >         but we don't have that for complex types
    >         >     >         [type 'State']
    >         >     > 
    >         >     >         This leads to situation that we don't know how to read and interpret
    >         >     >         incoming byte sequence or how to properly write it back to stream.
    >         >     > 
    >         >     >         It would be great if we could limit the issue just to
    >         >     >         // read
    >         >     >         state = StateIO.parseStatic(io.readUnsignedInt(16, true)); //
    >         >     >     true->LE
    >         >     >         // write
    >         >     >         wio = new WriteBuffer(2);
    >         >     >         StateIO.staticSerialize(wio, new State(...));
    >         >     >         io.writeUnsignedInt(8, wio.read, true); // true->LE
    >         >     > 
    >         >     >         However to do that first we need to know that State is a uint 16
    >         >     >     LE and
    >         >     >         then do a lot of mambo-jambo in code generators to cope with
    >         >     >     that. :-)
    >         >     > 
    >         >     >         Best,
    >         >     >         Łukasz
    >         >     > 
    >         >     > 
    >         >     >         On 14.04.2020 12:20, Christofer Dutz wrote:
    >         >     >         > Oh gee ... I totally remember having exactly the same
    >         >     >     discussion with Sebastian about "bytes" ...
    >         >     >         >
    >         >     >         > The problem is there is generally no "byte" "int" and "uint"
    >         >     >     because a byte = 8 bits is either signed or unsigned. That should
    >         >     >     the 3rd option be? Perhaps-signed?
    >         >     >         > So generally I use "uint 8" for what you refer to as a "byte 8".
    >         >     >         >
    >         >     >         > Wo what would be the difference between your "2 byte little
    >         >     >     endian" and an "uint 16" with a LE ReadBuffer?
    >         >     >         >
    >         >     >         > I think what you have to rid yourself of thinking of int as
    >         >     >     number and a byte not being a number.
    >         >     >         >
    >         >     >         > Chris
    >         >     >         >
    >         >     >         >
    >         >     >         >
    >         >     >         > Am 14.04.20, 10:52 schrieb "Łukasz Dywicki"
    >         >     >     <luke@code-house.org <ma...@code-house.org>>:
    >         >     >         >
    >         >     >         >     Hey Christian,
    >         >     >         >     The problem I face is quite simple. State type in mspec i
    >         >     >     declared as a
    >         >     >         >     bunch of bits. Type length is fixed, however not available
    >         >     >     anywhere in
    >         >     >         >     mspec or code generator to re-arrange bytes upfront. All
    >         >     >     we have exposed
    >         >     >         >     at reader/write level is read/write bit.
    >         >     >         >     To be fair LE/BE support in read/write buffers is limited
    >         >     >     just to
    >         >     >         >     numbers. There is no such support for raw bytes or bits,
    >         >     >     cause for that
    >         >     >         >     you need to declare a length of LE/BE sequence.
    >         >     >         >     
    >         >     >         >     I would love if I could just declare State as '2 byte
    >         >     >     little endian' so
    >         >     >         >     it would be read properly upfront and parsed with no
    >         >     >     changes in
    >         >     >         >     generated code, however I'm not sure how to do it and
    >         >     >     where. That's why
    >         >     >         >     I'm playing with different things described in earlier mail.
    >         >     >         >     Since all type handling is general I am just afraid of
    >         >     >     more complicated
    >         >     >         >     scenarios where we have variable length structures such as
    >         >     >     arrays.
    >         >     >         >     
    >         >     >         >     Best regards,
    >         >     >         >     Łukasz
    >         >     >         >     
    >         >     >         >     
    >         >     >         >     On 14.04.2020 09:41, Christofer Dutz wrote:
    >         >     >         >     > Hi Lukasz,
    >         >     >         >     >
    >         >     >         >     > I am not sure I am understanding the problems you are
    >         >     >     facing. We already have LE and BE protocols.
    >         >     >         >     > For example EIP is LE and the rest is generally BE. It
    >         >     >     seems that ADS/AMS is also LE.
    >         >     >         >     > mspec doesn't even know about endianness.
    >         >     >         >     >
    >         >     >         >     > Up till now the endianness doesn't have an effect on
    >         >     >     bit-fields or single-bit ints.
    >         >     >         >     > It only starts to affect if a field goes from one byte
    >         >     >     to the next, which is usually for (u)int and floating point values.
    >         >     >         >     >
    >         >     >         >     > That's why we have created the Read/WriteBuffers to set
    >         >     >     their endianness in the constructor.
    >         >     >         >     >
    >         >     >         >     > So if you're creating a driver for ADS/AMS which is LE,
    >         >     >     then you write the mspec according to the sequence the information
    >         >     >     is located in the transferred bytes and have the Read/WriteBuffer
    >         >     >     handle the endianness issue.
    >         >     >         >     >
    >         >     >         >     > I do see a problem when there are drivers that use mixed
    >         >     >     endianness, but we have still to encounter such a protocol.
    >         >     >         >     >
    >         >     >         >     > So I have to admit that I don't like any of the mspec
    >         >     >     changes you proposed, as I think you are just not using the tools we
    >         >     >     have the right way.
    >         >     >         >     >
    >         >     >         >     > Chris
    >         >     >         >     >
    >         >     >         >     >
    >         >     >         >     >
    >         >     >         >     > Am 14.04.20, 00:32 schrieb "Łukasz Dywicki"
    >         >     >     <luke@code-house.org <ma...@code-house.org>>:
    >         >     >         >     >
    >         >     >         >     >     Hey Niclas,
    >         >     >         >     >     I realized how old the old things are when I started
    >         >     >     preparing
    >         >     >         >     >     automation training for mere mortals and got into
    >         >     >     history of frames and
    >         >     >         >     >     even cabling. Mr. Modbus and EIA-485 is definitely
    >         >     >     older than I. ;-)
    >         >     >         >     >     
    >         >     >         >     >     Getting back to the point - yes. I been thinking how
    >         >     >     to address the byte
    >         >     >         >     >     order in effective way. Here are two approaches I
    >         >     >     have for now:
    >         >     >         >     >     
    >         >     >         >     >     A) My initial attempt is just a temporary buffer
    >         >     >     which is then written
    >         >     >         >     >     in reverse order to caller. For reading it is
    >         >     >     similar - just getting N
    >         >     >         >     >     bytes in reversed order. The hard part is.. knowing
    >         >     >     N. I had to add a
    >         >     >         >     >     static calculation in order to allocate valid buffer
    >         >     >     sizes. I tend to
    >         >     >         >     >     work but I'm not happy with this approach cause it
    >         >     >     involves additional work.
    >         >     >         >     >     B) Second idea I've got is really simple and relies
    >         >     >     on code generation.
    >         >     >         >     >     We know in which order fields are coming. Here I'm
    >         >     >     referring to a State
    >         >     >         >     >     field which is just bunch of bits. If we would group
    >         >     >     fields in bytes and
    >         >     >         >     >     generate code in reverse order then it has chance to
    >         >     >     work. Requirement
    >         >     >         >     >     for that - ability to know basic field sizes upfront.
    >         >     >         >     >     C) Try to combine above with bit-io or
    >         >     >     Read/WriteBuffers as these are
    >         >     >         >     >     places which know actual position and state of
    >         >     >     buffers which are being
    >         >     >         >     >     read/written.
    >         >     >         >     >     
    >         >     >         >     >     Now, getting to two cases which are a problem.
    >         >     >     CommandId and State. So
    >         >     >         >     >     with command id situation is simple as it is
    >         >     >     declared as enum and it is
    >         >     >         >     >     read as uint. We know size upfront and can generate
    >         >     >     valid method call
    >         >     >         >     >     (readIntLE).
    >         >     >         >     >     [enum uint 16 little endian 'CommandId'
    >         >     >         >     >         ['0x00' INVALID]
    >         >     >         >     >         ['0x01' ADS_READ_DEVICE_INFO]
    >         >     >         >     >         ['0x02' ADS_READ]
    >         >     >         >     >         ['0x03' ADS_WRITE]
    >         >     >         >     >         ['0x04' ADS_READ_STATE]
    >         >     >         >     >         ['0x05' ADS_WRITE_CONTROL]
    >         >     >         >     >         ['0x06' ADS_ADD_DEVICE_NOTIFICATION]
    >         >     >         >     >         ['0x07' ADS_DELETE_DEVICE_NOTIFICATION]
    >         >     >         >     >         ['0x08' ADS_DEVICE_NOTIFICATION]
    >         >     >         >     >         ['0x09' ADS_READ_WRITE]
    >         >     >         >     >     ]
    >         >     >         >     >     
    >         >     >         >     >     Second candidate is what I'm stuck right now sniping
    >         >     >     next cycles of
    >         >     >         >     >     problems. So in case of State we have complex type
    >         >     >     composed from 2
    >         >     >         >     >     bytes. A note here - instead of two bytes we might
    >         >     >     have a variable
    >         >     >         >     >     length type which includes array or other variable
    >         >     >     section.
    >         >     >         >     >     [type little endian 'State'
    >         >     >         >     >         [simple     bit 'broadcast'             ]
    >         >     >         >     >         [reserved   int 7 '0x0'                 ]
    >         >     >         >     >         [simple     bit 'initCommand'           ]
    >         >     >         >     >         [simple     bit 'updCommand'            ]
    >         >     >         >     >         [simple     bit 'timestampAdded'        ]
    >         >     >         >     >         [simple     bit 'highPriorityCommand'   ]
    >         >     >         >     >         [simple     bit 'systemCommand'         ]
    >         >     >         >     >         [simple     bit 'adsCommand'            ]
    >         >     >         >     >         [simple     bit 'noReturn'              ]
    >         >     >         >     >         [simple     bit 'response'              ]
    >         >     >         >     >     ]
    >         >     >         >     >     
    >         >     >         >     >     The order of reading big endian encoded data to
    >         >     >     impose little endian
    >         >     >         >     >     shift would be (please correct me if I'm wrong):
    >         >     >         >     >     1) init
    >         >     >         >     >     2) udp
    >         >     >         >     >     3) add timestamp
    >         >     >         >     >     4) priority
    >         >     >         >     >     5) system
    >         >     >         >     >     6) ads
    >         >     >         >     >     7) noreturn
    >         >     >         >     >     8) response (end of byte 1)
    >         >     >         >     >     9) broadcast
    >         >     >         >     >     10) reserved (end of byte )
    >         >     >         >     >     We can do same trick for writing, by re-arranging
    >         >     >     fields. By this way we
    >         >     >         >     >     avoid any additional byte level operations.
    >         >     >         >     >     
    >         >     >         >     >     Overall trouble with generated driver is to declare
    >         >     >     "how much" bytes
    >         >     >         >     >     should be read and interpreted. We have precise size
    >         >     >     information at the
    >         >     >         >     >     runtime - due to length fields, we can leverage it
    >         >     >     at generation time,
    >         >     >         >     >     but then we won't be able to cover all cases.
    >         >     >         >     >     
    >         >     >         >     >     I would love to keep it simple and do not break
    >         >     >     things thus I need your
    >         >     >         >     >     advice on how to approach this problem in a valid way.
    >         >     >         >     >     
    >         >     >         >     >     Cheers,
    >         >     >         >     >     Łukasz
    >         >     >         >     >     
    >         >     >         >     >     
    >         >     >         >     >     On 13.04.2020 03:26, Niclas Hedhman wrote:
    >         >     >         >     >     > <anecdotal-rant>
    >         >     >         >     >     > For us who were around and shaping the protocols
    >         >     >     in the 1980s, and people
    >         >     >         >     >     > before us (and before standards like RS-232), a
    >         >     >     lot of the "specifications"
    >         >     >         >     >     > came out of "observation of implementation we
    >         >     >     managed to get to work",
    >         >     >         >     >     > rather than "implement this spec". A lot was due
    >         >     >     to extreme memory
    >         >     >         >     >     > constraints (in my case, multi-tasking operating
    >         >     >     system, serial protocol
    >         >     >         >     >     > 187kbps, interpreted programming language with
    >         >     >     floating point ops and user
    >         >     >         >     >     > applications in 2kB RAM and 8kB EPROM) and a
    >         >     >     general lack of information,
    >         >     >         >     >     > like what other people were doing, sharing
    >         >     >     experiences and so on.
    >         >     >         >     >     >
    >         >     >         >     >     > And there were many "innovative" ways to squeeze
    >         >     >     just a little bit extra
    >         >     >         >     >     > out of the hardware, resulting in "hard to
    >         >     >     understand" consequences. Bit
    >         >     >         >     >     > packing was a typical one, multiple functions
    >         >     >     packed into a single byte.
    >         >     >         >     >     > Look at page 14 in
    >         >     >     https://www.nxp.com/docs/en/data-sheet/80C31_80C32.pdf
    >         >     >         >     >     > and read up on "UART Enahanced Mode", and we used
    >         >     >     this, i.e. 9 bits, no
    >         >     >         >     >     > parity and clever use of address and mask to
    >         >     >     create a slave-to-slave direct
    >         >     >         >     >     > protocol, where the master's role was to signal
    >         >     >     which slave "owned" the
    >         >     >         >     >     > cable. Yeah, in that 8kB ROM limitation (I think
    >         >     >     protocol was about 1kB
    >         >     >         >     >     > ROM) and something like 150 bytes RAM for comm
    >         >     >     protocol.
    >         >     >         >     >     >
    >         >     >         >     >     > Could you implement a compatible device to this
    >         >     >     with PLC4X and modern
    >         >     >         >     >     > hardware (i.e. no 8031/32 co-processor)? Possibly
    >         >     >     but bit-banging is needed
    >         >     >         >     >     > to support the 9bit data (+start and stop bits)
    >         >     >     and an awful lot of CPU
    >         >     >         >     >     > cycles on something that was automatic on one of
    >         >     >     the slowest long-lived
    >         >     >         >     >     > microcontroller ever.
    >         >     >         >     >     > </anecdotal-rant>
    >         >     >         >     >     >
    >         >     >         >     >     > My point was only to highlight that some of the
    >         >     >     strange things you see in
    >         >     >         >     >     > protocols today, have its roots in
    >         >     >     pre-standardization days. Today no one
    >         >     >         >     >     > would go down that route, because the hardware
    >         >     >     cost nothing now (8031  +
    >         >     >         >     >     > 8kB EPROM + 2kB static RAM + battery backup =>
    >         >     >     ~$50 in 1983's currency) and
    >         >     >         >     >     > longevity of software is more important.
    >         >     >         >     >     >
    >         >     >         >     >     > Cheers
    >         >     >         >     >     > Niclas
    >         >     >         >     >     >
    >         >     >         >     >     >
    >         >     >         >     >     > On Sun, Apr 12, 2020 at 10:10 PM Christofer Dutz
    >         >     >     <christofer.dutz@c-ware.de <ma...@c-ware.de>>
    >         >     >         >     >     > wrote:
    >         >     >         >     >     >
    >         >     >         >     >     >> Hi Lukasz,
    >         >     >         >     >     >>
    >         >     >         >     >     >> I think it really gets tricky when using BE and
    >         >     >     having some byte-odd-sizes
    >         >     >         >     >     >> ... I remember in the Firmata protocol there were
    >         >     >     some bitmasks and then 10
    >         >     >         >     >     >> bit uint as BE ... not it really got tricky as
    >         >     >     the specs were written from
    >         >     >         >     >     >> a point of view: You read 16 bits BE and then the
    >         >     >     first6 bits mean XYZ
    >         >     >         >     >     >> instead of describing how the bits actually
    >         >     >     travel over the wire.
    >         >     >         >     >     >>
    >         >     >         >     >     >> Chris
    >         >     >         >     >     >>
    >         >     >         >     >     >>
    >         >     >         >     >     >>
    >         >     >         >     >     >> Am 11.04.20, 01:21 schrieb "Łukasz Dywicki"
    >         >     >     <luke@code-house.org <ma...@code-house.org>>:
    >         >     >         >     >     >>
    >         >     >         >     >     >>     I've made some progress with topic by
    >         >     >     modyfing mspec and allowing
    >         >     >         >     >     >>     'little endian' flag on fields. This moved me
    >         >     >     further to next issue -
    >         >     >         >     >     >>     which is whole type encoded little endian.
    >         >     >         >     >     >>
    >         >     >         >     >     >>     In ADS driver such type is State, which has 2
    >         >     >     bytes and uses 8 bits for
    >         >     >         >     >     >>     various flags.
    >         >     >         >     >     >>     There are two cases which require different
    >         >     >     approach - reading and
    >         >     >         >     >     >>     writing. So for reading we need to swap N
    >         >     >     bytes based on type length.
    >         >     >         >     >     >>     For writing we need to alocate buffer for N
    >         >     >     bytes and swap them before
    >         >     >         >     >     >>     writing.
    >         >     >         >     >     >>
    >         >     >         >     >     >>     I am stuck now with freemaker templates and
    >         >     >     bit-io.
    >         >     >         >     >     >>
    >         >     >         >     >     >>     Cheers,
    >         >     >         >     >     >>     Łukasz
    >         >     >         >     >     >>
    >         >     >         >     >     >>
    >         >     >         >     >     >>
    >         >     >         >     >     >>     On 10.04.2020 17:57, Łukasz Dywicki wrote:
    >         >     >         >     >     >>     > I am doing some tests of ADS serialization.
    >         >     >         >     >     >>     >
    >         >     >         >     >     >>     > I've run into some troubles with payload
    >         >     >     which is generated with new
    >         >     >         >     >     >>     > driver. I'm not sure if that's my fault or
    >         >     >     generated code.
    >         >     >         >     >     >>     >
    >         >     >         >     >     >>     > I did a verification of what Wireshark
    >         >     >     shows and how ads structures
    >         >     >         >     >     >> are
    >         >     >         >     >     >>     > parsed. There is a gap I think. For example
    >         >     >     ams port number 1000
    >         >     >         >     >     >>     > (0x1027) is read as 4135.
    >         >     >         >     >     >>     >
    >         >     >         >     >     >>     > Obviously I used wrong structures while
    >         >     >     implementing protocol logic
    >         >     >         >     >     >> in
    >         >     >         >     >     >>     > first place, but now I am uncertain of how
    >         >     >     fields are encoded. How we
    >         >     >         >     >     >>     > mark field as little endian when rest of
    >         >     >     payload is big endian? Do we
    >         >     >         >     >     >>     > have `uint_le`?
    >         >     >         >     >     >>     >
    >         >     >         >     >     >>     > As far I remember route creation logic I
    >         >     >     was tracking last week used
    >         >     >         >     >     >>     > combination of LE and BE.
    >         >     >         >     >     >>     >
    >         >     >         >     >     >>     > Best regards,
    >         >     >         >     >     >>     > Łukasz
    >         >     >         >     >     >>     >
    >         >     >         >     >     >>
    >         >     >         >     >     >>
    >         >     >         >     >     >>
    >         >     >         >     >     >
    >         >     >         >     >     
    >         >     >         >     >
    >         >     >         >     
    >         >     >         >
    >         >     > 
    >         >     
    >         > 
    > 
    >

Re: Big and littleendian fields in one mspec

Posted by Łukasz Dywicki <lu...@code-house.org>.

Last Saturday I promised to Chris I will confirm the case and I just did.

As suggested - I rearranged field order in State type and made a test to
proof myself, that it works:
https://github.com/apache/plc4x/commit/3010fdd15524410dfd55746562ed44bcee0fe80f

I've closed related issue (PLC4X-193) and created pull requests. Changes
in mspec are not needed for that case.

It took quite long, but we managed to find solution.

Cheers,
Łukasz


On 01.05.2020 10:53, Christofer Dutz wrote:
> Hi folks,
> 
> so Lukasz just sent me a little pcap file with some AMS packets in there and my assumption was correct:
> 
> for example in the byte-stream I can see the "flags" being 0x0400 ... However wireshark read this as unsigned short LE and therefore re-aranged the bytes:
>     StateFlags: 0x0004
>         .... .... .... ...0 = RESPONSE: Not set
>         .... .... .... ..0. = NO RETURN: Not set
>         .... .... .... .1.. = ADS COMMAND: Set
>         .... .... .... 0... = SYSTEM COMMAND: Not set
>         .... .... ...0 .... = HIGH PRIORITY COMMAND: Not set
>         .... .... ..0. .... = TIMESTAMP ADDED: Not set
>         .... .... .0.. .... = UDP COMMAND: Not set
>         .... .... 0... .... = INIT COMMAND: Not set
>         0... .... .... .... = BROADCAST: Not set
> 
> So in above example only one field is true: "ADS COMMAND" ... so if I decode 0x0400 to binary that’s: 00000100 00000000
> 
> So this exactly fits my proposed mspec definition of: 
> 
>     [type 'State'
>             [simple     bit 'initCommand'           ]
>             [simple     bit 'updCommand'            ]
>             [simple     bit 'timestampAdded'        ]
>             [simple     bit 'highPriorityCommand'   ]
>             [simple     bit 'systemCommand'         ]
>             [simple     bit 'adsCommand'            ]
>             [simple     bit 'noReturn'              ]
>             [simple     bit 'response'              ]
>             [simple     bit 'broadcast'             ]
>             [reserved   int 7 '0x0'                 ]
>         ]
> 
> So I strongly object introducing any special endianness "feature" for this (in this case) ... 
> we might encounter something where we need it, but for this we don't.
> 
> Chris
> 
> 
> 
> Am 30.04.20, 23:02 schrieb "Christofer Dutz" <ch...@c-ware.de>:
> 
>     Hi Lukasz,
> 
>     I wasn't suggesting to look how WireShark decodes things. Cause I would assume that they read a short in little endian and then (on the rearranged bytes) apply some bit checks. 
> 
>     I was more suggesting that you look into the bytes Wireshark shows you and to check their exact position.
> 
>     When implementing drivers, I used the Mac's Calculator tool quite often to see the bits for a given byte value.
> 
>     In your example I can't see any need for endianness at all ... it's all just bits and one empty block which forms a full byte together with the first bit.
>     As this block doesn't cross any byte boundaries I can't see any problems with this.
> 
>     Assuming I understand what you are implying with 
> 
>     [type little endian 'State'
>             [simple     bit 'broadcast'             ]
>             [reserved   int 7 '0x0'                 ]
>             [simple     bit 'initCommand'           ]
>             [simple     bit 'updCommand'            ]
>             [simple     bit 'timestampAdded'        ]
>             [simple     bit 'highPriorityCommand'   ]
>             [simple     bit 'systemCommand'         ]
>             [simple     bit 'adsCommand'            ]
>             [simple     bit 'noReturn'              ]
>             [simple     bit 'response'              ]
>         ]
> 
>     Then it should be equal to:
> 
>     [type 'State'
>             [simple     bit 'initCommand'           ]
>             [simple     bit 'updCommand'            ]
>             [simple     bit 'timestampAdded'        ]
>             [simple     bit 'highPriorityCommand'   ]
>             [simple     bit 'systemCommand'         ]
>             [simple     bit 'adsCommand'            ]
>             [simple     bit 'noReturn'              ]
>             [simple     bit 'response'              ]
>             [simple     bit 'broadcast'             ]
>             [reserved   int 7 '0x0'                 ]
>         ]
> 
>     Which also seems to be the exact order how the bits go over the wire ... I mean ... it doesn't make any sense to put the 7 empty bits in the middle.
> 
>     Chris
> 
> 
>     Am 30.04.20, 22:42 schrieb "Łukasz Dywicki" <lu...@code-house.org>:
> 
>         Hey Chris and others,
>         I let this topic to dust and freeze a bit, however I would like to lift
>         it up again. Since the 0.7 release is now in discussion and ADS is not
>         ported yet I believe it is reasonable to validate this problem.
> 
>         As you suggested I've checked wireshark sources to see how they parse
>         state fragment of AMS header. While I can not read C properly and was
>         unable to locate actual reading logic, I do see they care about endianness:
>         https://github.com/wireshark/wireshark/blob/5cf3fd03f1538b37704d83b6c38b8223f9036108/plugins/epan/ethercat/packet-ams.c#L429
> 
>         Their frame traversal goes in following order:
>         - response
>         - no return
>         - ads cmd
>         - system cmd
>         - high priority
>         - timestamp add
>         - udp
>         - init cmd
>         - broadcast
> 
>         When I've tried to model mspec with respect to little endian order, I
>         end up with following field order:
>         - init cmd
>         - udp
>         - timestamp add
>         - high priority
>         - system cmd
>         - ads cmd
>         - no return
>         - response
>         - [reserved]
>         - broadcast
> 
>         I haven't seen ADS protocol book, my main source of information is
>         infosys which is rather short in this area. I doubt if they would
>         describe fields in above order.
> 
>         My initial attempts were similar to what you said - to read the value
>         and then shift bytes. With your feedback I started to look for better
>         ways to solve it. With my last attempt I solved the problem by
>         re-adjusting fields during code generation, so they appear in a valid
>         byte order. Beside additional flag at the mspec there is a need for this
>         helper:
>         build-utils/protocol-base-mspec/src/main/java/org/apache/plc4x/plugins/codegenerator/language/mspec/model/definitions/DefaultComplexTypeDefinition.java
> 
>         In above commit the State declaration in the mspec looks like this:
>         [type little endian 'State'
>             [simple     bit 'broadcast'             ]
>             [reserved   int 7 '0x0'                 ]
>             [simple     bit 'initCommand'           ]
>             [simple     bit 'updCommand'            ]
>             [simple     bit 'timestampAdded'        ]
>             [simple     bit 'highPriorityCommand'   ]
>             [simple     bit 'systemCommand'         ]
>             [simple     bit 'adsCommand'            ]
>             [simple     bit 'noReturn'              ]
>             [simple     bit 'response'              ]
>         ]
> 
>         Maybe it is not in line with wireshark, but it is consistent and:
>         1) requires minor change in code generator
>         2) does not involve additional processing of data
>         3) mspec is still easy to write
>         4) type declaration is in line with wireshark output when reading from
>         the left and not from the top. ;-)
>         5) it can be adjusted to be 100% consistent with wireshark by changing
>         helper logic. ;-)
> 
>         Best regards,
>         Łukasz
> 
> 
>         On 14.04.2020 21:25, Christofer Dutz wrote:
>         > Hi Lukasz,
>         > 
>         > I see it differently.
>         > 
>         > If you say something is LE/BE then it should be consistent. 
>         > 
>         > I for my part like it if the mspec is consistent with what I see in WireShark. Otherwise you would always have to do a sort of 2 step parsing:
>         > 
>         > "Read 4 bytes in BE and then on the rearranged byte sequence take the first 3 bits and call them X and the next bit after that Y and so on."
>         > 
>         > In this case you would also have to know how large the chunk is to read in order to know how to decode it's content.
>         > Having a historically partitioned set of two 16 bit bit-fields would look completely different to a 32 bit bit-field. 
>         > 
>         > I don't like adding this extra level, even if it's probably the way protocols were implemented when lacking the ability to write individual bits.
>         > I guess the usual process was to read a BE short (2 bytes) and to the resulting unsigned short value apply different logical or operations to get the information.
>         > In PLC4X we can simply read any number of bits and don't have to twist our minds in which chunks I have to read things first. 
>         > 
>         > So I think the way you are proposing is probably the wider used version due to legacy restrictions, but also the more complicated way.
>         > 
>         > So I would like to keep it simple, even if the developer will have to do a little more brain-work. 
>         > 
>         > I mean all we are doing in this project is going the extra mile so others don't have to ... at least the ones contributing are doing that.
>         > 
>         > What do the others say to this? (Ideally ones that have already used mspec or ones that understand the problem)
>         > 
>         > Chris
>         > 
>         > 
>         > Am 14.04.20, 15:46 schrieb "Łukasz Dywicki" <lu...@code-house.org>:
>         > 
>         >     Hey Chris,
>         >     Sorry for wrongly addressed mail, it was intended to go to mailing list.
>         >     
>         >     To the point - I think that what you propose with reordering fields
>         >     manually shift the load from the tool to the developer. It will lead to
>         >     further implementation errors caused by a need to reorder fields while
>         >     creating spec. It will be inconsistent with rest of places where things
>         >     are in "natural order" (big endian) except one or two types which
>         >     collects bit flags which will have to be declared in different order.
>         >     What I did in my naive implementation with "little endian" flag on the
>         >     State type is re-arrangement of fields during generation time.
>         >     It does not require any byte shifting at the runtime and keeps natural
>         >     order of fields without the need for manual reordering of flags in mspec
>         >     declaration.
>         >     
>         >     I understand that keeping mspec simple is an important priority, however
>         >     doing it at the cost of developers who will use it to implement codecs
>         >     will not help us in long term.
>         >     I'm quite sure that sooner or later we will get a  protocol with mix of
>         >     BE/LE fields.
>         >     
>         >     Cheers,
>         >     Łukasz
>         >     
>         >     
>         >     On 14.04.2020 14:57, Christofer Dutz wrote:
>         >     > Hi Lukasz (adding dev@ to the recipients again)
>         >     > 
>         >     >  
>         >     > 
>         >     > you are not quite correct. Enum types do declare a base type where they
>         >     > are declared … not where they are used.
>         >     > 
>         >     >  
>         >     > 
>         >     > If you have bit-fields then why don’t you define them as sequence of
>         >     > bits? I really can’t understand why you need to flip anything.
>         >     > 
>         >     > I think you’re trying to do something similar to Sebastian when he
>         >     > wanted to introduce some bit-mask types.
>         >     > 
>         >     >  
>         >     > 
>         >     > I could imagine that in the spec it might be written … these two bytes
>         >     > are a bit-mask and the bit 1 means X bit 10 means Y, …
>         >     > 
>         >     > So why not define the bit fields in the sequence they are sent over the
>         >     > wire?
>         >     > 
>         >     >  
>         >     > 
>         >     > Assuming you have a bit field of 16 bits BE uint. If the spec now says:
>         >     > 
>         >     >  
>         >     > 
>         >     >  
>         >     > 
>         >     > Bit:                        15|14|13|12|11|10|9 |8 |7 |6 |5  |4 |3  |2
>         >     >  |1  |0
>         >     > 
>         >     > Meaning:            A  |B  |C  |D  |E  | F  |G|H |I  | J | K |L | M|N |O |P
>         >     > 
>         >     >  
>         >     > 
>         >     > Then the order they are sent is:
>         >     > 
>         >     > 7 |6 |5  |4 |3  |2  |1  |0 | 15|14|13|12|11|10|9 |8
>         >     > 
>         >     > So you have to declare the bit fields in the order:
>         >     > 
>         >     >                                I  | J | K |L | M|N |O |P  |A  |B  |C  |D
>         >     >  |E  | F  |G|H
>         >     > 
>         >     >  
>         >     > 
>         >     > So I don’t quite understand what you’re trying to flip here.
>         >     > 
>         >     >  
>         >     > 
>         >     > Chris
>         >     > 
>         >     >  
>         >     > 
>         >     > *Von: *Łukasz Dywicki <lu...@code-house.org>
>         >     > *Datum: *Dienstag, 14. April 2020 um 13:26
>         >     > *An: *Christofer Dutz <ch...@c-ware.de>
>         >     > *Betreff: *Re: Big and littleendian fields in one mspec
>         >     > 
>         >     >  
>         >     > 
>         >     > I've added the endianness switch to the types in my experiments. These
>         >     > are not needed for enums as these are read as int and then mapped to
>         >     > constants. I referred to these as an example - enum does have a length
>         >     > indication and type behind it.
>         >     > 
>         >     >  
>         >     > 
>         >     > Again, I will bring the issue - StateIO currently fails to do anything
>         >     > useful due to BE/LE switch. Because this type uses single bits which
>         >     > need to be flipped before reading or after writing otherwise they end up
>         >     > in improper order. Out of 16 bits 9 are in use so we have a lot of
>         >     > possible combinations (2^9), which seems too much to create an enum.
>         >     > 
>         >     > Since you didn't like my initial proposal and modifications (as they
>         >     > might be redundant to what is available in the buffer API), how would
>         >     > you handle State serialization without affecting mspec and without
>         >     > further complication to code generation?
>         >     > 
>         >     >  
>         >     > 
>         >     > Best,
>         >     > 
>         >     > Łukasz
>         >     > 
>         >     >  
>         >     > 
>         >     > wt., 14 kwi 2020 o 12:47 Christofer Dutz <christofer.dutz@c-ware.de
>         >     > <ma...@c-ware.de>> napisał(a):
>         >     > 
>         >     >     Hi Lukasz,
>         >     > 
>         >     >     but we don't have a:
>         >     >     [enum uint 16 little endian 'CommandId']
>         >     >     But only a:
>         >     >     [enum uint 16 'CommandId']
>         >     > 
>         >     >     And in your case I think perhaps the constants are not correct. So
>         >     >     having a 16 bit uint will result in a 4 digit hex string:
>         >     >     So if you are having problems in mapping them to your constants,
>         >     >     perhaps the constants are in the wrong endianness.
>         >     > 
>         >     >     I have encountered (but can't recall where) that the Enum constants
>         >     >     in a BE protocol were written down in LE notation.
>         >     >     Of course the thing can't work then.
>         >     > 
>         >     >     So if for example you have the constant "1" in a BE protocol with a
>         >     >     uint 16 enum type, your constant is not 0x0001, but 0x0100 instead.
>         >     > 
>         >     >     Chris
>         >     > 
>         >     > 
>         >     > 
>         >     > 
>         >     >     Am 14.04.20, 12:37 schrieb "Łukasz Dywicki" <luke@code-house.org
>         >     >     <ma...@code-house.org>>:
>         >     > 
>         >     >         The legendary "two bytes" (shall we create a band with this
>         >     >     name?!) are
>         >     >         coming from unfortunate State type. I don't mind these to be an
>         >     >         int/sint/uint or whathever - this is matter of interpretation.
>         >     > 
>         >     >         If you could please check again my earlier messages you will find
>         >     >         struggle I have which is - how much given type takes.
>         >     > 
>         >     >         We have that for enums
>         >     >         [enum uint 16 little endian 'CommandId']
>         >     > 
>         >     >         but we don't have that for complex types
>         >     >         [type 'State']
>         >     > 
>         >     >         This leads to situation that we don't know how to read and interpret
>         >     >         incoming byte sequence or how to properly write it back to stream.
>         >     > 
>         >     >         It would be great if we could limit the issue just to
>         >     >         // read
>         >     >         state = StateIO.parseStatic(io.readUnsignedInt(16, true)); //
>         >     >     true->LE
>         >     >         // write
>         >     >         wio = new WriteBuffer(2);
>         >     >         StateIO.staticSerialize(wio, new State(...));
>         >     >         io.writeUnsignedInt(8, wio.read, true); // true->LE
>         >     > 
>         >     >         However to do that first we need to know that State is a uint 16
>         >     >     LE and
>         >     >         then do a lot of mambo-jambo in code generators to cope with
>         >     >     that. :-)
>         >     > 
>         >     >         Best,
>         >     >         Łukasz
>         >     > 
>         >     > 
>         >     >         On 14.04.2020 12:20, Christofer Dutz wrote:
>         >     >         > Oh gee ... I totally remember having exactly the same
>         >     >     discussion with Sebastian about "bytes" ...
>         >     >         >
>         >     >         > The problem is there is generally no "byte" "int" and "uint"
>         >     >     because a byte = 8 bits is either signed or unsigned. That should
>         >     >     the 3rd option be? Perhaps-signed?
>         >     >         > So generally I use "uint 8" for what you refer to as a "byte 8".
>         >     >         >
>         >     >         > Wo what would be the difference between your "2 byte little
>         >     >     endian" and an "uint 16" with a LE ReadBuffer?
>         >     >         >
>         >     >         > I think what you have to rid yourself of thinking of int as
>         >     >     number and a byte not being a number.
>         >     >         >
>         >     >         > Chris
>         >     >         >
>         >     >         >
>         >     >         >
>         >     >         > Am 14.04.20, 10:52 schrieb "Łukasz Dywicki"
>         >     >     <luke@code-house.org <ma...@code-house.org>>:
>         >     >         >
>         >     >         >     Hey Christian,
>         >     >         >     The problem I face is quite simple. State type in mspec i
>         >     >     declared as a
>         >     >         >     bunch of bits. Type length is fixed, however not available
>         >     >     anywhere in
>         >     >         >     mspec or code generator to re-arrange bytes upfront. All
>         >     >     we have exposed
>         >     >         >     at reader/write level is read/write bit.
>         >     >         >     To be fair LE/BE support in read/write buffers is limited
>         >     >     just to
>         >     >         >     numbers. There is no such support for raw bytes or bits,
>         >     >     cause for that
>         >     >         >     you need to declare a length of LE/BE sequence.
>         >     >         >     
>         >     >         >     I would love if I could just declare State as '2 byte
>         >     >     little endian' so
>         >     >         >     it would be read properly upfront and parsed with no
>         >     >     changes in
>         >     >         >     generated code, however I'm not sure how to do it and
>         >     >     where. That's why
>         >     >         >     I'm playing with different things described in earlier mail.
>         >     >         >     Since all type handling is general I am just afraid of
>         >     >     more complicated
>         >     >         >     scenarios where we have variable length structures such as
>         >     >     arrays.
>         >     >         >     
>         >     >         >     Best regards,
>         >     >         >     Łukasz
>         >     >         >     
>         >     >         >     
>         >     >         >     On 14.04.2020 09:41, Christofer Dutz wrote:
>         >     >         >     > Hi Lukasz,
>         >     >         >     >
>         >     >         >     > I am not sure I am understanding the problems you are
>         >     >     facing. We already have LE and BE protocols.
>         >     >         >     > For example EIP is LE and the rest is generally BE. It
>         >     >     seems that ADS/AMS is also LE.
>         >     >         >     > mspec doesn't even know about endianness.
>         >     >         >     >
>         >     >         >     > Up till now the endianness doesn't have an effect on
>         >     >     bit-fields or single-bit ints.
>         >     >         >     > It only starts to affect if a field goes from one byte
>         >     >     to the next, which is usually for (u)int and floating point values.
>         >     >         >     >
>         >     >         >     > That's why we have created the Read/WriteBuffers to set
>         >     >     their endianness in the constructor.
>         >     >         >     >
>         >     >         >     > So if you're creating a driver for ADS/AMS which is LE,
>         >     >     then you write the mspec according to the sequence the information
>         >     >     is located in the transferred bytes and have the Read/WriteBuffer
>         >     >     handle the endianness issue.
>         >     >         >     >
>         >     >         >     > I do see a problem when there are drivers that use mixed
>         >     >     endianness, but we have still to encounter such a protocol.
>         >     >         >     >
>         >     >         >     > So I have to admit that I don't like any of the mspec
>         >     >     changes you proposed, as I think you are just not using the tools we
>         >     >     have the right way.
>         >     >         >     >
>         >     >         >     > Chris
>         >     >         >     >
>         >     >         >     >
>         >     >         >     >
>         >     >         >     > Am 14.04.20, 00:32 schrieb "Łukasz Dywicki"
>         >     >     <luke@code-house.org <ma...@code-house.org>>:
>         >     >         >     >
>         >     >         >     >     Hey Niclas,
>         >     >         >     >     I realized how old the old things are when I started
>         >     >     preparing
>         >     >         >     >     automation training for mere mortals and got into
>         >     >     history of frames and
>         >     >         >     >     even cabling. Mr. Modbus and EIA-485 is definitely
>         >     >     older than I. ;-)
>         >     >         >     >     
>         >     >         >     >     Getting back to the point - yes. I been thinking how
>         >     >     to address the byte
>         >     >         >     >     order in effective way. Here are two approaches I
>         >     >     have for now:
>         >     >         >     >     
>         >     >         >     >     A) My initial attempt is just a temporary buffer
>         >     >     which is then written
>         >     >         >     >     in reverse order to caller. For reading it is
>         >     >     similar - just getting N
>         >     >         >     >     bytes in reversed order. The hard part is.. knowing
>         >     >     N. I had to add a
>         >     >         >     >     static calculation in order to allocate valid buffer
>         >     >     sizes. I tend to
>         >     >         >     >     work but I'm not happy with this approach cause it
>         >     >     involves additional work.
>         >     >         >     >     B) Second idea I've got is really simple and relies
>         >     >     on code generation.
>         >     >         >     >     We know in which order fields are coming. Here I'm
>         >     >     referring to a State
>         >     >         >     >     field which is just bunch of bits. If we would group
>         >     >     fields in bytes and
>         >     >         >     >     generate code in reverse order then it has chance to
>         >     >     work. Requirement
>         >     >         >     >     for that - ability to know basic field sizes upfront.
>         >     >         >     >     C) Try to combine above with bit-io or
>         >     >     Read/WriteBuffers as these are
>         >     >         >     >     places which know actual position and state of
>         >     >     buffers which are being
>         >     >         >     >     read/written.
>         >     >         >     >     
>         >     >         >     >     Now, getting to two cases which are a problem.
>         >     >     CommandId and State. So
>         >     >         >     >     with command id situation is simple as it is
>         >     >     declared as enum and it is
>         >     >         >     >     read as uint. We know size upfront and can generate
>         >     >     valid method call
>         >     >         >     >     (readIntLE).
>         >     >         >     >     [enum uint 16 little endian 'CommandId'
>         >     >         >     >         ['0x00' INVALID]
>         >     >         >     >         ['0x01' ADS_READ_DEVICE_INFO]
>         >     >         >     >         ['0x02' ADS_READ]
>         >     >         >     >         ['0x03' ADS_WRITE]
>         >     >         >     >         ['0x04' ADS_READ_STATE]
>         >     >         >     >         ['0x05' ADS_WRITE_CONTROL]
>         >     >         >     >         ['0x06' ADS_ADD_DEVICE_NOTIFICATION]
>         >     >         >     >         ['0x07' ADS_DELETE_DEVICE_NOTIFICATION]
>         >     >         >     >         ['0x08' ADS_DEVICE_NOTIFICATION]
>         >     >         >     >         ['0x09' ADS_READ_WRITE]
>         >     >         >     >     ]
>         >     >         >     >     
>         >     >         >     >     Second candidate is what I'm stuck right now sniping
>         >     >     next cycles of
>         >     >         >     >     problems. So in case of State we have complex type
>         >     >     composed from 2
>         >     >         >     >     bytes. A note here - instead of two bytes we might
>         >     >     have a variable
>         >     >         >     >     length type which includes array or other variable
>         >     >     section.
>         >     >         >     >     [type little endian 'State'
>         >     >         >     >         [simple     bit 'broadcast'             ]
>         >     >         >     >         [reserved   int 7 '0x0'                 ]
>         >     >         >     >         [simple     bit 'initCommand'           ]
>         >     >         >     >         [simple     bit 'updCommand'            ]
>         >     >         >     >         [simple     bit 'timestampAdded'        ]
>         >     >         >     >         [simple     bit 'highPriorityCommand'   ]
>         >     >         >     >         [simple     bit 'systemCommand'         ]
>         >     >         >     >         [simple     bit 'adsCommand'            ]
>         >     >         >     >         [simple     bit 'noReturn'              ]
>         >     >         >     >         [simple     bit 'response'              ]
>         >     >         >     >     ]
>         >     >         >     >     
>         >     >         >     >     The order of reading big endian encoded data to
>         >     >     impose little endian
>         >     >         >     >     shift would be (please correct me if I'm wrong):
>         >     >         >     >     1) init
>         >     >         >     >     2) udp
>         >     >         >     >     3) add timestamp
>         >     >         >     >     4) priority
>         >     >         >     >     5) system
>         >     >         >     >     6) ads
>         >     >         >     >     7) noreturn
>         >     >         >     >     8) response (end of byte 1)
>         >     >         >     >     9) broadcast
>         >     >         >     >     10) reserved (end of byte )
>         >     >         >     >     We can do same trick for writing, by re-arranging
>         >     >     fields. By this way we
>         >     >         >     >     avoid any additional byte level operations.
>         >     >         >     >     
>         >     >         >     >     Overall trouble with generated driver is to declare
>         >     >     "how much" bytes
>         >     >         >     >     should be read and interpreted. We have precise size
>         >     >     information at the
>         >     >         >     >     runtime - due to length fields, we can leverage it
>         >     >     at generation time,
>         >     >         >     >     but then we won't be able to cover all cases.
>         >     >         >     >     
>         >     >         >     >     I would love to keep it simple and do not break
>         >     >     things thus I need your
>         >     >         >     >     advice on how to approach this problem in a valid way.
>         >     >         >     >     
>         >     >         >     >     Cheers,
>         >     >         >     >     Łukasz
>         >     >         >     >     
>         >     >         >     >     
>         >     >         >     >     On 13.04.2020 03:26, Niclas Hedhman wrote:
>         >     >         >     >     > <anecdotal-rant>
>         >     >         >     >     > For us who were around and shaping the protocols
>         >     >     in the 1980s, and people
>         >     >         >     >     > before us (and before standards like RS-232), a
>         >     >     lot of the "specifications"
>         >     >         >     >     > came out of "observation of implementation we
>         >     >     managed to get to work",
>         >     >         >     >     > rather than "implement this spec". A lot was due
>         >     >     to extreme memory
>         >     >         >     >     > constraints (in my case, multi-tasking operating
>         >     >     system, serial protocol
>         >     >         >     >     > 187kbps, interpreted programming language with
>         >     >     floating point ops and user
>         >     >         >     >     > applications in 2kB RAM and 8kB EPROM) and a
>         >     >     general lack of information,
>         >     >         >     >     > like what other people were doing, sharing
>         >     >     experiences and so on.
>         >     >         >     >     >
>         >     >         >     >     > And there were many "innovative" ways to squeeze
>         >     >     just a little bit extra
>         >     >         >     >     > out of the hardware, resulting in "hard to
>         >     >     understand" consequences. Bit
>         >     >         >     >     > packing was a typical one, multiple functions
>         >     >     packed into a single byte.
>         >     >         >     >     > Look at page 14 in
>         >     >     https://www.nxp.com/docs/en/data-sheet/80C31_80C32.pdf
>         >     >         >     >     > and read up on "UART Enahanced Mode", and we used
>         >     >     this, i.e. 9 bits, no
>         >     >         >     >     > parity and clever use of address and mask to
>         >     >     create a slave-to-slave direct
>         >     >         >     >     > protocol, where the master's role was to signal
>         >     >     which slave "owned" the
>         >     >         >     >     > cable. Yeah, in that 8kB ROM limitation (I think
>         >     >     protocol was about 1kB
>         >     >         >     >     > ROM) and something like 150 bytes RAM for comm
>         >     >     protocol.
>         >     >         >     >     >
>         >     >         >     >     > Could you implement a compatible device to this
>         >     >     with PLC4X and modern
>         >     >         >     >     > hardware (i.e. no 8031/32 co-processor)? Possibly
>         >     >     but bit-banging is needed
>         >     >         >     >     > to support the 9bit data (+start and stop bits)
>         >     >     and an awful lot of CPU
>         >     >         >     >     > cycles on something that was automatic on one of
>         >     >     the slowest long-lived
>         >     >         >     >     > microcontroller ever.
>         >     >         >     >     > </anecdotal-rant>
>         >     >         >     >     >
>         >     >         >     >     > My point was only to highlight that some of the
>         >     >     strange things you see in
>         >     >         >     >     > protocols today, have its roots in
>         >     >     pre-standardization days. Today no one
>         >     >         >     >     > would go down that route, because the hardware
>         >     >     cost nothing now (8031  +
>         >     >         >     >     > 8kB EPROM + 2kB static RAM + battery backup =>
>         >     >     ~$50 in 1983's currency) and
>         >     >         >     >     > longevity of software is more important.
>         >     >         >     >     >
>         >     >         >     >     > Cheers
>         >     >         >     >     > Niclas
>         >     >         >     >     >
>         >     >         >     >     >
>         >     >         >     >     > On Sun, Apr 12, 2020 at 10:10 PM Christofer Dutz
>         >     >     <christofer.dutz@c-ware.de <ma...@c-ware.de>>
>         >     >         >     >     > wrote:
>         >     >         >     >     >
>         >     >         >     >     >> Hi Lukasz,
>         >     >         >     >     >>
>         >     >         >     >     >> I think it really gets tricky when using BE and
>         >     >     having some byte-odd-sizes
>         >     >         >     >     >> ... I remember in the Firmata protocol there were
>         >     >     some bitmasks and then 10
>         >     >         >     >     >> bit uint as BE ... not it really got tricky as
>         >     >     the specs were written from
>         >     >         >     >     >> a point of view: You read 16 bits BE and then the
>         >     >     first6 bits mean XYZ
>         >     >         >     >     >> instead of describing how the bits actually
>         >     >     travel over the wire.
>         >     >         >     >     >>
>         >     >         >     >     >> Chris
>         >     >         >     >     >>
>         >     >         >     >     >>
>         >     >         >     >     >>
>         >     >         >     >     >> Am 11.04.20, 01:21 schrieb "Łukasz Dywicki"
>         >     >     <luke@code-house.org <ma...@code-house.org>>:
>         >     >         >     >     >>
>         >     >         >     >     >>     I've made some progress with topic by
>         >     >     modyfing mspec and allowing
>         >     >         >     >     >>     'little endian' flag on fields. This moved me
>         >     >     further to next issue -
>         >     >         >     >     >>     which is whole type encoded little endian.
>         >     >         >     >     >>
>         >     >         >     >     >>     In ADS driver such type is State, which has 2
>         >     >     bytes and uses 8 bits for
>         >     >         >     >     >>     various flags.
>         >     >         >     >     >>     There are two cases which require different
>         >     >     approach - reading and
>         >     >         >     >     >>     writing. So for reading we need to swap N
>         >     >     bytes based on type length.
>         >     >         >     >     >>     For writing we need to alocate buffer for N
>         >     >     bytes and swap them before
>         >     >         >     >     >>     writing.
>         >     >         >     >     >>
>         >     >         >     >     >>     I am stuck now with freemaker templates and
>         >     >     bit-io.
>         >     >         >     >     >>
>         >     >         >     >     >>     Cheers,
>         >     >         >     >     >>     Łukasz
>         >     >         >     >     >>
>         >     >         >     >     >>
>         >     >         >     >     >>
>         >     >         >     >     >>     On 10.04.2020 17:57, Łukasz Dywicki wrote:
>         >     >         >     >     >>     > I am doing some tests of ADS serialization.
>         >     >         >     >     >>     >
>         >     >         >     >     >>     > I've run into some troubles with payload
>         >     >     which is generated with new
>         >     >         >     >     >>     > driver. I'm not sure if that's my fault or
>         >     >     generated code.
>         >     >         >     >     >>     >
>         >     >         >     >     >>     > I did a verification of what Wireshark
>         >     >     shows and how ads structures
>         >     >         >     >     >> are
>         >     >         >     >     >>     > parsed. There is a gap I think. For example
>         >     >     ams port number 1000
>         >     >         >     >     >>     > (0x1027) is read as 4135.
>         >     >         >     >     >>     >
>         >     >         >     >     >>     > Obviously I used wrong structures while
>         >     >     implementing protocol logic
>         >     >         >     >     >> in
>         >     >         >     >     >>     > first place, but now I am uncertain of how
>         >     >     fields are encoded. How we
>         >     >         >     >     >>     > mark field as little endian when rest of
>         >     >     payload is big endian? Do we
>         >     >         >     >     >>     > have `uint_le`?
>         >     >         >     >     >>     >
>         >     >         >     >     >>     > As far I remember route creation logic I
>         >     >     was tracking last week used
>         >     >         >     >     >>     > combination of LE and BE.
>         >     >         >     >     >>     >
>         >     >         >     >     >>     > Best regards,
>         >     >         >     >     >>     > Łukasz
>         >     >         >     >     >>     >
>         >     >         >     >     >>
>         >     >         >     >     >>
>         >     >         >     >     >>
>         >     >         >     >     >
>         >     >         >     >     
>         >     >         >     >
>         >     >         >     
>         >     >         >
>         >     > 
>         >     
>         > 
> 
>

Re: Big and littleendian fields in one mspec

Posted by Christofer Dutz <ch...@c-ware.de>.

Hi folks,

so Lukasz just sent me a little pcap file with some AMS packets in there and my assumption was correct:

for example in the byte-stream I can see the "flags" being 0x0400 ... However wireshark read this as unsigned short LE and therefore re-aranged the bytes:
    StateFlags: 0x0004
        .... .... .... ...0 = RESPONSE: Not set
        .... .... .... ..0. = NO RETURN: Not set
        .... .... .... .1.. = ADS COMMAND: Set
        .... .... .... 0... = SYSTEM COMMAND: Not set
        .... .... ...0 .... = HIGH PRIORITY COMMAND: Not set
        .... .... ..0. .... = TIMESTAMP ADDED: Not set
        .... .... .0.. .... = UDP COMMAND: Not set
        .... .... 0... .... = INIT COMMAND: Not set
        0... .... .... .... = BROADCAST: Not set

So in above example only one field is true: "ADS COMMAND" ... so if I decode 0x0400 to binary that’s: 00000100 00000000

So this exactly fits my proposed mspec definition of: 

    [type 'State'
            [simple     bit 'initCommand'           ]
            [simple     bit 'updCommand'            ]
            [simple     bit 'timestampAdded'        ]
            [simple     bit 'highPriorityCommand'   ]
            [simple     bit 'systemCommand'         ]
            [simple     bit 'adsCommand'            ]
            [simple     bit 'noReturn'              ]
            [simple     bit 'response'              ]
            [simple     bit 'broadcast'             ]
            [reserved   int 7 '0x0'                 ]
        ]

So I strongly object introducing any special endianness "feature" for this (in this case) ... 
we might encounter something where we need it, but for this we don't.

Chris



Am 30.04.20, 23:02 schrieb "Christofer Dutz" <ch...@c-ware.de>:

    Hi Lukasz,

    I wasn't suggesting to look how WireShark decodes things. Cause I would assume that they read a short in little endian and then (on the rearranged bytes) apply some bit checks. 

    I was more suggesting that you look into the bytes Wireshark shows you and to check their exact position.

    When implementing drivers, I used the Mac's Calculator tool quite often to see the bits for a given byte value.

    In your example I can't see any need for endianness at all ... it's all just bits and one empty block which forms a full byte together with the first bit.
    As this block doesn't cross any byte boundaries I can't see any problems with this.

    Assuming I understand what you are implying with 

    [type little endian 'State'
            [simple     bit 'broadcast'             ]
            [reserved   int 7 '0x0'                 ]
            [simple     bit 'initCommand'           ]
            [simple     bit 'updCommand'            ]
            [simple     bit 'timestampAdded'        ]
            [simple     bit 'highPriorityCommand'   ]
            [simple     bit 'systemCommand'         ]
            [simple     bit 'adsCommand'            ]
            [simple     bit 'noReturn'              ]
            [simple     bit 'response'              ]
        ]

    Then it should be equal to:

    [type 'State'
            [simple     bit 'initCommand'           ]
            [simple     bit 'updCommand'            ]
            [simple     bit 'timestampAdded'        ]
            [simple     bit 'highPriorityCommand'   ]
            [simple     bit 'systemCommand'         ]
            [simple     bit 'adsCommand'            ]
            [simple     bit 'noReturn'              ]
            [simple     bit 'response'              ]
            [simple     bit 'broadcast'             ]
            [reserved   int 7 '0x0'                 ]
        ]

    Which also seems to be the exact order how the bits go over the wire ... I mean ... it doesn't make any sense to put the 7 empty bits in the middle.

    Chris


    Am 30.04.20, 22:42 schrieb "Łukasz Dywicki" <lu...@code-house.org>:

        Hey Chris and others,
        I let this topic to dust and freeze a bit, however I would like to lift
        it up again. Since the 0.7 release is now in discussion and ADS is not
        ported yet I believe it is reasonable to validate this problem.

        As you suggested I've checked wireshark sources to see how they parse
        state fragment of AMS header. While I can not read C properly and was
        unable to locate actual reading logic, I do see they care about endianness:
        https://github.com/wireshark/wireshark/blob/5cf3fd03f1538b37704d83b6c38b8223f9036108/plugins/epan/ethercat/packet-ams.c#L429

        Their frame traversal goes in following order:
        - response
        - no return
        - ads cmd
        - system cmd
        - high priority
        - timestamp add
        - udp
        - init cmd
        - broadcast

        When I've tried to model mspec with respect to little endian order, I
        end up with following field order:
        - init cmd
        - udp
        - timestamp add
        - high priority
        - system cmd
        - ads cmd
        - no return
        - response
        - [reserved]
        - broadcast

        I haven't seen ADS protocol book, my main source of information is
        infosys which is rather short in this area. I doubt if they would
        describe fields in above order.

        My initial attempts were similar to what you said - to read the value
        and then shift bytes. With your feedback I started to look for better
        ways to solve it. With my last attempt I solved the problem by
        re-adjusting fields during code generation, so they appear in a valid
        byte order. Beside additional flag at the mspec there is a need for this
        helper:
        build-utils/protocol-base-mspec/src/main/java/org/apache/plc4x/plugins/codegenerator/language/mspec/model/definitions/DefaultComplexTypeDefinition.java

        In above commit the State declaration in the mspec looks like this:
        [type little endian 'State'
            [simple     bit 'broadcast'             ]
            [reserved   int 7 '0x0'                 ]
            [simple     bit 'initCommand'           ]
            [simple     bit 'updCommand'            ]
            [simple     bit 'timestampAdded'        ]
            [simple     bit 'highPriorityCommand'   ]
            [simple     bit 'systemCommand'         ]
            [simple     bit 'adsCommand'            ]
            [simple     bit 'noReturn'              ]
            [simple     bit 'response'              ]
        ]

        Maybe it is not in line with wireshark, but it is consistent and:
        1) requires minor change in code generator
        2) does not involve additional processing of data
        3) mspec is still easy to write
        4) type declaration is in line with wireshark output when reading from
        the left and not from the top. ;-)
        5) it can be adjusted to be 100% consistent with wireshark by changing
        helper logic. ;-)

        Best regards,
        Łukasz


        On 14.04.2020 21:25, Christofer Dutz wrote:
        > Hi Lukasz,
        > 
        > I see it differently.
        > 
        > If you say something is LE/BE then it should be consistent. 
        > 
        > I for my part like it if the mspec is consistent with what I see in WireShark. Otherwise you would always have to do a sort of 2 step parsing:
        > 
        > "Read 4 bytes in BE and then on the rearranged byte sequence take the first 3 bits and call them X and the next bit after that Y and so on."
        > 
        > In this case you would also have to know how large the chunk is to read in order to know how to decode it's content.
        > Having a historically partitioned set of two 16 bit bit-fields would look completely different to a 32 bit bit-field. 
        > 
        > I don't like adding this extra level, even if it's probably the way protocols were implemented when lacking the ability to write individual bits.
        > I guess the usual process was to read a BE short (2 bytes) and to the resulting unsigned short value apply different logical or operations to get the information.
        > In PLC4X we can simply read any number of bits and don't have to twist our minds in which chunks I have to read things first. 
        > 
        > So I think the way you are proposing is probably the wider used version due to legacy restrictions, but also the more complicated way.
        > 
        > So I would like to keep it simple, even if the developer will have to do a little more brain-work. 
        > 
        > I mean all we are doing in this project is going the extra mile so others don't have to ... at least the ones contributing are doing that.
        > 
        > What do the others say to this? (Ideally ones that have already used mspec or ones that understand the problem)
        > 
        > Chris
        > 
        > 
        > Am 14.04.20, 15:46 schrieb "Łukasz Dywicki" <lu...@code-house.org>:
        > 
        >     Hey Chris,
        >     Sorry for wrongly addressed mail, it was intended to go to mailing list.
        >     
        >     To the point - I think that what you propose with reordering fields
        >     manually shift the load from the tool to the developer. It will lead to
        >     further implementation errors caused by a need to reorder fields while
        >     creating spec. It will be inconsistent with rest of places where things
        >     are in "natural order" (big endian) except one or two types which
        >     collects bit flags which will have to be declared in different order.
        >     What I did in my naive implementation with "little endian" flag on the
        >     State type is re-arrangement of fields during generation time.
        >     It does not require any byte shifting at the runtime and keeps natural
        >     order of fields without the need for manual reordering of flags in mspec
        >     declaration.
        >     
        >     I understand that keeping mspec simple is an important priority, however
        >     doing it at the cost of developers who will use it to implement codecs
        >     will not help us in long term.
        >     I'm quite sure that sooner or later we will get a  protocol with mix of
        >     BE/LE fields.
        >     
        >     Cheers,
        >     Łukasz
        >     
        >     
        >     On 14.04.2020 14:57, Christofer Dutz wrote:
        >     > Hi Lukasz (adding dev@ to the recipients again)
        >     > 
        >     >  
        >     > 
        >     > you are not quite correct. Enum types do declare a base type where they
        >     > are declared … not where they are used.
        >     > 
        >     >  
        >     > 
        >     > If you have bit-fields then why don’t you define them as sequence of
        >     > bits? I really can’t understand why you need to flip anything.
        >     > 
        >     > I think you’re trying to do something similar to Sebastian when he
        >     > wanted to introduce some bit-mask types.
        >     > 
        >     >  
        >     > 
        >     > I could imagine that in the spec it might be written … these two bytes
        >     > are a bit-mask and the bit 1 means X bit 10 means Y, …
        >     > 
        >     > So why not define the bit fields in the sequence they are sent over the
        >     > wire?
        >     > 
        >     >  
        >     > 
        >     > Assuming you have a bit field of 16 bits BE uint. If the spec now says:
        >     > 
        >     >  
        >     > 
        >     >  
        >     > 
        >     > Bit:                        15|14|13|12|11|10|9 |8 |7 |6 |5  |4 |3  |2
        >     >  |1  |0
        >     > 
        >     > Meaning:            A  |B  |C  |D  |E  | F  |G|H |I  | J | K |L | M|N |O |P
        >     > 
        >     >  
        >     > 
        >     > Then the order they are sent is:
        >     > 
        >     > 7 |6 |5  |4 |3  |2  |1  |0 | 15|14|13|12|11|10|9 |8
        >     > 
        >     > So you have to declare the bit fields in the order:
        >     > 
        >     >                                I  | J | K |L | M|N |O |P  |A  |B  |C  |D
        >     >  |E  | F  |G|H
        >     > 
        >     >  
        >     > 
        >     > So I don’t quite understand what you’re trying to flip here.
        >     > 
        >     >  
        >     > 
        >     > Chris
        >     > 
        >     >  
        >     > 
        >     > *Von: *Łukasz Dywicki <lu...@code-house.org>
        >     > *Datum: *Dienstag, 14. April 2020 um 13:26
        >     > *An: *Christofer Dutz <ch...@c-ware.de>
        >     > *Betreff: *Re: Big and littleendian fields in one mspec
        >     > 
        >     >  
        >     > 
        >     > I've added the endianness switch to the types in my experiments. These
        >     > are not needed for enums as these are read as int and then mapped to
        >     > constants. I referred to these as an example - enum does have a length
        >     > indication and type behind it.
        >     > 
        >     >  
        >     > 
        >     > Again, I will bring the issue - StateIO currently fails to do anything
        >     > useful due to BE/LE switch. Because this type uses single bits which
        >     > need to be flipped before reading or after writing otherwise they end up
        >     > in improper order. Out of 16 bits 9 are in use so we have a lot of
        >     > possible combinations (2^9), which seems too much to create an enum.
        >     > 
        >     > Since you didn't like my initial proposal and modifications (as they
        >     > might be redundant to what is available in the buffer API), how would
        >     > you handle State serialization without affecting mspec and without
        >     > further complication to code generation?
        >     > 
        >     >  
        >     > 
        >     > Best,
        >     > 
        >     > Łukasz
        >     > 
        >     >  
        >     > 
        >     > wt., 14 kwi 2020 o 12:47 Christofer Dutz <christofer.dutz@c-ware.de
        >     > <ma...@c-ware.de>> napisał(a):
        >     > 
        >     >     Hi Lukasz,
        >     > 
        >     >     but we don't have a:
        >     >     [enum uint 16 little endian 'CommandId']
        >     >     But only a:
        >     >     [enum uint 16 'CommandId']
        >     > 
        >     >     And in your case I think perhaps the constants are not correct. So
        >     >     having a 16 bit uint will result in a 4 digit hex string:
        >     >     So if you are having problems in mapping them to your constants,
        >     >     perhaps the constants are in the wrong endianness.
        >     > 
        >     >     I have encountered (but can't recall where) that the Enum constants
        >     >     in a BE protocol were written down in LE notation.
        >     >     Of course the thing can't work then.
        >     > 
        >     >     So if for example you have the constant "1" in a BE protocol with a
        >     >     uint 16 enum type, your constant is not 0x0001, but 0x0100 instead.
        >     > 
        >     >     Chris
        >     > 
        >     > 
        >     > 
        >     > 
        >     >     Am 14.04.20, 12:37 schrieb "Łukasz Dywicki" <luke@code-house.org
        >     >     <ma...@code-house.org>>:
        >     > 
        >     >         The legendary "two bytes" (shall we create a band with this
        >     >     name?!) are
        >     >         coming from unfortunate State type. I don't mind these to be an
        >     >         int/sint/uint or whathever - this is matter of interpretation.
        >     > 
        >     >         If you could please check again my earlier messages you will find
        >     >         struggle I have which is - how much given type takes.
        >     > 
        >     >         We have that for enums
        >     >         [enum uint 16 little endian 'CommandId']
        >     > 
        >     >         but we don't have that for complex types
        >     >         [type 'State']
        >     > 
        >     >         This leads to situation that we don't know how to read and interpret
        >     >         incoming byte sequence or how to properly write it back to stream.
        >     > 
        >     >         It would be great if we could limit the issue just to
        >     >         // read
        >     >         state = StateIO.parseStatic(io.readUnsignedInt(16, true)); //
        >     >     true->LE
        >     >         // write
        >     >         wio = new WriteBuffer(2);
        >     >         StateIO.staticSerialize(wio, new State(...));
        >     >         io.writeUnsignedInt(8, wio.read, true); // true->LE
        >     > 
        >     >         However to do that first we need to know that State is a uint 16
        >     >     LE and
        >     >         then do a lot of mambo-jambo in code generators to cope with
        >     >     that. :-)
        >     > 
        >     >         Best,
        >     >         Łukasz
        >     > 
        >     > 
        >     >         On 14.04.2020 12:20, Christofer Dutz wrote:
        >     >         > Oh gee ... I totally remember having exactly the same
        >     >     discussion with Sebastian about "bytes" ...
        >     >         >
        >     >         > The problem is there is generally no "byte" "int" and "uint"
        >     >     because a byte = 8 bits is either signed or unsigned. That should
        >     >     the 3rd option be? Perhaps-signed?
        >     >         > So generally I use "uint 8" for what you refer to as a "byte 8".
        >     >         >
        >     >         > Wo what would be the difference between your "2 byte little
        >     >     endian" and an "uint 16" with a LE ReadBuffer?
        >     >         >
        >     >         > I think what you have to rid yourself of thinking of int as
        >     >     number and a byte not being a number.
        >     >         >
        >     >         > Chris
        >     >         >
        >     >         >
        >     >         >
        >     >         > Am 14.04.20, 10:52 schrieb "Łukasz Dywicki"
        >     >     <luke@code-house.org <ma...@code-house.org>>:
        >     >         >
        >     >         >     Hey Christian,
        >     >         >     The problem I face is quite simple. State type in mspec i
        >     >     declared as a
        >     >         >     bunch of bits. Type length is fixed, however not available
        >     >     anywhere in
        >     >         >     mspec or code generator to re-arrange bytes upfront. All
        >     >     we have exposed
        >     >         >     at reader/write level is read/write bit.
        >     >         >     To be fair LE/BE support in read/write buffers is limited
        >     >     just to
        >     >         >     numbers. There is no such support for raw bytes or bits,
        >     >     cause for that
        >     >         >     you need to declare a length of LE/BE sequence.
        >     >         >     
        >     >         >     I would love if I could just declare State as '2 byte
        >     >     little endian' so
        >     >         >     it would be read properly upfront and parsed with no
        >     >     changes in
        >     >         >     generated code, however I'm not sure how to do it and
        >     >     where. That's why
        >     >         >     I'm playing with different things described in earlier mail.
        >     >         >     Since all type handling is general I am just afraid of
        >     >     more complicated
        >     >         >     scenarios where we have variable length structures such as
        >     >     arrays.
        >     >         >     
        >     >         >     Best regards,
        >     >         >     Łukasz
        >     >         >     
        >     >         >     
        >     >         >     On 14.04.2020 09:41, Christofer Dutz wrote:
        >     >         >     > Hi Lukasz,
        >     >         >     >
        >     >         >     > I am not sure I am understanding the problems you are
        >     >     facing. We already have LE and BE protocols.
        >     >         >     > For example EIP is LE and the rest is generally BE. It
        >     >     seems that ADS/AMS is also LE.
        >     >         >     > mspec doesn't even know about endianness.
        >     >         >     >
        >     >         >     > Up till now the endianness doesn't have an effect on
        >     >     bit-fields or single-bit ints.
        >     >         >     > It only starts to affect if a field goes from one byte
        >     >     to the next, which is usually for (u)int and floating point values.
        >     >         >     >
        >     >         >     > That's why we have created the Read/WriteBuffers to set
        >     >     their endianness in the constructor.
        >     >         >     >
        >     >         >     > So if you're creating a driver for ADS/AMS which is LE,
        >     >     then you write the mspec according to the sequence the information
        >     >     is located in the transferred bytes and have the Read/WriteBuffer
        >     >     handle the endianness issue.
        >     >         >     >
        >     >         >     > I do see a problem when there are drivers that use mixed
        >     >     endianness, but we have still to encounter such a protocol.
        >     >         >     >
        >     >         >     > So I have to admit that I don't like any of the mspec
        >     >     changes you proposed, as I think you are just not using the tools we
        >     >     have the right way.
        >     >         >     >
        >     >         >     > Chris
        >     >         >     >
        >     >         >     >
        >     >         >     >
        >     >         >     > Am 14.04.20, 00:32 schrieb "Łukasz Dywicki"
        >     >     <luke@code-house.org <ma...@code-house.org>>:
        >     >         >     >
        >     >         >     >     Hey Niclas,
        >     >         >     >     I realized how old the old things are when I started
        >     >     preparing
        >     >         >     >     automation training for mere mortals and got into
        >     >     history of frames and
        >     >         >     >     even cabling. Mr. Modbus and EIA-485 is definitely
        >     >     older than I. ;-)
        >     >         >     >     
        >     >         >     >     Getting back to the point - yes. I been thinking how
        >     >     to address the byte
        >     >         >     >     order in effective way. Here are two approaches I
        >     >     have for now:
        >     >         >     >     
        >     >         >     >     A) My initial attempt is just a temporary buffer
        >     >     which is then written
        >     >         >     >     in reverse order to caller. For reading it is
        >     >     similar - just getting N
        >     >         >     >     bytes in reversed order. The hard part is.. knowing
        >     >     N. I had to add a
        >     >         >     >     static calculation in order to allocate valid buffer
        >     >     sizes. I tend to
        >     >         >     >     work but I'm not happy with this approach cause it
        >     >     involves additional work.
        >     >         >     >     B) Second idea I've got is really simple and relies
        >     >     on code generation.
        >     >         >     >     We know in which order fields are coming. Here I'm
        >     >     referring to a State
        >     >         >     >     field which is just bunch of bits. If we would group
        >     >     fields in bytes and
        >     >         >     >     generate code in reverse order then it has chance to
        >     >     work. Requirement
        >     >         >     >     for that - ability to know basic field sizes upfront.
        >     >         >     >     C) Try to combine above with bit-io or
        >     >     Read/WriteBuffers as these are
        >     >         >     >     places which know actual position and state of
        >     >     buffers which are being
        >     >         >     >     read/written.
        >     >         >     >     
        >     >         >     >     Now, getting to two cases which are a problem.
        >     >     CommandId and State. So
        >     >         >     >     with command id situation is simple as it is
        >     >     declared as enum and it is
        >     >         >     >     read as uint. We know size upfront and can generate
        >     >     valid method call
        >     >         >     >     (readIntLE).
        >     >         >     >     [enum uint 16 little endian 'CommandId'
        >     >         >     >         ['0x00' INVALID]
        >     >         >     >         ['0x01' ADS_READ_DEVICE_INFO]
        >     >         >     >         ['0x02' ADS_READ]
        >     >         >     >         ['0x03' ADS_WRITE]
        >     >         >     >         ['0x04' ADS_READ_STATE]
        >     >         >     >         ['0x05' ADS_WRITE_CONTROL]
        >     >         >     >         ['0x06' ADS_ADD_DEVICE_NOTIFICATION]
        >     >         >     >         ['0x07' ADS_DELETE_DEVICE_NOTIFICATION]
        >     >         >     >         ['0x08' ADS_DEVICE_NOTIFICATION]
        >     >         >     >         ['0x09' ADS_READ_WRITE]
        >     >         >     >     ]
        >     >         >     >     
        >     >         >     >     Second candidate is what I'm stuck right now sniping
        >     >     next cycles of
        >     >         >     >     problems. So in case of State we have complex type
        >     >     composed from 2
        >     >         >     >     bytes. A note here - instead of two bytes we might
        >     >     have a variable
        >     >         >     >     length type which includes array or other variable
        >     >     section.
        >     >         >     >     [type little endian 'State'
        >     >         >     >         [simple     bit 'broadcast'             ]
        >     >         >     >         [reserved   int 7 '0x0'                 ]
        >     >         >     >         [simple     bit 'initCommand'           ]
        >     >         >     >         [simple     bit 'updCommand'            ]
        >     >         >     >         [simple     bit 'timestampAdded'        ]
        >     >         >     >         [simple     bit 'highPriorityCommand'   ]
        >     >         >     >         [simple     bit 'systemCommand'         ]
        >     >         >     >         [simple     bit 'adsCommand'            ]
        >     >         >     >         [simple     bit 'noReturn'              ]
        >     >         >     >         [simple     bit 'response'              ]
        >     >         >     >     ]
        >     >         >     >     
        >     >         >     >     The order of reading big endian encoded data to
        >     >     impose little endian
        >     >         >     >     shift would be (please correct me if I'm wrong):
        >     >         >     >     1) init
        >     >         >     >     2) udp
        >     >         >     >     3) add timestamp
        >     >         >     >     4) priority
        >     >         >     >     5) system
        >     >         >     >     6) ads
        >     >         >     >     7) noreturn
        >     >         >     >     8) response (end of byte 1)
        >     >         >     >     9) broadcast
        >     >         >     >     10) reserved (end of byte )
        >     >         >     >     We can do same trick for writing, by re-arranging
        >     >     fields. By this way we
        >     >         >     >     avoid any additional byte level operations.
        >     >         >     >     
        >     >         >     >     Overall trouble with generated driver is to declare
        >     >     "how much" bytes
        >     >         >     >     should be read and interpreted. We have precise size
        >     >     information at the
        >     >         >     >     runtime - due to length fields, we can leverage it
        >     >     at generation time,
        >     >         >     >     but then we won't be able to cover all cases.
        >     >         >     >     
        >     >         >     >     I would love to keep it simple and do not break
        >     >     things thus I need your
        >     >         >     >     advice on how to approach this problem in a valid way.
        >     >         >     >     
        >     >         >     >     Cheers,
        >     >         >     >     Łukasz
        >     >         >     >     
        >     >         >     >     
        >     >         >     >     On 13.04.2020 03:26, Niclas Hedhman wrote:
        >     >         >     >     > <anecdotal-rant>
        >     >         >     >     > For us who were around and shaping the protocols
        >     >     in the 1980s, and people
        >     >         >     >     > before us (and before standards like RS-232), a
        >     >     lot of the "specifications"
        >     >         >     >     > came out of "observation of implementation we
        >     >     managed to get to work",
        >     >         >     >     > rather than "implement this spec". A lot was due
        >     >     to extreme memory
        >     >         >     >     > constraints (in my case, multi-tasking operating
        >     >     system, serial protocol
        >     >         >     >     > 187kbps, interpreted programming language with
        >     >     floating point ops and user
        >     >         >     >     > applications in 2kB RAM and 8kB EPROM) and a
        >     >     general lack of information,
        >     >         >     >     > like what other people were doing, sharing
        >     >     experiences and so on.
        >     >         >     >     >
        >     >         >     >     > And there were many "innovative" ways to squeeze
        >     >     just a little bit extra
        >     >         >     >     > out of the hardware, resulting in "hard to
        >     >     understand" consequences. Bit
        >     >         >     >     > packing was a typical one, multiple functions
        >     >     packed into a single byte.
        >     >         >     >     > Look at page 14 in
        >     >     https://www.nxp.com/docs/en/data-sheet/80C31_80C32.pdf
        >     >         >     >     > and read up on "UART Enahanced Mode", and we used
        >     >     this, i.e. 9 bits, no
        >     >         >     >     > parity and clever use of address and mask to
        >     >     create a slave-to-slave direct
        >     >         >     >     > protocol, where the master's role was to signal
        >     >     which slave "owned" the
        >     >         >     >     > cable. Yeah, in that 8kB ROM limitation (I think
        >     >     protocol was about 1kB
        >     >         >     >     > ROM) and something like 150 bytes RAM for comm
        >     >     protocol.
        >     >         >     >     >
        >     >         >     >     > Could you implement a compatible device to this
        >     >     with PLC4X and modern
        >     >         >     >     > hardware (i.e. no 8031/32 co-processor)? Possibly
        >     >     but bit-banging is needed
        >     >         >     >     > to support the 9bit data (+start and stop bits)
        >     >     and an awful lot of CPU
        >     >         >     >     > cycles on something that was automatic on one of
        >     >     the slowest long-lived
        >     >         >     >     > microcontroller ever.
        >     >         >     >     > </anecdotal-rant>
        >     >         >     >     >
        >     >         >     >     > My point was only to highlight that some of the
        >     >     strange things you see in
        >     >         >     >     > protocols today, have its roots in
        >     >     pre-standardization days. Today no one
        >     >         >     >     > would go down that route, because the hardware
        >     >     cost nothing now (8031  +
        >     >         >     >     > 8kB EPROM + 2kB static RAM + battery backup =>
        >     >     ~$50 in 1983's currency) and
        >     >         >     >     > longevity of software is more important.
        >     >         >     >     >
        >     >         >     >     > Cheers
        >     >         >     >     > Niclas
        >     >         >     >     >
        >     >         >     >     >
        >     >         >     >     > On Sun, Apr 12, 2020 at 10:10 PM Christofer Dutz
        >     >     <christofer.dutz@c-ware.de <ma...@c-ware.de>>
        >     >         >     >     > wrote:
        >     >         >     >     >
        >     >         >     >     >> Hi Lukasz,
        >     >         >     >     >>
        >     >         >     >     >> I think it really gets tricky when using BE and
        >     >     having some byte-odd-sizes
        >     >         >     >     >> ... I remember in the Firmata protocol there were
        >     >     some bitmasks and then 10
        >     >         >     >     >> bit uint as BE ... not it really got tricky as
        >     >     the specs were written from
        >     >         >     >     >> a point of view: You read 16 bits BE and then the
        >     >     first6 bits mean XYZ
        >     >         >     >     >> instead of describing how the bits actually
        >     >     travel over the wire.
        >     >         >     >     >>
        >     >         >     >     >> Chris
        >     >         >     >     >>
        >     >         >     >     >>
        >     >         >     >     >>
        >     >         >     >     >> Am 11.04.20, 01:21 schrieb "Łukasz Dywicki"
        >     >     <luke@code-house.org <ma...@code-house.org>>:
        >     >         >     >     >>
        >     >         >     >     >>     I've made some progress with topic by
        >     >     modyfing mspec and allowing
        >     >         >     >     >>     'little endian' flag on fields. This moved me
        >     >     further to next issue -
        >     >         >     >     >>     which is whole type encoded little endian.
        >     >         >     >     >>
        >     >         >     >     >>     In ADS driver such type is State, which has 2
        >     >     bytes and uses 8 bits for
        >     >         >     >     >>     various flags.
        >     >         >     >     >>     There are two cases which require different
        >     >     approach - reading and
        >     >         >     >     >>     writing. So for reading we need to swap N
        >     >     bytes based on type length.
        >     >         >     >     >>     For writing we need to alocate buffer for N
        >     >     bytes and swap them before
        >     >         >     >     >>     writing.
        >     >         >     >     >>
        >     >         >     >     >>     I am stuck now with freemaker templates and
        >     >     bit-io.
        >     >         >     >     >>
        >     >         >     >     >>     Cheers,
        >     >         >     >     >>     Łukasz
        >     >         >     >     >>
        >     >         >     >     >>
        >     >         >     >     >>
        >     >         >     >     >>     On 10.04.2020 17:57, Łukasz Dywicki wrote:
        >     >         >     >     >>     > I am doing some tests of ADS serialization.
        >     >         >     >     >>     >
        >     >         >     >     >>     > I've run into some troubles with payload
        >     >     which is generated with new
        >     >         >     >     >>     > driver. I'm not sure if that's my fault or
        >     >     generated code.
        >     >         >     >     >>     >
        >     >         >     >     >>     > I did a verification of what Wireshark
        >     >     shows and how ads structures
        >     >         >     >     >> are
        >     >         >     >     >>     > parsed. There is a gap I think. For example
        >     >     ams port number 1000
        >     >         >     >     >>     > (0x1027) is read as 4135.
        >     >         >     >     >>     >
        >     >         >     >     >>     > Obviously I used wrong structures while
        >     >     implementing protocol logic
        >     >         >     >     >> in
        >     >         >     >     >>     > first place, but now I am uncertain of how
        >     >     fields are encoded. How we
        >     >         >     >     >>     > mark field as little endian when rest of
        >     >     payload is big endian? Do we
        >     >         >     >     >>     > have `uint_le`?
        >     >         >     >     >>     >
        >     >         >     >     >>     > As far I remember route creation logic I
        >     >     was tracking last week used
        >     >         >     >     >>     > combination of LE and BE.
        >     >         >     >     >>     >
        >     >         >     >     >>     > Best regards,
        >     >         >     >     >>     > Łukasz
        >     >         >     >     >>     >
        >     >         >     >     >>
        >     >         >     >     >>
        >     >         >     >     >>
        >     >         >     >     >
        >     >         >     >     
        >     >         >     >
        >     >         >     
        >     >         >
        >     > 
        >     
        >

Re: Big and littleendian fields in one mspec

Posted by Christofer Dutz <ch...@c-ware.de>.

Hi Lukasz,

I wasn't suggesting to look how WireShark decodes things. Cause I would assume that they read a short in little endian and then (on the rearranged bytes) apply some bit checks. 

I was more suggesting that you look into the bytes Wireshark shows you and to check their exact position.

When implementing drivers, I used the Mac's Calculator tool quite often to see the bits for a given byte value.

In your example I can't see any need for endianness at all ... it's all just bits and one empty block which forms a full byte together with the first bit.
As this block doesn't cross any byte boundaries I can't see any problems with this.

Assuming I understand what you are implying with 

[type little endian 'State'
        [simple     bit 'broadcast'             ]
        [reserved   int 7 '0x0'                 ]
        [simple     bit 'initCommand'           ]
        [simple     bit 'updCommand'            ]
        [simple     bit 'timestampAdded'        ]
        [simple     bit 'highPriorityCommand'   ]
        [simple     bit 'systemCommand'         ]
        [simple     bit 'adsCommand'            ]
        [simple     bit 'noReturn'              ]
        [simple     bit 'response'              ]
    ]

Then it should be equal to:

[type 'State'
        [simple     bit 'initCommand'           ]
        [simple     bit 'updCommand'            ]
        [simple     bit 'timestampAdded'        ]
        [simple     bit 'highPriorityCommand'   ]
        [simple     bit 'systemCommand'         ]
        [simple     bit 'adsCommand'            ]
        [simple     bit 'noReturn'              ]
        [simple     bit 'response'              ]
        [simple     bit 'broadcast'             ]
        [reserved   int 7 '0x0'                 ]
    ]

Which also seems to be the exact order how the bits go over the wire ... I mean ... it doesn't make any sense to put the 7 empty bits in the middle.

Chris


Am 30.04.20, 22:42 schrieb "Łukasz Dywicki" <lu...@code-house.org>:

    Hey Chris and others,
    I let this topic to dust and freeze a bit, however I would like to lift
    it up again. Since the 0.7 release is now in discussion and ADS is not
    ported yet I believe it is reasonable to validate this problem.

    As you suggested I've checked wireshark sources to see how they parse
    state fragment of AMS header. While I can not read C properly and was
    unable to locate actual reading logic, I do see they care about endianness:
    https://github.com/wireshark/wireshark/blob/5cf3fd03f1538b37704d83b6c38b8223f9036108/plugins/epan/ethercat/packet-ams.c#L429

    Their frame traversal goes in following order:
    - response
    - no return
    - ads cmd
    - system cmd
    - high priority
    - timestamp add
    - udp
    - init cmd
    - broadcast

    When I've tried to model mspec with respect to little endian order, I
    end up with following field order:
    - init cmd
    - udp
    - timestamp add
    - high priority
    - system cmd
    - ads cmd
    - no return
    - response
    - [reserved]
    - broadcast

    I haven't seen ADS protocol book, my main source of information is
    infosys which is rather short in this area. I doubt if they would
    describe fields in above order.

    My initial attempts were similar to what you said - to read the value
    and then shift bytes. With your feedback I started to look for better
    ways to solve it. With my last attempt I solved the problem by
    re-adjusting fields during code generation, so they appear in a valid
    byte order. Beside additional flag at the mspec there is a need for this
    helper:
    build-utils/protocol-base-mspec/src/main/java/org/apache/plc4x/plugins/codegenerator/language/mspec/model/definitions/DefaultComplexTypeDefinition.java

    In above commit the State declaration in the mspec looks like this:
    [type little endian 'State'
        [simple     bit 'broadcast'             ]
        [reserved   int 7 '0x0'                 ]
        [simple     bit 'initCommand'           ]
        [simple     bit 'updCommand'            ]
        [simple     bit 'timestampAdded'        ]
        [simple     bit 'highPriorityCommand'   ]
        [simple     bit 'systemCommand'         ]
        [simple     bit 'adsCommand'            ]
        [simple     bit 'noReturn'              ]
        [simple     bit 'response'              ]
    ]

    Maybe it is not in line with wireshark, but it is consistent and:
    1) requires minor change in code generator
    2) does not involve additional processing of data
    3) mspec is still easy to write
    4) type declaration is in line with wireshark output when reading from
    the left and not from the top. ;-)
    5) it can be adjusted to be 100% consistent with wireshark by changing
    helper logic. ;-)

    Best regards,
    Łukasz


    On 14.04.2020 21:25, Christofer Dutz wrote:
    > Hi Lukasz,
    > 
    > I see it differently.
    > 
    > If you say something is LE/BE then it should be consistent. 
    > 
    > I for my part like it if the mspec is consistent with what I see in WireShark. Otherwise you would always have to do a sort of 2 step parsing:
    > 
    > "Read 4 bytes in BE and then on the rearranged byte sequence take the first 3 bits and call them X and the next bit after that Y and so on."
    > 
    > In this case you would also have to know how large the chunk is to read in order to know how to decode it's content.
    > Having a historically partitioned set of two 16 bit bit-fields would look completely different to a 32 bit bit-field. 
    > 
    > I don't like adding this extra level, even if it's probably the way protocols were implemented when lacking the ability to write individual bits.
    > I guess the usual process was to read a BE short (2 bytes) and to the resulting unsigned short value apply different logical or operations to get the information.
    > In PLC4X we can simply read any number of bits and don't have to twist our minds in which chunks I have to read things first. 
    > 
    > So I think the way you are proposing is probably the wider used version due to legacy restrictions, but also the more complicated way.
    > 
    > So I would like to keep it simple, even if the developer will have to do a little more brain-work. 
    > 
    > I mean all we are doing in this project is going the extra mile so others don't have to ... at least the ones contributing are doing that.
    > 
    > What do the others say to this? (Ideally ones that have already used mspec or ones that understand the problem)
    > 
    > Chris
    > 
    > 
    > Am 14.04.20, 15:46 schrieb "Łukasz Dywicki" <lu...@code-house.org>:
    > 
    >     Hey Chris,
    >     Sorry for wrongly addressed mail, it was intended to go to mailing list.
    >     
    >     To the point - I think that what you propose with reordering fields
    >     manually shift the load from the tool to the developer. It will lead to
    >     further implementation errors caused by a need to reorder fields while
    >     creating spec. It will be inconsistent with rest of places where things
    >     are in "natural order" (big endian) except one or two types which
    >     collects bit flags which will have to be declared in different order.
    >     What I did in my naive implementation with "little endian" flag on the
    >     State type is re-arrangement of fields during generation time.
    >     It does not require any byte shifting at the runtime and keeps natural
    >     order of fields without the need for manual reordering of flags in mspec
    >     declaration.
    >     
    >     I understand that keeping mspec simple is an important priority, however
    >     doing it at the cost of developers who will use it to implement codecs
    >     will not help us in long term.
    >     I'm quite sure that sooner or later we will get a  protocol with mix of
    >     BE/LE fields.
    >     
    >     Cheers,
    >     Łukasz
    >     
    >     
    >     On 14.04.2020 14:57, Christofer Dutz wrote:
    >     > Hi Lukasz (adding dev@ to the recipients again)
    >     > 
    >     >  
    >     > 
    >     > you are not quite correct. Enum types do declare a base type where they
    >     > are declared … not where they are used.
    >     > 
    >     >  
    >     > 
    >     > If you have bit-fields then why don’t you define them as sequence of
    >     > bits? I really can’t understand why you need to flip anything.
    >     > 
    >     > I think you’re trying to do something similar to Sebastian when he
    >     > wanted to introduce some bit-mask types.
    >     > 
    >     >  
    >     > 
    >     > I could imagine that in the spec it might be written … these two bytes
    >     > are a bit-mask and the bit 1 means X bit 10 means Y, …
    >     > 
    >     > So why not define the bit fields in the sequence they are sent over the
    >     > wire?
    >     > 
    >     >  
    >     > 
    >     > Assuming you have a bit field of 16 bits BE uint. If the spec now says:
    >     > 
    >     >  
    >     > 
    >     >  
    >     > 
    >     > Bit:                        15|14|13|12|11|10|9 |8 |7 |6 |5  |4 |3  |2
    >     >  |1  |0
    >     > 
    >     > Meaning:            A  |B  |C  |D  |E  | F  |G|H |I  | J | K |L | M|N |O |P
    >     > 
    >     >  
    >     > 
    >     > Then the order they are sent is:
    >     > 
    >     > 7 |6 |5  |4 |3  |2  |1  |0 | 15|14|13|12|11|10|9 |8
    >     > 
    >     > So you have to declare the bit fields in the order:
    >     > 
    >     >                                I  | J | K |L | M|N |O |P  |A  |B  |C  |D
    >     >  |E  | F  |G|H
    >     > 
    >     >  
    >     > 
    >     > So I don’t quite understand what you’re trying to flip here.
    >     > 
    >     >  
    >     > 
    >     > Chris
    >     > 
    >     >  
    >     > 
    >     > *Von: *Łukasz Dywicki <lu...@code-house.org>
    >     > *Datum: *Dienstag, 14. April 2020 um 13:26
    >     > *An: *Christofer Dutz <ch...@c-ware.de>
    >     > *Betreff: *Re: Big and littleendian fields in one mspec
    >     > 
    >     >  
    >     > 
    >     > I've added the endianness switch to the types in my experiments. These
    >     > are not needed for enums as these are read as int and then mapped to
    >     > constants. I referred to these as an example - enum does have a length
    >     > indication and type behind it.
    >     > 
    >     >  
    >     > 
    >     > Again, I will bring the issue - StateIO currently fails to do anything
    >     > useful due to BE/LE switch. Because this type uses single bits which
    >     > need to be flipped before reading or after writing otherwise they end up
    >     > in improper order. Out of 16 bits 9 are in use so we have a lot of
    >     > possible combinations (2^9), which seems too much to create an enum.
    >     > 
    >     > Since you didn't like my initial proposal and modifications (as they
    >     > might be redundant to what is available in the buffer API), how would
    >     > you handle State serialization without affecting mspec and without
    >     > further complication to code generation?
    >     > 
    >     >  
    >     > 
    >     > Best,
    >     > 
    >     > Łukasz
    >     > 
    >     >  
    >     > 
    >     > wt., 14 kwi 2020 o 12:47 Christofer Dutz <christofer.dutz@c-ware.de
    >     > <ma...@c-ware.de>> napisał(a):
    >     > 
    >     >     Hi Lukasz,
    >     > 
    >     >     but we don't have a:
    >     >     [enum uint 16 little endian 'CommandId']
    >     >     But only a:
    >     >     [enum uint 16 'CommandId']
    >     > 
    >     >     And in your case I think perhaps the constants are not correct. So
    >     >     having a 16 bit uint will result in a 4 digit hex string:
    >     >     So if you are having problems in mapping them to your constants,
    >     >     perhaps the constants are in the wrong endianness.
    >     > 
    >     >     I have encountered (but can't recall where) that the Enum constants
    >     >     in a BE protocol were written down in LE notation.
    >     >     Of course the thing can't work then.
    >     > 
    >     >     So if for example you have the constant "1" in a BE protocol with a
    >     >     uint 16 enum type, your constant is not 0x0001, but 0x0100 instead.
    >     > 
    >     >     Chris
    >     > 
    >     > 
    >     > 
    >     > 
    >     >     Am 14.04.20, 12:37 schrieb "Łukasz Dywicki" <luke@code-house.org
    >     >     <ma...@code-house.org>>:
    >     > 
    >     >         The legendary "two bytes" (shall we create a band with this
    >     >     name?!) are
    >     >         coming from unfortunate State type. I don't mind these to be an
    >     >         int/sint/uint or whathever - this is matter of interpretation.
    >     > 
    >     >         If you could please check again my earlier messages you will find
    >     >         struggle I have which is - how much given type takes.
    >     > 
    >     >         We have that for enums
    >     >         [enum uint 16 little endian 'CommandId']
    >     > 
    >     >         but we don't have that for complex types
    >     >         [type 'State']
    >     > 
    >     >         This leads to situation that we don't know how to read and interpret
    >     >         incoming byte sequence or how to properly write it back to stream.
    >     > 
    >     >         It would be great if we could limit the issue just to
    >     >         // read
    >     >         state = StateIO.parseStatic(io.readUnsignedInt(16, true)); //
    >     >     true->LE
    >     >         // write
    >     >         wio = new WriteBuffer(2);
    >     >         StateIO.staticSerialize(wio, new State(...));
    >     >         io.writeUnsignedInt(8, wio.read, true); // true->LE
    >     > 
    >     >         However to do that first we need to know that State is a uint 16
    >     >     LE and
    >     >         then do a lot of mambo-jambo in code generators to cope with
    >     >     that. :-)
    >     > 
    >     >         Best,
    >     >         Łukasz
    >     > 
    >     > 
    >     >         On 14.04.2020 12:20, Christofer Dutz wrote:
    >     >         > Oh gee ... I totally remember having exactly the same
    >     >     discussion with Sebastian about "bytes" ...
    >     >         >
    >     >         > The problem is there is generally no "byte" "int" and "uint"
    >     >     because a byte = 8 bits is either signed or unsigned. That should
    >     >     the 3rd option be? Perhaps-signed?
    >     >         > So generally I use "uint 8" for what you refer to as a "byte 8".
    >     >         >
    >     >         > Wo what would be the difference between your "2 byte little
    >     >     endian" and an "uint 16" with a LE ReadBuffer?
    >     >         >
    >     >         > I think what you have to rid yourself of thinking of int as
    >     >     number and a byte not being a number.
    >     >         >
    >     >         > Chris
    >     >         >
    >     >         >
    >     >         >
    >     >         > Am 14.04.20, 10:52 schrieb "Łukasz Dywicki"
    >     >     <luke@code-house.org <ma...@code-house.org>>:
    >     >         >
    >     >         >     Hey Christian,
    >     >         >     The problem I face is quite simple. State type in mspec i
    >     >     declared as a
    >     >         >     bunch of bits. Type length is fixed, however not available
    >     >     anywhere in
    >     >         >     mspec or code generator to re-arrange bytes upfront. All
    >     >     we have exposed
    >     >         >     at reader/write level is read/write bit.
    >     >         >     To be fair LE/BE support in read/write buffers is limited
    >     >     just to
    >     >         >     numbers. There is no such support for raw bytes or bits,
    >     >     cause for that
    >     >         >     you need to declare a length of LE/BE sequence.
    >     >         >     
    >     >         >     I would love if I could just declare State as '2 byte
    >     >     little endian' so
    >     >         >     it would be read properly upfront and parsed with no
    >     >     changes in
    >     >         >     generated code, however I'm not sure how to do it and
    >     >     where. That's why
    >     >         >     I'm playing with different things described in earlier mail.
    >     >         >     Since all type handling is general I am just afraid of
    >     >     more complicated
    >     >         >     scenarios where we have variable length structures such as
    >     >     arrays.
    >     >         >     
    >     >         >     Best regards,
    >     >         >     Łukasz
    >     >         >     
    >     >         >     
    >     >         >     On 14.04.2020 09:41, Christofer Dutz wrote:
    >     >         >     > Hi Lukasz,
    >     >         >     >
    >     >         >     > I am not sure I am understanding the problems you are
    >     >     facing. We already have LE and BE protocols.
    >     >         >     > For example EIP is LE and the rest is generally BE. It
    >     >     seems that ADS/AMS is also LE.
    >     >         >     > mspec doesn't even know about endianness.
    >     >         >     >
    >     >         >     > Up till now the endianness doesn't have an effect on
    >     >     bit-fields or single-bit ints.
    >     >         >     > It only starts to affect if a field goes from one byte
    >     >     to the next, which is usually for (u)int and floating point values.
    >     >         >     >
    >     >         >     > That's why we have created the Read/WriteBuffers to set
    >     >     their endianness in the constructor.
    >     >         >     >
    >     >         >     > So if you're creating a driver for ADS/AMS which is LE,
    >     >     then you write the mspec according to the sequence the information
    >     >     is located in the transferred bytes and have the Read/WriteBuffer
    >     >     handle the endianness issue.
    >     >         >     >
    >     >         >     > I do see a problem when there are drivers that use mixed
    >     >     endianness, but we have still to encounter such a protocol.
    >     >         >     >
    >     >         >     > So I have to admit that I don't like any of the mspec
    >     >     changes you proposed, as I think you are just not using the tools we
    >     >     have the right way.
    >     >         >     >
    >     >         >     > Chris
    >     >         >     >
    >     >         >     >
    >     >         >     >
    >     >         >     > Am 14.04.20, 00:32 schrieb "Łukasz Dywicki"
    >     >     <luke@code-house.org <ma...@code-house.org>>:
    >     >         >     >
    >     >         >     >     Hey Niclas,
    >     >         >     >     I realized how old the old things are when I started
    >     >     preparing
    >     >         >     >     automation training for mere mortals and got into
    >     >     history of frames and
    >     >         >     >     even cabling. Mr. Modbus and EIA-485 is definitely
    >     >     older than I. ;-)
    >     >         >     >     
    >     >         >     >     Getting back to the point - yes. I been thinking how
    >     >     to address the byte
    >     >         >     >     order in effective way. Here are two approaches I
    >     >     have for now:
    >     >         >     >     
    >     >         >     >     A) My initial attempt is just a temporary buffer
    >     >     which is then written
    >     >         >     >     in reverse order to caller. For reading it is
    >     >     similar - just getting N
    >     >         >     >     bytes in reversed order. The hard part is.. knowing
    >     >     N. I had to add a
    >     >         >     >     static calculation in order to allocate valid buffer
    >     >     sizes. I tend to
    >     >         >     >     work but I'm not happy with this approach cause it
    >     >     involves additional work.
    >     >         >     >     B) Second idea I've got is really simple and relies
    >     >     on code generation.
    >     >         >     >     We know in which order fields are coming. Here I'm
    >     >     referring to a State
    >     >         >     >     field which is just bunch of bits. If we would group
    >     >     fields in bytes and
    >     >         >     >     generate code in reverse order then it has chance to
    >     >     work. Requirement
    >     >         >     >     for that - ability to know basic field sizes upfront.
    >     >         >     >     C) Try to combine above with bit-io or
    >     >     Read/WriteBuffers as these are
    >     >         >     >     places which know actual position and state of
    >     >     buffers which are being
    >     >         >     >     read/written.
    >     >         >     >     
    >     >         >     >     Now, getting to two cases which are a problem.
    >     >     CommandId and State. So
    >     >         >     >     with command id situation is simple as it is
    >     >     declared as enum and it is
    >     >         >     >     read as uint. We know size upfront and can generate
    >     >     valid method call
    >     >         >     >     (readIntLE).
    >     >         >     >     [enum uint 16 little endian 'CommandId'
    >     >         >     >         ['0x00' INVALID]
    >     >         >     >         ['0x01' ADS_READ_DEVICE_INFO]
    >     >         >     >         ['0x02' ADS_READ]
    >     >         >     >         ['0x03' ADS_WRITE]
    >     >         >     >         ['0x04' ADS_READ_STATE]
    >     >         >     >         ['0x05' ADS_WRITE_CONTROL]
    >     >         >     >         ['0x06' ADS_ADD_DEVICE_NOTIFICATION]
    >     >         >     >         ['0x07' ADS_DELETE_DEVICE_NOTIFICATION]
    >     >         >     >         ['0x08' ADS_DEVICE_NOTIFICATION]
    >     >         >     >         ['0x09' ADS_READ_WRITE]
    >     >         >     >     ]
    >     >         >     >     
    >     >         >     >     Second candidate is what I'm stuck right now sniping
    >     >     next cycles of
    >     >         >     >     problems. So in case of State we have complex type
    >     >     composed from 2
    >     >         >     >     bytes. A note here - instead of two bytes we might
    >     >     have a variable
    >     >         >     >     length type which includes array or other variable
    >     >     section.
    >     >         >     >     [type little endian 'State'
    >     >         >     >         [simple     bit 'broadcast'             ]
    >     >         >     >         [reserved   int 7 '0x0'                 ]
    >     >         >     >         [simple     bit 'initCommand'           ]
    >     >         >     >         [simple     bit 'updCommand'            ]
    >     >         >     >         [simple     bit 'timestampAdded'        ]
    >     >         >     >         [simple     bit 'highPriorityCommand'   ]
    >     >         >     >         [simple     bit 'systemCommand'         ]
    >     >         >     >         [simple     bit 'adsCommand'            ]
    >     >         >     >         [simple     bit 'noReturn'              ]
    >     >         >     >         [simple     bit 'response'              ]
    >     >         >     >     ]
    >     >         >     >     
    >     >         >     >     The order of reading big endian encoded data to
    >     >     impose little endian
    >     >         >     >     shift would be (please correct me if I'm wrong):
    >     >         >     >     1) init
    >     >         >     >     2) udp
    >     >         >     >     3) add timestamp
    >     >         >     >     4) priority
    >     >         >     >     5) system
    >     >         >     >     6) ads
    >     >         >     >     7) noreturn
    >     >         >     >     8) response (end of byte 1)
    >     >         >     >     9) broadcast
    >     >         >     >     10) reserved (end of byte )
    >     >         >     >     We can do same trick for writing, by re-arranging
    >     >     fields. By this way we
    >     >         >     >     avoid any additional byte level operations.
    >     >         >     >     
    >     >         >     >     Overall trouble with generated driver is to declare
    >     >     "how much" bytes
    >     >         >     >     should be read and interpreted. We have precise size
    >     >     information at the
    >     >         >     >     runtime - due to length fields, we can leverage it
    >     >     at generation time,
    >     >         >     >     but then we won't be able to cover all cases.
    >     >         >     >     
    >     >         >     >     I would love to keep it simple and do not break
    >     >     things thus I need your
    >     >         >     >     advice on how to approach this problem in a valid way.
    >     >         >     >     
    >     >         >     >     Cheers,
    >     >         >     >     Łukasz
    >     >         >     >     
    >     >         >     >     
    >     >         >     >     On 13.04.2020 03:26, Niclas Hedhman wrote:
    >     >         >     >     > <anecdotal-rant>
    >     >         >     >     > For us who were around and shaping the protocols
    >     >     in the 1980s, and people
    >     >         >     >     > before us (and before standards like RS-232), a
    >     >     lot of the "specifications"
    >     >         >     >     > came out of "observation of implementation we
    >     >     managed to get to work",
    >     >         >     >     > rather than "implement this spec". A lot was due
    >     >     to extreme memory
    >     >         >     >     > constraints (in my case, multi-tasking operating
    >     >     system, serial protocol
    >     >         >     >     > 187kbps, interpreted programming language with
    >     >     floating point ops and user
    >     >         >     >     > applications in 2kB RAM and 8kB EPROM) and a
    >     >     general lack of information,
    >     >         >     >     > like what other people were doing, sharing
    >     >     experiences and so on.
    >     >         >     >     >
    >     >         >     >     > And there were many "innovative" ways to squeeze
    >     >     just a little bit extra
    >     >         >     >     > out of the hardware, resulting in "hard to
    >     >     understand" consequences. Bit
    >     >         >     >     > packing was a typical one, multiple functions
    >     >     packed into a single byte.
    >     >         >     >     > Look at page 14 in
    >     >     https://www.nxp.com/docs/en/data-sheet/80C31_80C32.pdf
    >     >         >     >     > and read up on "UART Enahanced Mode", and we used
    >     >     this, i.e. 9 bits, no
    >     >         >     >     > parity and clever use of address and mask to
    >     >     create a slave-to-slave direct
    >     >         >     >     > protocol, where the master's role was to signal
    >     >     which slave "owned" the
    >     >         >     >     > cable. Yeah, in that 8kB ROM limitation (I think
    >     >     protocol was about 1kB
    >     >         >     >     > ROM) and something like 150 bytes RAM for comm
    >     >     protocol.
    >     >         >     >     >
    >     >         >     >     > Could you implement a compatible device to this
    >     >     with PLC4X and modern
    >     >         >     >     > hardware (i.e. no 8031/32 co-processor)? Possibly
    >     >     but bit-banging is needed
    >     >         >     >     > to support the 9bit data (+start and stop bits)
    >     >     and an awful lot of CPU
    >     >         >     >     > cycles on something that was automatic on one of
    >     >     the slowest long-lived
    >     >         >     >     > microcontroller ever.
    >     >         >     >     > </anecdotal-rant>
    >     >         >     >     >
    >     >         >     >     > My point was only to highlight that some of the
    >     >     strange things you see in
    >     >         >     >     > protocols today, have its roots in
    >     >     pre-standardization days. Today no one
    >     >         >     >     > would go down that route, because the hardware
    >     >     cost nothing now (8031  +
    >     >         >     >     > 8kB EPROM + 2kB static RAM + battery backup =>
    >     >     ~$50 in 1983's currency) and
    >     >         >     >     > longevity of software is more important.
    >     >         >     >     >
    >     >         >     >     > Cheers
    >     >         >     >     > Niclas
    >     >         >     >     >
    >     >         >     >     >
    >     >         >     >     > On Sun, Apr 12, 2020 at 10:10 PM Christofer Dutz
    >     >     <christofer.dutz@c-ware.de <ma...@c-ware.de>>
    >     >         >     >     > wrote:
    >     >         >     >     >
    >     >         >     >     >> Hi Lukasz,
    >     >         >     >     >>
    >     >         >     >     >> I think it really gets tricky when using BE and
    >     >     having some byte-odd-sizes
    >     >         >     >     >> ... I remember in the Firmata protocol there were
    >     >     some bitmasks and then 10
    >     >         >     >     >> bit uint as BE ... not it really got tricky as
    >     >     the specs were written from
    >     >         >     >     >> a point of view: You read 16 bits BE and then the
    >     >     first6 bits mean XYZ
    >     >         >     >     >> instead of describing how the bits actually
    >     >     travel over the wire.
    >     >         >     >     >>
    >     >         >     >     >> Chris
    >     >         >     >     >>
    >     >         >     >     >>
    >     >         >     >     >>
    >     >         >     >     >> Am 11.04.20, 01:21 schrieb "Łukasz Dywicki"
    >     >     <luke@code-house.org <ma...@code-house.org>>:
    >     >         >     >     >>
    >     >         >     >     >>     I've made some progress with topic by
    >     >     modyfing mspec and allowing
    >     >         >     >     >>     'little endian' flag on fields. This moved me
    >     >     further to next issue -
    >     >         >     >     >>     which is whole type encoded little endian.
    >     >         >     >     >>
    >     >         >     >     >>     In ADS driver such type is State, which has 2
    >     >     bytes and uses 8 bits for
    >     >         >     >     >>     various flags.
    >     >         >     >     >>     There are two cases which require different
    >     >     approach - reading and
    >     >         >     >     >>     writing. So for reading we need to swap N
    >     >     bytes based on type length.
    >     >         >     >     >>     For writing we need to alocate buffer for N
    >     >     bytes and swap them before
    >     >         >     >     >>     writing.
    >     >         >     >     >>
    >     >         >     >     >>     I am stuck now with freemaker templates and
    >     >     bit-io.
    >     >         >     >     >>
    >     >         >     >     >>     Cheers,
    >     >         >     >     >>     Łukasz
    >     >         >     >     >>
    >     >         >     >     >>
    >     >         >     >     >>
    >     >         >     >     >>     On 10.04.2020 17:57, Łukasz Dywicki wrote:
    >     >         >     >     >>     > I am doing some tests of ADS serialization.
    >     >         >     >     >>     >
    >     >         >     >     >>     > I've run into some troubles with payload
    >     >     which is generated with new
    >     >         >     >     >>     > driver. I'm not sure if that's my fault or
    >     >     generated code.
    >     >         >     >     >>     >
    >     >         >     >     >>     > I did a verification of what Wireshark
    >     >     shows and how ads structures
    >     >         >     >     >> are
    >     >         >     >     >>     > parsed. There is a gap I think. For example
    >     >     ams port number 1000
    >     >         >     >     >>     > (0x1027) is read as 4135.
    >     >         >     >     >>     >
    >     >         >     >     >>     > Obviously I used wrong structures while
    >     >     implementing protocol logic
    >     >         >     >     >> in
    >     >         >     >     >>     > first place, but now I am uncertain of how
    >     >     fields are encoded. How we
    >     >         >     >     >>     > mark field as little endian when rest of
    >     >     payload is big endian? Do we
    >     >         >     >     >>     > have `uint_le`?
    >     >         >     >     >>     >
    >     >         >     >     >>     > As far I remember route creation logic I
    >     >     was tracking last week used
    >     >         >     >     >>     > combination of LE and BE.
    >     >         >     >     >>     >
    >     >         >     >     >>     > Best regards,
    >     >         >     >     >>     > Łukasz
    >     >         >     >     >>     >
    >     >         >     >     >>
    >     >         >     >     >>
    >     >         >     >     >>
    >     >         >     >     >
    >     >         >     >     
    >     >         >     >
    >     >         >     
    >     >         >
    >     > 
    >     
    >

Re: Big and littleendian fields in one mspec

Posted by Łukasz Dywicki <lu...@code-house.org>.

Hey Chris and others,
I let this topic to dust and freeze a bit, however I would like to lift
it up again. Since the 0.7 release is now in discussion and ADS is not
ported yet I believe it is reasonable to validate this problem.

As you suggested I've checked wireshark sources to see how they parse
state fragment of AMS header. While I can not read C properly and was
unable to locate actual reading logic, I do see they care about endianness:
https://github.com/wireshark/wireshark/blob/5cf3fd03f1538b37704d83b6c38b8223f9036108/plugins/epan/ethercat/packet-ams.c#L429

Their frame traversal goes in following order:
- response
- no return
- ads cmd
- system cmd
- high priority
- timestamp add
- udp
- init cmd
- broadcast

When I've tried to model mspec with respect to little endian order, I
end up with following field order:
- init cmd
- udp
- timestamp add
- high priority
- system cmd
- ads cmd
- no return
- response
- [reserved]
- broadcast

I haven't seen ADS protocol book, my main source of information is
infosys which is rather short in this area. I doubt if they would
describe fields in above order.

My initial attempts were similar to what you said - to read the value
and then shift bytes. With your feedback I started to look for better
ways to solve it. With my last attempt I solved the problem by
re-adjusting fields during code generation, so they appear in a valid
byte order. Beside additional flag at the mspec there is a need for this
helper:
build-utils/protocol-base-mspec/src/main/java/org/apache/plc4x/plugins/codegenerator/language/mspec/model/definitions/DefaultComplexTypeDefinition.java

In above commit the State declaration in the mspec looks like this:
[type little endian 'State'
    [simple     bit 'broadcast'             ]
    [reserved   int 7 '0x0'                 ]
    [simple     bit 'initCommand'           ]
    [simple     bit 'updCommand'            ]
    [simple     bit 'timestampAdded'        ]
    [simple     bit 'highPriorityCommand'   ]
    [simple     bit 'systemCommand'         ]
    [simple     bit 'adsCommand'            ]
    [simple     bit 'noReturn'              ]
    [simple     bit 'response'              ]
]

Maybe it is not in line with wireshark, but it is consistent and:
1) requires minor change in code generator
2) does not involve additional processing of data
3) mspec is still easy to write
4) type declaration is in line with wireshark output when reading from
the left and not from the top. ;-)
5) it can be adjusted to be 100% consistent with wireshark by changing
helper logic. ;-)

Best regards,
Łukasz


On 14.04.2020 21:25, Christofer Dutz wrote:
> Hi Lukasz,
> 
> I see it differently.
> 
> If you say something is LE/BE then it should be consistent. 
> 
> I for my part like it if the mspec is consistent with what I see in WireShark. Otherwise you would always have to do a sort of 2 step parsing:
> 
> "Read 4 bytes in BE and then on the rearranged byte sequence take the first 3 bits and call them X and the next bit after that Y and so on."
> 
> In this case you would also have to know how large the chunk is to read in order to know how to decode it's content.
> Having a historically partitioned set of two 16 bit bit-fields would look completely different to a 32 bit bit-field. 
> 
> I don't like adding this extra level, even if it's probably the way protocols were implemented when lacking the ability to write individual bits.
> I guess the usual process was to read a BE short (2 bytes) and to the resulting unsigned short value apply different logical or operations to get the information.
> In PLC4X we can simply read any number of bits and don't have to twist our minds in which chunks I have to read things first. 
> 
> So I think the way you are proposing is probably the wider used version due to legacy restrictions, but also the more complicated way.
> 
> So I would like to keep it simple, even if the developer will have to do a little more brain-work. 
> 
> I mean all we are doing in this project is going the extra mile so others don't have to ... at least the ones contributing are doing that.
> 
> What do the others say to this? (Ideally ones that have already used mspec or ones that understand the problem)
> 
> Chris
> 
> 
> Am 14.04.20, 15:46 schrieb "Łukasz Dywicki" <lu...@code-house.org>:
> 
>     Hey Chris,
>     Sorry for wrongly addressed mail, it was intended to go to mailing list.
>     
>     To the point - I think that what you propose with reordering fields
>     manually shift the load from the tool to the developer. It will lead to
>     further implementation errors caused by a need to reorder fields while
>     creating spec. It will be inconsistent with rest of places where things
>     are in "natural order" (big endian) except one or two types which
>     collects bit flags which will have to be declared in different order.
>     What I did in my naive implementation with "little endian" flag on the
>     State type is re-arrangement of fields during generation time.
>     It does not require any byte shifting at the runtime and keeps natural
>     order of fields without the need for manual reordering of flags in mspec
>     declaration.
>     
>     I understand that keeping mspec simple is an important priority, however
>     doing it at the cost of developers who will use it to implement codecs
>     will not help us in long term.
>     I'm quite sure that sooner or later we will get a  protocol with mix of
>     BE/LE fields.
>     
>     Cheers,
>     Łukasz
>     
>     
>     On 14.04.2020 14:57, Christofer Dutz wrote:
>     > Hi Lukasz (adding dev@ to the recipients again)
>     > 
>     >  
>     > 
>     > you are not quite correct. Enum types do declare a base type where they
>     > are declared … not where they are used.
>     > 
>     >  
>     > 
>     > If you have bit-fields then why don’t you define them as sequence of
>     > bits? I really can’t understand why you need to flip anything.
>     > 
>     > I think you’re trying to do something similar to Sebastian when he
>     > wanted to introduce some bit-mask types.
>     > 
>     >  
>     > 
>     > I could imagine that in the spec it might be written … these two bytes
>     > are a bit-mask and the bit 1 means X bit 10 means Y, …
>     > 
>     > So why not define the bit fields in the sequence they are sent over the
>     > wire?
>     > 
>     >  
>     > 
>     > Assuming you have a bit field of 16 bits BE uint. If the spec now says:
>     > 
>     >  
>     > 
>     >  
>     > 
>     > Bit:                        15|14|13|12|11|10|9 |8 |7 |6 |5  |4 |3  |2
>     >  |1  |0
>     > 
>     > Meaning:            A  |B  |C  |D  |E  | F  |G|H |I  | J | K |L | M|N |O |P
>     > 
>     >  
>     > 
>     > Then the order they are sent is:
>     > 
>     > 7 |6 |5  |4 |3  |2  |1  |0 | 15|14|13|12|11|10|9 |8
>     > 
>     > So you have to declare the bit fields in the order:
>     > 
>     >                                I  | J | K |L | M|N |O |P  |A  |B  |C  |D
>     >  |E  | F  |G|H
>     > 
>     >  
>     > 
>     > So I don’t quite understand what you’re trying to flip here.
>     > 
>     >  
>     > 
>     > Chris
>     > 
>     >  
>     > 
>     > *Von: *Łukasz Dywicki <lu...@code-house.org>
>     > *Datum: *Dienstag, 14. April 2020 um 13:26
>     > *An: *Christofer Dutz <ch...@c-ware.de>
>     > *Betreff: *Re: Big and littleendian fields in one mspec
>     > 
>     >  
>     > 
>     > I've added the endianness switch to the types in my experiments. These
>     > are not needed for enums as these are read as int and then mapped to
>     > constants. I referred to these as an example - enum does have a length
>     > indication and type behind it.
>     > 
>     >  
>     > 
>     > Again, I will bring the issue - StateIO currently fails to do anything
>     > useful due to BE/LE switch. Because this type uses single bits which
>     > need to be flipped before reading or after writing otherwise they end up
>     > in improper order. Out of 16 bits 9 are in use so we have a lot of
>     > possible combinations (2^9), which seems too much to create an enum.
>     > 
>     > Since you didn't like my initial proposal and modifications (as they
>     > might be redundant to what is available in the buffer API), how would
>     > you handle State serialization without affecting mspec and without
>     > further complication to code generation?
>     > 
>     >  
>     > 
>     > Best,
>     > 
>     > Łukasz
>     > 
>     >  
>     > 
>     > wt., 14 kwi 2020 o 12:47 Christofer Dutz <christofer.dutz@c-ware.de
>     > <ma...@c-ware.de>> napisał(a):
>     > 
>     >     Hi Lukasz,
>     > 
>     >     but we don't have a:
>     >     [enum uint 16 little endian 'CommandId']
>     >     But only a:
>     >     [enum uint 16 'CommandId']
>     > 
>     >     And in your case I think perhaps the constants are not correct. So
>     >     having a 16 bit uint will result in a 4 digit hex string:
>     >     So if you are having problems in mapping them to your constants,
>     >     perhaps the constants are in the wrong endianness.
>     > 
>     >     I have encountered (but can't recall where) that the Enum constants
>     >     in a BE protocol were written down in LE notation.
>     >     Of course the thing can't work then.
>     > 
>     >     So if for example you have the constant "1" in a BE protocol with a
>     >     uint 16 enum type, your constant is not 0x0001, but 0x0100 instead.
>     > 
>     >     Chris
>     > 
>     > 
>     > 
>     > 
>     >     Am 14.04.20, 12:37 schrieb "Łukasz Dywicki" <luke@code-house.org
>     >     <ma...@code-house.org>>:
>     > 
>     >         The legendary "two bytes" (shall we create a band with this
>     >     name?!) are
>     >         coming from unfortunate State type. I don't mind these to be an
>     >         int/sint/uint or whathever - this is matter of interpretation.
>     > 
>     >         If you could please check again my earlier messages you will find
>     >         struggle I have which is - how much given type takes.
>     > 
>     >         We have that for enums
>     >         [enum uint 16 little endian 'CommandId']
>     > 
>     >         but we don't have that for complex types
>     >         [type 'State']
>     > 
>     >         This leads to situation that we don't know how to read and interpret
>     >         incoming byte sequence or how to properly write it back to stream.
>     > 
>     >         It would be great if we could limit the issue just to
>     >         // read
>     >         state = StateIO.parseStatic(io.readUnsignedInt(16, true)); //
>     >     true->LE
>     >         // write
>     >         wio = new WriteBuffer(2);
>     >         StateIO.staticSerialize(wio, new State(...));
>     >         io.writeUnsignedInt(8, wio.read, true); // true->LE
>     > 
>     >         However to do that first we need to know that State is a uint 16
>     >     LE and
>     >         then do a lot of mambo-jambo in code generators to cope with
>     >     that. :-)
>     > 
>     >         Best,
>     >         Łukasz
>     > 
>     > 
>     >         On 14.04.2020 12:20, Christofer Dutz wrote:
>     >         > Oh gee ... I totally remember having exactly the same
>     >     discussion with Sebastian about "bytes" ...
>     >         >
>     >         > The problem is there is generally no "byte" "int" and "uint"
>     >     because a byte = 8 bits is either signed or unsigned. That should
>     >     the 3rd option be? Perhaps-signed?
>     >         > So generally I use "uint 8" for what you refer to as a "byte 8".
>     >         >
>     >         > Wo what would be the difference between your "2 byte little
>     >     endian" and an "uint 16" with a LE ReadBuffer?
>     >         >
>     >         > I think what you have to rid yourself of thinking of int as
>     >     number and a byte not being a number.
>     >         >
>     >         > Chris
>     >         >
>     >         >
>     >         >
>     >         > Am 14.04.20, 10:52 schrieb "Łukasz Dywicki"
>     >     <luke@code-house.org <ma...@code-house.org>>:
>     >         >
>     >         >     Hey Christian,
>     >         >     The problem I face is quite simple. State type in mspec i
>     >     declared as a
>     >         >     bunch of bits. Type length is fixed, however not available
>     >     anywhere in
>     >         >     mspec or code generator to re-arrange bytes upfront. All
>     >     we have exposed
>     >         >     at reader/write level is read/write bit.
>     >         >     To be fair LE/BE support in read/write buffers is limited
>     >     just to
>     >         >     numbers. There is no such support for raw bytes or bits,
>     >     cause for that
>     >         >     you need to declare a length of LE/BE sequence.
>     >         >     
>     >         >     I would love if I could just declare State as '2 byte
>     >     little endian' so
>     >         >     it would be read properly upfront and parsed with no
>     >     changes in
>     >         >     generated code, however I'm not sure how to do it and
>     >     where. That's why
>     >         >     I'm playing with different things described in earlier mail.
>     >         >     Since all type handling is general I am just afraid of
>     >     more complicated
>     >         >     scenarios where we have variable length structures such as
>     >     arrays.
>     >         >     
>     >         >     Best regards,
>     >         >     Łukasz
>     >         >     
>     >         >     
>     >         >     On 14.04.2020 09:41, Christofer Dutz wrote:
>     >         >     > Hi Lukasz,
>     >         >     >
>     >         >     > I am not sure I am understanding the problems you are
>     >     facing. We already have LE and BE protocols.
>     >         >     > For example EIP is LE and the rest is generally BE. It
>     >     seems that ADS/AMS is also LE.
>     >         >     > mspec doesn't even know about endianness.
>     >         >     >
>     >         >     > Up till now the endianness doesn't have an effect on
>     >     bit-fields or single-bit ints.
>     >         >     > It only starts to affect if a field goes from one byte
>     >     to the next, which is usually for (u)int and floating point values.
>     >         >     >
>     >         >     > That's why we have created the Read/WriteBuffers to set
>     >     their endianness in the constructor.
>     >         >     >
>     >         >     > So if you're creating a driver for ADS/AMS which is LE,
>     >     then you write the mspec according to the sequence the information
>     >     is located in the transferred bytes and have the Read/WriteBuffer
>     >     handle the endianness issue.
>     >         >     >
>     >         >     > I do see a problem when there are drivers that use mixed
>     >     endianness, but we have still to encounter such a protocol.
>     >         >     >
>     >         >     > So I have to admit that I don't like any of the mspec
>     >     changes you proposed, as I think you are just not using the tools we
>     >     have the right way.
>     >         >     >
>     >         >     > Chris
>     >         >     >
>     >         >     >
>     >         >     >
>     >         >     > Am 14.04.20, 00:32 schrieb "Łukasz Dywicki"
>     >     <luke@code-house.org <ma...@code-house.org>>:
>     >         >     >
>     >         >     >     Hey Niclas,
>     >         >     >     I realized how old the old things are when I started
>     >     preparing
>     >         >     >     automation training for mere mortals and got into
>     >     history of frames and
>     >         >     >     even cabling. Mr. Modbus and EIA-485 is definitely
>     >     older than I. ;-)
>     >         >     >     
>     >         >     >     Getting back to the point - yes. I been thinking how
>     >     to address the byte
>     >         >     >     order in effective way. Here are two approaches I
>     >     have for now:
>     >         >     >     
>     >         >     >     A) My initial attempt is just a temporary buffer
>     >     which is then written
>     >         >     >     in reverse order to caller. For reading it is
>     >     similar - just getting N
>     >         >     >     bytes in reversed order. The hard part is.. knowing
>     >     N. I had to add a
>     >         >     >     static calculation in order to allocate valid buffer
>     >     sizes. I tend to
>     >         >     >     work but I'm not happy with this approach cause it
>     >     involves additional work.
>     >         >     >     B) Second idea I've got is really simple and relies
>     >     on code generation.
>     >         >     >     We know in which order fields are coming. Here I'm
>     >     referring to a State
>     >         >     >     field which is just bunch of bits. If we would group
>     >     fields in bytes and
>     >         >     >     generate code in reverse order then it has chance to
>     >     work. Requirement
>     >         >     >     for that - ability to know basic field sizes upfront.
>     >         >     >     C) Try to combine above with bit-io or
>     >     Read/WriteBuffers as these are
>     >         >     >     places which know actual position and state of
>     >     buffers which are being
>     >         >     >     read/written.
>     >         >     >     
>     >         >     >     Now, getting to two cases which are a problem.
>     >     CommandId and State. So
>     >         >     >     with command id situation is simple as it is
>     >     declared as enum and it is
>     >         >     >     read as uint. We know size upfront and can generate
>     >     valid method call
>     >         >     >     (readIntLE).
>     >         >     >     [enum uint 16 little endian 'CommandId'
>     >         >     >         ['0x00' INVALID]
>     >         >     >         ['0x01' ADS_READ_DEVICE_INFO]
>     >         >     >         ['0x02' ADS_READ]
>     >         >     >         ['0x03' ADS_WRITE]
>     >         >     >         ['0x04' ADS_READ_STATE]
>     >         >     >         ['0x05' ADS_WRITE_CONTROL]
>     >         >     >         ['0x06' ADS_ADD_DEVICE_NOTIFICATION]
>     >         >     >         ['0x07' ADS_DELETE_DEVICE_NOTIFICATION]
>     >         >     >         ['0x08' ADS_DEVICE_NOTIFICATION]
>     >         >     >         ['0x09' ADS_READ_WRITE]
>     >         >     >     ]
>     >         >     >     
>     >         >     >     Second candidate is what I'm stuck right now sniping
>     >     next cycles of
>     >         >     >     problems. So in case of State we have complex type
>     >     composed from 2
>     >         >     >     bytes. A note here - instead of two bytes we might
>     >     have a variable
>     >         >     >     length type which includes array or other variable
>     >     section.
>     >         >     >     [type little endian 'State'
>     >         >     >         [simple     bit 'broadcast'             ]
>     >         >     >         [reserved   int 7 '0x0'                 ]
>     >         >     >         [simple     bit 'initCommand'           ]
>     >         >     >         [simple     bit 'updCommand'            ]
>     >         >     >         [simple     bit 'timestampAdded'        ]
>     >         >     >         [simple     bit 'highPriorityCommand'   ]
>     >         >     >         [simple     bit 'systemCommand'         ]
>     >         >     >         [simple     bit 'adsCommand'            ]
>     >         >     >         [simple     bit 'noReturn'              ]
>     >         >     >         [simple     bit 'response'              ]
>     >         >     >     ]
>     >         >     >     
>     >         >     >     The order of reading big endian encoded data to
>     >     impose little endian
>     >         >     >     shift would be (please correct me if I'm wrong):
>     >         >     >     1) init
>     >         >     >     2) udp
>     >         >     >     3) add timestamp
>     >         >     >     4) priority
>     >         >     >     5) system
>     >         >     >     6) ads
>     >         >     >     7) noreturn
>     >         >     >     8) response (end of byte 1)
>     >         >     >     9) broadcast
>     >         >     >     10) reserved (end of byte )
>     >         >     >     We can do same trick for writing, by re-arranging
>     >     fields. By this way we
>     >         >     >     avoid any additional byte level operations.
>     >         >     >     
>     >         >     >     Overall trouble with generated driver is to declare
>     >     "how much" bytes
>     >         >     >     should be read and interpreted. We have precise size
>     >     information at the
>     >         >     >     runtime - due to length fields, we can leverage it
>     >     at generation time,
>     >         >     >     but then we won't be able to cover all cases.
>     >         >     >     
>     >         >     >     I would love to keep it simple and do not break
>     >     things thus I need your
>     >         >     >     advice on how to approach this problem in a valid way.
>     >         >     >     
>     >         >     >     Cheers,
>     >         >     >     Łukasz
>     >         >     >     
>     >         >     >     
>     >         >     >     On 13.04.2020 03:26, Niclas Hedhman wrote:
>     >         >     >     > <anecdotal-rant>
>     >         >     >     > For us who were around and shaping the protocols
>     >     in the 1980s, and people
>     >         >     >     > before us (and before standards like RS-232), a
>     >     lot of the "specifications"
>     >         >     >     > came out of "observation of implementation we
>     >     managed to get to work",
>     >         >     >     > rather than "implement this spec". A lot was due
>     >     to extreme memory
>     >         >     >     > constraints (in my case, multi-tasking operating
>     >     system, serial protocol
>     >         >     >     > 187kbps, interpreted programming language with
>     >     floating point ops and user
>     >         >     >     > applications in 2kB RAM and 8kB EPROM) and a
>     >     general lack of information,
>     >         >     >     > like what other people were doing, sharing
>     >     experiences and so on.
>     >         >     >     >
>     >         >     >     > And there were many "innovative" ways to squeeze
>     >     just a little bit extra
>     >         >     >     > out of the hardware, resulting in "hard to
>     >     understand" consequences. Bit
>     >         >     >     > packing was a typical one, multiple functions
>     >     packed into a single byte.
>     >         >     >     > Look at page 14 in
>     >     https://www.nxp.com/docs/en/data-sheet/80C31_80C32.pdf
>     >         >     >     > and read up on "UART Enahanced Mode", and we used
>     >     this, i.e. 9 bits, no
>     >         >     >     > parity and clever use of address and mask to
>     >     create a slave-to-slave direct
>     >         >     >     > protocol, where the master's role was to signal
>     >     which slave "owned" the
>     >         >     >     > cable. Yeah, in that 8kB ROM limitation (I think
>     >     protocol was about 1kB
>     >         >     >     > ROM) and something like 150 bytes RAM for comm
>     >     protocol.
>     >         >     >     >
>     >         >     >     > Could you implement a compatible device to this
>     >     with PLC4X and modern
>     >         >     >     > hardware (i.e. no 8031/32 co-processor)? Possibly
>     >     but bit-banging is needed
>     >         >     >     > to support the 9bit data (+start and stop bits)
>     >     and an awful lot of CPU
>     >         >     >     > cycles on something that was automatic on one of
>     >     the slowest long-lived
>     >         >     >     > microcontroller ever.
>     >         >     >     > </anecdotal-rant>
>     >         >     >     >
>     >         >     >     > My point was only to highlight that some of the
>     >     strange things you see in
>     >         >     >     > protocols today, have its roots in
>     >     pre-standardization days. Today no one
>     >         >     >     > would go down that route, because the hardware
>     >     cost nothing now (8031  +
>     >         >     >     > 8kB EPROM + 2kB static RAM + battery backup =>
>     >     ~$50 in 1983's currency) and
>     >         >     >     > longevity of software is more important.
>     >         >     >     >
>     >         >     >     > Cheers
>     >         >     >     > Niclas
>     >         >     >     >
>     >         >     >     >
>     >         >     >     > On Sun, Apr 12, 2020 at 10:10 PM Christofer Dutz
>     >     <christofer.dutz@c-ware.de <ma...@c-ware.de>>
>     >         >     >     > wrote:
>     >         >     >     >
>     >         >     >     >> Hi Lukasz,
>     >         >     >     >>
>     >         >     >     >> I think it really gets tricky when using BE and
>     >     having some byte-odd-sizes
>     >         >     >     >> ... I remember in the Firmata protocol there were
>     >     some bitmasks and then 10
>     >         >     >     >> bit uint as BE ... not it really got tricky as
>     >     the specs were written from
>     >         >     >     >> a point of view: You read 16 bits BE and then the
>     >     first6 bits mean XYZ
>     >         >     >     >> instead of describing how the bits actually
>     >     travel over the wire.
>     >         >     >     >>
>     >         >     >     >> Chris
>     >         >     >     >>
>     >         >     >     >>
>     >         >     >     >>
>     >         >     >     >> Am 11.04.20, 01:21 schrieb "Łukasz Dywicki"
>     >     <luke@code-house.org <ma...@code-house.org>>:
>     >         >     >     >>
>     >         >     >     >>     I've made some progress with topic by
>     >     modyfing mspec and allowing
>     >         >     >     >>     'little endian' flag on fields. This moved me
>     >     further to next issue -
>     >         >     >     >>     which is whole type encoded little endian.
>     >         >     >     >>
>     >         >     >     >>     In ADS driver such type is State, which has 2
>     >     bytes and uses 8 bits for
>     >         >     >     >>     various flags.
>     >         >     >     >>     There are two cases which require different
>     >     approach - reading and
>     >         >     >     >>     writing. So for reading we need to swap N
>     >     bytes based on type length.
>     >         >     >     >>     For writing we need to alocate buffer for N
>     >     bytes and swap them before
>     >         >     >     >>     writing.
>     >         >     >     >>
>     >         >     >     >>     I am stuck now with freemaker templates and
>     >     bit-io.
>     >         >     >     >>
>     >         >     >     >>     Cheers,
>     >         >     >     >>     Łukasz
>     >         >     >     >>
>     >         >     >     >>
>     >         >     >     >>
>     >         >     >     >>     On 10.04.2020 17:57, Łukasz Dywicki wrote:
>     >         >     >     >>     > I am doing some tests of ADS serialization.
>     >         >     >     >>     >
>     >         >     >     >>     > I've run into some troubles with payload
>     >     which is generated with new
>     >         >     >     >>     > driver. I'm not sure if that's my fault or
>     >     generated code.
>     >         >     >     >>     >
>     >         >     >     >>     > I did a verification of what Wireshark
>     >     shows and how ads structures
>     >         >     >     >> are
>     >         >     >     >>     > parsed. There is a gap I think. For example
>     >     ams port number 1000
>     >         >     >     >>     > (0x1027) is read as 4135.
>     >         >     >     >>     >
>     >         >     >     >>     > Obviously I used wrong structures while
>     >     implementing protocol logic
>     >         >     >     >> in
>     >         >     >     >>     > first place, but now I am uncertain of how
>     >     fields are encoded. How we
>     >         >     >     >>     > mark field as little endian when rest of
>     >     payload is big endian? Do we
>     >         >     >     >>     > have `uint_le`?
>     >         >     >     >>     >
>     >         >     >     >>     > As far I remember route creation logic I
>     >     was tracking last week used
>     >         >     >     >>     > combination of LE and BE.
>     >         >     >     >>     >
>     >         >     >     >>     > Best regards,
>     >         >     >     >>     > Łukasz
>     >         >     >     >>     >
>     >         >     >     >>
>     >         >     >     >>
>     >         >     >     >>
>     >         >     >     >
>     >         >     >     
>     >         >     >
>     >         >     
>     >         >
>     > 
>     
>

Re: Big and littleendian fields in one mspec

Posted by Christofer Dutz <ch...@c-ware.de>.

Hi Lukasz,

I see it differently.

If you say something is LE/BE then it should be consistent. 

I for my part like it if the mspec is consistent with what I see in WireShark. Otherwise you would always have to do a sort of 2 step parsing:

"Read 4 bytes in BE and then on the rearranged byte sequence take the first 3 bits and call them X and the next bit after that Y and so on."

In this case you would also have to know how large the chunk is to read in order to know how to decode it's content.
Having a historically partitioned set of two 16 bit bit-fields would look completely different to a 32 bit bit-field. 

I don't like adding this extra level, even if it's probably the way protocols were implemented when lacking the ability to write individual bits.
I guess the usual process was to read a BE short (2 bytes) and to the resulting unsigned short value apply different logical or operations to get the information.
In PLC4X we can simply read any number of bits and don't have to twist our minds in which chunks I have to read things first. 

So I think the way you are proposing is probably the wider used version due to legacy restrictions, but also the more complicated way.

So I would like to keep it simple, even if the developer will have to do a little more brain-work. 

I mean all we are doing in this project is going the extra mile so others don't have to ... at least the ones contributing are doing that.

What do the others say to this? (Ideally ones that have already used mspec or ones that understand the problem)

Chris


Am 14.04.20, 15:46 schrieb "Łukasz Dywicki" <lu...@code-house.org>:

    Hey Chris,
    Sorry for wrongly addressed mail, it was intended to go to mailing list.
    
    To the point - I think that what you propose with reordering fields
    manually shift the load from the tool to the developer. It will lead to
    further implementation errors caused by a need to reorder fields while
    creating spec. It will be inconsistent with rest of places where things
    are in "natural order" (big endian) except one or two types which
    collects bit flags which will have to be declared in different order.
    What I did in my naive implementation with "little endian" flag on the
    State type is re-arrangement of fields during generation time.
    It does not require any byte shifting at the runtime and keeps natural
    order of fields without the need for manual reordering of flags in mspec
    declaration.
    
    I understand that keeping mspec simple is an important priority, however
    doing it at the cost of developers who will use it to implement codecs
    will not help us in long term.
    I'm quite sure that sooner or later we will get a  protocol with mix of
    BE/LE fields.
    
    Cheers,
    Łukasz
    
    
    On 14.04.2020 14:57, Christofer Dutz wrote:
    > Hi Lukasz (adding dev@ to the recipients again)
    > 
    >  
    > 
    > you are not quite correct. Enum types do declare a base type where they
    > are declared … not where they are used.
    > 
    >  
    > 
    > If you have bit-fields then why don’t you define them as sequence of
    > bits? I really can’t understand why you need to flip anything.
    > 
    > I think you’re trying to do something similar to Sebastian when he
    > wanted to introduce some bit-mask types.
    > 
    >  
    > 
    > I could imagine that in the spec it might be written … these two bytes
    > are a bit-mask and the bit 1 means X bit 10 means Y, …
    > 
    > So why not define the bit fields in the sequence they are sent over the
    > wire?
    > 
    >  
    > 
    > Assuming you have a bit field of 16 bits BE uint. If the spec now says:
    > 
    >  
    > 
    >  
    > 
    > Bit:                        15|14|13|12|11|10|9 |8 |7 |6 |5  |4 |3  |2
    >  |1  |0
    > 
    > Meaning:            A  |B  |C  |D  |E  | F  |G|H |I  | J | K |L | M|N |O |P
    > 
    >  
    > 
    > Then the order they are sent is:
    > 
    > 7 |6 |5  |4 |3  |2  |1  |0 | 15|14|13|12|11|10|9 |8
    > 
    > So you have to declare the bit fields in the order:
    > 
    >                                I  | J | K |L | M|N |O |P  |A  |B  |C  |D
    >  |E  | F  |G|H
    > 
    >  
    > 
    > So I don’t quite understand what you’re trying to flip here.
    > 
    >  
    > 
    > Chris
    > 
    >  
    > 
    > *Von: *Łukasz Dywicki <lu...@code-house.org>
    > *Datum: *Dienstag, 14. April 2020 um 13:26
    > *An: *Christofer Dutz <ch...@c-ware.de>
    > *Betreff: *Re: Big and littleendian fields in one mspec
    > 
    >  
    > 
    > I've added the endianness switch to the types in my experiments. These
    > are not needed for enums as these are read as int and then mapped to
    > constants. I referred to these as an example - enum does have a length
    > indication and type behind it.
    > 
    >  
    > 
    > Again, I will bring the issue - StateIO currently fails to do anything
    > useful due to BE/LE switch. Because this type uses single bits which
    > need to be flipped before reading or after writing otherwise they end up
    > in improper order. Out of 16 bits 9 are in use so we have a lot of
    > possible combinations (2^9), which seems too much to create an enum.
    > 
    > Since you didn't like my initial proposal and modifications (as they
    > might be redundant to what is available in the buffer API), how would
    > you handle State serialization without affecting mspec and without
    > further complication to code generation?
    > 
    >  
    > 
    > Best,
    > 
    > Łukasz
    > 
    >  
    > 
    > wt., 14 kwi 2020 o 12:47 Christofer Dutz <christofer.dutz@c-ware.de
    > <ma...@c-ware.de>> napisał(a):
    > 
    >     Hi Lukasz,
    > 
    >     but we don't have a:
    >     [enum uint 16 little endian 'CommandId']
    >     But only a:
    >     [enum uint 16 'CommandId']
    > 
    >     And in your case I think perhaps the constants are not correct. So
    >     having a 16 bit uint will result in a 4 digit hex string:
    >     So if you are having problems in mapping them to your constants,
    >     perhaps the constants are in the wrong endianness.
    > 
    >     I have encountered (but can't recall where) that the Enum constants
    >     in a BE protocol were written down in LE notation.
    >     Of course the thing can't work then.
    > 
    >     So if for example you have the constant "1" in a BE protocol with a
    >     uint 16 enum type, your constant is not 0x0001, but 0x0100 instead.
    > 
    >     Chris
    > 
    > 
    > 
    > 
    >     Am 14.04.20, 12:37 schrieb "Łukasz Dywicki" <luke@code-house.org
    >     <ma...@code-house.org>>:
    > 
    >         The legendary "two bytes" (shall we create a band with this
    >     name?!) are
    >         coming from unfortunate State type. I don't mind these to be an
    >         int/sint/uint or whathever - this is matter of interpretation.
    > 
    >         If you could please check again my earlier messages you will find
    >         struggle I have which is - how much given type takes.
    > 
    >         We have that for enums
    >         [enum uint 16 little endian 'CommandId']
    > 
    >         but we don't have that for complex types
    >         [type 'State']
    > 
    >         This leads to situation that we don't know how to read and interpret
    >         incoming byte sequence or how to properly write it back to stream.
    > 
    >         It would be great if we could limit the issue just to
    >         // read
    >         state = StateIO.parseStatic(io.readUnsignedInt(16, true)); //
    >     true->LE
    >         // write
    >         wio = new WriteBuffer(2);
    >         StateIO.staticSerialize(wio, new State(...));
    >         io.writeUnsignedInt(8, wio.read, true); // true->LE
    > 
    >         However to do that first we need to know that State is a uint 16
    >     LE and
    >         then do a lot of mambo-jambo in code generators to cope with
    >     that. :-)
    > 
    >         Best,
    >         Łukasz
    > 
    > 
    >         On 14.04.2020 12:20, Christofer Dutz wrote:
    >         > Oh gee ... I totally remember having exactly the same
    >     discussion with Sebastian about "bytes" ...
    >         >
    >         > The problem is there is generally no "byte" "int" and "uint"
    >     because a byte = 8 bits is either signed or unsigned. That should
    >     the 3rd option be? Perhaps-signed?
    >         > So generally I use "uint 8" for what you refer to as a "byte 8".
    >         >
    >         > Wo what would be the difference between your "2 byte little
    >     endian" and an "uint 16" with a LE ReadBuffer?
    >         >
    >         > I think what you have to rid yourself of thinking of int as
    >     number and a byte not being a number.
    >         >
    >         > Chris
    >         >
    >         >
    >         >
    >         > Am 14.04.20, 10:52 schrieb "Łukasz Dywicki"
    >     <luke@code-house.org <ma...@code-house.org>>:
    >         >
    >         >     Hey Christian,
    >         >     The problem I face is quite simple. State type in mspec i
    >     declared as a
    >         >     bunch of bits. Type length is fixed, however not available
    >     anywhere in
    >         >     mspec or code generator to re-arrange bytes upfront. All
    >     we have exposed
    >         >     at reader/write level is read/write bit.
    >         >     To be fair LE/BE support in read/write buffers is limited
    >     just to
    >         >     numbers. There is no such support for raw bytes or bits,
    >     cause for that
    >         >     you need to declare a length of LE/BE sequence.
    >         >     
    >         >     I would love if I could just declare State as '2 byte
    >     little endian' so
    >         >     it would be read properly upfront and parsed with no
    >     changes in
    >         >     generated code, however I'm not sure how to do it and
    >     where. That's why
    >         >     I'm playing with different things described in earlier mail.
    >         >     Since all type handling is general I am just afraid of
    >     more complicated
    >         >     scenarios where we have variable length structures such as
    >     arrays.
    >         >     
    >         >     Best regards,
    >         >     Łukasz
    >         >     
    >         >     
    >         >     On 14.04.2020 09:41, Christofer Dutz wrote:
    >         >     > Hi Lukasz,
    >         >     >
    >         >     > I am not sure I am understanding the problems you are
    >     facing. We already have LE and BE protocols.
    >         >     > For example EIP is LE and the rest is generally BE. It
    >     seems that ADS/AMS is also LE.
    >         >     > mspec doesn't even know about endianness.
    >         >     >
    >         >     > Up till now the endianness doesn't have an effect on
    >     bit-fields or single-bit ints.
    >         >     > It only starts to affect if a field goes from one byte
    >     to the next, which is usually for (u)int and floating point values.
    >         >     >
    >         >     > That's why we have created the Read/WriteBuffers to set
    >     their endianness in the constructor.
    >         >     >
    >         >     > So if you're creating a driver for ADS/AMS which is LE,
    >     then you write the mspec according to the sequence the information
    >     is located in the transferred bytes and have the Read/WriteBuffer
    >     handle the endianness issue.
    >         >     >
    >         >     > I do see a problem when there are drivers that use mixed
    >     endianness, but we have still to encounter such a protocol.
    >         >     >
    >         >     > So I have to admit that I don't like any of the mspec
    >     changes you proposed, as I think you are just not using the tools we
    >     have the right way.
    >         >     >
    >         >     > Chris
    >         >     >
    >         >     >
    >         >     >
    >         >     > Am 14.04.20, 00:32 schrieb "Łukasz Dywicki"
    >     <luke@code-house.org <ma...@code-house.org>>:
    >         >     >
    >         >     >     Hey Niclas,
    >         >     >     I realized how old the old things are when I started
    >     preparing
    >         >     >     automation training for mere mortals and got into
    >     history of frames and
    >         >     >     even cabling. Mr. Modbus and EIA-485 is definitely
    >     older than I. ;-)
    >         >     >     
    >         >     >     Getting back to the point - yes. I been thinking how
    >     to address the byte
    >         >     >     order in effective way. Here are two approaches I
    >     have for now:
    >         >     >     
    >         >     >     A) My initial attempt is just a temporary buffer
    >     which is then written
    >         >     >     in reverse order to caller. For reading it is
    >     similar - just getting N
    >         >     >     bytes in reversed order. The hard part is.. knowing
    >     N. I had to add a
    >         >     >     static calculation in order to allocate valid buffer
    >     sizes. I tend to
    >         >     >     work but I'm not happy with this approach cause it
    >     involves additional work.
    >         >     >     B) Second idea I've got is really simple and relies
    >     on code generation.
    >         >     >     We know in which order fields are coming. Here I'm
    >     referring to a State
    >         >     >     field which is just bunch of bits. If we would group
    >     fields in bytes and
    >         >     >     generate code in reverse order then it has chance to
    >     work. Requirement
    >         >     >     for that - ability to know basic field sizes upfront.
    >         >     >     C) Try to combine above with bit-io or
    >     Read/WriteBuffers as these are
    >         >     >     places which know actual position and state of
    >     buffers which are being
    >         >     >     read/written.
    >         >     >     
    >         >     >     Now, getting to two cases which are a problem.
    >     CommandId and State. So
    >         >     >     with command id situation is simple as it is
    >     declared as enum and it is
    >         >     >     read as uint. We know size upfront and can generate
    >     valid method call
    >         >     >     (readIntLE).
    >         >     >     [enum uint 16 little endian 'CommandId'
    >         >     >         ['0x00' INVALID]
    >         >     >         ['0x01' ADS_READ_DEVICE_INFO]
    >         >     >         ['0x02' ADS_READ]
    >         >     >         ['0x03' ADS_WRITE]
    >         >     >         ['0x04' ADS_READ_STATE]
    >         >     >         ['0x05' ADS_WRITE_CONTROL]
    >         >     >         ['0x06' ADS_ADD_DEVICE_NOTIFICATION]
    >         >     >         ['0x07' ADS_DELETE_DEVICE_NOTIFICATION]
    >         >     >         ['0x08' ADS_DEVICE_NOTIFICATION]
    >         >     >         ['0x09' ADS_READ_WRITE]
    >         >     >     ]
    >         >     >     
    >         >     >     Second candidate is what I'm stuck right now sniping
    >     next cycles of
    >         >     >     problems. So in case of State we have complex type
    >     composed from 2
    >         >     >     bytes. A note here - instead of two bytes we might
    >     have a variable
    >         >     >     length type which includes array or other variable
    >     section.
    >         >     >     [type little endian 'State'
    >         >     >         [simple     bit 'broadcast'             ]
    >         >     >         [reserved   int 7 '0x0'                 ]
    >         >     >         [simple     bit 'initCommand'           ]
    >         >     >         [simple     bit 'updCommand'            ]
    >         >     >         [simple     bit 'timestampAdded'        ]
    >         >     >         [simple     bit 'highPriorityCommand'   ]
    >         >     >         [simple     bit 'systemCommand'         ]
    >         >     >         [simple     bit 'adsCommand'            ]
    >         >     >         [simple     bit 'noReturn'              ]
    >         >     >         [simple     bit 'response'              ]
    >         >     >     ]
    >         >     >     
    >         >     >     The order of reading big endian encoded data to
    >     impose little endian
    >         >     >     shift would be (please correct me if I'm wrong):
    >         >     >     1) init
    >         >     >     2) udp
    >         >     >     3) add timestamp
    >         >     >     4) priority
    >         >     >     5) system
    >         >     >     6) ads
    >         >     >     7) noreturn
    >         >     >     8) response (end of byte 1)
    >         >     >     9) broadcast
    >         >     >     10) reserved (end of byte )
    >         >     >     We can do same trick for writing, by re-arranging
    >     fields. By this way we
    >         >     >     avoid any additional byte level operations.
    >         >     >     
    >         >     >     Overall trouble with generated driver is to declare
    >     "how much" bytes
    >         >     >     should be read and interpreted. We have precise size
    >     information at the
    >         >     >     runtime - due to length fields, we can leverage it
    >     at generation time,
    >         >     >     but then we won't be able to cover all cases.
    >         >     >     
    >         >     >     I would love to keep it simple and do not break
    >     things thus I need your
    >         >     >     advice on how to approach this problem in a valid way.
    >         >     >     
    >         >     >     Cheers,
    >         >     >     Łukasz
    >         >     >     
    >         >     >     
    >         >     >     On 13.04.2020 03:26, Niclas Hedhman wrote:
    >         >     >     > <anecdotal-rant>
    >         >     >     > For us who were around and shaping the protocols
    >     in the 1980s, and people
    >         >     >     > before us (and before standards like RS-232), a
    >     lot of the "specifications"
    >         >     >     > came out of "observation of implementation we
    >     managed to get to work",
    >         >     >     > rather than "implement this spec". A lot was due
    >     to extreme memory
    >         >     >     > constraints (in my case, multi-tasking operating
    >     system, serial protocol
    >         >     >     > 187kbps, interpreted programming language with
    >     floating point ops and user
    >         >     >     > applications in 2kB RAM and 8kB EPROM) and a
    >     general lack of information,
    >         >     >     > like what other people were doing, sharing
    >     experiences and so on.
    >         >     >     >
    >         >     >     > And there were many "innovative" ways to squeeze
    >     just a little bit extra
    >         >     >     > out of the hardware, resulting in "hard to
    >     understand" consequences. Bit
    >         >     >     > packing was a typical one, multiple functions
    >     packed into a single byte.
    >         >     >     > Look at page 14 in
    >     https://www.nxp.com/docs/en/data-sheet/80C31_80C32.pdf
    >         >     >     > and read up on "UART Enahanced Mode", and we used
    >     this, i.e. 9 bits, no
    >         >     >     > parity and clever use of address and mask to
    >     create a slave-to-slave direct
    >         >     >     > protocol, where the master's role was to signal
    >     which slave "owned" the
    >         >     >     > cable. Yeah, in that 8kB ROM limitation (I think
    >     protocol was about 1kB
    >         >     >     > ROM) and something like 150 bytes RAM for comm
    >     protocol.
    >         >     >     >
    >         >     >     > Could you implement a compatible device to this
    >     with PLC4X and modern
    >         >     >     > hardware (i.e. no 8031/32 co-processor)? Possibly
    >     but bit-banging is needed
    >         >     >     > to support the 9bit data (+start and stop bits)
    >     and an awful lot of CPU
    >         >     >     > cycles on something that was automatic on one of
    >     the slowest long-lived
    >         >     >     > microcontroller ever.
    >         >     >     > </anecdotal-rant>
    >         >     >     >
    >         >     >     > My point was only to highlight that some of the
    >     strange things you see in
    >         >     >     > protocols today, have its roots in
    >     pre-standardization days. Today no one
    >         >     >     > would go down that route, because the hardware
    >     cost nothing now (8031  +
    >         >     >     > 8kB EPROM + 2kB static RAM + battery backup =>
    >     ~$50 in 1983's currency) and
    >         >     >     > longevity of software is more important.
    >         >     >     >
    >         >     >     > Cheers
    >         >     >     > Niclas
    >         >     >     >
    >         >     >     >
    >         >     >     > On Sun, Apr 12, 2020 at 10:10 PM Christofer Dutz
    >     <christofer.dutz@c-ware.de <ma...@c-ware.de>>
    >         >     >     > wrote:
    >         >     >     >
    >         >     >     >> Hi Lukasz,
    >         >     >     >>
    >         >     >     >> I think it really gets tricky when using BE and
    >     having some byte-odd-sizes
    >         >     >     >> ... I remember in the Firmata protocol there were
    >     some bitmasks and then 10
    >         >     >     >> bit uint as BE ... not it really got tricky as
    >     the specs were written from
    >         >     >     >> a point of view: You read 16 bits BE and then the
    >     first6 bits mean XYZ
    >         >     >     >> instead of describing how the bits actually
    >     travel over the wire.
    >         >     >     >>
    >         >     >     >> Chris
    >         >     >     >>
    >         >     >     >>
    >         >     >     >>
    >         >     >     >> Am 11.04.20, 01:21 schrieb "Łukasz Dywicki"
    >     <luke@code-house.org <ma...@code-house.org>>:
    >         >     >     >>
    >         >     >     >>     I've made some progress with topic by
    >     modyfing mspec and allowing
    >         >     >     >>     'little endian' flag on fields. This moved me
    >     further to next issue -
    >         >     >     >>     which is whole type encoded little endian.
    >         >     >     >>
    >         >     >     >>     In ADS driver such type is State, which has 2
    >     bytes and uses 8 bits for
    >         >     >     >>     various flags.
    >         >     >     >>     There are two cases which require different
    >     approach - reading and
    >         >     >     >>     writing. So for reading we need to swap N
    >     bytes based on type length.
    >         >     >     >>     For writing we need to alocate buffer for N
    >     bytes and swap them before
    >         >     >     >>     writing.
    >         >     >     >>
    >         >     >     >>     I am stuck now with freemaker templates and
    >     bit-io.
    >         >     >     >>
    >         >     >     >>     Cheers,
    >         >     >     >>     Łukasz
    >         >     >     >>
    >         >     >     >>
    >         >     >     >>
    >         >     >     >>     On 10.04.2020 17:57, Łukasz Dywicki wrote:
    >         >     >     >>     > I am doing some tests of ADS serialization.
    >         >     >     >>     >
    >         >     >     >>     > I've run into some troubles with payload
    >     which is generated with new
    >         >     >     >>     > driver. I'm not sure if that's my fault or
    >     generated code.
    >         >     >     >>     >
    >         >     >     >>     > I did a verification of what Wireshark
    >     shows and how ads structures
    >         >     >     >> are
    >         >     >     >>     > parsed. There is a gap I think. For example
    >     ams port number 1000
    >         >     >     >>     > (0x1027) is read as 4135.
    >         >     >     >>     >
    >         >     >     >>     > Obviously I used wrong structures while
    >     implementing protocol logic
    >         >     >     >> in
    >         >     >     >>     > first place, but now I am uncertain of how
    >     fields are encoded. How we
    >         >     >     >>     > mark field as little endian when rest of
    >     payload is big endian? Do we
    >         >     >     >>     > have `uint_le`?
    >         >     >     >>     >
    >         >     >     >>     > As far I remember route creation logic I
    >     was tracking last week used
    >         >     >     >>     > combination of LE and BE.
    >         >     >     >>     >
    >         >     >     >>     > Best regards,
    >         >     >     >>     > Łukasz
    >         >     >     >>     >
    >         >     >     >>
    >         >     >     >>
    >         >     >     >>
    >         >     >     >
    >         >     >     
    >         >     >
    >         >     
    >         >
    >

Re: Big and littleendian fields in one mspec

Posted by Łukasz Dywicki <lu...@code-house.org>.

Hey Chris,
Sorry for wrongly addressed mail, it was intended to go to mailing list.

To the point - I think that what you propose with reordering fields
manually shift the load from the tool to the developer. It will lead to
further implementation errors caused by a need to reorder fields while
creating spec. It will be inconsistent with rest of places where things
are in "natural order" (big endian) except one or two types which
collects bit flags which will have to be declared in different order.
What I did in my naive implementation with "little endian" flag on the
State type is re-arrangement of fields during generation time.
It does not require any byte shifting at the runtime and keeps natural
order of fields without the need for manual reordering of flags in mspec
declaration.

I understand that keeping mspec simple is an important priority, however
doing it at the cost of developers who will use it to implement codecs
will not help us in long term.
I'm quite sure that sooner or later we will get a  protocol with mix of
BE/LE fields.

Cheers,
Łukasz


On 14.04.2020 14:57, Christofer Dutz wrote:
> Hi Lukasz (adding dev@ to the recipients again)
> 
>  
> 
> you are not quite correct. Enum types do declare a base type where they
> are declared … not where they are used.
> 
>  
> 
> If you have bit-fields then why don’t you define them as sequence of
> bits? I really can’t understand why you need to flip anything.
> 
> I think you’re trying to do something similar to Sebastian when he
> wanted to introduce some bit-mask types.
> 
>  
> 
> I could imagine that in the spec it might be written … these two bytes
> are a bit-mask and the bit 1 means X bit 10 means Y, …
> 
> So why not define the bit fields in the sequence they are sent over the
> wire?
> 
>  
> 
> Assuming you have a bit field of 16 bits BE uint. If the spec now says:
> 
>  
> 
>  
> 
> Bit:                        15|14|13|12|11|10|9 |8 |7 |6 |5  |4 |3  |2
>  |1  |0
> 
> Meaning:            A  |B  |C  |D  |E  | F  |G|H |I  | J | K |L | M|N |O |P
> 
>  
> 
> Then the order they are sent is:
> 
> 7 |6 |5  |4 |3  |2  |1  |0 | 15|14|13|12|11|10|9 |8
> 
> So you have to declare the bit fields in the order:
> 
>                                I  | J | K |L | M|N |O |P  |A  |B  |C  |D
>  |E  | F  |G|H
> 
>  
> 
> So I don’t quite understand what you’re trying to flip here.
> 
>  
> 
> Chris
> 
>  
> 
> *Von: *Łukasz Dywicki <lu...@code-house.org>
> *Datum: *Dienstag, 14. April 2020 um 13:26
> *An: *Christofer Dutz <ch...@c-ware.de>
> *Betreff: *Re: Big and littleendian fields in one mspec
> 
>  
> 
> I've added the endianness switch to the types in my experiments. These
> are not needed for enums as these are read as int and then mapped to
> constants. I referred to these as an example - enum does have a length
> indication and type behind it.
> 
>  
> 
> Again, I will bring the issue - StateIO currently fails to do anything
> useful due to BE/LE switch. Because this type uses single bits which
> need to be flipped before reading or after writing otherwise they end up
> in improper order. Out of 16 bits 9 are in use so we have a lot of
> possible combinations (2^9), which seems too much to create an enum.
> 
> Since you didn't like my initial proposal and modifications (as they
> might be redundant to what is available in the buffer API), how would
> you handle State serialization without affecting mspec and without
> further complication to code generation?
> 
>  
> 
> Best,
> 
> Łukasz
> 
>  
> 
> wt., 14 kwi 2020 o 12:47 Christofer Dutz <christofer.dutz@c-ware.de
> <ma...@c-ware.de>> napisał(a):
> 
>     Hi Lukasz,
> 
>     but we don't have a:
>     [enum uint 16 little endian 'CommandId']
>     But only a:
>     [enum uint 16 'CommandId']
> 
>     And in your case I think perhaps the constants are not correct. So
>     having a 16 bit uint will result in a 4 digit hex string:
>     So if you are having problems in mapping them to your constants,
>     perhaps the constants are in the wrong endianness.
> 
>     I have encountered (but can't recall where) that the Enum constants
>     in a BE protocol were written down in LE notation.
>     Of course the thing can't work then.
> 
>     So if for example you have the constant "1" in a BE protocol with a
>     uint 16 enum type, your constant is not 0x0001, but 0x0100 instead.
> 
>     Chris
> 
> 
> 
> 
>     Am 14.04.20, 12:37 schrieb "Łukasz Dywicki" <luke@code-house.org
>     <ma...@code-house.org>>:
> 
>         The legendary "two bytes" (shall we create a band with this
>     name?!) are
>         coming from unfortunate State type. I don't mind these to be an
>         int/sint/uint or whathever - this is matter of interpretation.
> 
>         If you could please check again my earlier messages you will find
>         struggle I have which is - how much given type takes.
> 
>         We have that for enums
>         [enum uint 16 little endian 'CommandId']
> 
>         but we don't have that for complex types
>         [type 'State']
> 
>         This leads to situation that we don't know how to read and interpret
>         incoming byte sequence or how to properly write it back to stream.
> 
>         It would be great if we could limit the issue just to
>         // read
>         state = StateIO.parseStatic(io.readUnsignedInt(16, true)); //
>     true->LE
>         // write
>         wio = new WriteBuffer(2);
>         StateIO.staticSerialize(wio, new State(...));
>         io.writeUnsignedInt(8, wio.read, true); // true->LE
> 
>         However to do that first we need to know that State is a uint 16
>     LE and
>         then do a lot of mambo-jambo in code generators to cope with
>     that. :-)
> 
>         Best,
>         Łukasz
> 
> 
>         On 14.04.2020 12:20, Christofer Dutz wrote:
>         > Oh gee ... I totally remember having exactly the same
>     discussion with Sebastian about "bytes" ...
>         >
>         > The problem is there is generally no "byte" "int" and "uint"
>     because a byte = 8 bits is either signed or unsigned. That should
>     the 3rd option be? Perhaps-signed?
>         > So generally I use "uint 8" for what you refer to as a "byte 8".
>         >
>         > Wo what would be the difference between your "2 byte little
>     endian" and an "uint 16" with a LE ReadBuffer?
>         >
>         > I think what you have to rid yourself of thinking of int as
>     number and a byte not being a number.
>         >
>         > Chris
>         >
>         >
>         >
>         > Am 14.04.20, 10:52 schrieb "Łukasz Dywicki"
>     <luke@code-house.org <ma...@code-house.org>>:
>         >
>         >     Hey Christian,
>         >     The problem I face is quite simple. State type in mspec i
>     declared as a
>         >     bunch of bits. Type length is fixed, however not available
>     anywhere in
>         >     mspec or code generator to re-arrange bytes upfront. All
>     we have exposed
>         >     at reader/write level is read/write bit.
>         >     To be fair LE/BE support in read/write buffers is limited
>     just to
>         >     numbers. There is no such support for raw bytes or bits,
>     cause for that
>         >     you need to declare a length of LE/BE sequence.
>         >     
>         >     I would love if I could just declare State as '2 byte
>     little endian' so
>         >     it would be read properly upfront and parsed with no
>     changes in
>         >     generated code, however I'm not sure how to do it and
>     where. That's why
>         >     I'm playing with different things described in earlier mail.
>         >     Since all type handling is general I am just afraid of
>     more complicated
>         >     scenarios where we have variable length structures such as
>     arrays.
>         >     
>         >     Best regards,
>         >     Łukasz
>         >     
>         >     
>         >     On 14.04.2020 09:41, Christofer Dutz wrote:
>         >     > Hi Lukasz,
>         >     >
>         >     > I am not sure I am understanding the problems you are
>     facing. We already have LE and BE protocols.
>         >     > For example EIP is LE and the rest is generally BE. It
>     seems that ADS/AMS is also LE.
>         >     > mspec doesn't even know about endianness.
>         >     >
>         >     > Up till now the endianness doesn't have an effect on
>     bit-fields or single-bit ints.
>         >     > It only starts to affect if a field goes from one byte
>     to the next, which is usually for (u)int and floating point values.
>         >     >
>         >     > That's why we have created the Read/WriteBuffers to set
>     their endianness in the constructor.
>         >     >
>         >     > So if you're creating a driver for ADS/AMS which is LE,
>     then you write the mspec according to the sequence the information
>     is located in the transferred bytes and have the Read/WriteBuffer
>     handle the endianness issue.
>         >     >
>         >     > I do see a problem when there are drivers that use mixed
>     endianness, but we have still to encounter such a protocol.
>         >     >
>         >     > So I have to admit that I don't like any of the mspec
>     changes you proposed, as I think you are just not using the tools we
>     have the right way.
>         >     >
>         >     > Chris
>         >     >
>         >     >
>         >     >
>         >     > Am 14.04.20, 00:32 schrieb "Łukasz Dywicki"
>     <luke@code-house.org <ma...@code-house.org>>:
>         >     >
>         >     >     Hey Niclas,
>         >     >     I realized how old the old things are when I started
>     preparing
>         >     >     automation training for mere mortals and got into
>     history of frames and
>         >     >     even cabling. Mr. Modbus and EIA-485 is definitely
>     older than I. ;-)
>         >     >     
>         >     >     Getting back to the point - yes. I been thinking how
>     to address the byte
>         >     >     order in effective way. Here are two approaches I
>     have for now:
>         >     >     
>         >     >     A) My initial attempt is just a temporary buffer
>     which is then written
>         >     >     in reverse order to caller. For reading it is
>     similar - just getting N
>         >     >     bytes in reversed order. The hard part is.. knowing
>     N. I had to add a
>         >     >     static calculation in order to allocate valid buffer
>     sizes. I tend to
>         >     >     work but I'm not happy with this approach cause it
>     involves additional work.
>         >     >     B) Second idea I've got is really simple and relies
>     on code generation.
>         >     >     We know in which order fields are coming. Here I'm
>     referring to a State
>         >     >     field which is just bunch of bits. If we would group
>     fields in bytes and
>         >     >     generate code in reverse order then it has chance to
>     work. Requirement
>         >     >     for that - ability to know basic field sizes upfront.
>         >     >     C) Try to combine above with bit-io or
>     Read/WriteBuffers as these are
>         >     >     places which know actual position and state of
>     buffers which are being
>         >     >     read/written.
>         >     >     
>         >     >     Now, getting to two cases which are a problem.
>     CommandId and State. So
>         >     >     with command id situation is simple as it is
>     declared as enum and it is
>         >     >     read as uint. We know size upfront and can generate
>     valid method call
>         >     >     (readIntLE).
>         >     >     [enum uint 16 little endian 'CommandId'
>         >     >         ['0x00' INVALID]
>         >     >         ['0x01' ADS_READ_DEVICE_INFO]
>         >     >         ['0x02' ADS_READ]
>         >     >         ['0x03' ADS_WRITE]
>         >     >         ['0x04' ADS_READ_STATE]
>         >     >         ['0x05' ADS_WRITE_CONTROL]
>         >     >         ['0x06' ADS_ADD_DEVICE_NOTIFICATION]
>         >     >         ['0x07' ADS_DELETE_DEVICE_NOTIFICATION]
>         >     >         ['0x08' ADS_DEVICE_NOTIFICATION]
>         >     >         ['0x09' ADS_READ_WRITE]
>         >     >     ]
>         >     >     
>         >     >     Second candidate is what I'm stuck right now sniping
>     next cycles of
>         >     >     problems. So in case of State we have complex type
>     composed from 2
>         >     >     bytes. A note here - instead of two bytes we might
>     have a variable
>         >     >     length type which includes array or other variable
>     section.
>         >     >     [type little endian 'State'
>         >     >         [simple     bit 'broadcast'             ]
>         >     >         [reserved   int 7 '0x0'                 ]
>         >     >         [simple     bit 'initCommand'           ]
>         >     >         [simple     bit 'updCommand'            ]
>         >     >         [simple     bit 'timestampAdded'        ]
>         >     >         [simple     bit 'highPriorityCommand'   ]
>         >     >         [simple     bit 'systemCommand'         ]
>         >     >         [simple     bit 'adsCommand'            ]
>         >     >         [simple     bit 'noReturn'              ]
>         >     >         [simple     bit 'response'              ]
>         >     >     ]
>         >     >     
>         >     >     The order of reading big endian encoded data to
>     impose little endian
>         >     >     shift would be (please correct me if I'm wrong):
>         >     >     1) init
>         >     >     2) udp
>         >     >     3) add timestamp
>         >     >     4) priority
>         >     >     5) system
>         >     >     6) ads
>         >     >     7) noreturn
>         >     >     8) response (end of byte 1)
>         >     >     9) broadcast
>         >     >     10) reserved (end of byte )
>         >     >     We can do same trick for writing, by re-arranging
>     fields. By this way we
>         >     >     avoid any additional byte level operations.
>         >     >     
>         >     >     Overall trouble with generated driver is to declare
>     "how much" bytes
>         >     >     should be read and interpreted. We have precise size
>     information at the
>         >     >     runtime - due to length fields, we can leverage it
>     at generation time,
>         >     >     but then we won't be able to cover all cases.
>         >     >     
>         >     >     I would love to keep it simple and do not break
>     things thus I need your
>         >     >     advice on how to approach this problem in a valid way.
>         >     >     
>         >     >     Cheers,
>         >     >     Łukasz
>         >     >     
>         >     >     
>         >     >     On 13.04.2020 03:26, Niclas Hedhman wrote:
>         >     >     > <anecdotal-rant>
>         >     >     > For us who were around and shaping the protocols
>     in the 1980s, and people
>         >     >     > before us (and before standards like RS-232), a
>     lot of the "specifications"
>         >     >     > came out of "observation of implementation we
>     managed to get to work",
>         >     >     > rather than "implement this spec". A lot was due
>     to extreme memory
>         >     >     > constraints (in my case, multi-tasking operating
>     system, serial protocol
>         >     >     > 187kbps, interpreted programming language with
>     floating point ops and user
>         >     >     > applications in 2kB RAM and 8kB EPROM) and a
>     general lack of information,
>         >     >     > like what other people were doing, sharing
>     experiences and so on.
>         >     >     >
>         >     >     > And there were many "innovative" ways to squeeze
>     just a little bit extra
>         >     >     > out of the hardware, resulting in "hard to
>     understand" consequences. Bit
>         >     >     > packing was a typical one, multiple functions
>     packed into a single byte.
>         >     >     > Look at page 14 in
>     https://www.nxp.com/docs/en/data-sheet/80C31_80C32.pdf
>         >     >     > and read up on "UART Enahanced Mode", and we used
>     this, i.e. 9 bits, no
>         >     >     > parity and clever use of address and mask to
>     create a slave-to-slave direct
>         >     >     > protocol, where the master's role was to signal
>     which slave "owned" the
>         >     >     > cable. Yeah, in that 8kB ROM limitation (I think
>     protocol was about 1kB
>         >     >     > ROM) and something like 150 bytes RAM for comm
>     protocol.
>         >     >     >
>         >     >     > Could you implement a compatible device to this
>     with PLC4X and modern
>         >     >     > hardware (i.e. no 8031/32 co-processor)? Possibly
>     but bit-banging is needed
>         >     >     > to support the 9bit data (+start and stop bits)
>     and an awful lot of CPU
>         >     >     > cycles on something that was automatic on one of
>     the slowest long-lived
>         >     >     > microcontroller ever.
>         >     >     > </anecdotal-rant>
>         >     >     >
>         >     >     > My point was only to highlight that some of the
>     strange things you see in
>         >     >     > protocols today, have its roots in
>     pre-standardization days. Today no one
>         >     >     > would go down that route, because the hardware
>     cost nothing now (8031  +
>         >     >     > 8kB EPROM + 2kB static RAM + battery backup =>
>     ~$50 in 1983's currency) and
>         >     >     > longevity of software is more important.
>         >     >     >
>         >     >     > Cheers
>         >     >     > Niclas
>         >     >     >
>         >     >     >
>         >     >     > On Sun, Apr 12, 2020 at 10:10 PM Christofer Dutz
>     <christofer.dutz@c-ware.de <ma...@c-ware.de>>
>         >     >     > wrote:
>         >     >     >
>         >     >     >> Hi Lukasz,
>         >     >     >>
>         >     >     >> I think it really gets tricky when using BE and
>     having some byte-odd-sizes
>         >     >     >> ... I remember in the Firmata protocol there were
>     some bitmasks and then 10
>         >     >     >> bit uint as BE ... not it really got tricky as
>     the specs were written from
>         >     >     >> a point of view: You read 16 bits BE and then the
>     first6 bits mean XYZ
>         >     >     >> instead of describing how the bits actually
>     travel over the wire.
>         >     >     >>
>         >     >     >> Chris
>         >     >     >>
>         >     >     >>
>         >     >     >>
>         >     >     >> Am 11.04.20, 01:21 schrieb "Łukasz Dywicki"
>     <luke@code-house.org <ma...@code-house.org>>:
>         >     >     >>
>         >     >     >>     I've made some progress with topic by
>     modyfing mspec and allowing
>         >     >     >>     'little endian' flag on fields. This moved me
>     further to next issue -
>         >     >     >>     which is whole type encoded little endian.
>         >     >     >>
>         >     >     >>     In ADS driver such type is State, which has 2
>     bytes and uses 8 bits for
>         >     >     >>     various flags.
>         >     >     >>     There are two cases which require different
>     approach - reading and
>         >     >     >>     writing. So for reading we need to swap N
>     bytes based on type length.
>         >     >     >>     For writing we need to alocate buffer for N
>     bytes and swap them before
>         >     >     >>     writing.
>         >     >     >>
>         >     >     >>     I am stuck now with freemaker templates and
>     bit-io.
>         >     >     >>
>         >     >     >>     Cheers,
>         >     >     >>     Łukasz
>         >     >     >>
>         >     >     >>
>         >     >     >>
>         >     >     >>     On 10.04.2020 17:57, Łukasz Dywicki wrote:
>         >     >     >>     > I am doing some tests of ADS serialization.
>         >     >     >>     >
>         >     >     >>     > I've run into some troubles with payload
>     which is generated with new
>         >     >     >>     > driver. I'm not sure if that's my fault or
>     generated code.
>         >     >     >>     >
>         >     >     >>     > I did a verification of what Wireshark
>     shows and how ads structures
>         >     >     >> are
>         >     >     >>     > parsed. There is a gap I think. For example
>     ams port number 1000
>         >     >     >>     > (0x1027) is read as 4135.
>         >     >     >>     >
>         >     >     >>     > Obviously I used wrong structures while
>     implementing protocol logic
>         >     >     >> in
>         >     >     >>     > first place, but now I am uncertain of how
>     fields are encoded. How we
>         >     >     >>     > mark field as little endian when rest of
>     payload is big endian? Do we
>         >     >     >>     > have `uint_le`?
>         >     >     >>     >
>         >     >     >>     > As far I remember route creation logic I
>     was tracking last week used
>         >     >     >>     > combination of LE and BE.
>         >     >     >>     >
>         >     >     >>     > Best regards,
>         >     >     >>     > Łukasz
>         >     >     >>     >
>         >     >     >>
>         >     >     >>
>         >     >     >>
>         >     >     >
>         >     >     
>         >     >
>         >     
>         >
>

Re: Big and littleendian fields in one mspec

Posted by Christofer Dutz <ch...@c-ware.de>.

Hi Lukasz (adding dev@ to the recipients again)

you are not quite correct. Enum types do declare a base type where they are declared … not where they are used.

If you have bit-fields then why don’t you define them as sequence of bits? I really can’t understand why you need to flip anything.
I think you’re trying to do something similar to Sebastian when he wanted to introduce some bit-mask types.

I could imagine that in the spec it might be written … these two bytes are a bit-mask and the bit 1 means X bit 10 means Y, …
So why not define the bit fields in the sequence they are sent over the wire?

Assuming you have a bit field of 16 bits BE uint. If the spec now says:


Bit:                        15|14|13|12|11|10|9 |8 |7 |6 |5  |4 |3  |2  |1  |0
Meaning:            A  |B  |C  |D  |E  | F  |G|H |I  | J | K |L | M|N |O |P

Then the order they are sent is:
7 |6 |5  |4 |3  |2  |1  |0 | 15|14|13|12|11|10|9 |8
So you have to declare the bit fields in the order:
                               I  | J | K |L | M|N |O |P  |A  |B  |C  |D  |E  | F  |G|H

So I don’t quite understand what you’re trying to flip here.

Chris

Von: Łukasz Dywicki <lu...@code-house.org>
Datum: Dienstag, 14. April 2020 um 13:26
An: Christofer Dutz <ch...@c-ware.de>
Betreff: Re: Big and littleendian fields in one mspec

I've added the endianness switch to the types in my experiments. These are not needed for enums as these are read as int and then mapped to constants. I referred to these as an example - enum does have a length indication and type behind it.

Again, I will bring the issue - StateIO currently fails to do anything useful due to BE/LE switch. Because this type uses single bits which need to be flipped before reading or after writing otherwise they end up in improper order. Out of 16 bits 9 are in use so we have a lot of possible combinations (2^9), which seems too much to create an enum.
Since you didn't like my initial proposal and modifications (as they might be redundant to what is available in the buffer API), how would you handle State serialization without affecting mspec and without further complication to code generation?

Best,
Łukasz

wt., 14 kwi 2020 o 12:47 Christofer Dutz <ch...@c-ware.de>> napisał(a):
Hi Lukasz,

but we don't have a:
[enum uint 16 little endian 'CommandId']
But only a:
[enum uint 16 'CommandId']

And in your case I think perhaps the constants are not correct. So having a 16 bit uint will result in a 4 digit hex string:
So if you are having problems in mapping them to your constants, perhaps the constants are in the wrong endianness.

I have encountered (but can't recall where) that the Enum constants in a BE protocol were written down in LE notation.
Of course the thing can't work then.

So if for example you have the constant "1" in a BE protocol with a uint 16 enum type, your constant is not 0x0001, but 0x0100 instead.

Chris




Am 14.04.20, 12:37 schrieb "Łukasz Dywicki" <lu...@code-house.org>>:

    The legendary "two bytes" (shall we create a band with this name?!) are
    coming from unfortunate State type. I don't mind these to be an
    int/sint/uint or whathever - this is matter of interpretation.

    If you could please check again my earlier messages you will find
    struggle I have which is - how much given type takes.

    We have that for enums
    [enum uint 16 little endian 'CommandId']

    but we don't have that for complex types
    [type 'State']

    This leads to situation that we don't know how to read and interpret
    incoming byte sequence or how to properly write it back to stream.

    It would be great if we could limit the issue just to
    // read
    state = StateIO.parseStatic(io.readUnsignedInt(16, true)); // true->LE
    // write
    wio = new WriteBuffer(2);
    StateIO.staticSerialize(wio, new State(...));
    io.writeUnsignedInt(8, wio.read, true); // true->LE

    However to do that first we need to know that State is a uint 16 LE and
    then do a lot of mambo-jambo in code generators to cope with that. :-)

    Best,
    Łukasz


    On 14.04.2020 12:20, Christofer Dutz wrote:
    > Oh gee ... I totally remember having exactly the same discussion with Sebastian about "bytes" ...
    >
    > The problem is there is generally no "byte" "int" and "uint" because a byte = 8 bits is either signed or unsigned. That should the 3rd option be? Perhaps-signed?
    > So generally I use "uint 8" for what you refer to as a "byte 8".
    >
    > Wo what would be the difference between your "2 byte little endian" and an "uint 16" with a LE ReadBuffer?
    >
    > I think what you have to rid yourself of thinking of int as number and a byte not being a number.
    >
    > Chris
    >
    >
    >
    > Am 14.04.20, 10:52 schrieb "Łukasz Dywicki" <lu...@code-house.org>>:
    >
    >     Hey Christian,
    >     The problem I face is quite simple. State type in mspec i declared as a
    >     bunch of bits. Type length is fixed, however not available anywhere in
    >     mspec or code generator to re-arrange bytes upfront. All we have exposed
    >     at reader/write level is read/write bit.
    >     To be fair LE/BE support in read/write buffers is limited just to
    >     numbers. There is no such support for raw bytes or bits, cause for that
    >     you need to declare a length of LE/BE sequence.
    >
    >     I would love if I could just declare State as '2 byte little endian' so
    >     it would be read properly upfront and parsed with no changes in
    >     generated code, however I'm not sure how to do it and where. That's why
    >     I'm playing with different things described in earlier mail.
    >     Since all type handling is general I am just afraid of more complicated
    >     scenarios where we have variable length structures such as arrays.
    >
    >     Best regards,
    >     Łukasz
    >
    >
    >     On 14.04.2020 09:41, Christofer Dutz wrote:
    >     > Hi Lukasz,
    >     >
    >     > I am not sure I am understanding the problems you are facing. We already have LE and BE protocols.
    >     > For example EIP is LE and the rest is generally BE. It seems that ADS/AMS is also LE.
    >     > mspec doesn't even know about endianness.
    >     >
    >     > Up till now the endianness doesn't have an effect on bit-fields or single-bit ints.
    >     > It only starts to affect if a field goes from one byte to the next, which is usually for (u)int and floating point values.
    >     >
    >     > That's why we have created the Read/WriteBuffers to set their endianness in the constructor.
    >     >
    >     > So if you're creating a driver for ADS/AMS which is LE, then you write the mspec according to the sequence the information is located in the transferred bytes and have the Read/WriteBuffer handle the endianness issue.
    >     >
    >     > I do see a problem when there are drivers that use mixed endianness, but we have still to encounter such a protocol.
    >     >
    >     > So I have to admit that I don't like any of the mspec changes you proposed, as I think you are just not using the tools we have the right way.
    >     >
    >     > Chris
    >     >
    >     >
    >     >
    >     > Am 14.04.20, 00:32 schrieb "Łukasz Dywicki" <lu...@code-house.org>>:
    >     >
    >     >     Hey Niclas,
    >     >     I realized how old the old things are when I started preparing
    >     >     automation training for mere mortals and got into history of frames and
    >     >     even cabling. Mr. Modbus and EIA-485 is definitely older than I. ;-)
    >     >
    >     >     Getting back to the point - yes. I been thinking how to address the byte
    >     >     order in effective way. Here are two approaches I have for now:
    >     >
    >     >     A) My initial attempt is just a temporary buffer which is then written
    >     >     in reverse order to caller. For reading it is similar - just getting N
    >     >     bytes in reversed order. The hard part is.. knowing N. I had to add a
    >     >     static calculation in order to allocate valid buffer sizes. I tend to
    >     >     work but I'm not happy with this approach cause it involves additional work.
    >     >     B) Second idea I've got is really simple and relies on code generation.
    >     >     We know in which order fields are coming. Here I'm referring to a State
    >     >     field which is just bunch of bits. If we would group fields in bytes and
    >     >     generate code in reverse order then it has chance to work. Requirement
    >     >     for that - ability to know basic field sizes upfront.
    >     >     C) Try to combine above with bit-io or Read/WriteBuffers as these are
    >     >     places which know actual position and state of buffers which are being
    >     >     read/written.
    >     >
    >     >     Now, getting to two cases which are a problem. CommandId and State. So
    >     >     with command id situation is simple as it is declared as enum and it is
    >     >     read as uint. We know size upfront and can generate valid method call
    >     >     (readIntLE).
    >     >     [enum uint 16 little endian 'CommandId'
    >     >         ['0x00' INVALID]
    >     >         ['0x01' ADS_READ_DEVICE_INFO]
    >     >         ['0x02' ADS_READ]
    >     >         ['0x03' ADS_WRITE]
    >     >         ['0x04' ADS_READ_STATE]
    >     >         ['0x05' ADS_WRITE_CONTROL]
    >     >         ['0x06' ADS_ADD_DEVICE_NOTIFICATION]
    >     >         ['0x07' ADS_DELETE_DEVICE_NOTIFICATION]
    >     >         ['0x08' ADS_DEVICE_NOTIFICATION]
    >     >         ['0x09' ADS_READ_WRITE]
    >     >     ]
    >     >
    >     >     Second candidate is what I'm stuck right now sniping next cycles of
    >     >     problems. So in case of State we have complex type composed from 2
    >     >     bytes. A note here - instead of two bytes we might have a variable
    >     >     length type which includes array or other variable section.
    >     >     [type little endian 'State'
    >     >         [simple     bit 'broadcast'             ]
    >     >         [reserved   int 7 '0x0'                 ]
    >     >         [simple     bit 'initCommand'           ]
    >     >         [simple     bit 'updCommand'            ]
    >     >         [simple     bit 'timestampAdded'        ]
    >     >         [simple     bit 'highPriorityCommand'   ]
    >     >         [simple     bit 'systemCommand'         ]
    >     >         [simple     bit 'adsCommand'            ]
    >     >         [simple     bit 'noReturn'              ]
    >     >         [simple     bit 'response'              ]
    >     >     ]
    >     >
    >     >     The order of reading big endian encoded data to impose little endian
    >     >     shift would be (please correct me if I'm wrong):
    >     >     1) init
    >     >     2) udp
    >     >     3) add timestamp
    >     >     4) priority
    >     >     5) system
    >     >     6) ads
    >     >     7) noreturn
    >     >     8) response (end of byte 1)
    >     >     9) broadcast
    >     >     10) reserved (end of byte )
    >     >     We can do same trick for writing, by re-arranging fields. By this way we
    >     >     avoid any additional byte level operations.
    >     >
    >     >     Overall trouble with generated driver is to declare "how much" bytes
    >     >     should be read and interpreted. We have precise size information at the
    >     >     runtime - due to length fields, we can leverage it at generation time,
    >     >     but then we won't be able to cover all cases.
    >     >
    >     >     I would love to keep it simple and do not break things thus I need your
    >     >     advice on how to approach this problem in a valid way.
    >     >
    >     >     Cheers,
    >     >     Łukasz
    >     >
    >     >
    >     >     On 13.04.2020 03:26, Niclas Hedhman wrote:
    >     >     > <anecdotal-rant>
    >     >     > For us who were around and shaping the protocols in the 1980s, and people
    >     >     > before us (and before standards like RS-232), a lot of the "specifications"
    >     >     > came out of "observation of implementation we managed to get to work",
    >     >     > rather than "implement this spec". A lot was due to extreme memory
    >     >     > constraints (in my case, multi-tasking operating system, serial protocol
    >     >     > 187kbps, interpreted programming language with floating point ops and user
    >     >     > applications in 2kB RAM and 8kB EPROM) and a general lack of information,
    >     >     > like what other people were doing, sharing experiences and so on.
    >     >     >
    >     >     > And there were many "innovative" ways to squeeze just a little bit extra
    >     >     > out of the hardware, resulting in "hard to understand" consequences. Bit
    >     >     > packing was a typical one, multiple functions packed into a single byte.
    >     >     > Look at page 14 in https://www.nxp.com/docs/en/data-sheet/80C31_80C32.pdf
    >     >     > and read up on "UART Enahanced Mode", and we used this, i.e. 9 bits, no
    >     >     > parity and clever use of address and mask to create a slave-to-slave direct
    >     >     > protocol, where the master's role was to signal which slave "owned" the
    >     >     > cable. Yeah, in that 8kB ROM limitation (I think protocol was about 1kB
    >     >     > ROM) and something like 150 bytes RAM for comm protocol.
    >     >     >
    >     >     > Could you implement a compatible device to this with PLC4X and modern
    >     >     > hardware (i.e. no 8031/32 co-processor)? Possibly but bit-banging is needed
    >     >     > to support the 9bit data (+start and stop bits) and an awful lot of CPU
    >     >     > cycles on something that was automatic on one of the slowest long-lived
    >     >     > microcontroller ever.
    >     >     > </anecdotal-rant>
    >     >     >
    >     >     > My point was only to highlight that some of the strange things you see in
    >     >     > protocols today, have its roots in pre-standardization days. Today no one
    >     >     > would go down that route, because the hardware cost nothing now (8031  +
    >     >     > 8kB EPROM + 2kB static RAM + battery backup => ~$50 in 1983's currency) and
    >     >     > longevity of software is more important.
    >     >     >
    >     >     > Cheers
    >     >     > Niclas
    >     >     >
    >     >     >
    >     >     > On Sun, Apr 12, 2020 at 10:10 PM Christofer Dutz <ch...@c-ware.de>>
    >     >     > wrote:
    >     >     >
    >     >     >> Hi Lukasz,
    >     >     >>
    >     >     >> I think it really gets tricky when using BE and having some byte-odd-sizes
    >     >     >> ... I remember in the Firmata protocol there were some bitmasks and then 10
    >     >     >> bit uint as BE ... not it really got tricky as the specs were written from
    >     >     >> a point of view: You read 16 bits BE and then the first6 bits mean XYZ
    >     >     >> instead of describing how the bits actually travel over the wire.
    >     >     >>
    >     >     >> Chris
    >     >     >>
    >     >     >>
    >     >     >>
    >     >     >> Am 11.04.20, 01:21 schrieb "Łukasz Dywicki" <lu...@code-house.org>>:
    >     >     >>
    >     >     >>     I've made some progress with topic by modyfing mspec and allowing
    >     >     >>     'little endian' flag on fields. This moved me further to next issue -
    >     >     >>     which is whole type encoded little endian.
    >     >     >>
    >     >     >>     In ADS driver such type is State, which has 2 bytes and uses 8 bits for
    >     >     >>     various flags.
    >     >     >>     There are two cases which require different approach - reading and
    >     >     >>     writing. So for reading we need to swap N bytes based on type length.
    >     >     >>     For writing we need to alocate buffer for N bytes and swap them before
    >     >     >>     writing.
    >     >     >>
    >     >     >>     I am stuck now with freemaker templates and bit-io.
    >     >     >>
    >     >     >>     Cheers,
    >     >     >>     Łukasz
    >     >     >>
    >     >     >>
    >     >     >>
    >     >     >>     On 10.04.2020 17:57, Łukasz Dywicki wrote:
    >     >     >>     > I am doing some tests of ADS serialization.
    >     >     >>     >
    >     >     >>     > I've run into some troubles with payload which is generated with new
    >     >     >>     > driver. I'm not sure if that's my fault or generated code.
    >     >     >>     >
    >     >     >>     > I did a verification of what Wireshark shows and how ads structures
    >     >     >> are
    >     >     >>     > parsed. There is a gap I think. For example ams port number 1000
    >     >     >>     > (0x1027) is read as 4135.
    >     >     >>     >
    >     >     >>     > Obviously I used wrong structures while implementing protocol logic
    >     >     >> in
    >     >     >>     > first place, but now I am uncertain of how fields are encoded. How we
    >     >     >>     > mark field as little endian when rest of payload is big endian? Do we
    >     >     >>     > have `uint_le`?
    >     >     >>     >
    >     >     >>     > As far I remember route creation logic I was tracking last week used
    >     >     >>     > combination of LE and BE.
    >     >     >>     >
    >     >     >>     > Best regards,
    >     >     >>     > Łukasz
    >     >     >>     >
    >     >     >>
    >     >     >>
    >     >     >>
    >     >     >
    >     >
    >     >
    >
    >

Re: Big and littleendian fields in one mspec

Posted by Christofer Dutz <ch...@c-ware.de>.

Hi Lukasz,

but we don't have a:
[enum uint 16 little endian 'CommandId']
But only a:
[enum uint 16 'CommandId']

And in your case I think perhaps the constants are not correct. So having a 16 bit uint will result in a 4 digit hex string:
So if you are having problems in mapping them to your constants, perhaps the constants are in the wrong endianness.

I have encountered (but can't recall where) that the Enum constants in a BE protocol were written down in LE notation.
Of course the thing can't work then.

So if for example you have the constant "1" in a BE protocol with a uint 16 enum type, your constant is not 0x0001, but 0x0100 instead.

Chris




Am 14.04.20, 12:37 schrieb "Łukasz Dywicki" <lu...@code-house.org>:

    The legendary "two bytes" (shall we create a band with this name?!) are
    coming from unfortunate State type. I don't mind these to be an
    int/sint/uint or whathever - this is matter of interpretation.
    
    If you could please check again my earlier messages you will find
    struggle I have which is - how much given type takes.
    
    We have that for enums
    [enum uint 16 little endian 'CommandId']
    
    but we don't have that for complex types
    [type 'State']
    
    This leads to situation that we don't know how to read and interpret
    incoming byte sequence or how to properly write it back to stream.
    
    It would be great if we could limit the issue just to
    // read
    state = StateIO.parseStatic(io.readUnsignedInt(16, true)); // true->LE
    // write
    wio = new WriteBuffer(2);
    StateIO.staticSerialize(wio, new State(...));
    io.writeUnsignedInt(8, wio.read, true); // true->LE
    
    However to do that first we need to know that State is a uint 16 LE and
    then do a lot of mambo-jambo in code generators to cope with that. :-)
    
    Best,
    Łukasz
    
    
    On 14.04.2020 12:20, Christofer Dutz wrote:
    > Oh gee ... I totally remember having exactly the same discussion with Sebastian about "bytes" ...
    > 
    > The problem is there is generally no "byte" "int" and "uint" because a byte = 8 bits is either signed or unsigned. That should the 3rd option be? Perhaps-signed?
    > So generally I use "uint 8" for what you refer to as a "byte 8".
    > 
    > Wo what would be the difference between your "2 byte little endian" and an "uint 16" with a LE ReadBuffer?
    > 
    > I think what you have to rid yourself of thinking of int as number and a byte not being a number.
    > 
    > Chris
    > 
    > 
    > 
    > Am 14.04.20, 10:52 schrieb "Łukasz Dywicki" <lu...@code-house.org>:
    > 
    >     Hey Christian,
    >     The problem I face is quite simple. State type in mspec i declared as a
    >     bunch of bits. Type length is fixed, however not available anywhere in
    >     mspec or code generator to re-arrange bytes upfront. All we have exposed
    >     at reader/write level is read/write bit.
    >     To be fair LE/BE support in read/write buffers is limited just to
    >     numbers. There is no such support for raw bytes or bits, cause for that
    >     you need to declare a length of LE/BE sequence.
    >     
    >     I would love if I could just declare State as '2 byte little endian' so
    >     it would be read properly upfront and parsed with no changes in
    >     generated code, however I'm not sure how to do it and where. That's why
    >     I'm playing with different things described in earlier mail.
    >     Since all type handling is general I am just afraid of more complicated
    >     scenarios where we have variable length structures such as arrays.
    >     
    >     Best regards,
    >     Łukasz
    >     
    >     
    >     On 14.04.2020 09:41, Christofer Dutz wrote:
    >     > Hi Lukasz,
    >     > 
    >     > I am not sure I am understanding the problems you are facing. We already have LE and BE protocols.
    >     > For example EIP is LE and the rest is generally BE. It seems that ADS/AMS is also LE. 
    >     > mspec doesn't even know about endianness.
    >     > 
    >     > Up till now the endianness doesn't have an effect on bit-fields or single-bit ints. 
    >     > It only starts to affect if a field goes from one byte to the next, which is usually for (u)int and floating point values.
    >     > 
    >     > That's why we have created the Read/WriteBuffers to set their endianness in the constructor.
    >     > 
    >     > So if you're creating a driver for ADS/AMS which is LE, then you write the mspec according to the sequence the information is located in the transferred bytes and have the Read/WriteBuffer handle the endianness issue.
    >     > 
    >     > I do see a problem when there are drivers that use mixed endianness, but we have still to encounter such a protocol.
    >     > 
    >     > So I have to admit that I don't like any of the mspec changes you proposed, as I think you are just not using the tools we have the right way.
    >     > 
    >     > Chris
    >     > 
    >     > 
    >     > 
    >     > Am 14.04.20, 00:32 schrieb "Łukasz Dywicki" <lu...@code-house.org>:
    >     > 
    >     >     Hey Niclas,
    >     >     I realized how old the old things are when I started preparing
    >     >     automation training for mere mortals and got into history of frames and
    >     >     even cabling. Mr. Modbus and EIA-485 is definitely older than I. ;-)
    >     >     
    >     >     Getting back to the point - yes. I been thinking how to address the byte
    >     >     order in effective way. Here are two approaches I have for now:
    >     >     
    >     >     A) My initial attempt is just a temporary buffer which is then written
    >     >     in reverse order to caller. For reading it is similar - just getting N
    >     >     bytes in reversed order. The hard part is.. knowing N. I had to add a
    >     >     static calculation in order to allocate valid buffer sizes. I tend to
    >     >     work but I'm not happy with this approach cause it involves additional work.
    >     >     B) Second idea I've got is really simple and relies on code generation.
    >     >     We know in which order fields are coming. Here I'm referring to a State
    >     >     field which is just bunch of bits. If we would group fields in bytes and
    >     >     generate code in reverse order then it has chance to work. Requirement
    >     >     for that - ability to know basic field sizes upfront.
    >     >     C) Try to combine above with bit-io or Read/WriteBuffers as these are
    >     >     places which know actual position and state of buffers which are being
    >     >     read/written.
    >     >     
    >     >     Now, getting to two cases which are a problem. CommandId and State. So
    >     >     with command id situation is simple as it is declared as enum and it is
    >     >     read as uint. We know size upfront and can generate valid method call
    >     >     (readIntLE).
    >     >     [enum uint 16 little endian 'CommandId'
    >     >         ['0x00' INVALID]
    >     >         ['0x01' ADS_READ_DEVICE_INFO]
    >     >         ['0x02' ADS_READ]
    >     >         ['0x03' ADS_WRITE]
    >     >         ['0x04' ADS_READ_STATE]
    >     >         ['0x05' ADS_WRITE_CONTROL]
    >     >         ['0x06' ADS_ADD_DEVICE_NOTIFICATION]
    >     >         ['0x07' ADS_DELETE_DEVICE_NOTIFICATION]
    >     >         ['0x08' ADS_DEVICE_NOTIFICATION]
    >     >         ['0x09' ADS_READ_WRITE]
    >     >     ]
    >     >     
    >     >     Second candidate is what I'm stuck right now sniping next cycles of
    >     >     problems. So in case of State we have complex type composed from 2
    >     >     bytes. A note here - instead of two bytes we might have a variable
    >     >     length type which includes array or other variable section.
    >     >     [type little endian 'State'
    >     >         [simple     bit 'broadcast'             ]
    >     >         [reserved   int 7 '0x0'                 ]
    >     >         [simple     bit 'initCommand'           ]
    >     >         [simple     bit 'updCommand'            ]
    >     >         [simple     bit 'timestampAdded'        ]
    >     >         [simple     bit 'highPriorityCommand'   ]
    >     >         [simple     bit 'systemCommand'         ]
    >     >         [simple     bit 'adsCommand'            ]
    >     >         [simple     bit 'noReturn'              ]
    >     >         [simple     bit 'response'              ]
    >     >     ]
    >     >     
    >     >     The order of reading big endian encoded data to impose little endian
    >     >     shift would be (please correct me if I'm wrong):
    >     >     1) init
    >     >     2) udp
    >     >     3) add timestamp
    >     >     4) priority
    >     >     5) system
    >     >     6) ads
    >     >     7) noreturn
    >     >     8) response (end of byte 1)
    >     >     9) broadcast
    >     >     10) reserved (end of byte )
    >     >     We can do same trick for writing, by re-arranging fields. By this way we
    >     >     avoid any additional byte level operations.
    >     >     
    >     >     Overall trouble with generated driver is to declare "how much" bytes
    >     >     should be read and interpreted. We have precise size information at the
    >     >     runtime - due to length fields, we can leverage it at generation time,
    >     >     but then we won't be able to cover all cases.
    >     >     
    >     >     I would love to keep it simple and do not break things thus I need your
    >     >     advice on how to approach this problem in a valid way.
    >     >     
    >     >     Cheers,
    >     >     Łukasz
    >     >     
    >     >     
    >     >     On 13.04.2020 03:26, Niclas Hedhman wrote:
    >     >     > <anecdotal-rant>
    >     >     > For us who were around and shaping the protocols in the 1980s, and people
    >     >     > before us (and before standards like RS-232), a lot of the "specifications"
    >     >     > came out of "observation of implementation we managed to get to work",
    >     >     > rather than "implement this spec". A lot was due to extreme memory
    >     >     > constraints (in my case, multi-tasking operating system, serial protocol
    >     >     > 187kbps, interpreted programming language with floating point ops and user
    >     >     > applications in 2kB RAM and 8kB EPROM) and a general lack of information,
    >     >     > like what other people were doing, sharing experiences and so on.
    >     >     > 
    >     >     > And there were many "innovative" ways to squeeze just a little bit extra
    >     >     > out of the hardware, resulting in "hard to understand" consequences. Bit
    >     >     > packing was a typical one, multiple functions packed into a single byte.
    >     >     > Look at page 14 in https://www.nxp.com/docs/en/data-sheet/80C31_80C32.pdf
    >     >     > and read up on "UART Enahanced Mode", and we used this, i.e. 9 bits, no
    >     >     > parity and clever use of address and mask to create a slave-to-slave direct
    >     >     > protocol, where the master's role was to signal which slave "owned" the
    >     >     > cable. Yeah, in that 8kB ROM limitation (I think protocol was about 1kB
    >     >     > ROM) and something like 150 bytes RAM for comm protocol.
    >     >     > 
    >     >     > Could you implement a compatible device to this with PLC4X and modern
    >     >     > hardware (i.e. no 8031/32 co-processor)? Possibly but bit-banging is needed
    >     >     > to support the 9bit data (+start and stop bits) and an awful lot of CPU
    >     >     > cycles on something that was automatic on one of the slowest long-lived
    >     >     > microcontroller ever.
    >     >     > </anecdotal-rant>
    >     >     > 
    >     >     > My point was only to highlight that some of the strange things you see in
    >     >     > protocols today, have its roots in pre-standardization days. Today no one
    >     >     > would go down that route, because the hardware cost nothing now (8031  +
    >     >     > 8kB EPROM + 2kB static RAM + battery backup => ~$50 in 1983's currency) and
    >     >     > longevity of software is more important.
    >     >     > 
    >     >     > Cheers
    >     >     > Niclas
    >     >     > 
    >     >     > 
    >     >     > On Sun, Apr 12, 2020 at 10:10 PM Christofer Dutz <ch...@c-ware.de>
    >     >     > wrote:
    >     >     > 
    >     >     >> Hi Lukasz,
    >     >     >>
    >     >     >> I think it really gets tricky when using BE and having some byte-odd-sizes
    >     >     >> ... I remember in the Firmata protocol there were some bitmasks and then 10
    >     >     >> bit uint as BE ... not it really got tricky as the specs were written from
    >     >     >> a point of view: You read 16 bits BE and then the first6 bits mean XYZ
    >     >     >> instead of describing how the bits actually travel over the wire.
    >     >     >>
    >     >     >> Chris
    >     >     >>
    >     >     >>
    >     >     >>
    >     >     >> Am 11.04.20, 01:21 schrieb "Łukasz Dywicki" <lu...@code-house.org>:
    >     >     >>
    >     >     >>     I've made some progress with topic by modyfing mspec and allowing
    >     >     >>     'little endian' flag on fields. This moved me further to next issue -
    >     >     >>     which is whole type encoded little endian.
    >     >     >>
    >     >     >>     In ADS driver such type is State, which has 2 bytes and uses 8 bits for
    >     >     >>     various flags.
    >     >     >>     There are two cases which require different approach - reading and
    >     >     >>     writing. So for reading we need to swap N bytes based on type length.
    >     >     >>     For writing we need to alocate buffer for N bytes and swap them before
    >     >     >>     writing.
    >     >     >>
    >     >     >>     I am stuck now with freemaker templates and bit-io.
    >     >     >>
    >     >     >>     Cheers,
    >     >     >>     Łukasz
    >     >     >>
    >     >     >>
    >     >     >>
    >     >     >>     On 10.04.2020 17:57, Łukasz Dywicki wrote:
    >     >     >>     > I am doing some tests of ADS serialization.
    >     >     >>     >
    >     >     >>     > I've run into some troubles with payload which is generated with new
    >     >     >>     > driver. I'm not sure if that's my fault or generated code.
    >     >     >>     >
    >     >     >>     > I did a verification of what Wireshark shows and how ads structures
    >     >     >> are
    >     >     >>     > parsed. There is a gap I think. For example ams port number 1000
    >     >     >>     > (0x1027) is read as 4135.
    >     >     >>     >
    >     >     >>     > Obviously I used wrong structures while implementing protocol logic
    >     >     >> in
    >     >     >>     > first place, but now I am uncertain of how fields are encoded. How we
    >     >     >>     > mark field as little endian when rest of payload is big endian? Do we
    >     >     >>     > have `uint_le`?
    >     >     >>     >
    >     >     >>     > As far I remember route creation logic I was tracking last week used
    >     >     >>     > combination of LE and BE.
    >     >     >>     >
    >     >     >>     > Best regards,
    >     >     >>     > Łukasz
    >     >     >>     >
    >     >     >>
    >     >     >>
    >     >     >>
    >     >     > 
    >     >     
    >     > 
    >     
    >

Re: Big and littleendian fields in one mspec

Posted by Łukasz Dywicki <lu...@code-house.org>.

Hey Christian,
The problem I face is quite simple. State type in mspec i declared as a
bunch of bits. Type length is fixed, however not available anywhere in
mspec or code generator to re-arrange bytes upfront. All we have exposed
at reader/write level is read/write bit.
To be fair LE/BE support in read/write buffers is limited just to
numbers. There is no such support for raw bytes or bits, cause for that
you need to declare a length of LE/BE sequence.

I would love if I could just declare State as '2 byte little endian' so
it would be read properly upfront and parsed with no changes in
generated code, however I'm not sure how to do it and where. That's why
I'm playing with different things described in earlier mail.
Since all type handling is general I am just afraid of more complicated
scenarios where we have variable length structures such as arrays.

Best regards,
Łukasz


On 14.04.2020 09:41, Christofer Dutz wrote:
> Hi Lukasz,
> 
> I am not sure I am understanding the problems you are facing. We already have LE and BE protocols.
> For example EIP is LE and the rest is generally BE. It seems that ADS/AMS is also LE. 
> mspec doesn't even know about endianness.
> 
> Up till now the endianness doesn't have an effect on bit-fields or single-bit ints. 
> It only starts to affect if a field goes from one byte to the next, which is usually for (u)int and floating point values.
> 
> That's why we have created the Read/WriteBuffers to set their endianness in the constructor.
> 
> So if you're creating a driver for ADS/AMS which is LE, then you write the mspec according to the sequence the information is located in the transferred bytes and have the Read/WriteBuffer handle the endianness issue.
> 
> I do see a problem when there are drivers that use mixed endianness, but we have still to encounter such a protocol.
> 
> So I have to admit that I don't like any of the mspec changes you proposed, as I think you are just not using the tools we have the right way.
> 
> Chris
> 
> 
> 
> Am 14.04.20, 00:32 schrieb "Łukasz Dywicki" <lu...@code-house.org>:
> 
>     Hey Niclas,
>     I realized how old the old things are when I started preparing
>     automation training for mere mortals and got into history of frames and
>     even cabling. Mr. Modbus and EIA-485 is definitely older than I. ;-)
>     
>     Getting back to the point - yes. I been thinking how to address the byte
>     order in effective way. Here are two approaches I have for now:
>     
>     A) My initial attempt is just a temporary buffer which is then written
>     in reverse order to caller. For reading it is similar - just getting N
>     bytes in reversed order. The hard part is.. knowing N. I had to add a
>     static calculation in order to allocate valid buffer sizes. I tend to
>     work but I'm not happy with this approach cause it involves additional work.
>     B) Second idea I've got is really simple and relies on code generation.
>     We know in which order fields are coming. Here I'm referring to a State
>     field which is just bunch of bits. If we would group fields in bytes and
>     generate code in reverse order then it has chance to work. Requirement
>     for that - ability to know basic field sizes upfront.
>     C) Try to combine above with bit-io or Read/WriteBuffers as these are
>     places which know actual position and state of buffers which are being
>     read/written.
>     
>     Now, getting to two cases which are a problem. CommandId and State. So
>     with command id situation is simple as it is declared as enum and it is
>     read as uint. We know size upfront and can generate valid method call
>     (readIntLE).
>     [enum uint 16 little endian 'CommandId'
>         ['0x00' INVALID]
>         ['0x01' ADS_READ_DEVICE_INFO]
>         ['0x02' ADS_READ]
>         ['0x03' ADS_WRITE]
>         ['0x04' ADS_READ_STATE]
>         ['0x05' ADS_WRITE_CONTROL]
>         ['0x06' ADS_ADD_DEVICE_NOTIFICATION]
>         ['0x07' ADS_DELETE_DEVICE_NOTIFICATION]
>         ['0x08' ADS_DEVICE_NOTIFICATION]
>         ['0x09' ADS_READ_WRITE]
>     ]
>     
>     Second candidate is what I'm stuck right now sniping next cycles of
>     problems. So in case of State we have complex type composed from 2
>     bytes. A note here - instead of two bytes we might have a variable
>     length type which includes array or other variable section.
>     [type little endian 'State'
>         [simple     bit 'broadcast'             ]
>         [reserved   int 7 '0x0'                 ]
>         [simple     bit 'initCommand'           ]
>         [simple     bit 'updCommand'            ]
>         [simple     bit 'timestampAdded'        ]
>         [simple     bit 'highPriorityCommand'   ]
>         [simple     bit 'systemCommand'         ]
>         [simple     bit 'adsCommand'            ]
>         [simple     bit 'noReturn'              ]
>         [simple     bit 'response'              ]
>     ]
>     
>     The order of reading big endian encoded data to impose little endian
>     shift would be (please correct me if I'm wrong):
>     1) init
>     2) udp
>     3) add timestamp
>     4) priority
>     5) system
>     6) ads
>     7) noreturn
>     8) response (end of byte 1)
>     9) broadcast
>     10) reserved (end of byte )
>     We can do same trick for writing, by re-arranging fields. By this way we
>     avoid any additional byte level operations.
>     
>     Overall trouble with generated driver is to declare "how much" bytes
>     should be read and interpreted. We have precise size information at the
>     runtime - due to length fields, we can leverage it at generation time,
>     but then we won't be able to cover all cases.
>     
>     I would love to keep it simple and do not break things thus I need your
>     advice on how to approach this problem in a valid way.
>     
>     Cheers,
>     Łukasz
>     
>     
>     On 13.04.2020 03:26, Niclas Hedhman wrote:
>     > <anecdotal-rant>
>     > For us who were around and shaping the protocols in the 1980s, and people
>     > before us (and before standards like RS-232), a lot of the "specifications"
>     > came out of "observation of implementation we managed to get to work",
>     > rather than "implement this spec". A lot was due to extreme memory
>     > constraints (in my case, multi-tasking operating system, serial protocol
>     > 187kbps, interpreted programming language with floating point ops and user
>     > applications in 2kB RAM and 8kB EPROM) and a general lack of information,
>     > like what other people were doing, sharing experiences and so on.
>     > 
>     > And there were many "innovative" ways to squeeze just a little bit extra
>     > out of the hardware, resulting in "hard to understand" consequences. Bit
>     > packing was a typical one, multiple functions packed into a single byte.
>     > Look at page 14 in https://www.nxp.com/docs/en/data-sheet/80C31_80C32.pdf
>     > and read up on "UART Enahanced Mode", and we used this, i.e. 9 bits, no
>     > parity and clever use of address and mask to create a slave-to-slave direct
>     > protocol, where the master's role was to signal which slave "owned" the
>     > cable. Yeah, in that 8kB ROM limitation (I think protocol was about 1kB
>     > ROM) and something like 150 bytes RAM for comm protocol.
>     > 
>     > Could you implement a compatible device to this with PLC4X and modern
>     > hardware (i.e. no 8031/32 co-processor)? Possibly but bit-banging is needed
>     > to support the 9bit data (+start and stop bits) and an awful lot of CPU
>     > cycles on something that was automatic on one of the slowest long-lived
>     > microcontroller ever.
>     > </anecdotal-rant>
>     > 
>     > My point was only to highlight that some of the strange things you see in
>     > protocols today, have its roots in pre-standardization days. Today no one
>     > would go down that route, because the hardware cost nothing now (8031  +
>     > 8kB EPROM + 2kB static RAM + battery backup => ~$50 in 1983's currency) and
>     > longevity of software is more important.
>     > 
>     > Cheers
>     > Niclas
>     > 
>     > 
>     > On Sun, Apr 12, 2020 at 10:10 PM Christofer Dutz <ch...@c-ware.de>
>     > wrote:
>     > 
>     >> Hi Lukasz,
>     >>
>     >> I think it really gets tricky when using BE and having some byte-odd-sizes
>     >> ... I remember in the Firmata protocol there were some bitmasks and then 10
>     >> bit uint as BE ... not it really got tricky as the specs were written from
>     >> a point of view: You read 16 bits BE and then the first6 bits mean XYZ
>     >> instead of describing how the bits actually travel over the wire.
>     >>
>     >> Chris
>     >>
>     >>
>     >>
>     >> Am 11.04.20, 01:21 schrieb "Łukasz Dywicki" <lu...@code-house.org>:
>     >>
>     >>     I've made some progress with topic by modyfing mspec and allowing
>     >>     'little endian' flag on fields. This moved me further to next issue -
>     >>     which is whole type encoded little endian.
>     >>
>     >>     In ADS driver such type is State, which has 2 bytes and uses 8 bits for
>     >>     various flags.
>     >>     There are two cases which require different approach - reading and
>     >>     writing. So for reading we need to swap N bytes based on type length.
>     >>     For writing we need to alocate buffer for N bytes and swap them before
>     >>     writing.
>     >>
>     >>     I am stuck now with freemaker templates and bit-io.
>     >>
>     >>     Cheers,
>     >>     Łukasz
>     >>
>     >>
>     >>
>     >>     On 10.04.2020 17:57, Łukasz Dywicki wrote:
>     >>     > I am doing some tests of ADS serialization.
>     >>     >
>     >>     > I've run into some troubles with payload which is generated with new
>     >>     > driver. I'm not sure if that's my fault or generated code.
>     >>     >
>     >>     > I did a verification of what Wireshark shows and how ads structures
>     >> are
>     >>     > parsed. There is a gap I think. For example ams port number 1000
>     >>     > (0x1027) is read as 4135.
>     >>     >
>     >>     > Obviously I used wrong structures while implementing protocol logic
>     >> in
>     >>     > first place, but now I am uncertain of how fields are encoded. How we
>     >>     > mark field as little endian when rest of payload is big endian? Do we
>     >>     > have `uint_le`?
>     >>     >
>     >>     > As far I remember route creation logic I was tracking last week used
>     >>     > combination of LE and BE.
>     >>     >
>     >>     > Best regards,
>     >>     > Łukasz
>     >>     >
>     >>
>     >>
>     >>
>     > 
>     
>

Re: Big and littleendian fields in one mspec

Posted by Christofer Dutz <ch...@c-ware.de>.

Hi Lukasz,

I am not sure I am understanding the problems you are facing. We already have LE and BE protocols.
For example EIP is LE and the rest is generally BE. It seems that ADS/AMS is also LE. 
mspec doesn't even know about endianness.

Up till now the endianness doesn't have an effect on bit-fields or single-bit ints. 
It only starts to affect if a field goes from one byte to the next, which is usually for (u)int and floating point values.

That's why we have created the Read/WriteBuffers to set their endianness in the constructor.

So if you're creating a driver for ADS/AMS which is LE, then you write the mspec according to the sequence the information is located in the transferred bytes and have the Read/WriteBuffer handle the endianness issue.

I do see a problem when there are drivers that use mixed endianness, but we have still to encounter such a protocol.

So I have to admit that I don't like any of the mspec changes you proposed, as I think you are just not using the tools we have the right way.

Chris



Am 14.04.20, 00:32 schrieb "Łukasz Dywicki" <lu...@code-house.org>:

    Hey Niclas,
    I realized how old the old things are when I started preparing
    automation training for mere mortals and got into history of frames and
    even cabling. Mr. Modbus and EIA-485 is definitely older than I. ;-)
    
    Getting back to the point - yes. I been thinking how to address the byte
    order in effective way. Here are two approaches I have for now:
    
    A) My initial attempt is just a temporary buffer which is then written
    in reverse order to caller. For reading it is similar - just getting N
    bytes in reversed order. The hard part is.. knowing N. I had to add a
    static calculation in order to allocate valid buffer sizes. I tend to
    work but I'm not happy with this approach cause it involves additional work.
    B) Second idea I've got is really simple and relies on code generation.
    We know in which order fields are coming. Here I'm referring to a State
    field which is just bunch of bits. If we would group fields in bytes and
    generate code in reverse order then it has chance to work. Requirement
    for that - ability to know basic field sizes upfront.
    C) Try to combine above with bit-io or Read/WriteBuffers as these are
    places which know actual position and state of buffers which are being
    read/written.
    
    Now, getting to two cases which are a problem. CommandId and State. So
    with command id situation is simple as it is declared as enum and it is
    read as uint. We know size upfront and can generate valid method call
    (readIntLE).
    [enum uint 16 little endian 'CommandId'
        ['0x00' INVALID]
        ['0x01' ADS_READ_DEVICE_INFO]
        ['0x02' ADS_READ]
        ['0x03' ADS_WRITE]
        ['0x04' ADS_READ_STATE]
        ['0x05' ADS_WRITE_CONTROL]
        ['0x06' ADS_ADD_DEVICE_NOTIFICATION]
        ['0x07' ADS_DELETE_DEVICE_NOTIFICATION]
        ['0x08' ADS_DEVICE_NOTIFICATION]
        ['0x09' ADS_READ_WRITE]
    ]
    
    Second candidate is what I'm stuck right now sniping next cycles of
    problems. So in case of State we have complex type composed from 2
    bytes. A note here - instead of two bytes we might have a variable
    length type which includes array or other variable section.
    [type little endian 'State'
        [simple     bit 'broadcast'             ]
        [reserved   int 7 '0x0'                 ]
        [simple     bit 'initCommand'           ]
        [simple     bit 'updCommand'            ]
        [simple     bit 'timestampAdded'        ]
        [simple     bit 'highPriorityCommand'   ]
        [simple     bit 'systemCommand'         ]
        [simple     bit 'adsCommand'            ]
        [simple     bit 'noReturn'              ]
        [simple     bit 'response'              ]
    ]
    
    The order of reading big endian encoded data to impose little endian
    shift would be (please correct me if I'm wrong):
    1) init
    2) udp
    3) add timestamp
    4) priority
    5) system
    6) ads
    7) noreturn
    8) response (end of byte 1)
    9) broadcast
    10) reserved (end of byte )
    We can do same trick for writing, by re-arranging fields. By this way we
    avoid any additional byte level operations.
    
    Overall trouble with generated driver is to declare "how much" bytes
    should be read and interpreted. We have precise size information at the
    runtime - due to length fields, we can leverage it at generation time,
    but then we won't be able to cover all cases.
    
    I would love to keep it simple and do not break things thus I need your
    advice on how to approach this problem in a valid way.
    
    Cheers,
    Łukasz
    
    
    On 13.04.2020 03:26, Niclas Hedhman wrote:
    > <anecdotal-rant>
    > For us who were around and shaping the protocols in the 1980s, and people
    > before us (and before standards like RS-232), a lot of the "specifications"
    > came out of "observation of implementation we managed to get to work",
    > rather than "implement this spec". A lot was due to extreme memory
    > constraints (in my case, multi-tasking operating system, serial protocol
    > 187kbps, interpreted programming language with floating point ops and user
    > applications in 2kB RAM and 8kB EPROM) and a general lack of information,
    > like what other people were doing, sharing experiences and so on.
    > 
    > And there were many "innovative" ways to squeeze just a little bit extra
    > out of the hardware, resulting in "hard to understand" consequences. Bit
    > packing was a typical one, multiple functions packed into a single byte.
    > Look at page 14 in https://www.nxp.com/docs/en/data-sheet/80C31_80C32.pdf
    > and read up on "UART Enahanced Mode", and we used this, i.e. 9 bits, no
    > parity and clever use of address and mask to create a slave-to-slave direct
    > protocol, where the master's role was to signal which slave "owned" the
    > cable. Yeah, in that 8kB ROM limitation (I think protocol was about 1kB
    > ROM) and something like 150 bytes RAM for comm protocol.
    > 
    > Could you implement a compatible device to this with PLC4X and modern
    > hardware (i.e. no 8031/32 co-processor)? Possibly but bit-banging is needed
    > to support the 9bit data (+start and stop bits) and an awful lot of CPU
    > cycles on something that was automatic on one of the slowest long-lived
    > microcontroller ever.
    > </anecdotal-rant>
    > 
    > My point was only to highlight that some of the strange things you see in
    > protocols today, have its roots in pre-standardization days. Today no one
    > would go down that route, because the hardware cost nothing now (8031  +
    > 8kB EPROM + 2kB static RAM + battery backup => ~$50 in 1983's currency) and
    > longevity of software is more important.
    > 
    > Cheers
    > Niclas
    > 
    > 
    > On Sun, Apr 12, 2020 at 10:10 PM Christofer Dutz <ch...@c-ware.de>
    > wrote:
    > 
    >> Hi Lukasz,
    >>
    >> I think it really gets tricky when using BE and having some byte-odd-sizes
    >> ... I remember in the Firmata protocol there were some bitmasks and then 10
    >> bit uint as BE ... not it really got tricky as the specs were written from
    >> a point of view: You read 16 bits BE and then the first6 bits mean XYZ
    >> instead of describing how the bits actually travel over the wire.
    >>
    >> Chris
    >>
    >>
    >>
    >> Am 11.04.20, 01:21 schrieb "Łukasz Dywicki" <lu...@code-house.org>:
    >>
    >>     I've made some progress with topic by modyfing mspec and allowing
    >>     'little endian' flag on fields. This moved me further to next issue -
    >>     which is whole type encoded little endian.
    >>
    >>     In ADS driver such type is State, which has 2 bytes and uses 8 bits for
    >>     various flags.
    >>     There are two cases which require different approach - reading and
    >>     writing. So for reading we need to swap N bytes based on type length.
    >>     For writing we need to alocate buffer for N bytes and swap them before
    >>     writing.
    >>
    >>     I am stuck now with freemaker templates and bit-io.
    >>
    >>     Cheers,
    >>     Łukasz
    >>
    >>
    >>
    >>     On 10.04.2020 17:57, Łukasz Dywicki wrote:
    >>     > I am doing some tests of ADS serialization.
    >>     >
    >>     > I've run into some troubles with payload which is generated with new
    >>     > driver. I'm not sure if that's my fault or generated code.
    >>     >
    >>     > I did a verification of what Wireshark shows and how ads structures
    >> are
    >>     > parsed. There is a gap I think. For example ams port number 1000
    >>     > (0x1027) is read as 4135.
    >>     >
    >>     > Obviously I used wrong structures while implementing protocol logic
    >> in
    >>     > first place, but now I am uncertain of how fields are encoded. How we
    >>     > mark field as little endian when rest of payload is big endian? Do we
    >>     > have `uint_le`?
    >>     >
    >>     > As far I remember route creation logic I was tracking last week used
    >>     > combination of LE and BE.
    >>     >
    >>     > Best regards,
    >>     > Łukasz
    >>     >
    >>
    >>
    >>
    >

Re: Big and littleendian fields in one mspec

Posted by Łukasz Dywicki <lu...@code-house.org>.

Hey Niclas,
I realized how old the old things are when I started preparing
automation training for mere mortals and got into history of frames and
even cabling. Mr. Modbus and EIA-485 is definitely older than I. ;-)

Getting back to the point - yes. I been thinking how to address the byte
order in effective way. Here are two approaches I have for now:

A) My initial attempt is just a temporary buffer which is then written
in reverse order to caller. For reading it is similar - just getting N
bytes in reversed order. The hard part is.. knowing N. I had to add a
static calculation in order to allocate valid buffer sizes. I tend to
work but I'm not happy with this approach cause it involves additional work.
B) Second idea I've got is really simple and relies on code generation.
We know in which order fields are coming. Here I'm referring to a State
field which is just bunch of bits. If we would group fields in bytes and
generate code in reverse order then it has chance to work. Requirement
for that - ability to know basic field sizes upfront.
C) Try to combine above with bit-io or Read/WriteBuffers as these are
places which know actual position and state of buffers which are being
read/written.

Now, getting to two cases which are a problem. CommandId and State. So
with command id situation is simple as it is declared as enum and it is
read as uint. We know size upfront and can generate valid method call
(readIntLE).
[enum uint 16 little endian 'CommandId'
    ['0x00' INVALID]
    ['0x01' ADS_READ_DEVICE_INFO]
    ['0x02' ADS_READ]
    ['0x03' ADS_WRITE]
    ['0x04' ADS_READ_STATE]
    ['0x05' ADS_WRITE_CONTROL]
    ['0x06' ADS_ADD_DEVICE_NOTIFICATION]
    ['0x07' ADS_DELETE_DEVICE_NOTIFICATION]
    ['0x08' ADS_DEVICE_NOTIFICATION]
    ['0x09' ADS_READ_WRITE]
]

Second candidate is what I'm stuck right now sniping next cycles of
problems. So in case of State we have complex type composed from 2
bytes. A note here - instead of two bytes we might have a variable
length type which includes array or other variable section.
[type little endian 'State'
    [simple     bit 'broadcast'             ]
    [reserved   int 7 '0x0'                 ]
    [simple     bit 'initCommand'           ]
    [simple     bit 'updCommand'            ]
    [simple     bit 'timestampAdded'        ]
    [simple     bit 'highPriorityCommand'   ]
    [simple     bit 'systemCommand'         ]
    [simple     bit 'adsCommand'            ]
    [simple     bit 'noReturn'              ]
    [simple     bit 'response'              ]
]

The order of reading big endian encoded data to impose little endian
shift would be (please correct me if I'm wrong):
1) init
2) udp
3) add timestamp
4) priority
5) system
6) ads
7) noreturn
8) response (end of byte 1)
9) broadcast
10) reserved (end of byte )
We can do same trick for writing, by re-arranging fields. By this way we
avoid any additional byte level operations.

Overall trouble with generated driver is to declare "how much" bytes
should be read and interpreted. We have precise size information at the
runtime - due to length fields, we can leverage it at generation time,
but then we won't be able to cover all cases.

I would love to keep it simple and do not break things thus I need your
advice on how to approach this problem in a valid way.

Cheers,
Łukasz


On 13.04.2020 03:26, Niclas Hedhman wrote:
> <anecdotal-rant>
> For us who were around and shaping the protocols in the 1980s, and people
> before us (and before standards like RS-232), a lot of the "specifications"
> came out of "observation of implementation we managed to get to work",
> rather than "implement this spec". A lot was due to extreme memory
> constraints (in my case, multi-tasking operating system, serial protocol
> 187kbps, interpreted programming language with floating point ops and user
> applications in 2kB RAM and 8kB EPROM) and a general lack of information,
> like what other people were doing, sharing experiences and so on.
> 
> And there were many "innovative" ways to squeeze just a little bit extra
> out of the hardware, resulting in "hard to understand" consequences. Bit
> packing was a typical one, multiple functions packed into a single byte.
> Look at page 14 in https://www.nxp.com/docs/en/data-sheet/80C31_80C32.pdf
> and read up on "UART Enahanced Mode", and we used this, i.e. 9 bits, no
> parity and clever use of address and mask to create a slave-to-slave direct
> protocol, where the master's role was to signal which slave "owned" the
> cable. Yeah, in that 8kB ROM limitation (I think protocol was about 1kB
> ROM) and something like 150 bytes RAM for comm protocol.
> 
> Could you implement a compatible device to this with PLC4X and modern
> hardware (i.e. no 8031/32 co-processor)? Possibly but bit-banging is needed
> to support the 9bit data (+start and stop bits) and an awful lot of CPU
> cycles on something that was automatic on one of the slowest long-lived
> microcontroller ever.
> </anecdotal-rant>
> 
> My point was only to highlight that some of the strange things you see in
> protocols today, have its roots in pre-standardization days. Today no one
> would go down that route, because the hardware cost nothing now (8031  +
> 8kB EPROM + 2kB static RAM + battery backup => ~$50 in 1983's currency) and
> longevity of software is more important.
> 
> Cheers
> Niclas
> 
> 
> On Sun, Apr 12, 2020 at 10:10 PM Christofer Dutz <ch...@c-ware.de>
> wrote:
> 
>> Hi Lukasz,
>>
>> I think it really gets tricky when using BE and having some byte-odd-sizes
>> ... I remember in the Firmata protocol there were some bitmasks and then 10
>> bit uint as BE ... not it really got tricky as the specs were written from
>> a point of view: You read 16 bits BE and then the first6 bits mean XYZ
>> instead of describing how the bits actually travel over the wire.
>>
>> Chris
>>
>>
>>
>> Am 11.04.20, 01:21 schrieb "Łukasz Dywicki" <lu...@code-house.org>:
>>
>>     I've made some progress with topic by modyfing mspec and allowing
>>     'little endian' flag on fields. This moved me further to next issue -
>>     which is whole type encoded little endian.
>>
>>     In ADS driver such type is State, which has 2 bytes and uses 8 bits for
>>     various flags.
>>     There are two cases which require different approach - reading and
>>     writing. So for reading we need to swap N bytes based on type length.
>>     For writing we need to alocate buffer for N bytes and swap them before
>>     writing.
>>
>>     I am stuck now with freemaker templates and bit-io.
>>
>>     Cheers,
>>     Łukasz
>>
>>
>>
>>     On 10.04.2020 17:57, Łukasz Dywicki wrote:
>>     > I am doing some tests of ADS serialization.
>>     >
>>     > I've run into some troubles with payload which is generated with new
>>     > driver. I'm not sure if that's my fault or generated code.
>>     >
>>     > I did a verification of what Wireshark shows and how ads structures
>> are
>>     > parsed. There is a gap I think. For example ams port number 1000
>>     > (0x1027) is read as 4135.
>>     >
>>     > Obviously I used wrong structures while implementing protocol logic
>> in
>>     > first place, but now I am uncertain of how fields are encoded. How we
>>     > mark field as little endian when rest of payload is big endian? Do we
>>     > have `uint_le`?
>>     >
>>     > As far I remember route creation logic I was tracking last week used
>>     > combination of LE and BE.
>>     >
>>     > Best regards,
>>     > Łukasz
>>     >
>>
>>
>>
>

Re: Big and littleendian fields in one mspec

Posted by Niclas Hedhman <ni...@hedhman.org>.

<anecdotal-rant>
For us who were around and shaping the protocols in the 1980s, and people
before us (and before standards like RS-232), a lot of the "specifications"
came out of "observation of implementation we managed to get to work",
rather than "implement this spec". A lot was due to extreme memory
constraints (in my case, multi-tasking operating system, serial protocol
187kbps, interpreted programming language with floating point ops and user
applications in 2kB RAM and 8kB EPROM) and a general lack of information,
like what other people were doing, sharing experiences and so on.

And there were many "innovative" ways to squeeze just a little bit extra
out of the hardware, resulting in "hard to understand" consequences. Bit
packing was a typical one, multiple functions packed into a single byte.
Look at page 14 in https://www.nxp.com/docs/en/data-sheet/80C31_80C32.pdf
and read up on "UART Enahanced Mode", and we used this, i.e. 9 bits, no
parity and clever use of address and mask to create a slave-to-slave direct
protocol, where the master's role was to signal which slave "owned" the
cable. Yeah, in that 8kB ROM limitation (I think protocol was about 1kB
ROM) and something like 150 bytes RAM for comm protocol.

Could you implement a compatible device to this with PLC4X and modern
hardware (i.e. no 8031/32 co-processor)? Possibly but bit-banging is needed
to support the 9bit data (+start and stop bits) and an awful lot of CPU
cycles on something that was automatic on one of the slowest long-lived
microcontroller ever.
</anecdotal-rant>

My point was only to highlight that some of the strange things you see in
protocols today, have its roots in pre-standardization days. Today no one
would go down that route, because the hardware cost nothing now (8031  +
8kB EPROM + 2kB static RAM + battery backup => ~$50 in 1983's currency) and
longevity of software is more important.

Cheers
Niclas

On Sun, Apr 12, 2020 at 10:10 PM Christofer Dutz <ch...@c-ware.de>
wrote:

> Hi Lukasz,
>
> I think it really gets tricky when using BE and having some byte-odd-sizes
> ... I remember in the Firmata protocol there were some bitmasks and then 10
> bit uint as BE ... not it really got tricky as the specs were written from
> a point of view: You read 16 bits BE and then the first6 bits mean XYZ
> instead of describing how the bits actually travel over the wire.
>
> Chris
>
>
>
> Am 11.04.20, 01:21 schrieb "Łukasz Dywicki" <lu...@code-house.org>:
>
>     I've made some progress with topic by modyfing mspec and allowing
>     'little endian' flag on fields. This moved me further to next issue -
>     which is whole type encoded little endian.
>
>     In ADS driver such type is State, which has 2 bytes and uses 8 bits for
>     various flags.
>     There are two cases which require different approach - reading and
>     writing. So for reading we need to swap N bytes based on type length.
>     For writing we need to alocate buffer for N bytes and swap them before
>     writing.
>
>     I am stuck now with freemaker templates and bit-io.
>
>     Cheers,
>     Łukasz
>
>
>
>     On 10.04.2020 17:57, Łukasz Dywicki wrote:
>     > I am doing some tests of ADS serialization.
>     >
>     > I've run into some troubles with payload which is generated with new
>     > driver. I'm not sure if that's my fault or generated code.
>     >
>     > I did a verification of what Wireshark shows and how ads structures
> are
>     > parsed. There is a gap I think. For example ams port number 1000
>     > (0x1027) is read as 4135.
>     >
>     > Obviously I used wrong structures while implementing protocol logic
> in
>     > first place, but now I am uncertain of how fields are encoded. How we
>     > mark field as little endian when rest of payload is big endian? Do we
>     > have `uint_le`?
>     >
>     > As far I remember route creation logic I was tracking last week used
>     > combination of LE and BE.
>     >
>     > Best regards,
>     > Łukasz
>     >
>
>
>

Re: Big and littleendian fields in one mspec

Posted by Christofer Dutz <ch...@c-ware.de>.

Hi Lukasz,

I think it really gets tricky when using BE and having some byte-odd-sizes ... I remember in the Firmata protocol there were some bitmasks and then 10 bit uint as BE ... not it really got tricky as the specs were written from a point of view: You read 16 bits BE and then the first6 bits mean XYZ instead of describing how the bits actually travel over the wire.

Chris



Am 11.04.20, 01:21 schrieb "Łukasz Dywicki" <lu...@code-house.org>:

    I've made some progress with topic by modyfing mspec and allowing
    'little endian' flag on fields. This moved me further to next issue -
    which is whole type encoded little endian.
    
    In ADS driver such type is State, which has 2 bytes and uses 8 bits for
    various flags.
    There are two cases which require different approach - reading and
    writing. So for reading we need to swap N bytes based on type length.
    For writing we need to alocate buffer for N bytes and swap them before
    writing.
    
    I am stuck now with freemaker templates and bit-io.
    
    Cheers,
    Łukasz
    
    
    
    On 10.04.2020 17:57, Łukasz Dywicki wrote:
    > I am doing some tests of ADS serialization.
    > 
    > I've run into some troubles with payload which is generated with new
    > driver. I'm not sure if that's my fault or generated code.
    > 
    > I did a verification of what Wireshark shows and how ads structures are
    > parsed. There is a gap I think. For example ams port number 1000
    > (0x1027) is read as 4135.
    > 
    > Obviously I used wrong structures while implementing protocol logic in
    > first place, but now I am uncertain of how fields are encoded. How we
    > mark field as little endian when rest of payload is big endian? Do we
    > have `uint_le`?
    > 
    > As far I remember route creation logic I was tracking last week used
    > combination of LE and BE.
    > 
    > Best regards,
    > Łukasz
    >

Re: Big and littleendian fields in one mspec

Posted by Łukasz Dywicki <lu...@code-house.org>.

I've made some progress with topic by modyfing mspec and allowing
'little endian' flag on fields. This moved me further to next issue -
which is whole type encoded little endian.

In ADS driver such type is State, which has 2 bytes and uses 8 bits for
various flags.
There are two cases which require different approach - reading and
writing. So for reading we need to swap N bytes based on type length.
For writing we need to alocate buffer for N bytes and swap them before
writing.

I am stuck now with freemaker templates and bit-io.

Cheers,
Łukasz

On 10.04.2020 17:57, Łukasz Dywicki wrote:
> I am doing some tests of ADS serialization.
> 
> I've run into some troubles with payload which is generated with new
> driver. I'm not sure if that's my fault or generated code.
> 
> I did a verification of what Wireshark shows and how ads structures are
> parsed. There is a gap I think. For example ams port number 1000
> (0x1027) is read as 4135.
> 
> Obviously I used wrong structures while implementing protocol logic in
> first place, but now I am uncertain of how fields are encoded. How we
> mark field as little endian when rest of payload is big endian? Do we
> have `uint_le`?
> 
> As far I remember route creation logic I was tracking last week used
> combination of LE and BE.
> 
> Best regards,
> Łukasz
>