You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@daffodil.apache.org by "Costello, Roger L." <co...@mitre.org> on 2018/11/16 22:12:59 UTC

If the input file is in littleEndian and I output as xs:hexBinary, would you use dfdl:byteOrder="littleEndian"? or bigEndian?

Hello DFDL Community,

I am creating a DFDL schema for exe files.

Exe files are in littleEndian.

One field is called imageBase. It is a long long value. (long long is 8 bytes)

Would it make most sense to output that field simply as hex?

If I output it as hex, would it be better to output the hex in littleEndian:

<xs:element     name="Image_Base_in_hex"
                       type="xs:hexBinary"
dfdl:length="8"
dfdl:lengthKind="explicit"
dfdl:lengthUnits="bytes"
dfdl:byteOrder="littleEndian"
/>

Or, would it be better to output the hex in bigEndian:

<xs:element     name="Image_Base_in_hex"
                       type="xs:hexBinary"
dfdl:length="8"
dfdl:lengthKind="explicit"
dfdl:lengthUnits="bytes"
dfdl:byteOrder="bigEndian"
/>

If I output it in bigEndian:

<Image_Base_in_hex>0090000000004000</Image_Base_in_hex>

it is easier to copy the hex in that element and find the hex in my hex editor.

What do you recommend?

Re: If the input file is in littleEndian and I output as xs:hexBinary, would you use dfdl:byteOrder="littleEndian"? or bigEndian?

Posted by Mike Beckerle <mb...@tresys.com>.
To clarify...


Byte order doesn't apply to type xs:hexBinary because a hexBinary is effectively a byte string, and byte order only applies when more than one byte is spanned by a single number.


So the dfdl:byteOrder big/little endian on your schema aren't actually being used becasue the elements are type hexBinary.


[ This suggests a Daffodil warning might be helpful if you put properties directly on an element that are ignored due to the type, or to other property settings. We've considered this, but it hasn't gotten generally implemented... there are so many cases. E.g., you have dfdl:lengthPattern, but the dfdl:lengthKind is not 'pattern'. You have dfdl:length but the dfdl:lengthKind is not "explicit". You have dfdl:occursCountKind, but the element is scalar, etc. etc. ]


But if you change the type to xs:unsignedLong, then byte order is very relevant and will be used for sure.


I suggest:


<xs:element     name="Image_Base_as_uLong"
                       type="xs:unsignedLong"
dfdl:lengthKind="implicit"
dfdl:byteOrder="littleEndian"
/>




The other relevant properties you need, which probably wouldn't go on the element but in the surrounding base, would be representation="binary" binaryNumberRep="binary".


-mikeb

________________________________
From: Mike Beckerle
Sent: Friday, November 16, 2018 9:30:09 PM
To: users@daffodil.apache.org
Subject: Re: If the input file is in littleEndian and I output as xs:hexBinary, would you use dfdl:byteOrder="littleEndian"? or bigEndian?


Is this value a number or some byte array of length 8?


Endianness doesn't apply if it is a byte array. There's no such concept.


So if you model it as hexbinary, byte order is not relevant. You will get the bytes in sequence right to left, first byte first.


If it is a number (name imageBase suggests it gets offsets added to it), then I suggest you model it as a number, i.e.,  I recommend you model it as a littleEndian xs:Long or xs:UnsignedLong.


That keeps you out of endianness hell.



________________________________
From: Costello, Roger L. <co...@mitre.org>
Sent: Friday, November 16, 2018 5:12:59 PM
To: users@daffodil.apache.org
Subject: If the input file is in littleEndian and I output as xs:hexBinary, would you use dfdl:byteOrder="littleEndian"? or bigEndian?


Hello DFDL Community,



I am creating a DFDL schema for exe files.



Exe files are in littleEndian.



One field is called imageBase. It is a long long value. (long long is 8 bytes)



Would it make most sense to output that field simply as hex?



If I output it as hex, would it be better to output the hex in littleEndian:

<xs:element     name="Image_Base_in_hex"

                       type="xs:hexBinary"

dfdl:length="8"

dfdl:lengthKind="explicit"

dfdl:lengthUnits="bytes"

dfdl:byteOrder="littleEndian"

/>



Or, would it be better to output the hex in bigEndian:



<xs:element     name="Image_Base_in_hex"

                       type="xs:hexBinary"

dfdl:length="8"

dfdl:lengthKind="explicit"

dfdl:lengthUnits="bytes"

dfdl:byteOrder="bigEndian"

/>



If I output it in bigEndian:



<Image_Base_in_hex>0090000000004000</Image_Base_in_hex>



it is easier to copy the hex in that element and find the hex in my hex editor.



What do you recommend?

Re: If the input file is in littleEndian and I output as xs:hexBinary, would you use dfdl:byteOrder="littleEndian"? or bigEndian?

Posted by Mike Beckerle <mb...@tresys.com>.
Is this value a number or some byte array of length 8?


Endianness doesn't apply if it is a byte array. There's no such concept.


So if you model it as hexbinary, byte order is not relevant. You will get the bytes in sequence right to left, first byte first.


If it is a number (name imageBase suggests it gets offsets added to it), then I suggest you model it as a number, i.e.,  I recommend you model it as a littleEndian xs:Long or xs:UnsignedLong.


That keeps you out of endianness hell.



________________________________
From: Costello, Roger L. <co...@mitre.org>
Sent: Friday, November 16, 2018 5:12:59 PM
To: users@daffodil.apache.org
Subject: If the input file is in littleEndian and I output as xs:hexBinary, would you use dfdl:byteOrder="littleEndian"? or bigEndian?


Hello DFDL Community,



I am creating a DFDL schema for exe files.



Exe files are in littleEndian.



One field is called imageBase. It is a long long value. (long long is 8 bytes)



Would it make most sense to output that field simply as hex?



If I output it as hex, would it be better to output the hex in littleEndian:

<xs:element     name="Image_Base_in_hex"

                       type="xs:hexBinary"

dfdl:length="8"

dfdl:lengthKind="explicit"

dfdl:lengthUnits="bytes"

dfdl:byteOrder="littleEndian"

/>



Or, would it be better to output the hex in bigEndian:



<xs:element     name="Image_Base_in_hex"

                       type="xs:hexBinary"

dfdl:length="8"

dfdl:lengthKind="explicit"

dfdl:lengthUnits="bytes"

dfdl:byteOrder="bigEndian"

/>



If I output it in bigEndian:



<Image_Base_in_hex>0090000000004000</Image_Base_in_hex>



it is easier to copy the hex in that element and find the hex in my hex editor.



What do you recommend?