You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@daffodil.apache.org by "Costello, Roger L." <co...@mitre.org> on 2018/11/12 18:37:30 UTC

Bug with hidden group containing an element with a 1-bit value?

Hello DFDL community,

I am creating a DFDL schema for .exe files (Portable Executable file format).

At one point in the file are 2-bytes that represent sixteen 1-bit flags.

For each bit, if the bit = 1, then I want to output a message; otherwise output 0 (for now).

To test that I am correctly reading the 2 bytes I did this:

<xs:element name="Characteristics" type="xs:hexBinary" dfdl:length="2" dfdl:lengthKind="explicit" dfdl:lengthUnits="bytes" />

The output contained the correct hex bytes:

	07 01

To test that I am correctly consuming each individual bit I did this:

<xs:element name="Characteristic" type="unsignedint1" minOccurs="16" maxOccurs="16" />

Here is the output:

    <Characteristic>0</Characteristic>
    <Characteristic>0</Characteristic>
    <Characteristic>0</Characteristic>
    <Characteristic>0</Characteristic>
    <Characteristic>0</Characteristic>
    <Characteristic>1</Characteristic>
    <Characteristic>1</Characteristic>
    <Characteristic>1</Characteristic>
    <Characteristic>0</Characteristic>
    <Characteristic>0</Characteristic>
    <Characteristic>0</Characteristic>
    <Characteristic>0</Characteristic>
    <Characteristic>0</Characteristic>
    <Characteristic>0</Characteristic>
    <Characteristic>0</Characteristic>
    <Characteristic>1</Characteristic>

You can see that those bits correspond to hex 07 01.

I want the value of each <Characteristic> element to be a message, not 0/1. So, I used the dfdl:hiddenGroupRef approach to translate 0/1 to a message.

<xs:element name="Characteristics">
    <xs:complexType>
        <xs:sequence>
            <xs:sequence dfdl:hiddenGroupRef="hidden_characteristic_bit16_Group" />
            <xs:element name="Characteristic-16" type="xs:string" dfdl:inputValueCalc="{
                    if (fn:lower-case(xs:string(../Hidden_characteristic_bit16)) eq "1") then "The bytes of the word are reversed(obsolete)"  
                    else xs:string(../Hidden_characteristic_bit16)}" />
            <xs:sequence dfdl:hiddenGroupRef="hidden_characteristic_bit15_Group" />
            <xs:element name="Characteristic-15" type="xs:string" dfdl:inputValueCalc="{
                    if (fn:lower-case(xs:string(../Hidden_characteristic_bit15)) eq "1") then "The image should only be run on a single processor computer"  
                    else xs:string(../Hidden_characteristic_bit15)}" />
            ...
        </xs:sequence>
    </xs:complexType>
</xs:element>

<xs:group name="hidden_characteristic_bit16_Group"> 
    <xs:sequence>
        <xs:element name="Hidden_characteristic_bit16" type="unsignedint1" dfdl:outputValueCalc="{ . }" />
    </xs:sequence>
</xs:group>
...
<xs:group name="hidden_characteristic_bit1_Group"> 
    <xs:sequence>
        <xs:element name="Hidden_characteristic_bit1" type="unsignedint1" dfdl:outputValueCalc="{ . }" />
    </xs:sequence>
</xs:group>

That produced this incorrect output:

<Characteristic-16>0</Characteristic-16>
<Characteristic-15>0</Characteristic-15>
<Characteristic-14>0</Characteristic-14>
<Characteristic-13>0</Characteristic-13>
<Characteristic-12>0</Characteristic-12>
<Characteristic-11>0</Characteristic-11>
<Characteristic-10>0</Characteristic-10>
<Characteristic-9>0</Characteristic-9>
<Characteristic-8>0</Characteristic-8>
<Characteristic-7>0</Characteristic-7>
<Characteristic-6>0</Characteristic-6>
<Characteristic-5>Aggressively trim the working set(obsolete)</Characteristic-5>
<Characteristic-4>0</Characteristic-4>
<Characteristic-3>0</Characteristic-3>
<Characteristic-2>0</Characteristic-2>
<Characteristic-1>0</Characteristic-1>

Is there a bug with hidden groups containing an element with a 1-bit value?

Note: here is how unsigned1 is defined:

<xs:simpleType name="unsignedint1" dfdl:length="1" dfdl:lengthKind="explicit" dfdl:alignmentUnits="bits">
        <xs:restriction base="xs:unsignedInt"/>
</xs:simpleType>

/Roger

RE: Bug with hidden group containing an element with a 1-bit value?

Posted by "Costello, Roger L." <co...@mitre.org>.
That is a brilliant solution! 

Thank you Steve!

/Roger

-----Original Message-----
From: Steve Lawrence <sl...@apache.org> 
Sent: Monday, November 12, 2018 5:18 PM
To: users@daffodil.apache.org; Costello, Roger L. <co...@mitre.org>
Subject: Re: Bug with hidden group containing an element with a 1-bit value?

To me, this feels more like something that might be more appropriately handled via XSLT. It's not easy or particularly clean to do in DFDL.
That said, below is somewhat close to what you're trying to achieve. I can't think of a way to get exactly what since there are some limitations on inputValueCalc (e.g. can't be optional, can't be root of a choice, can't be on arrays).

  <!-- hidden group that consumes all characteristic bits -->
  <xs:group name="hidden_characteristic_bits">
    <xs:sequence>
      <xs:element name="Characteristic-16" ... />
      <xs:element name="Characteristic-15" ... />
      ...
      <xs:element name="Characteristic-2" ... />
      <xs:element name="Characteristic-1" ... />
    </xs:sequence>
  </xs:group>

  <xs:sequence dfdl:hiddenGroupRef="hidden_characteristic_bits" />
  <xs:element name="Characteristics" type="xs:string" dfdl:inputValueCalc="{
  fn:concat(
    if (../Characteristic-16 eq 1) then "* Description 16\n" else "",
    if (../Characteristic-15 eq 1) then "* Description 15\n" else "",
    ...
    if (../Characteristic-2 eq 1) then "* Description 2\n" else "",
    if (../Characteristic-1 eq 1) then "* Description 1\n" else ""
  )
}" />

So each of the characteristic bits are consumed altogether in a hidden group. Then you have a single inputValueCalc element that tests each of those bits and outputs either the description for the bit or the empty string, and finally those are all concatenated together to make one big human readable string, for example:

  <Characteristics>
  * If the image is on removable media, copy it to and run it
  * Debugging information was removed and stored separately
  * The computer supports 32-bit words
  * Relocation information was stripped from file
  </Characteristics>

For unparse, you would modify each of your Characteristic-X hidden elements to have a dfdl:outputValueCalc that uses fn:contains() to test if the Characteristics string contains its description and outputs a one or a zero depending, for example:

  <xs:element name="Characteristic-10" dfdl:outputValueCalc="{ fn:contains(../Characteristics, "* Debugging information was removed") }" />
  <xs:element name="Characteristic-9" dfdl:outputValueCalc="{ fn:contains(../Characteristics, "* The computer supports 32-bit words") }" />

- Steve


On 11/12/18 3:27 PM, Costello, Roger L. wrote:
> That fixed it - thank you Steve!
> 
> One more question, if I may.
> 
> Here is the output that I am now getting:
> 
> <Characteristics>
>     <Characteristic-16>0</Characteristic-16>
>     <Characteristic-15>0</Characteristic-15>
>     <Characteristic-14>0</Characteristic-14>
>     <Characteristic-13>0</Characteristic-13>
>     <Characteristic-12>0</Characteristic-12>
>     <Characteristic-11>If the image is on removable media, copy it to and run it from the swap file</Characteristic-11>
>     <Characteristic-10>Debugging information was removed and stored separately in another file</Characteristic-10>
>     <Characteristic-9>The computer supports 32-bit words</Characteristic-9>
>     <Characteristic-8>0</Characteristic-8>
>     <Characteristic-7>0</Characteristic-7>
>     <Characteristic-6>0</Characteristic-6>
>     <Characteristic-5>0</Characteristic-5>
>     <Characteristic-4>0</Characteristic-4>
>     <Characteristic-3>0</Characteristic-3>
>     <Characteristic-2>0</Characteristic-2>
>     <Characteristic-1>Relocation information was stripped from 
> file</Characteristic-1> </Characteristics>
> 
> That is okay, but I would like to get rid of the -N (e.g., Characteristic-1). I'd like to output this XML:
> 
> <Characteristics>
>     <Characteristic->0</Characteristic->
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>If the image is on removable media, copy it to and run it from the swap file</Characteristic>
>     <Characteristic>Debugging information was removed and stored separately in another file</Characteristic>
>     <Characteristic>The computer supports 32-bit words</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>Relocation information was stripped from 
> file</Characteristic> </Characteristics>
> 
> Even better, I'd like to omit completely Characteristic elements with no message: 
> 
> <Characteristics>
>     <Characteristic>If the image is on removable media, copy it to and run it from the swap file</Characteristic>
>     <Characteristic>Debugging information was removed and stored separately in another file</Characteristic>
>     <Characteristic>The computer supports 32-bit words</Characteristic>
>     <Characteristic>Relocation information was stripped from 
> file</Characteristic> </Characteristics>
> 
> Is that possible (and still be able to unparse the XML)?
> 
> /Roger
> 
> 
> -----Original Message-----
> From: Steve Lawrence <sl...@apache.org>
> Sent: Monday, November 12, 2018 3:05 PM
> To: users@daffodil.apache.org; Costello, Roger L. <co...@mitre.org>
> Subject: Re: Bug with hidden group containing an element with a 1-bit value?
> 
> I believe I have found the problem. In the dfdl:format tag, you have the alignment and alignmentUnits set to 1-byte. Your unsignedint1 type explicitly sets the alignmentUnits property to bits, which makes perfect sense. In side each hidden group you have a sequence with an element that uses this type, e.g.:
> 
>   <xs:group name="hidden_characteristic_bit1_Group">
>     <xs:sequence>
>       <xs:element name="Hidden_characteristic_bit1" type="unsignedint1" />
>     </xs:sequence>
>   </xs:group>
> 
> In addition to elements and simple types, alignment also applies to sequences and choices. So in this case, the xs:sequence will have an alignmentUnit of bytes, which ultimately causes multilpe bits to be skipped to deal with the alignment of that sequence. The solution here is to either 1) set the alignmentsUnits of that sequence to bits or 2) just set the alignmentUnits to bits in the dfdl:format tag. If you do not ever expect to need to skip bits due to alignment, then #2 probably makes the most sense to avoid these kinds of subtle mistakes, and then explicitly set alignmentUnits to bytes in cases where there explicit alignment is needed, if at all.
> 
> - Steve
> 
> 
> On 11/12/18 1:37 PM, Costello, Roger L. wrote:
>> Hello DFDL community,
>>
>> I am creating a DFDL schema for .exe files (Portable Executable file format).
>>
>> At one point in the file are 2-bytes that represent sixteen 1-bit flags.
>>
>> For each bit, if the bit = 1, then I want to output a message; otherwise output 0 (for now).
>>
>> To test that I am correctly reading the 2 bytes I did this:
>>
>> <xs:element name="Characteristics" type="xs:hexBinary" dfdl:length="2" 
>> dfdl:lengthKind="explicit" dfdl:lengthUnits="bytes" />
>>
>> The output contained the correct hex bytes:
>>
>> 	07 01
>>
>> To test that I am correctly consuming each individual bit I did this:
>>
>> <xs:element name="Characteristic" type="unsignedint1" minOccurs="16" 
>> maxOccurs="16" />
>>
>> Here is the output:
>>
>>     <Characteristic>0</Characteristic>
>>     <Characteristic>0</Characteristic>
>>     <Characteristic>0</Characteristic>
>>     <Characteristic>0</Characteristic>
>>     <Characteristic>0</Characteristic>
>>     <Characteristic>1</Characteristic>
>>     <Characteristic>1</Characteristic>
>>     <Characteristic>1</Characteristic>
>>     <Characteristic>0</Characteristic>
>>     <Characteristic>0</Characteristic>
>>     <Characteristic>0</Characteristic>
>>     <Characteristic>0</Characteristic>
>>     <Characteristic>0</Characteristic>
>>     <Characteristic>0</Characteristic>
>>     <Characteristic>0</Characteristic>
>>     <Characteristic>1</Characteristic>
>>
>> You can see that those bits correspond to hex 07 01.
>>
>> I want the value of each <Characteristic> element to be a message, not 0/1. So, I used the dfdl:hiddenGroupRef approach to translate 0/1 to a message.
>>
>> <xs:element name="Characteristics">
>>     <xs:complexType>
>>         <xs:sequence>
>>             <xs:sequence dfdl:hiddenGroupRef="hidden_characteristic_bit16_Group" />
>>             <xs:element name="Characteristic-16" type="xs:string" dfdl:inputValueCalc="{
>>                     if (fn:lower-case(xs:string(../Hidden_characteristic_bit16)) eq "1") then "The bytes of the word are reversed(obsolete)"  
>>                     else xs:string(../Hidden_characteristic_bit16)}" />
>>             <xs:sequence dfdl:hiddenGroupRef="hidden_characteristic_bit15_Group" />
>>             <xs:element name="Characteristic-15" type="xs:string" dfdl:inputValueCalc="{
>>                     if (fn:lower-case(xs:string(../Hidden_characteristic_bit15)) eq "1") then "The image should only be run on a single processor computer"  
>>                     else xs:string(../Hidden_characteristic_bit15)}" />
>>             ...
>>         </xs:sequence>
>>     </xs:complexType>
>> </xs:element>
>>
>> <xs:group name="hidden_characteristic_bit16_Group"> 
>>     <xs:sequence>
>>         <xs:element name="Hidden_characteristic_bit16" type="unsignedint1" dfdl:outputValueCalc="{ . }" />
>>     </xs:sequence>
>> </xs:group>
>> ...
>> <xs:group name="hidden_characteristic_bit1_Group"> 
>>     <xs:sequence>
>>         <xs:element name="Hidden_characteristic_bit1" type="unsignedint1" dfdl:outputValueCalc="{ . }" />
>>     </xs:sequence>
>> </xs:group>
>>
>> That produced this incorrect output:
>>
>> <Characteristic-16>0</Characteristic-16>
>> <Characteristic-15>0</Characteristic-15>
>> <Characteristic-14>0</Characteristic-14>
>> <Characteristic-13>0</Characteristic-13>
>> <Characteristic-12>0</Characteristic-12>
>> <Characteristic-11>0</Characteristic-11>
>> <Characteristic-10>0</Characteristic-10>
>> <Characteristic-9>0</Characteristic-9>
>> <Characteristic-8>0</Characteristic-8>
>> <Characteristic-7>0</Characteristic-7>
>> <Characteristic-6>0</Characteristic-6>
>> <Characteristic-5>Aggressively trim the working 
>> set(obsolete)</Characteristic-5> 
>> <Characteristic-4>0</Characteristic-4>
>> <Characteristic-3>0</Characteristic-3>
>> <Characteristic-2>0</Characteristic-2>
>> <Characteristic-1>0</Characteristic-1>
>>
>> Is there a bug with hidden groups containing an element with a 1-bit value?
>>
>> Note: here is how unsigned1 is defined:
>>
>> <xs:simpleType name="unsignedint1" dfdl:length="1" dfdl:lengthKind="explicit" dfdl:alignmentUnits="bits">
>>         <xs:restriction base="xs:unsignedInt"/> </xs:simpleType>
>>
>> /Roger
>>
> 


Re: Bug with hidden group containing an element with a 1-bit value?

Posted by Steve Lawrence <sl...@apache.org>.
To me, this feels more like something that might be more appropriately
handled via XSLT. It's not easy or particularly clean to do in DFDL.
That said, below is somewhat close to what you're trying to achieve. I
can't think of a way to get exactly what since there are some
limitations on inputValueCalc (e.g. can't be optional, can't be root of
a choice, can't be on arrays).

  <!-- hidden group that consumes all characteristic bits -->
  <xs:group name="hidden_characteristic_bits">
    <xs:sequence>
      <xs:element name="Characteristic-16" ... />
      <xs:element name="Characteristic-15" ... />
      ...
      <xs:element name="Characteristic-2" ... />
      <xs:element name="Characteristic-1" ... />
    </xs:sequence>
  </xs:group>

  <xs:sequence dfdl:hiddenGroupRef="hidden_characteristic_bits" />
  <xs:element name="Characteristics" type="xs:string" dfdl:inputValueCalc="{
  fn:concat(
    if (../Characteristic-16 eq 1) then "* Description 16\n" else "",
    if (../Characteristic-15 eq 1) then "* Description 15\n" else "",
    ...
    if (../Characteristic-2 eq 1) then "* Description 2\n" else "",
    if (../Characteristic-1 eq 1) then "* Description 1\n" else ""
  )
}" />

So each of the characteristic bits are consumed altogether in a hidden
group. Then you have a single inputValueCalc element that tests each of
those bits and outputs either the description for the bit or the empty
string, and finally those are all concatenated together to make one big
human readable string, for example:

  <Characteristics>
  * If the image is on removable media, copy it to and run it
  * Debugging information was removed and stored separately
  * The computer supports 32-bit words
  * Relocation information was stripped from file
  </Characteristics>

For unparse, you would modify each of your Characteristic-X hidden
elements to have a dfdl:outputValueCalc that uses fn:contains() to test
if the Characteristics string contains its description and outputs a one
or a zero depending, for example:

  <xs:element name="Characteristic-10" dfdl:outputValueCalc="{
fn:contains(../Characteristics, "* Debugging information was removed")
}" />
  <xs:element name="Characteristic-9" dfdl:outputValueCalc="{
fn:contains(../Characteristics, "* The computer supports 32-bit words")
}" />

- Steve


On 11/12/18 3:27 PM, Costello, Roger L. wrote:
> That fixed it - thank you Steve!
> 
> One more question, if I may.
> 
> Here is the output that I am now getting:
> 
> <Characteristics>
>     <Characteristic-16>0</Characteristic-16>
>     <Characteristic-15>0</Characteristic-15>
>     <Characteristic-14>0</Characteristic-14>
>     <Characteristic-13>0</Characteristic-13>
>     <Characteristic-12>0</Characteristic-12>
>     <Characteristic-11>If the image is on removable media, copy it to and run it from the swap file</Characteristic-11>
>     <Characteristic-10>Debugging information was removed and stored separately in another file</Characteristic-10>
>     <Characteristic-9>The computer supports 32-bit words</Characteristic-9>
>     <Characteristic-8>0</Characteristic-8>
>     <Characteristic-7>0</Characteristic-7>
>     <Characteristic-6>0</Characteristic-6>
>     <Characteristic-5>0</Characteristic-5>
>     <Characteristic-4>0</Characteristic-4>
>     <Characteristic-3>0</Characteristic-3>
>     <Characteristic-2>0</Characteristic-2>
>     <Characteristic-1>Relocation information was stripped from file</Characteristic-1>
> </Characteristics>
> 
> That is okay, but I would like to get rid of the -N (e.g., Characteristic-1). I'd like to output this XML:
> 
> <Characteristics>
>     <Characteristic->0</Characteristic->
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>If the image is on removable media, copy it to and run it from the swap file</Characteristic>
>     <Characteristic>Debugging information was removed and stored separately in another file</Characteristic>
>     <Characteristic>The computer supports 32-bit words</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>Relocation information was stripped from file</Characteristic>
> </Characteristics>
> 
> Even better, I'd like to omit completely Characteristic elements with no message: 
> 
> <Characteristics>
>     <Characteristic>If the image is on removable media, copy it to and run it from the swap file</Characteristic>
>     <Characteristic>Debugging information was removed and stored separately in another file</Characteristic>
>     <Characteristic>The computer supports 32-bit words</Characteristic>
>     <Characteristic>Relocation information was stripped from file</Characteristic>
> </Characteristics>
> 
> Is that possible (and still be able to unparse the XML)?
> 
> /Roger
> 
> 
> -----Original Message-----
> From: Steve Lawrence <sl...@apache.org> 
> Sent: Monday, November 12, 2018 3:05 PM
> To: users@daffodil.apache.org; Costello, Roger L. <co...@mitre.org>
> Subject: Re: Bug with hidden group containing an element with a 1-bit value?
> 
> I believe I have found the problem. In the dfdl:format tag, you have the alignment and alignmentUnits set to 1-byte. Your unsignedint1 type explicitly sets the alignmentUnits property to bits, which makes perfect sense. In side each hidden group you have a sequence with an element that uses this type, e.g.:
> 
>   <xs:group name="hidden_characteristic_bit1_Group">
>     <xs:sequence>
>       <xs:element name="Hidden_characteristic_bit1" type="unsignedint1" />
>     </xs:sequence>
>   </xs:group>
> 
> In addition to elements and simple types, alignment also applies to sequences and choices. So in this case, the xs:sequence will have an alignmentUnit of bytes, which ultimately causes multilpe bits to be skipped to deal with the alignment of that sequence. The solution here is to either 1) set the alignmentsUnits of that sequence to bits or 2) just set the alignmentUnits to bits in the dfdl:format tag. If you do not ever expect to need to skip bits due to alignment, then #2 probably makes the most sense to avoid these kinds of subtle mistakes, and then explicitly set alignmentUnits to bytes in cases where there explicit alignment is needed, if at all.
> 
> - Steve
> 
> 
> On 11/12/18 1:37 PM, Costello, Roger L. wrote:
>> Hello DFDL community,
>>
>> I am creating a DFDL schema for .exe files (Portable Executable file format).
>>
>> At one point in the file are 2-bytes that represent sixteen 1-bit flags.
>>
>> For each bit, if the bit = 1, then I want to output a message; otherwise output 0 (for now).
>>
>> To test that I am correctly reading the 2 bytes I did this:
>>
>> <xs:element name="Characteristics" type="xs:hexBinary" dfdl:length="2" 
>> dfdl:lengthKind="explicit" dfdl:lengthUnits="bytes" />
>>
>> The output contained the correct hex bytes:
>>
>> 	07 01
>>
>> To test that I am correctly consuming each individual bit I did this:
>>
>> <xs:element name="Characteristic" type="unsignedint1" minOccurs="16" 
>> maxOccurs="16" />
>>
>> Here is the output:
>>
>>     <Characteristic>0</Characteristic>
>>     <Characteristic>0</Characteristic>
>>     <Characteristic>0</Characteristic>
>>     <Characteristic>0</Characteristic>
>>     <Characteristic>0</Characteristic>
>>     <Characteristic>1</Characteristic>
>>     <Characteristic>1</Characteristic>
>>     <Characteristic>1</Characteristic>
>>     <Characteristic>0</Characteristic>
>>     <Characteristic>0</Characteristic>
>>     <Characteristic>0</Characteristic>
>>     <Characteristic>0</Characteristic>
>>     <Characteristic>0</Characteristic>
>>     <Characteristic>0</Characteristic>
>>     <Characteristic>0</Characteristic>
>>     <Characteristic>1</Characteristic>
>>
>> You can see that those bits correspond to hex 07 01.
>>
>> I want the value of each <Characteristic> element to be a message, not 0/1. So, I used the dfdl:hiddenGroupRef approach to translate 0/1 to a message.
>>
>> <xs:element name="Characteristics">
>>     <xs:complexType>
>>         <xs:sequence>
>>             <xs:sequence dfdl:hiddenGroupRef="hidden_characteristic_bit16_Group" />
>>             <xs:element name="Characteristic-16" type="xs:string" dfdl:inputValueCalc="{
>>                     if (fn:lower-case(xs:string(../Hidden_characteristic_bit16)) eq "1") then "The bytes of the word are reversed(obsolete)"  
>>                     else xs:string(../Hidden_characteristic_bit16)}" />
>>             <xs:sequence dfdl:hiddenGroupRef="hidden_characteristic_bit15_Group" />
>>             <xs:element name="Characteristic-15" type="xs:string" dfdl:inputValueCalc="{
>>                     if (fn:lower-case(xs:string(../Hidden_characteristic_bit15)) eq "1") then "The image should only be run on a single processor computer"  
>>                     else xs:string(../Hidden_characteristic_bit15)}" />
>>             ...
>>         </xs:sequence>
>>     </xs:complexType>
>> </xs:element>
>>
>> <xs:group name="hidden_characteristic_bit16_Group"> 
>>     <xs:sequence>
>>         <xs:element name="Hidden_characteristic_bit16" type="unsignedint1" dfdl:outputValueCalc="{ . }" />
>>     </xs:sequence>
>> </xs:group>
>> ...
>> <xs:group name="hidden_characteristic_bit1_Group"> 
>>     <xs:sequence>
>>         <xs:element name="Hidden_characteristic_bit1" type="unsignedint1" dfdl:outputValueCalc="{ . }" />
>>     </xs:sequence>
>> </xs:group>
>>
>> That produced this incorrect output:
>>
>> <Characteristic-16>0</Characteristic-16>
>> <Characteristic-15>0</Characteristic-15>
>> <Characteristic-14>0</Characteristic-14>
>> <Characteristic-13>0</Characteristic-13>
>> <Characteristic-12>0</Characteristic-12>
>> <Characteristic-11>0</Characteristic-11>
>> <Characteristic-10>0</Characteristic-10>
>> <Characteristic-9>0</Characteristic-9>
>> <Characteristic-8>0</Characteristic-8>
>> <Characteristic-7>0</Characteristic-7>
>> <Characteristic-6>0</Characteristic-6>
>> <Characteristic-5>Aggressively trim the working 
>> set(obsolete)</Characteristic-5> 
>> <Characteristic-4>0</Characteristic-4>
>> <Characteristic-3>0</Characteristic-3>
>> <Characteristic-2>0</Characteristic-2>
>> <Characteristic-1>0</Characteristic-1>
>>
>> Is there a bug with hidden groups containing an element with a 1-bit value?
>>
>> Note: here is how unsigned1 is defined:
>>
>> <xs:simpleType name="unsignedint1" dfdl:length="1" dfdl:lengthKind="explicit" dfdl:alignmentUnits="bits">
>>         <xs:restriction base="xs:unsignedInt"/> </xs:simpleType>
>>
>> /Roger
>>
> 


RE: Bug with hidden group containing an element with a 1-bit value?

Posted by "Costello, Roger L." <co...@mitre.org>.
That fixed it - thank you Steve!

One more question, if I may.

Here is the output that I am now getting:

<Characteristics>
    <Characteristic-16>0</Characteristic-16>
    <Characteristic-15>0</Characteristic-15>
    <Characteristic-14>0</Characteristic-14>
    <Characteristic-13>0</Characteristic-13>
    <Characteristic-12>0</Characteristic-12>
    <Characteristic-11>If the image is on removable media, copy it to and run it from the swap file</Characteristic-11>
    <Characteristic-10>Debugging information was removed and stored separately in another file</Characteristic-10>
    <Characteristic-9>The computer supports 32-bit words</Characteristic-9>
    <Characteristic-8>0</Characteristic-8>
    <Characteristic-7>0</Characteristic-7>
    <Characteristic-6>0</Characteristic-6>
    <Characteristic-5>0</Characteristic-5>
    <Characteristic-4>0</Characteristic-4>
    <Characteristic-3>0</Characteristic-3>
    <Characteristic-2>0</Characteristic-2>
    <Characteristic-1>Relocation information was stripped from file</Characteristic-1>
</Characteristics>

That is okay, but I would like to get rid of the -N (e.g., Characteristic-1). I'd like to output this XML:

<Characteristics>
    <Characteristic->0</Characteristic->
    <Characteristic>0</Characteristic>
    <Characteristic>0</Characteristic>
    <Characteristic>0</Characteristic>
    <Characteristic>0</Characteristic>
    <Characteristic>If the image is on removable media, copy it to and run it from the swap file</Characteristic>
    <Characteristic>Debugging information was removed and stored separately in another file</Characteristic>
    <Characteristic>The computer supports 32-bit words</Characteristic>
    <Characteristic>0</Characteristic>
    <Characteristic>0</Characteristic>
    <Characteristic>0</Characteristic>
    <Characteristic>0</Characteristic>
    <Characteristic>0</Characteristic>
    <Characteristic>0</Characteristic>
    <Characteristic>0</Characteristic>
    <Characteristic>Relocation information was stripped from file</Characteristic>
</Characteristics>

Even better, I'd like to omit completely Characteristic elements with no message: 

<Characteristics>
    <Characteristic>If the image is on removable media, copy it to and run it from the swap file</Characteristic>
    <Characteristic>Debugging information was removed and stored separately in another file</Characteristic>
    <Characteristic>The computer supports 32-bit words</Characteristic>
    <Characteristic>Relocation information was stripped from file</Characteristic>
</Characteristics>

Is that possible (and still be able to unparse the XML)?

/Roger


-----Original Message-----
From: Steve Lawrence <sl...@apache.org> 
Sent: Monday, November 12, 2018 3:05 PM
To: users@daffodil.apache.org; Costello, Roger L. <co...@mitre.org>
Subject: Re: Bug with hidden group containing an element with a 1-bit value?

I believe I have found the problem. In the dfdl:format tag, you have the alignment and alignmentUnits set to 1-byte. Your unsignedint1 type explicitly sets the alignmentUnits property to bits, which makes perfect sense. In side each hidden group you have a sequence with an element that uses this type, e.g.:

  <xs:group name="hidden_characteristic_bit1_Group">
    <xs:sequence>
      <xs:element name="Hidden_characteristic_bit1" type="unsignedint1" />
    </xs:sequence>
  </xs:group>

In addition to elements and simple types, alignment also applies to sequences and choices. So in this case, the xs:sequence will have an alignmentUnit of bytes, which ultimately causes multilpe bits to be skipped to deal with the alignment of that sequence. The solution here is to either 1) set the alignmentsUnits of that sequence to bits or 2) just set the alignmentUnits to bits in the dfdl:format tag. If you do not ever expect to need to skip bits due to alignment, then #2 probably makes the most sense to avoid these kinds of subtle mistakes, and then explicitly set alignmentUnits to bytes in cases where there explicit alignment is needed, if at all.

- Steve


On 11/12/18 1:37 PM, Costello, Roger L. wrote:
> Hello DFDL community,
> 
> I am creating a DFDL schema for .exe files (Portable Executable file format).
> 
> At one point in the file are 2-bytes that represent sixteen 1-bit flags.
> 
> For each bit, if the bit = 1, then I want to output a message; otherwise output 0 (for now).
> 
> To test that I am correctly reading the 2 bytes I did this:
> 
> <xs:element name="Characteristics" type="xs:hexBinary" dfdl:length="2" 
> dfdl:lengthKind="explicit" dfdl:lengthUnits="bytes" />
> 
> The output contained the correct hex bytes:
> 
> 	07 01
> 
> To test that I am correctly consuming each individual bit I did this:
> 
> <xs:element name="Characteristic" type="unsignedint1" minOccurs="16" 
> maxOccurs="16" />
> 
> Here is the output:
> 
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>1</Characteristic>
>     <Characteristic>1</Characteristic>
>     <Characteristic>1</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>1</Characteristic>
> 
> You can see that those bits correspond to hex 07 01.
> 
> I want the value of each <Characteristic> element to be a message, not 0/1. So, I used the dfdl:hiddenGroupRef approach to translate 0/1 to a message.
> 
> <xs:element name="Characteristics">
>     <xs:complexType>
>         <xs:sequence>
>             <xs:sequence dfdl:hiddenGroupRef="hidden_characteristic_bit16_Group" />
>             <xs:element name="Characteristic-16" type="xs:string" dfdl:inputValueCalc="{
>                     if (fn:lower-case(xs:string(../Hidden_characteristic_bit16)) eq "1") then "The bytes of the word are reversed(obsolete)"  
>                     else xs:string(../Hidden_characteristic_bit16)}" />
>             <xs:sequence dfdl:hiddenGroupRef="hidden_characteristic_bit15_Group" />
>             <xs:element name="Characteristic-15" type="xs:string" dfdl:inputValueCalc="{
>                     if (fn:lower-case(xs:string(../Hidden_characteristic_bit15)) eq "1") then "The image should only be run on a single processor computer"  
>                     else xs:string(../Hidden_characteristic_bit15)}" />
>             ...
>         </xs:sequence>
>     </xs:complexType>
> </xs:element>
> 
> <xs:group name="hidden_characteristic_bit16_Group"> 
>     <xs:sequence>
>         <xs:element name="Hidden_characteristic_bit16" type="unsignedint1" dfdl:outputValueCalc="{ . }" />
>     </xs:sequence>
> </xs:group>
> ...
> <xs:group name="hidden_characteristic_bit1_Group"> 
>     <xs:sequence>
>         <xs:element name="Hidden_characteristic_bit1" type="unsignedint1" dfdl:outputValueCalc="{ . }" />
>     </xs:sequence>
> </xs:group>
> 
> That produced this incorrect output:
> 
> <Characteristic-16>0</Characteristic-16>
> <Characteristic-15>0</Characteristic-15>
> <Characteristic-14>0</Characteristic-14>
> <Characteristic-13>0</Characteristic-13>
> <Characteristic-12>0</Characteristic-12>
> <Characteristic-11>0</Characteristic-11>
> <Characteristic-10>0</Characteristic-10>
> <Characteristic-9>0</Characteristic-9>
> <Characteristic-8>0</Characteristic-8>
> <Characteristic-7>0</Characteristic-7>
> <Characteristic-6>0</Characteristic-6>
> <Characteristic-5>Aggressively trim the working 
> set(obsolete)</Characteristic-5> 
> <Characteristic-4>0</Characteristic-4>
> <Characteristic-3>0</Characteristic-3>
> <Characteristic-2>0</Characteristic-2>
> <Characteristic-1>0</Characteristic-1>
> 
> Is there a bug with hidden groups containing an element with a 1-bit value?
> 
> Note: here is how unsigned1 is defined:
> 
> <xs:simpleType name="unsignedint1" dfdl:length="1" dfdl:lengthKind="explicit" dfdl:alignmentUnits="bits">
>         <xs:restriction base="xs:unsignedInt"/> </xs:simpleType>
> 
> /Roger
> 


Re: Bug with hidden group containing an element with a 1-bit value?

Posted by Steve Lawrence <sl...@apache.org>.
I believe I have found the problem. In the dfdl:format tag, you have the
alignment and alignmentUnits set to 1-byte. Your unsignedint1 type
explicitly sets the alignmentUnits property to bits, which makes perfect
sense. In side each hidden group you have a sequence with an element
that uses this type, e.g.:

  <xs:group name="hidden_characteristic_bit1_Group">
    <xs:sequence>
      <xs:element name="Hidden_characteristic_bit1" type="unsignedint1" />
    </xs:sequence>
  </xs:group>

In addition to elements and simple types, alignment also applies to
sequences and choices. So in this case, the xs:sequence will have an
alignmentUnit of bytes, which ultimately causes multilpe bits to be
skipped to deal with the alignment of that sequence. The solution here
is to either 1) set the alignmentsUnits of that sequence to bits or 2)
just set the alignmentUnits to bits in the dfdl:format tag. If you do
not ever expect to need to skip bits due to alignment, then #2 probably
makes the most sense to avoid these kinds of subtle mistakes, and then
explicitly set alignmentUnits to bytes in cases where there explicit
alignment is needed, if at all.

- Steve


On 11/12/18 1:37 PM, Costello, Roger L. wrote:
> Hello DFDL community,
> 
> I am creating a DFDL schema for .exe files (Portable Executable file format).
> 
> At one point in the file are 2-bytes that represent sixteen 1-bit flags.
> 
> For each bit, if the bit = 1, then I want to output a message; otherwise output 0 (for now).
> 
> To test that I am correctly reading the 2 bytes I did this:
> 
> <xs:element name="Characteristics" type="xs:hexBinary" dfdl:length="2" dfdl:lengthKind="explicit" dfdl:lengthUnits="bytes" />
> 
> The output contained the correct hex bytes:
> 
> 	07 01
> 
> To test that I am correctly consuming each individual bit I did this:
> 
> <xs:element name="Characteristic" type="unsignedint1" minOccurs="16" maxOccurs="16" />
> 
> Here is the output:
> 
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>1</Characteristic>
>     <Characteristic>1</Characteristic>
>     <Characteristic>1</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>0</Characteristic>
>     <Characteristic>1</Characteristic>
> 
> You can see that those bits correspond to hex 07 01.
> 
> I want the value of each <Characteristic> element to be a message, not 0/1. So, I used the dfdl:hiddenGroupRef approach to translate 0/1 to a message.
> 
> <xs:element name="Characteristics">
>     <xs:complexType>
>         <xs:sequence>
>             <xs:sequence dfdl:hiddenGroupRef="hidden_characteristic_bit16_Group" />
>             <xs:element name="Characteristic-16" type="xs:string" dfdl:inputValueCalc="{
>                     if (fn:lower-case(xs:string(../Hidden_characteristic_bit16)) eq "1") then "The bytes of the word are reversed(obsolete)"  
>                     else xs:string(../Hidden_characteristic_bit16)}" />
>             <xs:sequence dfdl:hiddenGroupRef="hidden_characteristic_bit15_Group" />
>             <xs:element name="Characteristic-15" type="xs:string" dfdl:inputValueCalc="{
>                     if (fn:lower-case(xs:string(../Hidden_characteristic_bit15)) eq "1") then "The image should only be run on a single processor computer"  
>                     else xs:string(../Hidden_characteristic_bit15)}" />
>             ...
>         </xs:sequence>
>     </xs:complexType>
> </xs:element>
> 
> <xs:group name="hidden_characteristic_bit16_Group"> 
>     <xs:sequence>
>         <xs:element name="Hidden_characteristic_bit16" type="unsignedint1" dfdl:outputValueCalc="{ . }" />
>     </xs:sequence>
> </xs:group>
> ...
> <xs:group name="hidden_characteristic_bit1_Group"> 
>     <xs:sequence>
>         <xs:element name="Hidden_characteristic_bit1" type="unsignedint1" dfdl:outputValueCalc="{ . }" />
>     </xs:sequence>
> </xs:group>
> 
> That produced this incorrect output:
> 
> <Characteristic-16>0</Characteristic-16>
> <Characteristic-15>0</Characteristic-15>
> <Characteristic-14>0</Characteristic-14>
> <Characteristic-13>0</Characteristic-13>
> <Characteristic-12>0</Characteristic-12>
> <Characteristic-11>0</Characteristic-11>
> <Characteristic-10>0</Characteristic-10>
> <Characteristic-9>0</Characteristic-9>
> <Characteristic-8>0</Characteristic-8>
> <Characteristic-7>0</Characteristic-7>
> <Characteristic-6>0</Characteristic-6>
> <Characteristic-5>Aggressively trim the working set(obsolete)</Characteristic-5>
> <Characteristic-4>0</Characteristic-4>
> <Characteristic-3>0</Characteristic-3>
> <Characteristic-2>0</Characteristic-2>
> <Characteristic-1>0</Characteristic-1>
> 
> Is there a bug with hidden groups containing an element with a 1-bit value?
> 
> Note: here is how unsigned1 is defined:
> 
> <xs:simpleType name="unsignedint1" dfdl:length="1" dfdl:lengthKind="explicit" dfdl:alignmentUnits="bits">
>         <xs:restriction base="xs:unsignedInt"/>
> </xs:simpleType>
> 
> /Roger
>