You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@daffodil.apache.org by "Ramaka, Shashi" <sr...@owlcyberdefense.com> on 2020/10/16 17:16:33 UTC

DAFFODIL-2377

I am working on DAFFODIL-2377: Abort instead of diagnostic message. (https://issues.apache.org/jira/browse/DAFFODIL-2377)

The attached files can be used to illustrate the bug.


daffodil parse -s s2377.xsd -o d2377.xml d2377.bin

daffodil unparse -s s2377.xsd d2377.xml

Unparsing results in the below error:

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!   An unexpected exception occurred. This is a bug!   !!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

The schema has three elements. The first element has a length of 1 byte and alignment of 1 byte while the next two elements have length = 1 and alignment = 2. The unparser is unparsing the first element correctly but throwing the above exception while determining the second element. Specifically it is failing figuring out the fill byte to be used.

The unparser is failing on the highlighted line in ProcessorStateBase.scala in daffodil-runtime1:

final def fillByte: Byte = {
  if (maybeCachedFillByte.isEmpty)
    maybeCachedFillByte = MaybeInt(termRuntimeData.maybeFillByteEv.get.evaluate(this).toInt)
  maybeCachedFillByte.get.toByte
}

Seems like the processor state in termRuntimeData does not have the fill byte value from the schema. If I update the above fillByte to return a hard coded byte value, unparse is working fine.

I'd appreciate any pointers on how to proceed from here.


Regards,

   Shashi Ramaka
   sramaka@owlcyberdefense.com<ma...@owlcyberdefense.com>
   Tel: 703-965-3656


Re: DAFFODIL-2377

Posted by Steve Lawrence <st...@gmail.com>.
Digging a bit more here's what's happening. We have three elements: A,
B, and C. Each element is 1 byte long, but B and C have 2 byte
alignment. So the data looks something like this, where each section is
a byte.

Byte Pos: 0      1      2      3      4
    Data: |  A   | FILL |  B   | FILL |  C   |

Notice that because A is 1 byte long and B must be 2 byte aligned, we
need to insert a fill byte in between A and B. The FILL before B is the
alignment region for B, which means the B element must have the
dfdl:fillByte property provided so we know what to fill that data with
when unparsing. The same logic goes for element C and it's alignment region.

So elements B and C need the fillByte property to know how to unparse
this alignment region. The fillByte property on the sequence contributes
nothing towards the alignment region for B, and C, so it's used or
needed. That's why you could change it or remove it without affecting
anything.

And that's also why adding the fillByte property globally fixed it.
Because it provided the fillByte property for elements B and C, and so
they have a value to use for their fill byte in the alignment region.

But the underlying issue here is that elements required a fillByte for B
and C, the schema did not provide the property for B and C, and Daffodil
did not detect it as the element being missing.

Instead of what we have, Daffodil should require the fillByte property,
which is what Solution 1 that I mentioned below does. Rather than
detecting if the fillByte propety exists via
optinoFillByteRaw.isDefined, I proposed that we just directly access
fillByte unconditionally (as part of creating FillByteEv. Access the
"fillByte" variable unconditionally will require that the property
exists and create an SDE if it does not.

- Steve

On 10/16/20 3:50 PM, Ramaka, Shashi wrote:
> I updated the schema to define the fillByte property at the global level (under xs:annotation). Now the unparsing is working without error.  
> The fillByte property defined for the sequence has no impact whether it is present or not -- it is not overriding the global fillByte value.
> 
> Do the code changes you suggest impact the above behavior? 
> 
> On a related note, there is an open ticket that says fillByte allows only raw byte values and not DFDL entities: https://issues.apache.org/jira/browse/DAFFODIL-1646 
> The example XSD uses DFDL entity value for fillByte. Can DAFFODIL-1646 be closed? 
> 
> 
> Regards,
> 
>    Shashi Ramaka
>    sramaka@owlcyberdefense.com 
>    Tel: 703-965-3656
> 
> -----Original Message-----
> From: Steve Lawrence <sl...@apache.org> 
> Sent: Friday, October 16, 2020 1:43 PM
> To: dev@daffodil.apache.org
> Subject: Re: DAFFODIL-2377
> 
> The issue appears to be with how we don't really require fillByte until runtime.
> 
> RuntimePropertyMixins.scala defines maybeFillByteEv, which returns the FillByteEv if the dfdl:fillByte property is provided on the schema element. If it's not provided, it's just a Nope and we don't have a fill byte. So our code for creating FillByteEv implies that the dfdl:fillByte property is not mandatory.
> 
> But then in ProcessorStateBases.scala, the def fillByte function does
> 
>   maybeFillByteEv.get
> 
> So that requires that maybeFillByteEv is defined when we determine we needed to fill some bytes, and thus the dfdl:fillByteProperty is mandatory.
> 
> Seems to me we should just make the fill byte property mandatory. I don't know if there are cases where we should actually consider it optional.
> 
> Solution 1:
> 
> Change RuntimePropertyMixins.scala so that maybeFillByEv is no longer a Maybe, and it becomes something like this:
> 
>   final lazy val fillByteEv = {
>     val ev = new FillByteEv(fillByte, charsetEv, tci)
>     ev.compile(tunable)
>     ev
>   }
> 
> So fill byte is always required to be defined in the schema. And then change all references to maybeFillByteEv to just fillByteEv. And follow the variable until ProcessorStateBases.scala just becomes
> 
>   filleByteEv
> 
> instead of
> 
>   maybeFillByteEv.get
> 
> This way, if anything ever uses the fillByteEv variable, we require the fillByte property to exist, and if it doesn't we'll get an SDE at schema compile time. And then if fillByte is ever needed, then it will be available.
> 
> The only issue with this approach is it might break schemas that don't provide fillByte, since we now will always require fill byte where it might not have been technically needed before. But I think we were just getting lucky, and probably most schemas already defined it.
> 
> Solution 2:
> 
> Another option would be to just change ProcessorStateBases to be something like this:
> 
>     if (termRuntimeData.maybeFillByteEv.isEmpty) {
>       SDE("fillByte property required")
>     } else {
>       maybeCachedFillByte =
> MaybeInt(termRuntimeData.maybeFillByteEv.get....)
>     }
> 
> So at runtime, if we need to fill some bytes but maybeFillByteEv isn't defined (i.e. the schema didn't define fillByte), then we throw a Runtime SDE. This maintains backwards compatibility, but doesn't detect the missing property until runtime, which is unfortunately. I'm not sure if it's wroth maintaining that for this issue though. I'd probably lean towards the first solution and just always require fillByte.
> 
> 
> On 10/16/20 1:16 PM, Ramaka, Shashi wrote:
>> I am working on DAFFODIL-2377: Abort instead of diagnostic message. 
>> (https://issues.apache.org/jira/browse/DAFFODIL-2377
>> <https://issues.apache.org/jira/browse/DAFFODIL-2377>)
>>
>> The attached files can be used to illustrate the bug.
>>
>> daffodil parse -s s2377.xsd -o d2377.xml d2377.bin
>>
>> daffodil unparse -s s2377.xsd d2377.xml
>>
>> Unparsing results in the below error:
>>
>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
>>
>> !!   An unexpected exception occurred. This is a bug!   !!
>>
>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
>>
>> The schema has three elements. The first element has a length of 1 
>> byte and alignment of 1 byte while the next two elements have length = 
>> 1 and alignment = 2. The unparser is unparsing the first element 
>> correctly but throwing the above exception while determining the 
>> second element. Specifically it is failing figuring out the fill byte to be used.
>>
>> The unparser is failing on the highlighted line in 
>> ProcessorStateBase.scala in
>> daffodil-runtime1:
>>
>> final def fillByte: Byte = {
>>    if (maybeCachedFillByte.isEmpty)
>> *    maybeCachedFillByte = 
>> MaybeInt(termRuntimeData.maybeFillByteEv.get.evaluate(this).toInt)
>> *  maybeCachedFillByte.get.toByte
>> }
>>
>> Seems like the processor state in termRuntimeData does not have the 
>> fill byte value from the schema. If I update the above fillByte to 
>> return a hard coded byte value, unparse is working fine.
>>
>> I'd appreciate any pointers on how to proceed from here.
>>
>> Regards,
>>
>>     Shashi Ramaka
>>
>> sramaka@owlcyberdefense.com <ma...@owlcyberdefense.com>
>>
>>     Tel: 703-965-3656
>>
> 


RE: DAFFODIL-2377

Posted by "Ramaka, Shashi" <sr...@owlcyberdefense.com>.
I updated the schema to define the fillByte property at the global level (under xs:annotation). Now the unparsing is working without error.  
The fillByte property defined for the sequence has no impact whether it is present or not -- it is not overriding the global fillByte value.

Do the code changes you suggest impact the above behavior? 

On a related note, there is an open ticket that says fillByte allows only raw byte values and not DFDL entities: https://issues.apache.org/jira/browse/DAFFODIL-1646 
The example XSD uses DFDL entity value for fillByte. Can DAFFODIL-1646 be closed? 


Regards,

   Shashi Ramaka
   sramaka@owlcyberdefense.com 
   Tel: 703-965-3656

-----Original Message-----
From: Steve Lawrence <sl...@apache.org> 
Sent: Friday, October 16, 2020 1:43 PM
To: dev@daffodil.apache.org
Subject: Re: DAFFODIL-2377

The issue appears to be with how we don't really require fillByte until runtime.

RuntimePropertyMixins.scala defines maybeFillByteEv, which returns the FillByteEv if the dfdl:fillByte property is provided on the schema element. If it's not provided, it's just a Nope and we don't have a fill byte. So our code for creating FillByteEv implies that the dfdl:fillByte property is not mandatory.

But then in ProcessorStateBases.scala, the def fillByte function does

  maybeFillByteEv.get

So that requires that maybeFillByteEv is defined when we determine we needed to fill some bytes, and thus the dfdl:fillByteProperty is mandatory.

Seems to me we should just make the fill byte property mandatory. I don't know if there are cases where we should actually consider it optional.

Solution 1:

Change RuntimePropertyMixins.scala so that maybeFillByEv is no longer a Maybe, and it becomes something like this:

  final lazy val fillByteEv = {
    val ev = new FillByteEv(fillByte, charsetEv, tci)
    ev.compile(tunable)
    ev
  }

So fill byte is always required to be defined in the schema. And then change all references to maybeFillByteEv to just fillByteEv. And follow the variable until ProcessorStateBases.scala just becomes

  filleByteEv

instead of

  maybeFillByteEv.get

This way, if anything ever uses the fillByteEv variable, we require the fillByte property to exist, and if it doesn't we'll get an SDE at schema compile time. And then if fillByte is ever needed, then it will be available.

The only issue with this approach is it might break schemas that don't provide fillByte, since we now will always require fill byte where it might not have been technically needed before. But I think we were just getting lucky, and probably most schemas already defined it.

Solution 2:

Another option would be to just change ProcessorStateBases to be something like this:

    if (termRuntimeData.maybeFillByteEv.isEmpty) {
      SDE("fillByte property required")
    } else {
      maybeCachedFillByte =
MaybeInt(termRuntimeData.maybeFillByteEv.get....)
    }

So at runtime, if we need to fill some bytes but maybeFillByteEv isn't defined (i.e. the schema didn't define fillByte), then we throw a Runtime SDE. This maintains backwards compatibility, but doesn't detect the missing property until runtime, which is unfortunately. I'm not sure if it's wroth maintaining that for this issue though. I'd probably lean towards the first solution and just always require fillByte.


On 10/16/20 1:16 PM, Ramaka, Shashi wrote:
> I am working on DAFFODIL-2377: Abort instead of diagnostic message. 
> (https://issues.apache.org/jira/browse/DAFFODIL-2377
> <https://issues.apache.org/jira/browse/DAFFODIL-2377>)
> 
> The attached files can be used to illustrate the bug.
> 
> daffodil parse -s s2377.xsd -o d2377.xml d2377.bin
> 
> daffodil unparse -s s2377.xsd d2377.xml
> 
> Unparsing results in the below error:
> 
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> 
> !!   An unexpected exception occurred. This is a bug!   !!
> 
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> 
> The schema has three elements. The first element has a length of 1 
> byte and alignment of 1 byte while the next two elements have length = 
> 1 and alignment = 2. The unparser is unparsing the first element 
> correctly but throwing the above exception while determining the 
> second element. Specifically it is failing figuring out the fill byte to be used.
> 
> The unparser is failing on the highlighted line in 
> ProcessorStateBase.scala in
> daffodil-runtime1:
> 
> final def fillByte: Byte = {
>    if (maybeCachedFillByte.isEmpty)
> *    maybeCachedFillByte = 
> MaybeInt(termRuntimeData.maybeFillByteEv.get.evaluate(this).toInt)
> *  maybeCachedFillByte.get.toByte
> }
> 
> Seems like the processor state in termRuntimeData does not have the 
> fill byte value from the schema. If I update the above fillByte to 
> return a hard coded byte value, unparse is working fine.
> 
> I'd appreciate any pointers on how to proceed from here.
> 
> Regards,
> 
>     Shashi Ramaka
> 
> sramaka@owlcyberdefense.com <ma...@owlcyberdefense.com>
> 
>     Tel: 703-965-3656
> 


Re: DAFFODIL-2377

Posted by Steve Lawrence <sl...@apache.org>.
The issue appears to be with how we don't really require fillByte until
runtime.

RuntimePropertyMixins.scala defines maybeFillByteEv, which returns the
FillByteEv if the dfdl:fillByte property is provided on the schema
element. If it's not provided, it's just a Nope and we don't have a fill
byte. So our code for creating FillByteEv implies that the dfdl:fillByte
property is not mandatory.

But then in ProcessorStateBases.scala, the def fillByte function does

  maybeFillByteEv.get

So that requires that maybeFillByteEv is defined when we determine we
needed to fill some bytes, and thus the dfdl:fillByteProperty is mandatory.

Seems to me we should just make the fill byte property mandatory. I
don't know if there are cases where we should actually consider it optional.

Solution 1:

Change RuntimePropertyMixins.scala so that maybeFillByEv is no longer a
Maybe, and it becomes something like this:

  final lazy val fillByteEv = {
    val ev = new FillByteEv(fillByte, charsetEv, tci)
    ev.compile(tunable)
    ev
  }

So fill byte is always required to be defined in the schema. And then
change all references to maybeFillByteEv to just fillByteEv. And follow
the variable until ProcessorStateBases.scala just becomes

  filleByteEv

instead of

  maybeFillByteEv.get

This way, if anything ever uses the fillByteEv variable, we require the
fillByte property to exist, and if it doesn't we'll get an SDE at schema
compile time. And then if fillByte is ever needed, then it will be
available.

The only issue with this approach is it might break schemas that don't
provide fillByte, since we now will always require fill byte where it
might not have been technically needed before. But I think we were just
getting lucky, and probably most schemas already defined it.

Solution 2:

Another option would be to just change ProcessorStateBases to be
something like this:

    if (termRuntimeData.maybeFillByteEv.isEmpty) {
      SDE("fillByte property required")
    } else {
      maybeCachedFillByte =
MaybeInt(termRuntimeData.maybeFillByteEv.get....)
    }

So at runtime, if we need to fill some bytes but maybeFillByteEv isn't
defined (i.e. the schema didn't define fillByte), then we throw a
Runtime SDE. This maintains backwards compatibility, but doesn't detect
the missing property until runtime, which is unfortunately. I'm not sure
if it's wroth maintaining that for this issue though. I'd probably lean
towards the first solution and just always require fillByte.


On 10/16/20 1:16 PM, Ramaka, Shashi wrote:
> I am working on DAFFODIL-2377: Abort instead of diagnostic message. 
> (https://issues.apache.org/jira/browse/DAFFODIL-2377 
> <https://issues.apache.org/jira/browse/DAFFODIL-2377>)
> 
> The attached files can be used to illustrate the bug.
> 
> daffodil parse -s s2377.xsd -o d2377.xml d2377.bin
> 
> daffodil unparse -s s2377.xsd d2377.xml
> 
> Unparsing results in the below error:
> 
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> 
> !!   An unexpected exception occurred. This is a bug!   !!
> 
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> 
> The schema has three elements. The first element has a length of 1 byte and 
> alignment of 1 byte while the next two elements have length = 1 and alignment = 
> 2. The unparser is unparsing the first element correctly but throwing the above 
> exception while determining the second element. Specifically it is failing 
> figuring out the fill byte to be used.
> 
> The unparser is failing on the highlighted line in ProcessorStateBase.scala in 
> daffodil-runtime1:
> 
> final def fillByte: Byte = {
>    if (maybeCachedFillByte.isEmpty)
> *    maybeCachedFillByte = 
> MaybeInt(termRuntimeData.maybeFillByteEv.get.evaluate(this).toInt)
> *  maybeCachedFillByte.get.toByte
> }
> 
> Seems like the processor state in termRuntimeData does not have the fill byte 
> value from the schema. If I update the above fillByte to return a hard coded 
> byte value, unparse is working fine.
> 
> I’d appreciate any pointers on how to proceed from here.
> 
> Regards,
> 
>     Shashi Ramaka
> 
> sramaka@owlcyberdefense.com <ma...@owlcyberdefense.com>
> 
>     Tel: 703-965-3656
> 


RE: DAFFODIL-2377

Posted by "Ramaka, Shashi" <sr...@owlcyberdefense.com>.
The attachment d2377.bin in the previous message was rejected by the listserv. The file contains just 5 bytes for the 3 integers in the schema. It is not needed to illustrate the bug. You can run the unparse step with the attached XML file to see the bug.

Regards,

   Shashi Ramaka
   sramaka@owlcyberdefense.com<ma...@owlcyberdefense.com>
   Tel: 703-965-3656

From: Ramaka, Shashi <sr...@owlcyberdefense.com>
Sent: Friday, October 16, 2020 1:17 PM
To: dev@daffodil.apache.org
Subject: DAFFODIL-2377

I am working on DAFFODIL-2377: Abort instead of diagnostic message. (https://issues.apache.org/jira/browse/DAFFODIL-2377)

The attached files can be used to illustrate the bug.


daffodil parse -s s2377.xsd -o d2377.xml d2377.bin

daffodil unparse -s s2377.xsd d2377.xml

Unparsing results in the below error:

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!   An unexpected exception occurred. This is a bug!   !!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

The schema has three elements. The first element has a length of 1 byte and alignment of 1 byte while the next two elements have length = 1 and alignment = 2. The unparser is unparsing the first element correctly but throwing the above exception while determining the second element. Specifically it is failing figuring out the fill byte to be used.

The unparser is failing on the highlighted line in ProcessorStateBase.scala in daffodil-runtime1:

final def fillByte: Byte = {
  if (maybeCachedFillByte.isEmpty)
    maybeCachedFillByte = MaybeInt(termRuntimeData.maybeFillByteEv.get.evaluate(this).toInt)
  maybeCachedFillByte.get.toByte
}

Seems like the processor state in termRuntimeData does not have the fill byte value from the schema. If I update the above fillByte to return a hard coded byte value, unparse is working fine.

I'd appreciate any pointers on how to proceed from here.


Regards,

   Shashi Ramaka
   sramaka@owlcyberdefense.com<ma...@owlcyberdefense.com>
   Tel: 703-965-3656