You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@daffodil.apache.org by "Sloane, Brandon" <bs...@tresys.com> on 2019/12/17 18:19:40 UTC

DataValue compiler bug under 2.11

While debugging my latest DataValue PR under 2.11, I came across what I believe to be a compiler bug.

Recall that we have source code along the lines of:

...
  @inline implicit def toDataValue(v: DFDLTime): DataValueTime = new DataValue(v)
  @inline implicit def toDataValue(v: Array[Byte]): DataValueByteArray = new DataValue(v)
  @inline implicit def toDataValue(v: JBoolean): DataValueBool = new DataValue(v)
  @inline implicit def toDataValue(v: JNumber): DataValueNumber = new DataValue(v)
...

Under 2.11, this gets compiled to:

...
  public org.apache.daffodil.calendar.DFDLDate toDataValue(org.apache.daffodil.calendar.DFDLDate);
    Code:
       0: aload_1
       1: areturn

  public org.apache.daffodil.calendar.DFDLTime toDataValue(org.apache.daffodil.calendar.DFDLTime);
    Code:
       0: aload_1
       1: areturn

  public byte[] toDataValue(byte[]);
    Code:
       0: getstatic     #35                 // Field scala/runtime/ScalaRunTime$.MODULE$:Lscala/runtime/ScalaRunTime$;
       3: aload_1
       4: invokevirtual #39                 // Method scala/runtime/ScalaRunTime$.toObjectArray:(Ljava/lang/Object;)[Ljava/lang/Object;
       7: checkcast     #41                 // class "[Ljava/lang/Byte;"
      10: areturn

  public java.lang.Boolean toDataValue(java.lang.Boolean);
    Code:
       0: aload_1
       1: areturn

  public java.lang.Number toDataValue(java.lang.Number);
    Code:
       0: aload_1
       1: areturn
...

Note that for every instant accept byte[], the function compiles to a simple identity function.

When compiled under 2.12, the byte[] instance gets compiled to identity as well:

  public byte[] toDataValue(byte[]);
    Code:
       0: aload_1
       1: areturn

The 2.11 version is not just inefficient; it is also wrong. It first converts to an array of Objects (which implies boxing), before attempting to cast said array to an array of (boxed) Bytes. This case fails , as Object[] does not extends Byte[].

Even if this cast were to succeed, the function would then be attempting to return an array of boxed java.lang.Byte, while its signature indicates it should be returning an array of unboxed bytes.

(As an asside, at the callpoint I looked at in 2.12 optimizes this function out entirely; the 2.11 version makes a standard function call to toDataValue)

I'm at a bit of a loss as to how to proceed. This bug seems fatal to the idea of having an AnyVal DataValue class, and we do not have much wiggle-room to tweak our source code to avoid triggering it.

Thoughts?


Brandon T. Sloane

Associate, Services

bsloane@tresys.com | tresys.com

Re: DataValue compiler bug under 2.11

Posted by "Sloane, Brandon" <bs...@tresys.com>.
PR updated.

This 2.11 hack seems significant enough that it warrants a re-review. I'll take Steve's +1 from this thread as part of that, but would like to get a new +1 from Mike before merging in.
________________________________
From: Steve Lawrence <sl...@apache.org>
Sent: Tuesday, December 17, 2019 2:38 PM
To: Sloane, Brandon <bs...@tresys.com>; dev@daffodil.apache.org <de...@daffodil.apache.org>
Subject: Re: DataValue compiler bug under 2.11

Seems reasonable to me. +1

On 12/17/19 2:36 PM, Sloane, Brandon wrote:
> This is the option I think I am settling on.
>
> SBT allows us to provide a seperate implementation of a file based on compiler
> version, so we can incur the related overhead on only 2.11 builds.
>
> The only caveat is that in 2.11, we would need to incur a cost on every call to
> getAnyRef, as that function needs to change to:
>
>    @inline def getAnyRef = {
>      if(v.isInstanceOf[BoxedByteArray]){
>        v.asInstanceOf[BoxedByteArray].v
>      }else{
>        v.asInstanceOf[AnyRef]
>    }
>
> We could avoid this by going through all of our code that expects a ByteArray
> and handling BoxedByteArray, but I would like to keep the hackiness contained.
>
> Since we can contain this to only 2.11 builds, I think the performance hit would
> be acceptable.
>
> This would still impose a bit of a maintance burden, as we now have 2 versions
> of the file to maintain, but hopefully this is not a file that requires much
> modifications.
> --------------------------------------------------------------------------------
> *From:* Steve Lawrence <sl...@apache.org>
> *Sent:* Tuesday, December 17, 2019 2:11 PM
> *To:* Sloane, Brandon <bs...@tresys.com>; dev@daffodil.apache.org
> <de...@daffodil.apache.org>
> *Subject:* Re: DataValue compiler bug under 2.11
> So this seems a bit hacky, but doing some simple testing I think
> wrapping the byte array works, e.g.:
>
> class WrappedByteArray(val ba: Array[Byte])
>
>
> object DataValue {
>    ...
>
>    type DataValueByteArray = DataValue[WrappedByteArray, ...]
>
>    @inline
>    implicit def toDataValue(v: Array[Byte]): DataValueByteArray =
>    {
>      val w = new WrappedByteArray(v)
>      new DataValueByteArray(w)
>    }
>
>    ...
>
> }
>
> class DataValue ... {
>
>    @inline
>    def getByteArray = v.asInstanceOf[WrappedByteArray].array
>
> }
>
> We're now effectively boxing byte arrays, but those are only used for
> hexBinary data, which aren't all that common so might not be too big of
> a deal. The byte code looks like this:
>
>    public WrappedByteArray toDataValue(byte[]);
>      Code:
>         0: new           #20   // class WrappedByteArray
>         3: dup
>         4: aload_1
>         5: invokespecial #23   // Method WrappedByteArray."<init>":([B)V
>         8: astore_2
>         9: aload_2
>        10: areturn
>
> Which is a bit more overhead, but is at least what we would actually
> expect, and it works on scala 2.11.
>
> The other option is probably to drop support for 2.11. The only use case
> I know of needing 2.11 is for using Daffodil in Apache Spark, which now
> has experimental support for 2.12. So it's not unreasonable to drop
> support, but would be nice to maintain, IMO.
>
> - Steve
>
> On 12/17/19 1:40 PM, Sloane, Brandon wrote:
>> That certainly looks like what I am seeing.
>>
>> If I am reading that issue correctly, it looks like it was fixed in 2.12.0-M1.
>> --------------------------------------------------------------------------------
>> *From:* Steve Lawrence <sl...@apache.org>
>> *Sent:* Tuesday, December 17, 2019 1:32 PM
>> *To:* dev@daffodil.apache.org <de...@daffodil.apache.org>
>> *Subject:* Re: DataValue compiler bug under 2.11
>> I believe this is scala bug 7521:
>>
>> https://github.com/scala/bug/issues/7521
>>
>> Skimming the comments and other referenced issues, I don't see anyone
>> mentioning a workaround. Appears the fix was added in 2.12, which makes
>> sense based on the behavior you're seeing.
>>
>> On 12/17/19 1:19 PM, Sloane, Brandon wrote:
>>> While debugging my latest DataValue PR under 2.11, I came across what I believe to be a compiler bug.
>>>
>>> Recall that we have source code along the lines of:
>>>
>>> ...
>>>   @inline implicit def toDataValue(v: DFDLTime): DataValueTime = new DataValue(v)
>>>   @inline implicit def toDataValue(v: Array[Byte]): DataValueByteArray = new DataValue(v)
>>>   @inline implicit def toDataValue(v: JBoolean): DataValueBool = new DataValue(v)
>>>   @inline implicit def toDataValue(v: JNumber): DataValueNumber = new DataValue(v)
>>> ...
>>>
>>> Under 2.11, this gets compiled to:
>>>
>>> ...
>>>   public org.apache.daffodil.calendar.DFDLDate toDataValue(org.apache.daffodil.calendar.DFDLDate);
>>>     Code:
>>>        0: aload_1
>>>        1: areturn
>>>
>>>   public org.apache.daffodil.calendar.DFDLTime toDataValue(org.apache.daffodil.calendar.DFDLTime);
>>>     Code:
>>>        0: aload_1
>>>        1: areturn
>>>
>>>   public byte[] toDataValue(byte[]);
>>>     Code:
>>>        0: getstatic     #35                 // Field scala/runtime/ScalaRunTime$.MODULE$:Lscala/runtime/ScalaRunTime$;
>>>        3: aload_1
>>>        4: invokevirtual #39                 // Method scala/runtime/ScalaRunTime$.toObjectArray:(Ljava/lang/Object;)[Ljava/lang/Object;
>>>        7: checkcast     #41                 // class "[Ljava/lang/Byte;"
>>>       10: areturn
>>>
>>>   public java.lang.Boolean toDataValue(java.lang.Boolean);
>>>     Code:
>>>        0: aload_1
>>>        1: areturn
>>>
>>>   public java.lang.Number toDataValue(java.lang.Number);
>>>     Code:
>>>        0: aload_1
>>>        1: areturn
>>> ...
>>>
>>> Note that for every instant accept byte[], the function compiles to a simple identity function.
>>>
>>> When compiled under 2.12, the byte[] instance gets compiled to identity as well:
>>>
>>>   public byte[] toDataValue(byte[]);
>>>     Code:
>>>        0: aload_1
>>>        1: areturn
>>>
>>> The 2.11 version is not just inefficient; it is also wrong. It first converts to an array of Objects (which implies boxing), before attempting to cast said array to an array of (boxed) Bytes. This case fails , as Object[] does not extends Byte[].
>>>
>>> Even if this cast were to succeed, the function would then be attempting to return an array of boxed java.lang.Byte, while its signature indicates it should be returning an array of unboxed bytes.
>>>
>>> (As an asside, at the callpoint I looked at in 2.12 optimizes this function out entirely; the 2.11 version makes a standard function call to toDataValue)
>>>
>>> I'm at a bit of a loss as to how to proceed. This bug seems fatal to the idea of having an AnyVal DataValue class, and we do not have much wiggle-room to tweak our source code to avoid triggering it.
>>>
>>> Thoughts?
>>>
>>>
>>> Brandon T. Sloane
>>>
>>> Associate, Services
>>>
>>> bsloane@tresys.com | tresys.com
>>>
>>
>


Re: DataValue compiler bug under 2.11

Posted by Steve Lawrence <sl...@apache.org>.
Seems reasonable to me. +1

On 12/17/19 2:36 PM, Sloane, Brandon wrote:
> This is the option I think I am settling on.
> 
> SBT allows us to provide a seperate implementation of a file based on compiler 
> version, so we can incur the related overhead on only 2.11 builds.
> 
> The only caveat is that in 2.11, we would need to incur a cost on every call to 
> getAnyRef, as that function needs to change to:
> 
>    @inline def getAnyRef = {
>      if(v.isInstanceOf[BoxedByteArray]){
>        v.asInstanceOf[BoxedByteArray].v
>      }else{
>        v.asInstanceOf[AnyRef]
>    }
> 
> We could avoid this by going through all of our code that expects a ByteArray 
> and handling BoxedByteArray, but I would like to keep the hackiness contained.
> 
> Since we can contain this to only 2.11 builds, I think the performance hit would 
> be acceptable.
> 
> This would still impose a bit of a maintance burden, as we now have 2 versions 
> of the file to maintain, but hopefully this is not a file that requires much 
> modifications.
> --------------------------------------------------------------------------------
> *From:* Steve Lawrence <sl...@apache.org>
> *Sent:* Tuesday, December 17, 2019 2:11 PM
> *To:* Sloane, Brandon <bs...@tresys.com>; dev@daffodil.apache.org 
> <de...@daffodil.apache.org>
> *Subject:* Re: DataValue compiler bug under 2.11
> So this seems a bit hacky, but doing some simple testing I think
> wrapping the byte array works, e.g.:
> 
> class WrappedByteArray(val ba: Array[Byte])
> 
> 
> object DataValue {
>    ...
> 
>    type DataValueByteArray = DataValue[WrappedByteArray, ...]
> 
>    @inline
>    implicit def toDataValue(v: Array[Byte]): DataValueByteArray =
>    {
>      val w = new WrappedByteArray(v)
>      new DataValueByteArray(w)
>    }
> 
>    ...
> 
> }
> 
> class DataValue ... {
> 
>    @inline
>    def getByteArray = v.asInstanceOf[WrappedByteArray].array
> 
> }
> 
> We're now effectively boxing byte arrays, but those are only used for
> hexBinary data, which aren't all that common so might not be too big of
> a deal. The byte code looks like this:
> 
>    public WrappedByteArray toDataValue(byte[]);
>      Code:
>         0: new           #20   // class WrappedByteArray
>         3: dup
>         4: aload_1
>         5: invokespecial #23   // Method WrappedByteArray."<init>":([B)V
>         8: astore_2
>         9: aload_2
>        10: areturn
> 
> Which is a bit more overhead, but is at least what we would actually
> expect, and it works on scala 2.11.
> 
> The other option is probably to drop support for 2.11. The only use case
> I know of needing 2.11 is for using Daffodil in Apache Spark, which now
> has experimental support for 2.12. So it's not unreasonable to drop
> support, but would be nice to maintain, IMO.
> 
> - Steve
> 
> On 12/17/19 1:40 PM, Sloane, Brandon wrote:
>> That certainly looks like what I am seeing.
>> 
>> If I am reading that issue correctly, it looks like it was fixed in 2.12.0-M1.
>> --------------------------------------------------------------------------------
>> *From:* Steve Lawrence <sl...@apache.org>
>> *Sent:* Tuesday, December 17, 2019 1:32 PM
>> *To:* dev@daffodil.apache.org <de...@daffodil.apache.org>
>> *Subject:* Re: DataValue compiler bug under 2.11
>> I believe this is scala bug 7521:
>> 
>> https://github.com/scala/bug/issues/7521
>> 
>> Skimming the comments and other referenced issues, I don't see anyone
>> mentioning a workaround. Appears the fix was added in 2.12, which makes
>> sense based on the behavior you're seeing.
>> 
>> On 12/17/19 1:19 PM, Sloane, Brandon wrote:
>>> While debugging my latest DataValue PR under 2.11, I came across what I believe to be a compiler bug.
>>> 
>>> Recall that we have source code along the lines of:
>>> 
>>> ...
>>>   @inline implicit def toDataValue(v: DFDLTime): DataValueTime = new DataValue(v)
>>>   @inline implicit def toDataValue(v: Array[Byte]): DataValueByteArray = new DataValue(v)
>>>   @inline implicit def toDataValue(v: JBoolean): DataValueBool = new DataValue(v)
>>>   @inline implicit def toDataValue(v: JNumber): DataValueNumber = new DataValue(v)
>>> ...
>>> 
>>> Under 2.11, this gets compiled to:
>>> 
>>> ...
>>>   public org.apache.daffodil.calendar.DFDLDate toDataValue(org.apache.daffodil.calendar.DFDLDate);
>>>     Code:
>>>        0: aload_1
>>>        1: areturn
>>> 
>>>   public org.apache.daffodil.calendar.DFDLTime toDataValue(org.apache.daffodil.calendar.DFDLTime);
>>>     Code:
>>>        0: aload_1
>>>        1: areturn
>>> 
>>>   public byte[] toDataValue(byte[]);
>>>     Code:
>>>        0: getstatic     #35                 // Field scala/runtime/ScalaRunTime$.MODULE$:Lscala/runtime/ScalaRunTime$;
>>>        3: aload_1
>>>        4: invokevirtual #39                 // Method scala/runtime/ScalaRunTime$.toObjectArray:(Ljava/lang/Object;)[Ljava/lang/Object;
>>>        7: checkcast     #41                 // class "[Ljava/lang/Byte;"
>>>       10: areturn
>>> 
>>>   public java.lang.Boolean toDataValue(java.lang.Boolean);
>>>     Code:
>>>        0: aload_1
>>>        1: areturn
>>> 
>>>   public java.lang.Number toDataValue(java.lang.Number);
>>>     Code:
>>>        0: aload_1
>>>        1: areturn
>>> ...
>>> 
>>> Note that for every instant accept byte[], the function compiles to a simple identity function.
>>> 
>>> When compiled under 2.12, the byte[] instance gets compiled to identity as well:
>>> 
>>>   public byte[] toDataValue(byte[]);
>>>     Code:
>>>        0: aload_1
>>>        1: areturn
>>> 
>>> The 2.11 version is not just inefficient; it is also wrong. It first converts to an array of Objects (which implies boxing), before attempting to cast said array to an array of (boxed) Bytes. This case fails , as Object[] does not extends Byte[].
>>> 
>>> Even if this cast were to succeed, the function would then be attempting to return an array of boxed java.lang.Byte, while its signature indicates it should be returning an array of unboxed bytes.
>>> 
>>> (As an asside, at the callpoint I looked at in 2.12 optimizes this function out entirely; the 2.11 version makes a standard function call to toDataValue)
>>> 
>>> I'm at a bit of a loss as to how to proceed. This bug seems fatal to the idea of having an AnyVal DataValue class, and we do not have much wiggle-room to tweak our source code to avoid triggering it.
>>> 
>>> Thoughts?
>>> 
>>> 
>>> Brandon T. Sloane
>>> 
>>> Associate, Services
>>> 
>>> bsloane@tresys.com | tresys.com
>>> 
>> 
> 


Re: DataValue compiler bug under 2.11

Posted by "Sloane, Brandon" <bs...@tresys.com>.
This is the option I think I am settling on.

SBT allows us to provide a seperate implementation of a file based on compiler version, so we can incur the related overhead on only 2.11 builds.

The only caveat is that in 2.11, we would need to incur a cost on every call to getAnyRef, as that function needs to change to:

  @inline def getAnyRef = {
    if(v.isInstanceOf[BoxedByteArray]){
      v.asInstanceOf[BoxedByteArray].v
    }else{
      v.asInstanceOf[AnyRef]
  }

We could avoid this by going through all of our code that expects a ByteArray and handling BoxedByteArray, but I would like to keep the hackiness contained.

Since we can contain this to only 2.11 builds, I think the performance hit would be acceptable.

This would still impose a bit of a maintance burden, as we now have 2 versions of the file to maintain, but hopefully this is not a file that requires much modifications.
________________________________
From: Steve Lawrence <sl...@apache.org>
Sent: Tuesday, December 17, 2019 2:11 PM
To: Sloane, Brandon <bs...@tresys.com>; dev@daffodil.apache.org <de...@daffodil.apache.org>
Subject: Re: DataValue compiler bug under 2.11

So this seems a bit hacky, but doing some simple testing I think
wrapping the byte array works, e.g.:

class WrappedByteArray(val ba: Array[Byte])


object DataValue {
  ...

  type DataValueByteArray = DataValue[WrappedByteArray, ...]

  @inline
  implicit def toDataValue(v: Array[Byte]): DataValueByteArray =
  {
    val w = new WrappedByteArray(v)
    new DataValueByteArray(w)
  }

  ...

}

class DataValue ... {

  @inline
  def getByteArray = v.asInstanceOf[WrappedByteArray].array

}

We're now effectively boxing byte arrays, but those are only used for
hexBinary data, which aren't all that common so might not be too big of
a deal. The byte code looks like this:

  public WrappedByteArray toDataValue(byte[]);
    Code:
       0: new           #20   // class WrappedByteArray
       3: dup
       4: aload_1
       5: invokespecial #23   // Method WrappedByteArray."<init>":([B)V
       8: astore_2
       9: aload_2
      10: areturn

Which is a bit more overhead, but is at least what we would actually
expect, and it works on scala 2.11.

The other option is probably to drop support for 2.11. The only use case
I know of needing 2.11 is for using Daffodil in Apache Spark, which now
has experimental support for 2.12. So it's not unreasonable to drop
support, but would be nice to maintain, IMO.

- Steve

On 12/17/19 1:40 PM, Sloane, Brandon wrote:
> That certainly looks like what I am seeing.
>
> If I am reading that issue correctly, it looks like it was fixed in 2.12.0-M1.
> --------------------------------------------------------------------------------
> *From:* Steve Lawrence <sl...@apache.org>
> *Sent:* Tuesday, December 17, 2019 1:32 PM
> *To:* dev@daffodil.apache.org <de...@daffodil.apache.org>
> *Subject:* Re: DataValue compiler bug under 2.11
> I believe this is scala bug 7521:
>
> https://github.com/scala/bug/issues/7521
>
> Skimming the comments and other referenced issues, I don't see anyone
> mentioning a workaround. Appears the fix was added in 2.12, which makes
> sense based on the behavior you're seeing.
>
> On 12/17/19 1:19 PM, Sloane, Brandon wrote:
>> While debugging my latest DataValue PR under 2.11, I came across what I believe to be a compiler bug.
>>
>> Recall that we have source code along the lines of:
>>
>> ...
>>   @inline implicit def toDataValue(v: DFDLTime): DataValueTime = new DataValue(v)
>>   @inline implicit def toDataValue(v: Array[Byte]): DataValueByteArray = new DataValue(v)
>>   @inline implicit def toDataValue(v: JBoolean): DataValueBool = new DataValue(v)
>>   @inline implicit def toDataValue(v: JNumber): DataValueNumber = new DataValue(v)
>> ...
>>
>> Under 2.11, this gets compiled to:
>>
>> ...
>>   public org.apache.daffodil.calendar.DFDLDate toDataValue(org.apache.daffodil.calendar.DFDLDate);
>>     Code:
>>        0: aload_1
>>        1: areturn
>>
>>   public org.apache.daffodil.calendar.DFDLTime toDataValue(org.apache.daffodil.calendar.DFDLTime);
>>     Code:
>>        0: aload_1
>>        1: areturn
>>
>>   public byte[] toDataValue(byte[]);
>>     Code:
>>        0: getstatic     #35                 // Field scala/runtime/ScalaRunTime$.MODULE$:Lscala/runtime/ScalaRunTime$;
>>        3: aload_1
>>        4: invokevirtual #39                 // Method scala/runtime/ScalaRunTime$.toObjectArray:(Ljava/lang/Object;)[Ljava/lang/Object;
>>        7: checkcast     #41                 // class "[Ljava/lang/Byte;"
>>       10: areturn
>>
>>   public java.lang.Boolean toDataValue(java.lang.Boolean);
>>     Code:
>>        0: aload_1
>>        1: areturn
>>
>>   public java.lang.Number toDataValue(java.lang.Number);
>>     Code:
>>        0: aload_1
>>        1: areturn
>> ...
>>
>> Note that for every instant accept byte[], the function compiles to a simple identity function.
>>
>> When compiled under 2.12, the byte[] instance gets compiled to identity as well:
>>
>>   public byte[] toDataValue(byte[]);
>>     Code:
>>        0: aload_1
>>        1: areturn
>>
>> The 2.11 version is not just inefficient; it is also wrong. It first converts to an array of Objects (which implies boxing), before attempting to cast said array to an array of (boxed) Bytes. This case fails , as Object[] does not extends Byte[].
>>
>> Even if this cast were to succeed, the function would then be attempting to return an array of boxed java.lang.Byte, while its signature indicates it should be returning an array of unboxed bytes.
>>
>> (As an asside, at the callpoint I looked at in 2.12 optimizes this function out entirely; the 2.11 version makes a standard function call to toDataValue)
>>
>> I'm at a bit of a loss as to how to proceed. This bug seems fatal to the idea of having an AnyVal DataValue class, and we do not have much wiggle-room to tweak our source code to avoid triggering it.
>>
>> Thoughts?
>>
>>
>> Brandon T. Sloane
>>
>> Associate, Services
>>
>> bsloane@tresys.com | tresys.com
>>
>


Re: DataValue compiler bug under 2.11

Posted by Steve Lawrence <sl...@apache.org>.
So this seems a bit hacky, but doing some simple testing I think
wrapping the byte array works, e.g.:

class WrappedByteArray(val ba: Array[Byte])


object DataValue {
  ...

  type DataValueByteArray = DataValue[WrappedByteArray, ...]

  @inline
  implicit def toDataValue(v: Array[Byte]): DataValueByteArray =
  {
    val w = new WrappedByteArray(v)
    new DataValueByteArray(w)
  }

  ...

}

class DataValue ... {

  @inline
  def getByteArray = v.asInstanceOf[WrappedByteArray].array

}

We're now effectively boxing byte arrays, but those are only used for
hexBinary data, which aren't all that common so might not be too big of
a deal. The byte code looks like this:

  public WrappedByteArray toDataValue(byte[]);
    Code:
       0: new           #20   // class WrappedByteArray
       3: dup
       4: aload_1
       5: invokespecial #23   // Method WrappedByteArray."<init>":([B)V
       8: astore_2
       9: aload_2
      10: areturn

Which is a bit more overhead, but is at least what we would actually
expect, and it works on scala 2.11.

The other option is probably to drop support for 2.11. The only use case
I know of needing 2.11 is for using Daffodil in Apache Spark, which now
has experimental support for 2.12. So it's not unreasonable to drop
support, but would be nice to maintain, IMO.

- Steve

On 12/17/19 1:40 PM, Sloane, Brandon wrote:
> That certainly looks like what I am seeing.
> 
> If I am reading that issue correctly, it looks like it was fixed in 2.12.0-M1.
> --------------------------------------------------------------------------------
> *From:* Steve Lawrence <sl...@apache.org>
> *Sent:* Tuesday, December 17, 2019 1:32 PM
> *To:* dev@daffodil.apache.org <de...@daffodil.apache.org>
> *Subject:* Re: DataValue compiler bug under 2.11
> I believe this is scala bug 7521:
> 
> https://github.com/scala/bug/issues/7521
> 
> Skimming the comments and other referenced issues, I don't see anyone
> mentioning a workaround. Appears the fix was added in 2.12, which makes
> sense based on the behavior you're seeing.
> 
> On 12/17/19 1:19 PM, Sloane, Brandon wrote:
>> While debugging my latest DataValue PR under 2.11, I came across what I believe to be a compiler bug.
>> 
>> Recall that we have source code along the lines of:
>> 
>> ...
>>   @inline implicit def toDataValue(v: DFDLTime): DataValueTime = new DataValue(v)
>>   @inline implicit def toDataValue(v: Array[Byte]): DataValueByteArray = new DataValue(v)
>>   @inline implicit def toDataValue(v: JBoolean): DataValueBool = new DataValue(v)
>>   @inline implicit def toDataValue(v: JNumber): DataValueNumber = new DataValue(v)
>> ...
>> 
>> Under 2.11, this gets compiled to:
>> 
>> ...
>>   public org.apache.daffodil.calendar.DFDLDate toDataValue(org.apache.daffodil.calendar.DFDLDate);
>>     Code:
>>        0: aload_1
>>        1: areturn
>> 
>>   public org.apache.daffodil.calendar.DFDLTime toDataValue(org.apache.daffodil.calendar.DFDLTime);
>>     Code:
>>        0: aload_1
>>        1: areturn
>> 
>>   public byte[] toDataValue(byte[]);
>>     Code:
>>        0: getstatic     #35                 // Field scala/runtime/ScalaRunTime$.MODULE$:Lscala/runtime/ScalaRunTime$;
>>        3: aload_1
>>        4: invokevirtual #39                 // Method scala/runtime/ScalaRunTime$.toObjectArray:(Ljava/lang/Object;)[Ljava/lang/Object;
>>        7: checkcast     #41                 // class "[Ljava/lang/Byte;"
>>       10: areturn
>> 
>>   public java.lang.Boolean toDataValue(java.lang.Boolean);
>>     Code:
>>        0: aload_1
>>        1: areturn
>> 
>>   public java.lang.Number toDataValue(java.lang.Number);
>>     Code:
>>        0: aload_1
>>        1: areturn
>> ...
>> 
>> Note that for every instant accept byte[], the function compiles to a simple identity function.
>> 
>> When compiled under 2.12, the byte[] instance gets compiled to identity as well:
>> 
>>   public byte[] toDataValue(byte[]);
>>     Code:
>>        0: aload_1
>>        1: areturn
>> 
>> The 2.11 version is not just inefficient; it is also wrong. It first converts to an array of Objects (which implies boxing), before attempting to cast said array to an array of (boxed) Bytes. This case fails , as Object[] does not extends Byte[].
>> 
>> Even if this cast were to succeed, the function would then be attempting to return an array of boxed java.lang.Byte, while its signature indicates it should be returning an array of unboxed bytes.
>> 
>> (As an asside, at the callpoint I looked at in 2.12 optimizes this function out entirely; the 2.11 version makes a standard function call to toDataValue)
>> 
>> I'm at a bit of a loss as to how to proceed. This bug seems fatal to the idea of having an AnyVal DataValue class, and we do not have much wiggle-room to tweak our source code to avoid triggering it.
>> 
>> Thoughts?
>> 
>> 
>> Brandon T. Sloane
>> 
>> Associate, Services
>> 
>> bsloane@tresys.com | tresys.com
>> 
> 


Re: DataValue compiler bug under 2.11

Posted by "Sloane, Brandon" <bs...@tresys.com>.
That certainly looks like what I am seeing.

If I am reading that issue correctly, it looks like it was fixed in 2.12.0-M1.
________________________________
From: Steve Lawrence <sl...@apache.org>
Sent: Tuesday, December 17, 2019 1:32 PM
To: dev@daffodil.apache.org <de...@daffodil.apache.org>
Subject: Re: DataValue compiler bug under 2.11

I believe this is scala bug 7521:

https://github.com/scala/bug/issues/7521

Skimming the comments and other referenced issues, I don't see anyone
mentioning a workaround. Appears the fix was added in 2.12, which makes
sense based on the behavior you're seeing.

On 12/17/19 1:19 PM, Sloane, Brandon wrote:
> While debugging my latest DataValue PR under 2.11, I came across what I believe to be a compiler bug.
>
> Recall that we have source code along the lines of:
>
> ...
>   @inline implicit def toDataValue(v: DFDLTime): DataValueTime = new DataValue(v)
>   @inline implicit def toDataValue(v: Array[Byte]): DataValueByteArray = new DataValue(v)
>   @inline implicit def toDataValue(v: JBoolean): DataValueBool = new DataValue(v)
>   @inline implicit def toDataValue(v: JNumber): DataValueNumber = new DataValue(v)
> ...
>
> Under 2.11, this gets compiled to:
>
> ...
>   public org.apache.daffodil.calendar.DFDLDate toDataValue(org.apache.daffodil.calendar.DFDLDate);
>     Code:
>        0: aload_1
>        1: areturn
>
>   public org.apache.daffodil.calendar.DFDLTime toDataValue(org.apache.daffodil.calendar.DFDLTime);
>     Code:
>        0: aload_1
>        1: areturn
>
>   public byte[] toDataValue(byte[]);
>     Code:
>        0: getstatic     #35                 // Field scala/runtime/ScalaRunTime$.MODULE$:Lscala/runtime/ScalaRunTime$;
>        3: aload_1
>        4: invokevirtual #39                 // Method scala/runtime/ScalaRunTime$.toObjectArray:(Ljava/lang/Object;)[Ljava/lang/Object;
>        7: checkcast     #41                 // class "[Ljava/lang/Byte;"
>       10: areturn
>
>   public java.lang.Boolean toDataValue(java.lang.Boolean);
>     Code:
>        0: aload_1
>        1: areturn
>
>   public java.lang.Number toDataValue(java.lang.Number);
>     Code:
>        0: aload_1
>        1: areturn
> ...
>
> Note that for every instant accept byte[], the function compiles to a simple identity function.
>
> When compiled under 2.12, the byte[] instance gets compiled to identity as well:
>
>   public byte[] toDataValue(byte[]);
>     Code:
>        0: aload_1
>        1: areturn
>
> The 2.11 version is not just inefficient; it is also wrong. It first converts to an array of Objects (which implies boxing), before attempting to cast said array to an array of (boxed) Bytes. This case fails , as Object[] does not extends Byte[].
>
> Even if this cast were to succeed, the function would then be attempting to return an array of boxed java.lang.Byte, while its signature indicates it should be returning an array of unboxed bytes.
>
> (As an asside, at the callpoint I looked at in 2.12 optimizes this function out entirely; the 2.11 version makes a standard function call to toDataValue)
>
> I'm at a bit of a loss as to how to proceed. This bug seems fatal to the idea of having an AnyVal DataValue class, and we do not have much wiggle-room to tweak our source code to avoid triggering it.
>
> Thoughts?
>
>
> Brandon T. Sloane
>
> Associate, Services
>
> bsloane@tresys.com | tresys.com
>


Re: DataValue compiler bug under 2.11

Posted by Steve Lawrence <sl...@apache.org>.
I believe this is scala bug 7521:

https://github.com/scala/bug/issues/7521

Skimming the comments and other referenced issues, I don't see anyone
mentioning a workaround. Appears the fix was added in 2.12, which makes
sense based on the behavior you're seeing.

On 12/17/19 1:19 PM, Sloane, Brandon wrote:
> While debugging my latest DataValue PR under 2.11, I came across what I believe to be a compiler bug.
> 
> Recall that we have source code along the lines of:
> 
> ...
>   @inline implicit def toDataValue(v: DFDLTime): DataValueTime = new DataValue(v)
>   @inline implicit def toDataValue(v: Array[Byte]): DataValueByteArray = new DataValue(v)
>   @inline implicit def toDataValue(v: JBoolean): DataValueBool = new DataValue(v)
>   @inline implicit def toDataValue(v: JNumber): DataValueNumber = new DataValue(v)
> ...
> 
> Under 2.11, this gets compiled to:
> 
> ...
>   public org.apache.daffodil.calendar.DFDLDate toDataValue(org.apache.daffodil.calendar.DFDLDate);
>     Code:
>        0: aload_1
>        1: areturn
> 
>   public org.apache.daffodil.calendar.DFDLTime toDataValue(org.apache.daffodil.calendar.DFDLTime);
>     Code:
>        0: aload_1
>        1: areturn
> 
>   public byte[] toDataValue(byte[]);
>     Code:
>        0: getstatic     #35                 // Field scala/runtime/ScalaRunTime$.MODULE$:Lscala/runtime/ScalaRunTime$;
>        3: aload_1
>        4: invokevirtual #39                 // Method scala/runtime/ScalaRunTime$.toObjectArray:(Ljava/lang/Object;)[Ljava/lang/Object;
>        7: checkcast     #41                 // class "[Ljava/lang/Byte;"
>       10: areturn
> 
>   public java.lang.Boolean toDataValue(java.lang.Boolean);
>     Code:
>        0: aload_1
>        1: areturn
> 
>   public java.lang.Number toDataValue(java.lang.Number);
>     Code:
>        0: aload_1
>        1: areturn
> ...
> 
> Note that for every instant accept byte[], the function compiles to a simple identity function.
> 
> When compiled under 2.12, the byte[] instance gets compiled to identity as well:
> 
>   public byte[] toDataValue(byte[]);
>     Code:
>        0: aload_1
>        1: areturn
> 
> The 2.11 version is not just inefficient; it is also wrong. It first converts to an array of Objects (which implies boxing), before attempting to cast said array to an array of (boxed) Bytes. This case fails , as Object[] does not extends Byte[].
> 
> Even if this cast were to succeed, the function would then be attempting to return an array of boxed java.lang.Byte, while its signature indicates it should be returning an array of unboxed bytes.
> 
> (As an asside, at the callpoint I looked at in 2.12 optimizes this function out entirely; the 2.11 version makes a standard function call to toDataValue)
> 
> I'm at a bit of a loss as to how to proceed. This bug seems fatal to the idea of having an AnyVal DataValue class, and we do not have much wiggle-room to tweak our source code to avoid triggering it.
> 
> Thoughts?
> 
> 
> Brandon T. Sloane
> 
> Associate, Services
> 
> bsloane@tresys.com | tresys.com
>