You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Alessandro Solimando <al...@gmail.com> on 2021/02/07 20:12:01 UTC

Discussion around CALCITE-2489 (risks linked to Class.getFields())

Hello everyone,
I had a look at CALCITE-2489
<https://issues.apache.org/jira/browse/CALCITE-2489> (Order of fields in
JavaTypeFactoryImpl#createStructType is unstable) since it was linked to
few PRs I worked on in the last months.

In a nutshell, the output of "Class.getFields()" is JVM
implementation-specific without any guarantees on the field ordering (even
if, at date, the implementation seems rather consistent), we should replace
it with a deterministic and stable ordering instead.

I have created two branches with a proposed implementation for that, one
for Calcite
<https://github.com/asolimando/calcite/tree/master-CALCITE-2489-createStructTypeUnstable>,
and one for Avatica
<https://github.com/asolimando/calcite-avatica/tree/master-CALCITE-2489-createStructTypeUnstable>,
all tests are passing (Calcite needs the changes in Avatica, though). You
can check the diff against the branch and master for both Calcite
<https://github.com/asolimando/calcite/compare/master...asolimando:master-CALCITE-2489-createStructTypeUnstable>
and Avatica
<https://github.com/asolimando/calcite-avatica/compare/master...asolimando:master-CALCITE-2489-createStructTypeUnstable#diff-651ecd95723b9508c17cfd021ec3894dfbdc81ea3ce9384a5d57cf209a214a97>
.

The only rough edge I see with the current proposal is that ordering on the
fields in the Sql type derived from a class "C" must match exactly the
order in the constructor(s) of "C" itself.

This comes from the code generation for instantiating classes (that is,
"new Class(...)" in code generated via Janino), which relies on the order
of the fields (see NewExpression.java#L63
<https://github.com/apache/calcite/blob/c475124ef8c61eb398ee119f27dfe21a14eca32f/linq4j/src/main/java/org/apache/calcite/linq4j/tree/NewExpression.java#L63>
).

The code generation part did not change, and the same "limitation" applies
to original implementation, where the constructor signature had to match
the order of the fields declaration, and in turn this order had to be
respected by the "Class.getFields()" implementation, which apparently is
consistently the case, otherwise the existing implementation would not have
worked.
To test this, I have modified a few classes from ReflectiveSchemaTest.java,
changing the order of either the field declarations, or the constructor,
and the code generation was failing.

What's worrying me with the proposed solution is that now the ordering
logic is a bit more complicated: annotations can dictate the order
explicitly for all the classes involved in the hierarchy, or rely on the
default (from parent to descendants, alphabetical order for fields declared
within the same class). Since the user must write the constructor knowing
exactly the final order the fields will get, it can rapidly become
complicated with a rich class hierarchy, especially if no annotations are
used.

So, in summary, relying on the JVM implementation-specific behaviour of
"Class.getFields()" is risky, but on the other hand I see some potential
friction for the final user of Calcite if we adopt a safer approach. I'd
like to stress that the downside is not linked to my specific choices nor
to the  suggestions in the ticket description, since the constructor
parameters cannot be known even via reflection, there is no way we can
automatically match the constructor, the burden will always fall on the
user.

I'd like to hear your thoughts before eventually moving forward with
opening the two PRs.

Best regards,
Alessandro

Re: Discussion around CALCITE-2489 (risks linked to Class.getFields())

Posted by Alessandro Solimando <al...@gmail.com>.
Stamatis, Vladimir, thanks for your comments.

The empty constructor Stamatis is suggesting would probably be enough, but
what Vladimir is suggesting would also improve on the annotation front (in
particular, it's easier to annotate just the constructor rather than each
field in the whole hierarchy).

I am using Jackson from time to time but I did not know about that feature,
I will dig more and try to improve the current implementation I have. Would
you mind writing a small example of what you have in mind?

Googling the topic a bit more, I also found ConstructorParameters
<https://docs.oracle.com/javase/9/docs/api/javax/management/ConstructorParameters.html>
annotation (Java7+) which might help:

> @ConstructorParameters({"aid", "name", "birthPlace", "books"})

public Author(int aid, String name, Place birthPlace, List<Book> books) {
>       this.aid = aid;
>       this.name = name;
>       this.birthPlace = birthPlace;
>       this.books = books;
>     }


Regarding "newExpression", any test making use of a JavaRowType is affected
(e.g., ReflectiveSchemaTest.testSelectWithFieldAccessOnSecondLevelRecordType
<https://github.com/apache/calcite/blob/c475124ef8c61eb398ee119f27dfe21a14eca32f/core/src/test/java/org/apache/calcite/test/ReflectiveSchemaTest.java#L336>),
but the dependency on the field order does not come from the test itself,
but it's tied to the way expressions representing rows are built for
enumerable tables.

The aforementioned test runs the following query:

"select au.\"birthPlace\".\"coords\".\"latitude\" as lat\n"
    + "from \"bookstore\".\"authors\" au\n"

While building rows for "Authors" table (backed up by a Java object),
JavaRowFormat.record()
<https://github.com/apache/calcite/blob/c475124ef8c61eb398ee119f27dfe21a14eca32f/core/src/main/java/org/apache/calcite/adapter/enumerable/JavaRowFormat.java#L61>
is invoked, and it builds an expression of this kind, which invokes
Authors's constructor with the appropriate parameters:

new org.apache.calcite.test.BookstoreSchema.Author(
>   row.aid,
>   row.name,
>   row.birthPlace,
>   org.apache.calcite.linq4j.Linq4j.asEnumerable(row.books).select(new
> org.apache.calcite.linq4j.function.Function1() {
>     public java.util.List
> apply(org.apache.calcite.test.BookstoreSchema.Book o) {
>       return (java.util.List)
> org.apache.calcite.runtime.FlatLists.of(o.title, o.publishYear, o.pages);
>     }
>     public Object apply(Object o) {
>       return apply(
>         (org.apache.calcite.test.BookstoreSchema.Book) o);
>     }
>   }


In this case, since my "default" ordering was different from (sorting
alphabetically) the "declaration order" used by the JVM, I had to provide
the same ordering via the CalciteField annotation, in order to still match
the constructor:

public static class Author {
  @CalciteField(order = 1)
  public final int aid;
  @CalciteField(order = 2)
  public final String name;
  @CalciteField(order = 3)
  public final Place birthPlace;
  @CalciteField(order = 4)
  @org.apache.calcite.adapter.java.Array(component = Book.class)
  public final List<Book> books;

  public Author(int aid, String name, Place birthPlace, List<Book> books) {
    this.aid = aid;
    this.name = name;
    this.birthPlace = birthPlace;
    this.books = books;
  }
}

If you start from master, and you swap any two parameters in Author (either
in the field declaration, or in the constructor), all tests related to
Author will fail since the constructor invocation part won't compile due to
signature mismatch (the same apply to any other such class used in tests).

For instance, I moved "name" field as the last declared field in Author,
and below you can see that the constructor invocation expects it as last
parameter:

java.sql.SQLException: Error while executing SQL "select
> au."birthPlace"."coords"."latitude" as lat
> from "bookstore"."authors" au
> ": Error while compiling generated Java code:
> public org.apache.calcite.linq4j.Enumerable bind(final
> org.apache.calcite.DataContext root) {
>   final org.apache.calcite.linq4j.Enumerable _inputEnumerable =
> org.apache.calcite.linq4j.Linq4j.asEnumerable(((org.apache.calcite.test.BookstoreSchema)
> ((org.apache.calcite.adapter.java.ReflectiveSchema)
> root.getRootSchema().getSubSchema("bookstore").unwrap(org.apache.calcite.adapter.java.ReflectiveSchema.class)).getTarget()).authors).select(new
> org.apache.calcite.linq4j.function.Function1() {
>     public org.apache.calcite.test.BookstoreSchema.Author
> apply(org.apache.calcite.test.BookstoreSchema.Author row) {
>       return new org.apache.calcite.test.BookstoreSchema.Author(
>           row.aid,
>           row.birthPlace,
>
> org.apache.calcite.linq4j.Linq4j.asEnumerable(row.books).select(new
> org.apache.calcite.linq4j.function.Function1() {
>             public java.util.List
> apply(org.apache.calcite.test.BookstoreSchema.Book o) {
>               return org.apache.calcite.runtime.FlatLists.of(o.title,
> o.publishYear, o.pages);
>             }
>             public Object apply(Object o) {
>               return apply(
>                 (org.apache.calcite.test.BookstoreSchema.Book) o);
>             }
>           }
>           ).toList(),
>           row.name); <----- should be the second param
>     }
>     public Object apply(Object row) {
>       return apply(
>         (org.apache.calcite.test.BookstoreSchema.Author) row);
>     }
>   }
>   );

[...]
>

It's NewExpression.accept() (NewExpression.java#L63
<https://github.com/apache/calcite/blob/c475124ef8c61eb398ee119f27dfe21a14eca32f/linq4j/src/main/java/org/apache/calcite/linq4j/tree/NewExpression.java#L63>)
generating that string, which indirectly receives the parameter order
from JavaTypeFactoryImpl.createStructType(Class
type)
<https://github.com/apache/calcite/blob/c475124ef8c61eb398ee119f27dfe21a14eca32f/core/src/main/java/org/apache/calcite/jdbc/JavaTypeFactoryImpl.java#L78>,
which it was using Class.getFields(), so in the end the output of that
method has always been matching the declaration order, which is also the
order used for the constructor of all the Java objects used in Calcite
codebase (that's why in the diff you can see I had to put annotations in
all of them to keep the same expected test results).

Best regards,
Alessandro

On Tue, 9 Feb 2021 at 11:35, Vladimir Sitnikov <si...@gmail.com>
wrote:

> Alessandro, thanks for pushing this.
>
> Frankly speaking, I don't like @CalciteField(order=42) approach. It looks
> like annotating the constructor parameters makes more sense, and it is
> more flexible.
> A similar case is covered in Jackson with "parameter names" module:
>
> https://github.com/FasterXML/jackson-modules-java8/tree/master/parameter-names
> Just in case, Java8+ can include parameter names to the reflection info in
> case user supplies `-parameters` option for the compiler.
>
> >NewExpression
>
> It might indeed be a too low-level API for that.
> Could you please highlight which test relies on the order of constructor
> arguments?
>
> By the way, SQL structs rely on the order of the fields rather than field
> names.
> So it might be ok to rely on the order of the constructor arguments.
>
> Vladimir
>

Re: Discussion around CALCITE-2489 (risks linked to Class.getFields())

Posted by Alessandro Solimando <al...@gmail.com>.
Hello everyone,
I have finally managed to brush up the implementation I did several months
ago and opened a PR: https://github.com/apache/calcite/pull/2488

I have followed the main ideas from this discussion, so I think the overall
structure is sound already, but the devil lies in the details.

Looking forward to hearing the comments from the community!

Best regards,
Alessandro


On Thu, 18 Feb 2021 at 09:11, Alessandro Solimando <
alessandro.solimando@gmail.com> wrote:

> Hello all,
> Thanks Francis for your message.
>
> I have filed CALCITE-4503
> <https://issues.apache.org/jira/browse/CALCITE-4503> as suggested by
> Julian. I have filled "fix version" as 1.8.0 since we target this
> release for the PR, I hope it's OK.
>
> For the open PR I have pushed a new commit with a commit message
> associated to the new JIRA case, I haven't amended the original commit to
> avoid problems with the on-going review, this can be safely done during the
> final "squashing" I think.
>
> Best regards,
> Alessandro
>
> On Thu, 18 Feb 2021 at 01:05, Julian Hyde <jh...@apache.org> wrote:
>
>> Yes, it should go into 1.18.
>>
>> I think that a new JIRA case needs to be logged for this change, so
>> that it is clearly Avatica-specific. This change relates to order of
>> fields in Meta with respect to Jackson, but it has nothing to do with
>> JavaTypeFactory (which is a Calcite concept, not an Avatica concept).
>>
>>
>> On Wed, Feb 17, 2021 at 3:08 PM Francis Chuang <fr...@apache.org>
>> wrote:
>> >
>> > Hey Alessandro + all,
>> >
>> > Do you think it would be possible to get this into the next Avatica
>> > release (1.18.0)? I note that that PR #138 also has a corresponding PR
>> > for Calcite, which would require this PR to go into Avatica if we want
>> > the corresponding PR for Calcite to go into the next Calcite release.
>> >
>> > Francis
>> >
>> > On 16/02/2021 10:15 am, Alessandro Solimando wrote:
>> > > Hi again,
>> > > sorry I have hit "send" too soon, here is the PR:
>> > > https://github.com/apache/calcite-avatica/pull/138
>> > >
>> > > Best regards,
>> > > Alessandro
>> > >
>> > > On Mon, 15 Feb 2021 at 23:46, Alessandro Solimando <
>> > > alessandro.solimando@gmail.com> wrote:
>> > >
>> > >> Hi Julian,
>> > >> sorry for the late reply.
>> > >>
>> > >> It's true that with third-party classes annotations are not viable,
>> and
>> > >> that's an important use case.
>> > >>
>> > >> Regarding the test case and the user experience, I like the idea of
>> making
>> > >> the strategy configurable, I'd capture the strategies you have
>> listed via
>> > >> an enum as follows.
>> > >>
>> > >> public enum FIELDS_ORDERING {
>> > >>      JVM,
>> > >>      ALPHABETICAL,
>> > >>      ALPHABETICAL_AND_HIERARCHY,
>> > >>      EXPLICIT,
>> > >>      EXPLICIT_TOLERANT,
>> > >>      ANNOTATIONS
>> > >>    }
>> > >>
>> > >> where EXPLICIT and EXPLICIT_TOLERANT expects a second mandatory
>> parameter
>> > >> that is the list of fields.
>> > >> EXPLICIT fails if the list does not fully cover the public fields of
>> the
>> > >> class, EXPLICIT_TOLERANT does not, and use to incomplete list to
>> "project"
>> > >> (i.e., all and only fields appearing in the list will be present in
>> the
>> > >> derived SQL type).
>> > >>
>> > >> An example of the schema.json:
>> > >> schemas: [
>> > >>          {
>> > >>            name: "author",
>> > >>            type: "custom",
>> > >>            factory:
>> > >> "org.apache.calcite.adapter.java.ReflectiveSchema$Factory",
>> > >>            operand: {
>> > >>              class: "org.apache.calcite.test.Author",
>> > >>              orderingStrategy: "EXPLICIT_TOLERANT",
>> > >>              fields: [“aid”, “name”, “birthplace”]
>> > >>            }
>> > >>          }
>> > >>        ]
>> > >>
>> > >> Regarding ANNOTATION, I have tried the "-parameters" build option
>> > >> suggested by Vladimir, and it works well.
>> > >>
>> > >> In case the parameter names are not available (i.e., the target
>> class was
>> > >> not compiled with the sought parameter) I suggest we fail the
>> conversion
>> > >> (this is an advanced use-case, the user should know about how the
>> classes
>> > >> are compiled).
>> > >>
>> > >> In case of multiple constructors for the target class, we could
>> either:
>> > >> 1) pick the (valid) constructor having the biggest overlap with the
>> class
>> > >> public fields, or
>> > >> 2) pick the first (valid) constructor
>> > >>
>> > >> With "valid" constructor I mean a constructor where each parameter
>> > >> matching a public field, has a type assignable to that of the
>> homonym field.
>> > >>
>> > >> Of course there are some details and corner cases left to handle, but
>> > >> that's probably for the PR itself.
>> > >>
>> > >> As I said in the first email, some changes are required on the
>> Avatica
>> > >> side to completely remove Class.getFields() and
>> Class.getDeclaredFields().
>> > >> Such changes are independent from the strategy we will adopt on
>> Calcite,
>> > >> they just allow us to "pass" the field ordering to the JDBC
>> enumerator and
>> > >> other classes which need to be aware of the field order.
>> > >> Despite them being minimal (you can check here
>> > >> <
>> https://github.com/asolimando/calcite-avatica/compare/master...asolimando:master-CALCITE-2489-createStructTypeUnstable#diff-651ecd95723b9508c17cfd021ec3894dfbdc81ea3ce9384a5d57cf209a214a97
>> >)
>> > >> I am afraid it's late for 1.18.0 (just saw Francis' email), I have
>> > >> nonetheless opened a PR, just in case it's doable:
>> > >>
>> > >> Best regards,
>> > >> Alessandro
>> > >>
>> > >> On Tue, 9 Feb 2021 at 23:02, Julian Hyde <jh...@gmail.com>
>> wrote:
>> > >>
>> > >>> A very important use case is when you are basing a table on
>> third-party
>> > >>> class and you are not able to add annotations. People using these
>> classes
>> > >>> should be able to tell the reflective adapter (that’s not its real
>> name,
>> > >>> but you know what I mean) how to derive a stable ordering for the
>> fields.
>> > >>>
>> > >>> There are several possible strategies:
>> > >>>
>> > >>> Use the natural order from the JVM. (Yes, let people do dumb
>> things.)
>> > >>> Order fields alphabetically
>> > >>> Order fields alphabetically, putting inherited fields (from base
>> classes)
>> > >>> first
>> > >>> Order using a list of field names, e.g. [“aid”, “name”,
>> “birthplace”,
>> > >>> “books”] and a boolean saying whether to fail if the list is not
>> exhaustive.
>> > >>> Look for annotations
>> > >>>
>> > >>> These strategies would be in the schema.json file (or however the
>> schema
>> > >>> is configured). In my opinion, that is preferable to adding
>> annotations.
>> > >>>
>> > >>> I’d approach this by designing the user’s experience. That is,
>> write the
>> > >>> test case for the adapter. Once we have agreed the specification,
>> the
>> > >>> implementation is straightforward.
>> > >>>
>> > >>> Julian
>> > >>>
>> > >>>
>> > >>>> On Feb 9, 2021, at 2:35 AM, Vladimir Sitnikov <
>> > >>> sitnikov.vladimir@gmail.com> wrote:
>> > >>>>
>> > >>>> Alessandro, thanks for pushing this.
>> > >>>>
>> > >>>> Frankly speaking, I don't like @CalciteField(order=42) approach. It
>> > >>> looks
>> > >>>> like annotating the constructor parameters makes more sense, and
>> it is
>> > >>>> more flexible.
>> > >>>> A similar case is covered in Jackson with "parameter names" module:
>> > >>>>
>> > >>>
>> https://github.com/FasterXML/jackson-modules-java8/tree/master/parameter-names
>> > >>>> Just in case, Java8+ can include parameter names to the reflection
>> info
>> > >>> in
>> > >>>> case user supplies `-parameters` option for the compiler.
>> > >>>>
>> > >>>>> NewExpression
>> > >>>>
>> > >>>> It might indeed be a too low-level API for that.
>> > >>>> Could you please highlight which test relies on the order of
>> constructor
>> > >>>> arguments?
>> > >>>>
>> > >>>> By the way, SQL structs rely on the order of the fields rather than
>> > >>> field
>> > >>>> names.
>> > >>>> So it might be ok to rely on the order of the constructor
>> arguments.
>> > >>>>
>> > >>>> Vladimir
>> > >>>
>> > >>>
>> > >
>>
>

Re: Discussion around CALCITE-2489 (risks linked to Class.getFields())

Posted by Alessandro Solimando <al...@gmail.com>.
Hello all,
Thanks Francis for your message.

I have filed CALCITE-4503
<https://issues.apache.org/jira/browse/CALCITE-4503> as suggested by
Julian. I have filled "fix version" as 1.8.0 since we target this
release for the PR, I hope it's OK.

For the open PR I have pushed a new commit with a commit message associated
to the new JIRA case, I haven't amended the original commit to avoid
problems with the on-going review, this can be safely done during the final
"squashing" I think.

Best regards,
Alessandro

On Thu, 18 Feb 2021 at 01:05, Julian Hyde <jh...@apache.org> wrote:

> Yes, it should go into 1.18.
>
> I think that a new JIRA case needs to be logged for this change, so
> that it is clearly Avatica-specific. This change relates to order of
> fields in Meta with respect to Jackson, but it has nothing to do with
> JavaTypeFactory (which is a Calcite concept, not an Avatica concept).
>
>
> On Wed, Feb 17, 2021 at 3:08 PM Francis Chuang <fr...@apache.org>
> wrote:
> >
> > Hey Alessandro + all,
> >
> > Do you think it would be possible to get this into the next Avatica
> > release (1.18.0)? I note that that PR #138 also has a corresponding PR
> > for Calcite, which would require this PR to go into Avatica if we want
> > the corresponding PR for Calcite to go into the next Calcite release.
> >
> > Francis
> >
> > On 16/02/2021 10:15 am, Alessandro Solimando wrote:
> > > Hi again,
> > > sorry I have hit "send" too soon, here is the PR:
> > > https://github.com/apache/calcite-avatica/pull/138
> > >
> > > Best regards,
> > > Alessandro
> > >
> > > On Mon, 15 Feb 2021 at 23:46, Alessandro Solimando <
> > > alessandro.solimando@gmail.com> wrote:
> > >
> > >> Hi Julian,
> > >> sorry for the late reply.
> > >>
> > >> It's true that with third-party classes annotations are not viable,
> and
> > >> that's an important use case.
> > >>
> > >> Regarding the test case and the user experience, I like the idea of
> making
> > >> the strategy configurable, I'd capture the strategies you have listed
> via
> > >> an enum as follows.
> > >>
> > >> public enum FIELDS_ORDERING {
> > >>      JVM,
> > >>      ALPHABETICAL,
> > >>      ALPHABETICAL_AND_HIERARCHY,
> > >>      EXPLICIT,
> > >>      EXPLICIT_TOLERANT,
> > >>      ANNOTATIONS
> > >>    }
> > >>
> > >> where EXPLICIT and EXPLICIT_TOLERANT expects a second mandatory
> parameter
> > >> that is the list of fields.
> > >> EXPLICIT fails if the list does not fully cover the public fields of
> the
> > >> class, EXPLICIT_TOLERANT does not, and use to incomplete list to
> "project"
> > >> (i.e., all and only fields appearing in the list will be present in
> the
> > >> derived SQL type).
> > >>
> > >> An example of the schema.json:
> > >> schemas: [
> > >>          {
> > >>            name: "author",
> > >>            type: "custom",
> > >>            factory:
> > >> "org.apache.calcite.adapter.java.ReflectiveSchema$Factory",
> > >>            operand: {
> > >>              class: "org.apache.calcite.test.Author",
> > >>              orderingStrategy: "EXPLICIT_TOLERANT",
> > >>              fields: [“aid”, “name”, “birthplace”]
> > >>            }
> > >>          }
> > >>        ]
> > >>
> > >> Regarding ANNOTATION, I have tried the "-parameters" build option
> > >> suggested by Vladimir, and it works well.
> > >>
> > >> In case the parameter names are not available (i.e., the target class
> was
> > >> not compiled with the sought parameter) I suggest we fail the
> conversion
> > >> (this is an advanced use-case, the user should know about how the
> classes
> > >> are compiled).
> > >>
> > >> In case of multiple constructors for the target class, we could
> either:
> > >> 1) pick the (valid) constructor having the biggest overlap with the
> class
> > >> public fields, or
> > >> 2) pick the first (valid) constructor
> > >>
> > >> With "valid" constructor I mean a constructor where each parameter
> > >> matching a public field, has a type assignable to that of the homonym
> field.
> > >>
> > >> Of course there are some details and corner cases left to handle, but
> > >> that's probably for the PR itself.
> > >>
> > >> As I said in the first email, some changes are required on the Avatica
> > >> side to completely remove Class.getFields() and
> Class.getDeclaredFields().
> > >> Such changes are independent from the strategy we will adopt on
> Calcite,
> > >> they just allow us to "pass" the field ordering to the JDBC
> enumerator and
> > >> other classes which need to be aware of the field order.
> > >> Despite them being minimal (you can check here
> > >> <
> https://github.com/asolimando/calcite-avatica/compare/master...asolimando:master-CALCITE-2489-createStructTypeUnstable#diff-651ecd95723b9508c17cfd021ec3894dfbdc81ea3ce9384a5d57cf209a214a97
> >)
> > >> I am afraid it's late for 1.18.0 (just saw Francis' email), I have
> > >> nonetheless opened a PR, just in case it's doable:
> > >>
> > >> Best regards,
> > >> Alessandro
> > >>
> > >> On Tue, 9 Feb 2021 at 23:02, Julian Hyde <jh...@gmail.com>
> wrote:
> > >>
> > >>> A very important use case is when you are basing a table on
> third-party
> > >>> class and you are not able to add annotations. People using these
> classes
> > >>> should be able to tell the reflective adapter (that’s not its real
> name,
> > >>> but you know what I mean) how to derive a stable ordering for the
> fields.
> > >>>
> > >>> There are several possible strategies:
> > >>>
> > >>> Use the natural order from the JVM. (Yes, let people do dumb things.)
> > >>> Order fields alphabetically
> > >>> Order fields alphabetically, putting inherited fields (from base
> classes)
> > >>> first
> > >>> Order using a list of field names, e.g. [“aid”, “name”, “birthplace”,
> > >>> “books”] and a boolean saying whether to fail if the list is not
> exhaustive.
> > >>> Look for annotations
> > >>>
> > >>> These strategies would be in the schema.json file (or however the
> schema
> > >>> is configured). In my opinion, that is preferable to adding
> annotations.
> > >>>
> > >>> I’d approach this by designing the user’s experience. That is, write
> the
> > >>> test case for the adapter. Once we have agreed the specification, the
> > >>> implementation is straightforward.
> > >>>
> > >>> Julian
> > >>>
> > >>>
> > >>>> On Feb 9, 2021, at 2:35 AM, Vladimir Sitnikov <
> > >>> sitnikov.vladimir@gmail.com> wrote:
> > >>>>
> > >>>> Alessandro, thanks for pushing this.
> > >>>>
> > >>>> Frankly speaking, I don't like @CalciteField(order=42) approach. It
> > >>> looks
> > >>>> like annotating the constructor parameters makes more sense, and it
> is
> > >>>> more flexible.
> > >>>> A similar case is covered in Jackson with "parameter names" module:
> > >>>>
> > >>>
> https://github.com/FasterXML/jackson-modules-java8/tree/master/parameter-names
> > >>>> Just in case, Java8+ can include parameter names to the reflection
> info
> > >>> in
> > >>>> case user supplies `-parameters` option for the compiler.
> > >>>>
> > >>>>> NewExpression
> > >>>>
> > >>>> It might indeed be a too low-level API for that.
> > >>>> Could you please highlight which test relies on the order of
> constructor
> > >>>> arguments?
> > >>>>
> > >>>> By the way, SQL structs rely on the order of the fields rather than
> > >>> field
> > >>>> names.
> > >>>> So it might be ok to rely on the order of the constructor arguments.
> > >>>>
> > >>>> Vladimir
> > >>>
> > >>>
> > >
>

Re: Discussion around CALCITE-2489 (risks linked to Class.getFields())

Posted by Julian Hyde <jh...@apache.org>.
Yes, it should go into 1.18.

I think that a new JIRA case needs to be logged for this change, so
that it is clearly Avatica-specific. This change relates to order of
fields in Meta with respect to Jackson, but it has nothing to do with
JavaTypeFactory (which is a Calcite concept, not an Avatica concept).


On Wed, Feb 17, 2021 at 3:08 PM Francis Chuang <fr...@apache.org> wrote:
>
> Hey Alessandro + all,
>
> Do you think it would be possible to get this into the next Avatica
> release (1.18.0)? I note that that PR #138 also has a corresponding PR
> for Calcite, which would require this PR to go into Avatica if we want
> the corresponding PR for Calcite to go into the next Calcite release.
>
> Francis
>
> On 16/02/2021 10:15 am, Alessandro Solimando wrote:
> > Hi again,
> > sorry I have hit "send" too soon, here is the PR:
> > https://github.com/apache/calcite-avatica/pull/138
> >
> > Best regards,
> > Alessandro
> >
> > On Mon, 15 Feb 2021 at 23:46, Alessandro Solimando <
> > alessandro.solimando@gmail.com> wrote:
> >
> >> Hi Julian,
> >> sorry for the late reply.
> >>
> >> It's true that with third-party classes annotations are not viable, and
> >> that's an important use case.
> >>
> >> Regarding the test case and the user experience, I like the idea of making
> >> the strategy configurable, I'd capture the strategies you have listed via
> >> an enum as follows.
> >>
> >> public enum FIELDS_ORDERING {
> >>      JVM,
> >>      ALPHABETICAL,
> >>      ALPHABETICAL_AND_HIERARCHY,
> >>      EXPLICIT,
> >>      EXPLICIT_TOLERANT,
> >>      ANNOTATIONS
> >>    }
> >>
> >> where EXPLICIT and EXPLICIT_TOLERANT expects a second mandatory parameter
> >> that is the list of fields.
> >> EXPLICIT fails if the list does not fully cover the public fields of the
> >> class, EXPLICIT_TOLERANT does not, and use to incomplete list to "project"
> >> (i.e., all and only fields appearing in the list will be present in the
> >> derived SQL type).
> >>
> >> An example of the schema.json:
> >> schemas: [
> >>          {
> >>            name: "author",
> >>            type: "custom",
> >>            factory:
> >> "org.apache.calcite.adapter.java.ReflectiveSchema$Factory",
> >>            operand: {
> >>              class: "org.apache.calcite.test.Author",
> >>              orderingStrategy: "EXPLICIT_TOLERANT",
> >>              fields: [“aid”, “name”, “birthplace”]
> >>            }
> >>          }
> >>        ]
> >>
> >> Regarding ANNOTATION, I have tried the "-parameters" build option
> >> suggested by Vladimir, and it works well.
> >>
> >> In case the parameter names are not available (i.e., the target class was
> >> not compiled with the sought parameter) I suggest we fail the conversion
> >> (this is an advanced use-case, the user should know about how the classes
> >> are compiled).
> >>
> >> In case of multiple constructors for the target class, we could either:
> >> 1) pick the (valid) constructor having the biggest overlap with the class
> >> public fields, or
> >> 2) pick the first (valid) constructor
> >>
> >> With "valid" constructor I mean a constructor where each parameter
> >> matching a public field, has a type assignable to that of the homonym field.
> >>
> >> Of course there are some details and corner cases left to handle, but
> >> that's probably for the PR itself.
> >>
> >> As I said in the first email, some changes are required on the Avatica
> >> side to completely remove Class.getFields() and Class.getDeclaredFields().
> >> Such changes are independent from the strategy we will adopt on Calcite,
> >> they just allow us to "pass" the field ordering to the JDBC enumerator and
> >> other classes which need to be aware of the field order.
> >> Despite them being minimal (you can check here
> >> <https://github.com/asolimando/calcite-avatica/compare/master...asolimando:master-CALCITE-2489-createStructTypeUnstable#diff-651ecd95723b9508c17cfd021ec3894dfbdc81ea3ce9384a5d57cf209a214a97>)
> >> I am afraid it's late for 1.18.0 (just saw Francis' email), I have
> >> nonetheless opened a PR, just in case it's doable:
> >>
> >> Best regards,
> >> Alessandro
> >>
> >> On Tue, 9 Feb 2021 at 23:02, Julian Hyde <jh...@gmail.com> wrote:
> >>
> >>> A very important use case is when you are basing a table on third-party
> >>> class and you are not able to add annotations. People using these classes
> >>> should be able to tell the reflective adapter (that’s not its real name,
> >>> but you know what I mean) how to derive a stable ordering for the fields.
> >>>
> >>> There are several possible strategies:
> >>>
> >>> Use the natural order from the JVM. (Yes, let people do dumb things.)
> >>> Order fields alphabetically
> >>> Order fields alphabetically, putting inherited fields (from base classes)
> >>> first
> >>> Order using a list of field names, e.g. [“aid”, “name”, “birthplace”,
> >>> “books”] and a boolean saying whether to fail if the list is not exhaustive.
> >>> Look for annotations
> >>>
> >>> These strategies would be in the schema.json file (or however the schema
> >>> is configured). In my opinion, that is preferable to adding annotations.
> >>>
> >>> I’d approach this by designing the user’s experience. That is, write the
> >>> test case for the adapter. Once we have agreed the specification, the
> >>> implementation is straightforward.
> >>>
> >>> Julian
> >>>
> >>>
> >>>> On Feb 9, 2021, at 2:35 AM, Vladimir Sitnikov <
> >>> sitnikov.vladimir@gmail.com> wrote:
> >>>>
> >>>> Alessandro, thanks for pushing this.
> >>>>
> >>>> Frankly speaking, I don't like @CalciteField(order=42) approach. It
> >>> looks
> >>>> like annotating the constructor parameters makes more sense, and it is
> >>>> more flexible.
> >>>> A similar case is covered in Jackson with "parameter names" module:
> >>>>
> >>> https://github.com/FasterXML/jackson-modules-java8/tree/master/parameter-names
> >>>> Just in case, Java8+ can include parameter names to the reflection info
> >>> in
> >>>> case user supplies `-parameters` option for the compiler.
> >>>>
> >>>>> NewExpression
> >>>>
> >>>> It might indeed be a too low-level API for that.
> >>>> Could you please highlight which test relies on the order of constructor
> >>>> arguments?
> >>>>
> >>>> By the way, SQL structs rely on the order of the fields rather than
> >>> field
> >>>> names.
> >>>> So it might be ok to rely on the order of the constructor arguments.
> >>>>
> >>>> Vladimir
> >>>
> >>>
> >

Re: Discussion around CALCITE-2489 (risks linked to Class.getFields())

Posted by Francis Chuang <fr...@apache.org>.
Hey Alessandro + all,

Do you think it would be possible to get this into the next Avatica 
release (1.18.0)? I note that that PR #138 also has a corresponding PR 
for Calcite, which would require this PR to go into Avatica if we want 
the corresponding PR for Calcite to go into the next Calcite release.

Francis

On 16/02/2021 10:15 am, Alessandro Solimando wrote:
> Hi again,
> sorry I have hit "send" too soon, here is the PR:
> https://github.com/apache/calcite-avatica/pull/138
> 
> Best regards,
> Alessandro
> 
> On Mon, 15 Feb 2021 at 23:46, Alessandro Solimando <
> alessandro.solimando@gmail.com> wrote:
> 
>> Hi Julian,
>> sorry for the late reply.
>>
>> It's true that with third-party classes annotations are not viable, and
>> that's an important use case.
>>
>> Regarding the test case and the user experience, I like the idea of making
>> the strategy configurable, I'd capture the strategies you have listed via
>> an enum as follows.
>>
>> public enum FIELDS_ORDERING {
>>      JVM,
>>      ALPHABETICAL,
>>      ALPHABETICAL_AND_HIERARCHY,
>>      EXPLICIT,
>>      EXPLICIT_TOLERANT,
>>      ANNOTATIONS
>>    }
>>
>> where EXPLICIT and EXPLICIT_TOLERANT expects a second mandatory parameter
>> that is the list of fields.
>> EXPLICIT fails if the list does not fully cover the public fields of the
>> class, EXPLICIT_TOLERANT does not, and use to incomplete list to "project"
>> (i.e., all and only fields appearing in the list will be present in the
>> derived SQL type).
>>
>> An example of the schema.json:
>> schemas: [
>>          {
>>            name: "author",
>>            type: "custom",
>>            factory:
>> "org.apache.calcite.adapter.java.ReflectiveSchema$Factory",
>>            operand: {
>>              class: "org.apache.calcite.test.Author",
>>              orderingStrategy: "EXPLICIT_TOLERANT",
>>              fields: [“aid”, “name”, “birthplace”]
>>            }
>>          }
>>        ]
>>
>> Regarding ANNOTATION, I have tried the "-parameters" build option
>> suggested by Vladimir, and it works well.
>>
>> In case the parameter names are not available (i.e., the target class was
>> not compiled with the sought parameter) I suggest we fail the conversion
>> (this is an advanced use-case, the user should know about how the classes
>> are compiled).
>>
>> In case of multiple constructors for the target class, we could either:
>> 1) pick the (valid) constructor having the biggest overlap with the class
>> public fields, or
>> 2) pick the first (valid) constructor
>>
>> With "valid" constructor I mean a constructor where each parameter
>> matching a public field, has a type assignable to that of the homonym field.
>>
>> Of course there are some details and corner cases left to handle, but
>> that's probably for the PR itself.
>>
>> As I said in the first email, some changes are required on the Avatica
>> side to completely remove Class.getFields() and Class.getDeclaredFields().
>> Such changes are independent from the strategy we will adopt on Calcite,
>> they just allow us to "pass" the field ordering to the JDBC enumerator and
>> other classes which need to be aware of the field order.
>> Despite them being minimal (you can check here
>> <https://github.com/asolimando/calcite-avatica/compare/master...asolimando:master-CALCITE-2489-createStructTypeUnstable#diff-651ecd95723b9508c17cfd021ec3894dfbdc81ea3ce9384a5d57cf209a214a97>)
>> I am afraid it's late for 1.18.0 (just saw Francis' email), I have
>> nonetheless opened a PR, just in case it's doable:
>>
>> Best regards,
>> Alessandro
>>
>> On Tue, 9 Feb 2021 at 23:02, Julian Hyde <jh...@gmail.com> wrote:
>>
>>> A very important use case is when you are basing a table on third-party
>>> class and you are not able to add annotations. People using these classes
>>> should be able to tell the reflective adapter (that’s not its real name,
>>> but you know what I mean) how to derive a stable ordering for the fields.
>>>
>>> There are several possible strategies:
>>>
>>> Use the natural order from the JVM. (Yes, let people do dumb things.)
>>> Order fields alphabetically
>>> Order fields alphabetically, putting inherited fields (from base classes)
>>> first
>>> Order using a list of field names, e.g. [“aid”, “name”, “birthplace”,
>>> “books”] and a boolean saying whether to fail if the list is not exhaustive.
>>> Look for annotations
>>>
>>> These strategies would be in the schema.json file (or however the schema
>>> is configured). In my opinion, that is preferable to adding annotations.
>>>
>>> I’d approach this by designing the user’s experience. That is, write the
>>> test case for the adapter. Once we have agreed the specification, the
>>> implementation is straightforward.
>>>
>>> Julian
>>>
>>>
>>>> On Feb 9, 2021, at 2:35 AM, Vladimir Sitnikov <
>>> sitnikov.vladimir@gmail.com> wrote:
>>>>
>>>> Alessandro, thanks for pushing this.
>>>>
>>>> Frankly speaking, I don't like @CalciteField(order=42) approach. It
>>> looks
>>>> like annotating the constructor parameters makes more sense, and it is
>>>> more flexible.
>>>> A similar case is covered in Jackson with "parameter names" module:
>>>>
>>> https://github.com/FasterXML/jackson-modules-java8/tree/master/parameter-names
>>>> Just in case, Java8+ can include parameter names to the reflection info
>>> in
>>>> case user supplies `-parameters` option for the compiler.
>>>>
>>>>> NewExpression
>>>>
>>>> It might indeed be a too low-level API for that.
>>>> Could you please highlight which test relies on the order of constructor
>>>> arguments?
>>>>
>>>> By the way, SQL structs rely on the order of the fields rather than
>>> field
>>>> names.
>>>> So it might be ok to rely on the order of the constructor arguments.
>>>>
>>>> Vladimir
>>>
>>>
> 

Re: Discussion around CALCITE-2489 (risks linked to Class.getFields())

Posted by Alessandro Solimando <al...@gmail.com>.
Hi again,
sorry I have hit "send" too soon, here is the PR:
https://github.com/apache/calcite-avatica/pull/138

Best regards,
Alessandro

On Mon, 15 Feb 2021 at 23:46, Alessandro Solimando <
alessandro.solimando@gmail.com> wrote:

> Hi Julian,
> sorry for the late reply.
>
> It's true that with third-party classes annotations are not viable, and
> that's an important use case.
>
> Regarding the test case and the user experience, I like the idea of making
> the strategy configurable, I'd capture the strategies you have listed via
> an enum as follows.
>
> public enum FIELDS_ORDERING {
>     JVM,
>     ALPHABETICAL,
>     ALPHABETICAL_AND_HIERARCHY,
>     EXPLICIT,
>     EXPLICIT_TOLERANT,
>     ANNOTATIONS
>   }
>
> where EXPLICIT and EXPLICIT_TOLERANT expects a second mandatory parameter
> that is the list of fields.
> EXPLICIT fails if the list does not fully cover the public fields of the
> class, EXPLICIT_TOLERANT does not, and use to incomplete list to "project"
> (i.e., all and only fields appearing in the list will be present in the
> derived SQL type).
>
> An example of the schema.json:
> schemas: [
>         {
>           name: "author",
>           type: "custom",
>           factory:
> "org.apache.calcite.adapter.java.ReflectiveSchema$Factory",
>           operand: {
>             class: "org.apache.calcite.test.Author",
>             orderingStrategy: "EXPLICIT_TOLERANT",
>             fields: [“aid”, “name”, “birthplace”]
>           }
>         }
>       ]
>
> Regarding ANNOTATION, I have tried the "-parameters" build option
> suggested by Vladimir, and it works well.
>
> In case the parameter names are not available (i.e., the target class was
> not compiled with the sought parameter) I suggest we fail the conversion
> (this is an advanced use-case, the user should know about how the classes
> are compiled).
>
> In case of multiple constructors for the target class, we could either:
> 1) pick the (valid) constructor having the biggest overlap with the class
> public fields, or
> 2) pick the first (valid) constructor
>
> With "valid" constructor I mean a constructor where each parameter
> matching a public field, has a type assignable to that of the homonym field.
>
> Of course there are some details and corner cases left to handle, but
> that's probably for the PR itself.
>
> As I said in the first email, some changes are required on the Avatica
> side to completely remove Class.getFields() and Class.getDeclaredFields().
> Such changes are independent from the strategy we will adopt on Calcite,
> they just allow us to "pass" the field ordering to the JDBC enumerator and
> other classes which need to be aware of the field order.
> Despite them being minimal (you can check here
> <https://github.com/asolimando/calcite-avatica/compare/master...asolimando:master-CALCITE-2489-createStructTypeUnstable#diff-651ecd95723b9508c17cfd021ec3894dfbdc81ea3ce9384a5d57cf209a214a97>)
> I am afraid it's late for 1.18.0 (just saw Francis' email), I have
> nonetheless opened a PR, just in case it's doable:
>
> Best regards,
> Alessandro
>
> On Tue, 9 Feb 2021 at 23:02, Julian Hyde <jh...@gmail.com> wrote:
>
>> A very important use case is when you are basing a table on third-party
>> class and you are not able to add annotations. People using these classes
>> should be able to tell the reflective adapter (that’s not its real name,
>> but you know what I mean) how to derive a stable ordering for the fields.
>>
>> There are several possible strategies:
>>
>> Use the natural order from the JVM. (Yes, let people do dumb things.)
>> Order fields alphabetically
>> Order fields alphabetically, putting inherited fields (from base classes)
>> first
>> Order using a list of field names, e.g. [“aid”, “name”, “birthplace”,
>> “books”] and a boolean saying whether to fail if the list is not exhaustive.
>> Look for annotations
>>
>> These strategies would be in the schema.json file (or however the schema
>> is configured). In my opinion, that is preferable to adding annotations.
>>
>> I’d approach this by designing the user’s experience. That is, write the
>> test case for the adapter. Once we have agreed the specification, the
>> implementation is straightforward.
>>
>> Julian
>>
>>
>> > On Feb 9, 2021, at 2:35 AM, Vladimir Sitnikov <
>> sitnikov.vladimir@gmail.com> wrote:
>> >
>> > Alessandro, thanks for pushing this.
>> >
>> > Frankly speaking, I don't like @CalciteField(order=42) approach. It
>> looks
>> > like annotating the constructor parameters makes more sense, and it is
>> > more flexible.
>> > A similar case is covered in Jackson with "parameter names" module:
>> >
>> https://github.com/FasterXML/jackson-modules-java8/tree/master/parameter-names
>> > Just in case, Java8+ can include parameter names to the reflection info
>> in
>> > case user supplies `-parameters` option for the compiler.
>> >
>> >> NewExpression
>> >
>> > It might indeed be a too low-level API for that.
>> > Could you please highlight which test relies on the order of constructor
>> > arguments?
>> >
>> > By the way, SQL structs rely on the order of the fields rather than
>> field
>> > names.
>> > So it might be ok to rely on the order of the constructor arguments.
>> >
>> > Vladimir
>>
>>

Re: Discussion around CALCITE-2489 (risks linked to Class.getFields())

Posted by Alessandro Solimando <al...@gmail.com>.
Hi Julian,
sorry for the late reply.

It's true that with third-party classes annotations are not viable, and
that's an important use case.

Regarding the test case and the user experience, I like the idea of making
the strategy configurable, I'd capture the strategies you have listed via
an enum as follows.

public enum FIELDS_ORDERING {
    JVM,
    ALPHABETICAL,
    ALPHABETICAL_AND_HIERARCHY,
    EXPLICIT,
    EXPLICIT_TOLERANT,
    ANNOTATIONS
  }

where EXPLICIT and EXPLICIT_TOLERANT expects a second mandatory parameter
that is the list of fields.
EXPLICIT fails if the list does not fully cover the public fields of the
class, EXPLICIT_TOLERANT does not, and use to incomplete list to "project"
(i.e., all and only fields appearing in the list will be present in the
derived SQL type).

An example of the schema.json:
schemas: [
        {
          name: "author",
          type: "custom",
          factory:
"org.apache.calcite.adapter.java.ReflectiveSchema$Factory",
          operand: {
            class: "org.apache.calcite.test.Author",
            orderingStrategy: "EXPLICIT_TOLERANT",
            fields: [“aid”, “name”, “birthplace”]
          }
        }
      ]

Regarding ANNOTATION, I have tried the "-parameters" build option suggested
by Vladimir, and it works well.

In case the parameter names are not available (i.e., the target class was
not compiled with the sought parameter) I suggest we fail the conversion
(this is an advanced use-case, the user should know about how the classes
are compiled).

In case of multiple constructors for the target class, we could either:
1) pick the (valid) constructor having the biggest overlap with the class
public fields, or
2) pick the first (valid) constructor

With "valid" constructor I mean a constructor where each parameter matching
a public field, has a type assignable to that of the homonym field.

Of course there are some details and corner cases left to handle, but
that's probably for the PR itself.

As I said in the first email, some changes are required on the Avatica side
to completely remove Class.getFields() and Class.getDeclaredFields().
Such changes are independent from the strategy we will adopt on Calcite,
they just allow us to "pass" the field ordering to the JDBC enumerator and
other classes which need to be aware of the field order.
Despite them being minimal (you can check here
<https://github.com/asolimando/calcite-avatica/compare/master...asolimando:master-CALCITE-2489-createStructTypeUnstable#diff-651ecd95723b9508c17cfd021ec3894dfbdc81ea3ce9384a5d57cf209a214a97>)
I am afraid it's late for 1.18.0 (just saw Francis' email), I have
nonetheless opened a PR, just in case it's doable:

Best regards,
Alessandro

On Tue, 9 Feb 2021 at 23:02, Julian Hyde <jh...@gmail.com> wrote:

> A very important use case is when you are basing a table on third-party
> class and you are not able to add annotations. People using these classes
> should be able to tell the reflective adapter (that’s not its real name,
> but you know what I mean) how to derive a stable ordering for the fields.
>
> There are several possible strategies:
>
> Use the natural order from the JVM. (Yes, let people do dumb things.)
> Order fields alphabetically
> Order fields alphabetically, putting inherited fields (from base classes)
> first
> Order using a list of field names, e.g. [“aid”, “name”, “birthplace”,
> “books”] and a boolean saying whether to fail if the list is not exhaustive.
> Look for annotations
>
> These strategies would be in the schema.json file (or however the schema
> is configured). In my opinion, that is preferable to adding annotations.
>
> I’d approach this by designing the user’s experience. That is, write the
> test case for the adapter. Once we have agreed the specification, the
> implementation is straightforward.
>
> Julian
>
>
> > On Feb 9, 2021, at 2:35 AM, Vladimir Sitnikov <
> sitnikov.vladimir@gmail.com> wrote:
> >
> > Alessandro, thanks for pushing this.
> >
> > Frankly speaking, I don't like @CalciteField(order=42) approach. It looks
> > like annotating the constructor parameters makes more sense, and it is
> > more flexible.
> > A similar case is covered in Jackson with "parameter names" module:
> >
> https://github.com/FasterXML/jackson-modules-java8/tree/master/parameter-names
> > Just in case, Java8+ can include parameter names to the reflection info
> in
> > case user supplies `-parameters` option for the compiler.
> >
> >> NewExpression
> >
> > It might indeed be a too low-level API for that.
> > Could you please highlight which test relies on the order of constructor
> > arguments?
> >
> > By the way, SQL structs rely on the order of the fields rather than field
> > names.
> > So it might be ok to rely on the order of the constructor arguments.
> >
> > Vladimir
>
>

Re: Discussion around CALCITE-2489 (risks linked to Class.getFields())

Posted by Julian Hyde <jh...@gmail.com>.
A very important use case is when you are basing a table on third-party class and you are not able to add annotations. People using these classes should be able to tell the reflective adapter (that’s not its real name, but you know what I mean) how to derive a stable ordering for the fields.

There are several possible strategies:

Use the natural order from the JVM. (Yes, let people do dumb things.)
Order fields alphabetically 
Order fields alphabetically, putting inherited fields (from base classes) first
Order using a list of field names, e.g. [“aid”, “name”, “birthplace”, “books”] and a boolean saying whether to fail if the list is not exhaustive.
Look for annotations 

These strategies would be in the schema.json file (or however the schema is configured). In my opinion, that is preferable to adding annotations.

I’d approach this by designing the user’s experience. That is, write the test case for the adapter. Once we have agreed the specification, the implementation is straightforward.

Julian


> On Feb 9, 2021, at 2:35 AM, Vladimir Sitnikov <si...@gmail.com> wrote:
> 
> Alessandro, thanks for pushing this.
> 
> Frankly speaking, I don't like @CalciteField(order=42) approach. It looks
> like annotating the constructor parameters makes more sense, and it is
> more flexible.
> A similar case is covered in Jackson with "parameter names" module:
> https://github.com/FasterXML/jackson-modules-java8/tree/master/parameter-names
> Just in case, Java8+ can include parameter names to the reflection info in
> case user supplies `-parameters` option for the compiler.
> 
>> NewExpression
> 
> It might indeed be a too low-level API for that.
> Could you please highlight which test relies on the order of constructor
> arguments?
> 
> By the way, SQL structs rely on the order of the fields rather than field
> names.
> So it might be ok to rely on the order of the constructor arguments.
> 
> Vladimir


Re: Discussion around CALCITE-2489 (risks linked to Class.getFields())

Posted by Vladimir Sitnikov <si...@gmail.com>.
Alessandro, thanks for pushing this.

Frankly speaking, I don't like @CalciteField(order=42) approach. It looks
like annotating the constructor parameters makes more sense, and it is
more flexible.
A similar case is covered in Jackson with "parameter names" module:
https://github.com/FasterXML/jackson-modules-java8/tree/master/parameter-names
Just in case, Java8+ can include parameter names to the reflection info in
case user supplies `-parameters` option for the compiler.

>NewExpression

It might indeed be a too low-level API for that.
Could you please highlight which test relies on the order of constructor
arguments?

By the way, SQL structs rely on the order of the fields rather than field
names.
So it might be ok to rely on the order of the constructor arguments.

Vladimir

Re: Discussion around CALCITE-2489 (risks linked to Class.getFields())

Posted by Stamatis Zampetakis <za...@gmail.com>.
Hi Alessandro,

Many thanks for pushing this forward.

I had a quick look at the diff and the changes look reasonable.

I am not sure I got entirely the problem with the constructors and the
order of the parameters but if we are talking about JavaBeans couldn't we
enforce the same limitations which apply to them.
For instance, we could require such java types to provide a no-arg
constructor as it happens in various ORM frameworks.

Best,
Stamatis

On Sun, Feb 7, 2021 at 9:12 PM Alessandro Solimando <
alessandro.solimando@gmail.com> wrote:

> Hello everyone,
> I had a look at CALCITE-2489
> <https://issues.apache.org/jira/browse/CALCITE-2489> (Order of fields in
> JavaTypeFactoryImpl#createStructType is unstable) since it was linked to
> few PRs I worked on in the last months.
>
> In a nutshell, the output of "Class.getFields()" is JVM
> implementation-specific without any guarantees on the field ordering (even
> if, at date, the implementation seems rather consistent), we should replace
> it with a deterministic and stable ordering instead.
>
> I have created two branches with a proposed implementation for that, one
> for Calcite
> <
> https://github.com/asolimando/calcite/tree/master-CALCITE-2489-createStructTypeUnstable
> >,
> and one for Avatica
> <
> https://github.com/asolimando/calcite-avatica/tree/master-CALCITE-2489-createStructTypeUnstable
> >,
> all tests are passing (Calcite needs the changes in Avatica, though). You
> can check the diff against the branch and master for both Calcite
> <
> https://github.com/asolimando/calcite/compare/master...asolimando:master-CALCITE-2489-createStructTypeUnstable
> >
> and Avatica
> <
> https://github.com/asolimando/calcite-avatica/compare/master...asolimando:master-CALCITE-2489-createStructTypeUnstable#diff-651ecd95723b9508c17cfd021ec3894dfbdc81ea3ce9384a5d57cf209a214a97
> >
> .
>
> The only rough edge I see with the current proposal is that ordering on the
> fields in the Sql type derived from a class "C" must match exactly the
> order in the constructor(s) of "C" itself.
>
> This comes from the code generation for instantiating classes (that is,
> "new Class(...)" in code generated via Janino), which relies on the order
> of the fields (see NewExpression.java#L63
> <
> https://github.com/apache/calcite/blob/c475124ef8c61eb398ee119f27dfe21a14eca32f/linq4j/src/main/java/org/apache/calcite/linq4j/tree/NewExpression.java#L63
> >
> ).
>
> The code generation part did not change, and the same "limitation" applies
> to original implementation, where the constructor signature had to match
> the order of the fields declaration, and in turn this order had to be
> respected by the "Class.getFields()" implementation, which apparently is
> consistently the case, otherwise the existing implementation would not have
> worked.
> To test this, I have modified a few classes from ReflectiveSchemaTest.java,
> changing the order of either the field declarations, or the constructor,
> and the code generation was failing.
>
> What's worrying me with the proposed solution is that now the ordering
> logic is a bit more complicated: annotations can dictate the order
> explicitly for all the classes involved in the hierarchy, or rely on the
> default (from parent to descendants, alphabetical order for fields declared
> within the same class). Since the user must write the constructor knowing
> exactly the final order the fields will get, it can rapidly become
> complicated with a rich class hierarchy, especially if no annotations are
> used.
>
> So, in summary, relying on the JVM implementation-specific behaviour of
> "Class.getFields()" is risky, but on the other hand I see some potential
> friction for the final user of Calcite if we adopt a safer approach. I'd
> like to stress that the downside is not linked to my specific choices nor
> to the  suggestions in the ticket description, since the constructor
> parameters cannot be known even via reflection, there is no way we can
> automatically match the constructor, the burden will always fall on the
> user.
>
> I'd like to hear your thoughts before eventually moving forward with
> opening the two PRs.
>
> Best regards,
> Alessandro
>