You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Paul Rogers (Jira)" <ji...@apache.org> on 2019/12/11 04:48:00 UTC

[jira] [Created] (DRILL-7480) Revisit parameterized type design for Metadata API

Paul Rogers created DRILL-7480:
----------------------------------

             Summary: Revisit parameterized type design for Metadata API
                 Key: DRILL-7480
                 URL: https://issues.apache.org/jira/browse/DRILL-7480
             Project: Apache Drill
          Issue Type: Improvement
            Reporter: Paul Rogers


Grabbed latest master and found that the code will not build in Eclipse due to a type mismatch in the statistics code. Specifically, the problem is that we have several parameterized classes, but we often omit the parameters. Evidently, doing so is fine for some compilers, but is an error in Eclipse.

Then, while fixing the immediate issue, I found an opposite problem: code that would satisfy Eclipse, but which failed in the Maven build.

I spent time making another pass through the metadata code to add type parameters, remove "rawtypes" ignores and so on. See DRILL-7479.

Stepping back a bit, it seems that we are perhaps using the type parameters in a way that does not serve our needs in this particular case.

We have many classes that hold onto particular values of some type, such as {{StatisticsHolder}}, which can hold a String, a Double, etc. So, we parameterize.

But, after that, we treat the items generically. We don't care that {{foo}} is a {{StatisticsHolder<String>}} and {{bar}} is {{StatisticsHolder<Double>}}, we just want to create, combine and work with lists of statistics.

The same is true in several other places such as column type, comparator type, etc. For comparators, we don't really care what type they compare, we just want, given two generic \{{StatisticsHolder}}s to get the corresponding comparator.

This is very similar to the situation with the "column accessors" in EVF: each column is a {{VARCHAR}} or a\{{ FLOAT8}}, but most code just treats them generically. So, the type-ness of the value was treated as data a runtime attribute, not a compile-time attribute.

This is a subtle point. Most code in Drill does not work with types directly in Java code. Instead, Drill is an interpreter: it works with generic objects which, at run time, resolve to actual typed objects. It is the difference between writing an application (directly uses types) and writing a language (generically works with all types.)

For example, a {{StatsticsHolder}} probably only needs to be type-aware at the moment it is populated or used, but not in all the generic column-level and table level code. (The same is true of properties in the column metadata class, as an example.)

IMHO, {{StatsticsHolder}} probably wants to be a non-parameterized class. It should have a declaration object that, say, provides the name, type, comparator and with other metadata. When the actual value is needed, a typed getter can be provided:
{code:java}
<T> T getValue();
{code}
As it is, the type system is very complex but we get no value. Since it is so complex, the code just punted and sprinkled raw types and ignores in many places, which defeats the purpose of parameterized types anyway.

Suggestion: let's revisit this work after the upcoming release and see if we can simplify it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)