You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Vladimir Sitnikov <si...@gmail.com> on 2014/11/09 16:43:50 UTC
Extending OptiqCatalogReader
Hi,
I wonder what is the proper way to add sql operators?
It looks like the set is locked in the static fields of OptiqCatalogReader.
I am (ab?)using sql operators to implement "compute just required
columns" feature.
Basically, I create SqlUserDefinedFunction on the fly and pass them to
rexBuilder.makeCall, however I do not like very much that I need to
have typeFactory to fill in all the arguments of
SqlUserDefinedFunction.
Problem statement:
For instance, my "java.lang.HashMap" table consists of the following columns:
@ID integer
@SHALLOW long : computed as snapshot.getHeapSize(@ID)
@RETAINED long : computed as snapshot.getRetainedHeapSize(@ID)
loadFactor float : computed as
snapshot.getObject(@ID).resolveValue("loadFactor")
threshold int : computed as snapshot.getObject(@ID).resolveValue("threshold")
modCount int...
size int
entrySet HeapReference
table HeapReference
...
The thing is all the columns are computed from the "@ID" column, thus
I want avoid computation of unused ones.
This looks to be covered by "CsvSmartTable", however I do not find
that approach very suitable here.
The key feature I want to have is to reuse computations: I do not want
to lookup snapshot.getObject() for each and every column.
I could pessimistically lookup snapshot.getObject(), however that
would be excessive work if just "@ID" column in is required.
Support of the edge cases like "just @ID is required" or "just
loadFactor is required" seems to be a mess, so I got a hammer and made
Calcite figure out the projections.
Even CsvEnumerator has to have two implementations of enumerator: it
is not that obvious that even in case of Object[] storage you must not
create those Object[1] wrappers when the result is just a single
column.
My solution:
I went with the following approach: my tables are implemented as
Project(FindIds("java.lang.HashMap"), list(compute @ID, compute
@SHALLOW, ....)) on top of a "index table" that returns just the
required "@ID"s.
ProjectRel requires RexNodes, however there is no such thing as
"SqlOperator for snapshot.getHeapSize".
So I create those SqlOperators right before I call rexBuilder.makeCall (see [1])
The projections are calculated in rule that replaces "table scan" to
project(findids(), ...) [2].
The generated code I currently get looks as follows (I do not like
excessive casts in lines 30..32, however that is another issue):
/* 22 */ public Object current() {
/* 23 */ final int current =
net.hydromatic.optiq.runtime.SqlFunctions.toInt(inputEnumerator.current());
// This is the enumerator of @IDs
/* 24 */ final org.eclipse.mat.snapshot.ISnapshot v =
com.github.vlsi.mat.optiq.SnapshotHolder.get(0); // If only there were
a proper way to share variables via DataContext...
/* 25 */ final org.eclipse.mat.snapshot.model.IObject
v2 = com.github.vlsi.mat.optiq.functions.ISnapshotMethods.getIObject(v,
current);
/* 26 */ return new Object[] {
/* 27 */ current,
/* 28 */
com.github.vlsi.mat.optiq.functions.ISnapshotMethods.getShallowSize(v,
current),
/* 29 */
com.github.vlsi.mat.optiq.functions.ISnapshotMethods.getRetainedSize(v,
current),
/* 30 */
net.hydromatic.optiq.runtime.SqlFunctions.toFloat(com.github.vlsi.mat.optiq.functions.IObjectMethods.resolveSimpleValue(v2,
"loadFactor")),
/* 31 */
net.hydromatic.optiq.runtime.SqlFunctions.toInt(com.github.vlsi.mat.optiq.functions.IObjectMethods.resolveSimpleValue(v2,
"threshold")),
/* 32 */
net.hydromatic.optiq.runtime.SqlFunctions.toInt(com.github.vlsi.mat.optiq.functions.IObjectMethods.resolveSimpleValue(v2,
"modCount")),
[1]: https://github.com/vlsi/mat-calcite-plugin/blob/master/MatCalcitePlugin/src/com/github/vlsi/mat/optiq/ClassRowTypeCache.java#L138
[2]: https://github.com/vlsi/mat-calcite-plugin/blob/master/MatCalcitePlugin/src/com/github/vlsi/mat/optiq/rules/InstanceAccessByClassIdRule.java#L43
--
Regards,
Vladimir Sitnikov