You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Lai Zhou (JIRA)" <ji...@apache.org> on 2019/01/08 07:17:00 UTC

[jira] [Comment Edited] (CALCITE-2741) Add operator table with Hive-specific built-in functions

    [ https://issues.apache.org/jira/browse/CALCITE-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16736820#comment-16736820 ] 

Lai Zhou edited comment on CALCITE-2741 at 1/8/19 7:16 AM:
-----------------------------------------------------------

[~julianhyde], I there a right way to add local fileld declarations into  the  bind method  of 'Baz' class?
{code:java}
public org.apache.calcite.linq4j.Enumerable bind(final org.apache.calcite.DataContext root)  

{code}
I write a new NotNullImplementor for hive operators, that returns a expression like 
{code:java}
org.apache.calcite.hivesql.function.HiveUDFInvoke.invokeGenericUdfGetBoolean(udfInstance_1, new Object[] {...)  

{code}
the udfInstance_1 is a hive generic udf instance that should be constructed at the beginning of the bind method block, like 
{code:java}
public org.apache.calcite.linq4j.Enumerable bind(final org.apache.calcite.DataContext root) { final org.apache.hadoop.hive.ql.udf.generic.GenericUDF udfInstance_2 = org.apache.calcite.hivesql.function.HiveUDFInvoke.createGenericUDFInstance("OR", org.apache.calcite.sql.SqlSyntax.BINARY); final org.apache.hadoop.hive.ql.udf.generic.GenericUDF udfInstance_3 = org.apache.calcite.hivesql.function.HiveUDFInvoke.createGenericUDFInstance("AND", org.apache.calcite.sql.SqlSyntax.BINARY); final org.apache.hadoop.hive.ql.udf.generic.GenericUDF udfInstance_4 = org.apache.calcite.hivesql.function.HiveUDFInvoke.createGenericUDFInstance("<", org.apache.calcite.sql.SqlSyntax.BINARY); final org.apache.hadoop.hive.ql.udf.generic.GenericUDF udfInst ance_1 = org.apache.calcite.hivesql.function.HiveUDFInvoke.createGenericUDFInstance("=", org.apache.calcite.sql.SqlSyntax.BINARY); final org.apache.hadoop.hive.ql.udf.generic.GenericUDF udfInstance_0 = org.apache.calcite.hivesql.function.HiveUDFInvoke.createGenericUDFInstance(">", org.apache.calcite.sql.SqlSyntax.BINARY); final org.apache.hadoop.hive.ql.udf.generic.GenericUDF udfInstance_5 = org.apache.calcite.hivesql.function.HiveUDFInvoke.createGenericUDFInstance("SUBSTR", org.apache.calcite.sql.SqlSyntax.FUNCTION); final org.apache.calcite.linq4j.Enumerable _inputEnumerable = org.apache.calcite.schema.Schemas.queryable(root, root.getRootSchema().getSubSchema("DEFAULT_SCH"), java.lang.Object[].class, "T").asEnumerable();

{code}
I think it'd be better to stash the local field declarations when implement a RexCall.But the 

RexToLixTranslator did not hold a reference of EnumerableRelImplementor, can you give me some suggestions to support this feature? (now I use a arbitrary way to support it ,just use a ThreadLocal context to stash things, and clear all things when parse a new sql query ).

 

 


was (Author: hhlai1990):
[~julianhyde], I there a right way to add local fileld declarations into  the method  block  of 'Baz' class?
{code:java}
public org.apache.calcite.linq4j.Enumerable bind(final org.apache.calcite.DataContext root)  

{code}
I write a new NotNullImplementor for hive operators, that returns a expression like 
{code:java}
org.apache.calcite.hivesql.function.HiveUDFInvoke.invokeGenericUdfGetBoolean(udfInstance_1, new Object[] {...)  

{code}
 

the udfInstance_1 is a hive generic udf instance that should be constructed at the beginning of the bind method block, like 
{code:java}
public org.apache.calcite.linq4j.Enumerable bind(final org.apache.calcite.DataContext root) { final org.apache.hadoop.hive.ql.udf.generic.GenericUDF udfInstance_2 = org.apache.calcite.hivesql.function.HiveUDFInvoke.createGenericUDFInstance("OR", org.apache.calcite.sql.SqlSyntax.BINARY); final org.apache.hadoop.hive.ql.udf.generic.GenericUDF udfInstance_3 = org.apache.calcite.hivesql.function.HiveUDFInvoke.createGenericUDFInstance("AND", org.apache.calcite.sql.SqlSyntax.BINARY); final org.apache.hadoop.hive.ql.udf.generic.GenericUDF udfInstance_4 = org.apache.calcite.hivesql.function.HiveUDFInvoke.createGenericUDFInstance("<", org.apache.calcite.sql.SqlSyntax.BINARY); final org.apache.hadoop.hive.ql.udf.generic.GenericUDF udfInst ance_1 = org.apache.calcite.hivesql.function.HiveUDFInvoke.createGenericUDFInstance("=", org.apache.calcite.sql.SqlSyntax.BINARY); final org.apache.hadoop.hive.ql.udf.generic.GenericUDF udfInstance_0 = org.apache.calcite.hivesql.function.HiveUDFInvoke.createGenericUDFInstance(">", org.apache.calcite.sql.SqlSyntax.BINARY); final org.apache.hadoop.hive.ql.udf.generic.GenericUDF udfInstance_5 = org.apache.calcite.hivesql.function.HiveUDFInvoke.createGenericUDFInstance("SUBSTR", org.apache.calcite.sql.SqlSyntax.FUNCTION); final org.apache.calcite.linq4j.Enumerable _inputEnumerable = org.apache.calcite.schema.Schemas.queryable(root, root.getRootSchema().getSubSchema("DEFAULT_SCH"), java.lang.Object[].class, "T").asEnumerable();

{code}
I think it'd be better to stash the local field declarations when implement a RexCall.But the 

RexToLixTranslator did not hold a reference of EnumerableRelImplementor, can you give me some suggestions to support this feature? (now I use a arbitrary way to support it ,just use a ThreadLocal context to stash things, and clear all things when parse a new sql query ).

 

 

> Add operator table with Hive-specific built-in functions
> --------------------------------------------------------
>
>                 Key: CALCITE-2741
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2741
>             Project: Calcite
>          Issue Type: New Feature
>          Components: core
>            Reporter: Lai Zhou
>            Assignee: Julian Hyde
>            Priority: Minor
>
> [~julianhyde],
> I extended the native enummerable implemention of calcite to support Hive sql ,include UDF、UDAF and all the SqlSpecialOperator,which inspired by apache Drills.
> I modified the parser,type systems,and bridge the hive operator .
> How do you think of supporting a direct implemention of hive sql like this?
> I think it will be valueable when someone want to migrate his hive etl jobs to real-time scene.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)