You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Ashish Thusoo (JIRA)" <ji...@apache.org> on 2009/01/15 23:34:00 UTC

[jira] Commented: (HIVE-65) Implict conversion from integer to long broken for Dynamic Serde tables

    [ https://issues.apache.org/jira/browse/HIVE-65?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664297#action_12664297 ] 

Ashish Thusoo commented on HIVE-65:
-----------------------------------

The patch contains the following changes: 

1. Interfaces to be able to plugin your own resolvers for UDFs and UDAFs. These are: 
 - UDFMethodResolver for udfs -> Given the types of the arguments, this interface allows the compiler to retrieve which evaluate function to use. 
 - UDAFEvaluatorResolver for udafs -> Given the types of the arguments, this interface allows the compiler to retrieve which UDAFEvaluator to use. 

What was UDAF previously in now the UDAFEvaluator interface. The function names have been changed somewhat. These are: 
1. init which was also init in the UDAF abstract class - for initializaing the state 
2. iterate which was aggregate in the UDAF abstract class - for updating the state for each passed in value of the arguments 
3. terminatePartial which was evaluatePartial in the UDAF abstract class - for returning the state after the partial aggregation has been done 
4. merge which was aggregatePartial in the UDAF abstract class - for merging the results of the terminatePartial while doing the final aggregation 
5. terminate which was evaluate in the UDAF abstract class - for retruning the final result of the aggregation. 

The UDF and UDAF classes now encapsulate a resolver which is used for resolution for that particular class. 
The different types of resolver implementation for UDFMethodResolver are: 
1. DefaultUDFMethodResolver - This is the default resolver and it uses the old rule of finding the evaluate function in the UDF that needs least number of 
    argument conversions. 
2. NumericOpMethodResolver - This is the resolver used by overloaded numeric operators (+, -, %, /, *). This implements the following resolution logic: 
  - If any of the arguments is Void (null) or String, then the evaluate(Double, Double) method is used. 
  - otherwise, if both the arguments are of the same type, then the evaluate(<arg Type>, <arg Type>) method is used. 
  - otherwise, evaluate(Double, Double) method is used. 
3. ComparisonOpMethodResolver - This is the resolver used by oveloaded comparison operators (>, <. >=. <=. =. <>). This implements the following resolution logic: 
  - If any of the arguments in Void (null), then the evaluate(Double, Double) method is used. 
  - otherwise, if both the arguments are of the same type, then the evaluate(<arg Type>, <arg Type>) method is used. 
  - otherwise, if one of the arguments is a Date, then evaluate(Date, Date) method is used. 
  - otherwise, evaluate(Double, Double) method is used. 

Abstract base classes for UDFs that use each of these resolvers are provided. These are: 
1. UDF has been modified to use DefaultUDFMethodResolver. 
2. UDFNumericOp is a new class which uses NumericOpMethodResolver. 
3. UDFBaseCompare has been modified to use ComparisonOpMethodResolver. 

Similar to the UDFMethodResolvers decribed above, there are 2 implementation available for the UDAFEvaluatorResolvers. These are: 
1. DefaultUDAFEvaluatorResolver - on similar lines as DefaultUDFMethodResolver. 
2. NumericUDAFEvaluatorResolver - on similar lines as NumericOpMethodResolver. 

The UDAF resolution logic is in getUDAF where as the UDF resolution logic is in getExprNodeDesc. This logic is the same as previously, though 
I think we should change this to allow conversions from any thing to any thing. I have change the UDF conversion operators to reflect that. In both 
these locations, conversion operators are appropriately added for the arguments that need conversion. 

I have also moved the code to create exprNodeDesc to start using the tree walker infrastructure. Note the code in SemanticAnalyzer still remains 
because PartitionPruner depends on it and we can get rid of that only after we have refactored partition pruning which iteself depends on predicate 
push down - I will file a separate JIRA for this work. 
In the Tree Walker framework, I have added the ability for the processor to return objects and I have also added the ability for the walker to pass 
what was returned for objects walked so far to the processor. This is usefull while creating expression node descriptors for a node from the 
expression node descriptors of the children that have been visited preorder. Also I have removed the NodeProcessorCtx (this was Joy's comment 
in the previous review), it was anyway an empty abstract class and having that around was making it more involved to write the type checker 
because of Java's lack of support for multiple class inheritance. 
Accordingly, the type check processor and factory are implemented in: 
1. TypeCheckProcFactory 
2. TypeCheckCtx 

I have so far added 1 test relating to this JIRA. I am also going to add more tests for this. 

As a result of the UDAFEvaluator framework, we are now able to support overloaded aggregate functions like max and min for strings as well. 
One of the existing tests which was returning null previously (a wrong result!!) now outputs the correct result.


> Implict conversion from integer to long broken for Dynamic Serde tables
> -----------------------------------------------------------------------
>
>                 Key: HIVE-65
>                 URL: https://issues.apache.org/jira/browse/HIVE-65
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Ashish Thusoo
>            Assignee: Ashish Thusoo
>            Priority: Critical
>         Attachments: patch-65.txt
>
>
> For a dynamic serde table that has a bigint column, implict conversion from int to bigint seems to be broken. I have not verified this for other tables.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.