You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by Jinfeng Ni <jn...@maprtech.com> on 2013/12/16 18:52:49 UTC

Re: Review Request 15871: DRILL-316. Explicit cast support in drill execution engine .

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15871/
-----------------------------------------------------------

(Updated Dec. 16, 2013, 9:52 a.m.)


Review request for drill.


Changes
-------

Remove two json test files; put back checking of return type for function resolution. 


Summary (updated)
-----------------

DRILL-316. Explicit cast support in drill execution engine . 


Repository: drill-git


Description
-------

The explicit cast function is treated as a special function call. Its syntax in logical/physical plan:
    
    cast (input_expr as int | bigint | float4 | float8 | varchar(length) | varbinary(length) )

The main code changes :
1) parser change to read expression in logical/physical plan, which may contain an explicit cast function
2) FunctionRegistry.java is added with two methods to build a logical expression for cast function.
3) Add 3 function template to generate the different type of cast functions :
  
   CastFunctionsSrcVarLen.java : when source type has varied length, target type has fixed length.
   CastFunctionsTargetVarLen.java: when source type has fixed length, target type has varied length.
   CastFunctionsSrcVarLenTargetVarLen.java: when both source and target type have varied length.

Currently, the following 6 types can be cast into each other
  float4, float8, int, bigint, varchar, varbinary.


Diffs (updated)
-----

  common/src/main/antlr3/org/apache/drill/common/expression/parser/ExprLexer.g 76da965 
  common/src/main/antlr3/org/apache/drill/common/expression/parser/ExprParser.g 7fc1651 
  common/src/main/java/org/apache/drill/common/expression/FunctionCall.java a8b7e01 
  common/src/main/java/org/apache/drill/common/expression/FunctionDefinition.java fb00dc7 
  common/src/main/java/org/apache/drill/common/expression/FunctionRegistry.java 73e1925 
  common/src/main/java/org/apache/drill/common/expression/OutputTypeDeterminer.java 2f583d5 
  common/src/main/java/org/apache/drill/common/expression/fn/CastFunctionDefs.java PRE-CREATION 
  exec/java-exec/src/main/codegen/data/Casts.tdd 760b0f9 
  exec/java-exec/src/main/codegen/templates/CastFunctions.java 85a00b0 
  exec/java-exec/src/main/codegen/templates/CastFunctionsSrcVarLen.java PRE-CREATION 
  exec/java-exec/src/main/codegen/templates/CastFunctionsSrcVarLenTargetVarLen.java PRE-CREATION 
  exec/java-exec/src/main/codegen/templates/CastFunctionsTargetVarLen.java PRE-CREATION 
  exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/DrillFuncHolder.java e001ffe 
  exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/FunctionImplementationRegistry.java 5bbab76 
  exec/java-exec/src/main/java/org/apache/drill/exec/vector/ValueHolderHelper.java 077c693 
  exec/java-exec/src/main/java/org/apache/drill/exec/work/FragmentRunner.java d003972 
  exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestCastFunctions.java PRE-CREATION 
  exec/java-exec/src/test/resources/functions/cast/testCastBigInt.json PRE-CREATION 
  exec/java-exec/src/test/resources/functions/cast/testCastFloat4.json PRE-CREATION 
  exec/java-exec/src/test/resources/functions/cast/testCastFloat8.json PRE-CREATION 
  exec/java-exec/src/test/resources/functions/cast/testCastInt.json PRE-CREATION 
  exec/java-exec/src/test/resources/functions/cast/testCastNested.json PRE-CREATION 
  exec/java-exec/src/test/resources/functions/cast/testCastVarBinary.json PRE-CREATION 
  exec/java-exec/src/test/resources/functions/cast/testCastVarChar.json PRE-CREATION 

Diff: https://reviews.apache.org/r/15871/diff/


Testing
-------

7 physical plans are used in junit testcase testCastFunctions.java

1) testCastInt.json : when target type is int, and src type is bigint, float4, float8, varchar, varbinary.
2) testCastBigInt.json: when target type is bigint, and src type is int, float4, float8, varchar, varbinary.
3) testCastFloat4.json: when target type is float4, and src type is int, bigint, float8, varchar, varbinary.
4) testCastFloat8.json: when target type is float8, and src type is int, bigint, float4, varchar, varbinary
5) testCastVarChar.json: when target tyep is varchar, and src type is int, bigint, float4,float8, varbinary
6) testCastVarBinary.json: when target type is varbinary, and src type is int, bigint, float4, float8, varchar.
7) testCastNested.json: test cast functions is nested in another cast function, or another normal functions.
 


Thanks,

Jinfeng Ni


Re: Review Request 15871: DRILL-316. Explicit cast support in drill execution engine .

Posted by Jinfeng Ni <jn...@maprtech.com>.

> On Dec. 16, 2013, 10:01 p.m., Jacques Nadeau wrote:
> > exec/java-exec/src/main/codegen/templates/CastFunctionsSrcVarLen.java, line 48
> > <https://reviews.apache.org/r/15871/diff/4/?file=398256#file398256line48>
> >
> >     Can you take a quick look at what this looks like in machine code after optimized.  I'm concerned that it will be inefficient.
> 
> Jinfeng Ni wrote:
>     The original code seems inefficient in that it first converts byte array into char array, then construct a string then parse into int/long etc.
>     
>     The fix is to decode the byte array directly into int/long, in a similar way how java's Integer/Long parse char array. This would improve performance, when the cast is applied many times in record batches.

Checked the generated machine code by looking at the assembly code using -XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly. The original code will 1)copy into byte array, 2)decode into char array, 3) then parse into int/long. It confirms that the original code is less efficient, compared with the new code, which will decode the byte array directly into int/long.


- Jinfeng


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15871/#review30518
-----------------------------------------------------------


On Dec. 19, 2013, 9:54 a.m., Jinfeng Ni wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15871/
> -----------------------------------------------------------
> 
> (Updated Dec. 19, 2013, 9:54 a.m.)
> 
> 
> Review request for drill.
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> The explicit cast function is treated as a special function call. Its syntax in logical/physical plan:
>     
>     cast (input_expr as int | bigint | float4 | float8 | varchar(length) | varbinary(length) )
> 
> The main code changes :
> 1) parser change to read expression in logical/physical plan, which may contain an explicit cast function
> 2) FunctionRegistry.java is added with two methods to build a logical expression for cast function.
> 3) Add 3 function template to generate the different type of cast functions :
>   
>    CastFunctionsSrcVarLen.java : when source type has varied length, target type has fixed length.
>    CastFunctionsTargetVarLen.java: when source type has fixed length, target type has varied length.
>    CastFunctionsSrcVarLenTargetVarLen.java: when both source and target type have varied length.
> 
> Currently, the following 6 types can be cast into each other
>   float4, float8, int, bigint, varchar, varbinary.
> 
> 
> Diffs
> -----
> 
>   common/src/main/antlr3/org/apache/drill/common/expression/parser/ExprLexer.g 76da965 
>   common/src/main/antlr3/org/apache/drill/common/expression/parser/ExprParser.g 7fc1651 
>   common/src/main/java/org/apache/drill/common/expression/FunctionCall.java a8b7e01 
>   common/src/main/java/org/apache/drill/common/expression/FunctionDefinition.java fb00dc7 
>   common/src/main/java/org/apache/drill/common/expression/FunctionRegistry.java 73e1925 
>   common/src/main/java/org/apache/drill/common/expression/OutputTypeDeterminer.java 2f583d5 
>   common/src/main/java/org/apache/drill/common/expression/fn/CastFunctionDefs.java PRE-CREATION 
>   exec/java-exec/src/main/codegen/data/Casts.tdd 760b0f9 
>   exec/java-exec/src/main/codegen/templates/CastFunctions.java 85a00b0 
>   exec/java-exec/src/main/codegen/templates/CastFunctionsSrcVarLen.java PRE-CREATION 
>   exec/java-exec/src/main/codegen/templates/CastFunctionsSrcVarLenTargetVarLen.java PRE-CREATION 
>   exec/java-exec/src/main/codegen/templates/CastFunctionsTargetVarLen.java PRE-CREATION 
>   exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/DrillFuncHolder.java e001ffe 
>   exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/FunctionImplementationRegistry.java 5bbab76 
>   exec/java-exec/src/main/java/org/apache/drill/exec/vector/ValueHolderHelper.java 077c693 
>   exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestCastFunctions.java PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastBigInt.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastFloat4.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastFloat8.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastInt.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastNested.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastNumException.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastVarBinary.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastVarChar.json PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/15871/diff/
> 
> 
> Testing
> -------
> 
> 7 physical plans are used in junit testcase testCastFunctions.java
> 
> 1) testCastInt.json : when target type is int, and src type is bigint, float4, float8, varchar, varbinary.
> 2) testCastBigInt.json: when target type is bigint, and src type is int, float4, float8, varchar, varbinary.
> 3) testCastFloat4.json: when target type is float4, and src type is int, bigint, float8, varchar, varbinary.
> 4) testCastFloat8.json: when target type is float8, and src type is int, bigint, float4, varchar, varbinary
> 5) testCastVarChar.json: when target tyep is varchar, and src type is int, bigint, float4,float8, varbinary
> 6) testCastVarBinary.json: when target type is varbinary, and src type is int, bigint, float4, float8, varchar.
> 7) testCastNested.json: test cast functions is nested in another cast function, or another normal functions.
>  
> 
> 
> Thanks,
> 
> Jinfeng Ni
> 
>


Re: Review Request 15871: DRILL-316. Explicit cast support in drill execution engine .

Posted by Jinfeng Ni <jn...@maprtech.com>.

> On Dec. 16, 2013, 10:01 p.m., Jacques Nadeau wrote:
> > common/src/main/antlr3/org/apache/drill/common/expression/parser/ExprParser.g, line 99
> > <https://reviews.apache.org/r/15871/diff/4/?file=398248#file398248line99>
> >
> >     As discussed, it is weird to treat the type as a logical expression.  Can you please modify this to be a new struct that holds the information associated with charType and numType?  Same for typeLen.  Or maybe just build a more complicated parser rule?  Otherwise it seems like a hack

Fix: use MajorType to hold all the information related to the target type (name, length, mode, etc). This also simplifies grammar rules.


> On Dec. 16, 2013, 10:01 p.m., Jacques Nadeau wrote:
> > common/src/main/java/org/apache/drill/common/expression/FunctionRegistry.java, line 44
> > <https://reviews.apache.org/r/15871/diff/4/?file=398251#file398251line44>
> >
> >     Can you add these definitions so that they are automatically generated as part of type helper so we don't have to worry about maintenance in multiple places?

fix: typeNameMap is no longed required in the code. The new code uses drill's internal type name as part of the cast function name. As such, no need for such typeNameMap structure any more.


- Jinfeng


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15871/#review30518
-----------------------------------------------------------


On Dec. 16, 2013, 9:52 a.m., Jinfeng Ni wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15871/
> -----------------------------------------------------------
> 
> (Updated Dec. 16, 2013, 9:52 a.m.)
> 
> 
> Review request for drill.
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> The explicit cast function is treated as a special function call. Its syntax in logical/physical plan:
>     
>     cast (input_expr as int | bigint | float4 | float8 | varchar(length) | varbinary(length) )
> 
> The main code changes :
> 1) parser change to read expression in logical/physical plan, which may contain an explicit cast function
> 2) FunctionRegistry.java is added with two methods to build a logical expression for cast function.
> 3) Add 3 function template to generate the different type of cast functions :
>   
>    CastFunctionsSrcVarLen.java : when source type has varied length, target type has fixed length.
>    CastFunctionsTargetVarLen.java: when source type has fixed length, target type has varied length.
>    CastFunctionsSrcVarLenTargetVarLen.java: when both source and target type have varied length.
> 
> Currently, the following 6 types can be cast into each other
>   float4, float8, int, bigint, varchar, varbinary.
> 
> 
> Diffs
> -----
> 
>   common/src/main/antlr3/org/apache/drill/common/expression/parser/ExprLexer.g 76da965 
>   common/src/main/antlr3/org/apache/drill/common/expression/parser/ExprParser.g 7fc1651 
>   common/src/main/java/org/apache/drill/common/expression/FunctionCall.java a8b7e01 
>   common/src/main/java/org/apache/drill/common/expression/FunctionDefinition.java fb00dc7 
>   common/src/main/java/org/apache/drill/common/expression/FunctionRegistry.java 73e1925 
>   common/src/main/java/org/apache/drill/common/expression/OutputTypeDeterminer.java 2f583d5 
>   common/src/main/java/org/apache/drill/common/expression/fn/CastFunctionDefs.java PRE-CREATION 
>   exec/java-exec/src/main/codegen/data/Casts.tdd 760b0f9 
>   exec/java-exec/src/main/codegen/templates/CastFunctions.java 85a00b0 
>   exec/java-exec/src/main/codegen/templates/CastFunctionsSrcVarLen.java PRE-CREATION 
>   exec/java-exec/src/main/codegen/templates/CastFunctionsSrcVarLenTargetVarLen.java PRE-CREATION 
>   exec/java-exec/src/main/codegen/templates/CastFunctionsTargetVarLen.java PRE-CREATION 
>   exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/DrillFuncHolder.java e001ffe 
>   exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/FunctionImplementationRegistry.java 5bbab76 
>   exec/java-exec/src/main/java/org/apache/drill/exec/vector/ValueHolderHelper.java 077c693 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/FragmentRunner.java d003972 
>   exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestCastFunctions.java PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastBigInt.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastFloat4.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastFloat8.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastInt.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastNested.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastVarBinary.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastVarChar.json PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/15871/diff/
> 
> 
> Testing
> -------
> 
> 7 physical plans are used in junit testcase testCastFunctions.java
> 
> 1) testCastInt.json : when target type is int, and src type is bigint, float4, float8, varchar, varbinary.
> 2) testCastBigInt.json: when target type is bigint, and src type is int, float4, float8, varchar, varbinary.
> 3) testCastFloat4.json: when target type is float4, and src type is int, bigint, float8, varchar, varbinary.
> 4) testCastFloat8.json: when target type is float8, and src type is int, bigint, float4, varchar, varbinary
> 5) testCastVarChar.json: when target tyep is varchar, and src type is int, bigint, float4,float8, varbinary
> 6) testCastVarBinary.json: when target type is varbinary, and src type is int, bigint, float4, float8, varchar.
> 7) testCastNested.json: test cast functions is nested in another cast function, or another normal functions.
>  
> 
> 
> Thanks,
> 
> Jinfeng Ni
> 
>


Re: Review Request 15871: DRILL-316. Explicit cast support in drill execution engine .

Posted by Jinfeng Ni <jn...@maprtech.com>.

> On Dec. 16, 2013, 10:01 p.m., Jacques Nadeau wrote:
> > exec/java-exec/src/main/codegen/templates/CastFunctionsSrcVarLen.java, line 48
> > <https://reviews.apache.org/r/15871/diff/4/?file=398256#file398256line48>
> >
> >     Can you take a quick look at what this looks like in machine code after optimized.  I'm concerned that it will be inefficient.

The original code seems inefficient in that it first converts byte array into char array, then construct a string then parse into int/long etc.

The fix is to decode the byte array directly into int/long, in a similar way how java's Integer/Long parse char array. This would improve performance, when the cast is applied many times in record batches.  


- Jinfeng


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15871/#review30518
-----------------------------------------------------------


On Dec. 16, 2013, 9:52 a.m., Jinfeng Ni wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15871/
> -----------------------------------------------------------
> 
> (Updated Dec. 16, 2013, 9:52 a.m.)
> 
> 
> Review request for drill.
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> The explicit cast function is treated as a special function call. Its syntax in logical/physical plan:
>     
>     cast (input_expr as int | bigint | float4 | float8 | varchar(length) | varbinary(length) )
> 
> The main code changes :
> 1) parser change to read expression in logical/physical plan, which may contain an explicit cast function
> 2) FunctionRegistry.java is added with two methods to build a logical expression for cast function.
> 3) Add 3 function template to generate the different type of cast functions :
>   
>    CastFunctionsSrcVarLen.java : when source type has varied length, target type has fixed length.
>    CastFunctionsTargetVarLen.java: when source type has fixed length, target type has varied length.
>    CastFunctionsSrcVarLenTargetVarLen.java: when both source and target type have varied length.
> 
> Currently, the following 6 types can be cast into each other
>   float4, float8, int, bigint, varchar, varbinary.
> 
> 
> Diffs
> -----
> 
>   common/src/main/antlr3/org/apache/drill/common/expression/parser/ExprLexer.g 76da965 
>   common/src/main/antlr3/org/apache/drill/common/expression/parser/ExprParser.g 7fc1651 
>   common/src/main/java/org/apache/drill/common/expression/FunctionCall.java a8b7e01 
>   common/src/main/java/org/apache/drill/common/expression/FunctionDefinition.java fb00dc7 
>   common/src/main/java/org/apache/drill/common/expression/FunctionRegistry.java 73e1925 
>   common/src/main/java/org/apache/drill/common/expression/OutputTypeDeterminer.java 2f583d5 
>   common/src/main/java/org/apache/drill/common/expression/fn/CastFunctionDefs.java PRE-CREATION 
>   exec/java-exec/src/main/codegen/data/Casts.tdd 760b0f9 
>   exec/java-exec/src/main/codegen/templates/CastFunctions.java 85a00b0 
>   exec/java-exec/src/main/codegen/templates/CastFunctionsSrcVarLen.java PRE-CREATION 
>   exec/java-exec/src/main/codegen/templates/CastFunctionsSrcVarLenTargetVarLen.java PRE-CREATION 
>   exec/java-exec/src/main/codegen/templates/CastFunctionsTargetVarLen.java PRE-CREATION 
>   exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/DrillFuncHolder.java e001ffe 
>   exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/FunctionImplementationRegistry.java 5bbab76 
>   exec/java-exec/src/main/java/org/apache/drill/exec/vector/ValueHolderHelper.java 077c693 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/FragmentRunner.java d003972 
>   exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestCastFunctions.java PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastBigInt.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastFloat4.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastFloat8.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastInt.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastNested.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastVarBinary.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastVarChar.json PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/15871/diff/
> 
> 
> Testing
> -------
> 
> 7 physical plans are used in junit testcase testCastFunctions.java
> 
> 1) testCastInt.json : when target type is int, and src type is bigint, float4, float8, varchar, varbinary.
> 2) testCastBigInt.json: when target type is bigint, and src type is int, float4, float8, varchar, varbinary.
> 3) testCastFloat4.json: when target type is float4, and src type is int, bigint, float8, varchar, varbinary.
> 4) testCastFloat8.json: when target type is float8, and src type is int, bigint, float4, varchar, varbinary
> 5) testCastVarChar.json: when target tyep is varchar, and src type is int, bigint, float4,float8, varbinary
> 6) testCastVarBinary.json: when target type is varbinary, and src type is int, bigint, float4, float8, varchar.
> 7) testCastNested.json: test cast functions is nested in another cast function, or another normal functions.
>  
> 
> 
> Thanks,
> 
> Jinfeng Ni
> 
>


Re: Review Request 15871: DRILL-316. Explicit cast support in drill execution engine .

Posted by Jacques Nadeau <ja...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15871/#review30518
-----------------------------------------------------------



common/src/main/antlr3/org/apache/drill/common/expression/parser/ExprParser.g
<https://reviews.apache.org/r/15871/#comment58450>

    As discussed, it is weird to treat the type as a logical expression.  Can you please modify this to be a new struct that holds the information associated with charType and numType?  Same for typeLen.  Or maybe just build a more complicated parser rule?  Otherwise it seems like a hack



common/src/main/java/org/apache/drill/common/expression/FunctionRegistry.java
<https://reviews.apache.org/r/15871/#comment58452>

    Can you add these definitions so that they are automatically generated as part of type helper so we don't have to worry about maintenance in multiple places?



exec/java-exec/src/main/codegen/templates/CastFunctionsSrcVarLen.java
<https://reviews.apache.org/r/15871/#comment58449>

    Can you take a quick look at what this looks like in machine code after optimized.  I'm concerned that it will be inefficient.


- Jacques Nadeau


On Dec. 16, 2013, 5:52 p.m., Jinfeng Ni wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15871/
> -----------------------------------------------------------
> 
> (Updated Dec. 16, 2013, 5:52 p.m.)
> 
> 
> Review request for drill.
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> The explicit cast function is treated as a special function call. Its syntax in logical/physical plan:
>     
>     cast (input_expr as int | bigint | float4 | float8 | varchar(length) | varbinary(length) )
> 
> The main code changes :
> 1) parser change to read expression in logical/physical plan, which may contain an explicit cast function
> 2) FunctionRegistry.java is added with two methods to build a logical expression for cast function.
> 3) Add 3 function template to generate the different type of cast functions :
>   
>    CastFunctionsSrcVarLen.java : when source type has varied length, target type has fixed length.
>    CastFunctionsTargetVarLen.java: when source type has fixed length, target type has varied length.
>    CastFunctionsSrcVarLenTargetVarLen.java: when both source and target type have varied length.
> 
> Currently, the following 6 types can be cast into each other
>   float4, float8, int, bigint, varchar, varbinary.
> 
> 
> Diffs
> -----
> 
>   common/src/main/antlr3/org/apache/drill/common/expression/parser/ExprLexer.g 76da965 
>   common/src/main/antlr3/org/apache/drill/common/expression/parser/ExprParser.g 7fc1651 
>   common/src/main/java/org/apache/drill/common/expression/FunctionCall.java a8b7e01 
>   common/src/main/java/org/apache/drill/common/expression/FunctionDefinition.java fb00dc7 
>   common/src/main/java/org/apache/drill/common/expression/FunctionRegistry.java 73e1925 
>   common/src/main/java/org/apache/drill/common/expression/OutputTypeDeterminer.java 2f583d5 
>   common/src/main/java/org/apache/drill/common/expression/fn/CastFunctionDefs.java PRE-CREATION 
>   exec/java-exec/src/main/codegen/data/Casts.tdd 760b0f9 
>   exec/java-exec/src/main/codegen/templates/CastFunctions.java 85a00b0 
>   exec/java-exec/src/main/codegen/templates/CastFunctionsSrcVarLen.java PRE-CREATION 
>   exec/java-exec/src/main/codegen/templates/CastFunctionsSrcVarLenTargetVarLen.java PRE-CREATION 
>   exec/java-exec/src/main/codegen/templates/CastFunctionsTargetVarLen.java PRE-CREATION 
>   exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/DrillFuncHolder.java e001ffe 
>   exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/FunctionImplementationRegistry.java 5bbab76 
>   exec/java-exec/src/main/java/org/apache/drill/exec/vector/ValueHolderHelper.java 077c693 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/FragmentRunner.java d003972 
>   exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestCastFunctions.java PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastBigInt.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastFloat4.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastFloat8.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastInt.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastNested.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastVarBinary.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastVarChar.json PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/15871/diff/
> 
> 
> Testing
> -------
> 
> 7 physical plans are used in junit testcase testCastFunctions.java
> 
> 1) testCastInt.json : when target type is int, and src type is bigint, float4, float8, varchar, varbinary.
> 2) testCastBigInt.json: when target type is bigint, and src type is int, float4, float8, varchar, varbinary.
> 3) testCastFloat4.json: when target type is float4, and src type is int, bigint, float8, varchar, varbinary.
> 4) testCastFloat8.json: when target type is float8, and src type is int, bigint, float4, varchar, varbinary
> 5) testCastVarChar.json: when target tyep is varchar, and src type is int, bigint, float4,float8, varbinary
> 6) testCastVarBinary.json: when target type is varbinary, and src type is int, bigint, float4, float8, varchar.
> 7) testCastNested.json: test cast functions is nested in another cast function, or another normal functions.
>  
> 
> 
> Thanks,
> 
> Jinfeng Ni
> 
>


Re: Review Request 15871: DRILL-316. Explicit cast support in drill execution engine .

Posted by Jinfeng Ni <jn...@maprtech.com>.

> On Dec. 18, 2013, 5:07 p.m., Timothy Chen wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/work/FragmentRunner.java, line 96
> > <https://reviews.apache.org/r/15871/diff/4/?file=398262#file398262line96>
> >
> >     Was this intended to be here?

This is added when I debug code issue. For some reason, when fragmentrunner caught a failure, the stacktrace is not captured in the log sometimes. That's why I added this line of code. 

I removed this line of code. 


> On Dec. 18, 2013, 5:07 p.m., Timothy Chen wrote:
> > exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestCastFunctions.java, line 59
> > <https://reviews.apache.org/r/15871/diff/4/?file=398263#file398263line59>
> >
> >     Do you think we need to test cases when we cannot cast? And also since you set varchar(length) what happens if you set a smaller width than the actual content?

1. I added one test case to test when cast cause NumberFormatException, eg  cast("abc" as int).

2. target type is varchar(length),
  2.1) when the input type is fixed-length type, and the varchar could not hold all the digits, there would be truncation. 

( testCastVarChar.json )
 cast(intcol as varchar(2)), when intcol = -214748364, the output is -2, 
                             when intcol = 2147483647, the output is 21.
 
  2.2) when the input type is var-length type as well
  if the input's length is longer than the target length, again, there would be truncation.
  cast("12345" as varchar(2)  ==> "12"
 
  if the input's length is smaller than the target length, then code just set the target length = input's length. 

    //if input length <= target_type length, do nothing
    //else truncate based on target_type length.     
    out.buffer = in.buffer;
    out.start =  in.start;
    if (in.end - in.start <= length.value) {
      out.end = in.end;
    } else {
      out.end = out.start + (int) length.value;
    }
  }

When truncation happens, one might want to issue a warning. Drill-319 is opened to address such issue. 


- Jinfeng


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15871/#review30662
-----------------------------------------------------------


On Dec. 16, 2013, 9:52 a.m., Jinfeng Ni wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15871/
> -----------------------------------------------------------
> 
> (Updated Dec. 16, 2013, 9:52 a.m.)
> 
> 
> Review request for drill.
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> The explicit cast function is treated as a special function call. Its syntax in logical/physical plan:
>     
>     cast (input_expr as int | bigint | float4 | float8 | varchar(length) | varbinary(length) )
> 
> The main code changes :
> 1) parser change to read expression in logical/physical plan, which may contain an explicit cast function
> 2) FunctionRegistry.java is added with two methods to build a logical expression for cast function.
> 3) Add 3 function template to generate the different type of cast functions :
>   
>    CastFunctionsSrcVarLen.java : when source type has varied length, target type has fixed length.
>    CastFunctionsTargetVarLen.java: when source type has fixed length, target type has varied length.
>    CastFunctionsSrcVarLenTargetVarLen.java: when both source and target type have varied length.
> 
> Currently, the following 6 types can be cast into each other
>   float4, float8, int, bigint, varchar, varbinary.
> 
> 
> Diffs
> -----
> 
>   common/src/main/antlr3/org/apache/drill/common/expression/parser/ExprLexer.g 76da965 
>   common/src/main/antlr3/org/apache/drill/common/expression/parser/ExprParser.g 7fc1651 
>   common/src/main/java/org/apache/drill/common/expression/FunctionCall.java a8b7e01 
>   common/src/main/java/org/apache/drill/common/expression/FunctionDefinition.java fb00dc7 
>   common/src/main/java/org/apache/drill/common/expression/FunctionRegistry.java 73e1925 
>   common/src/main/java/org/apache/drill/common/expression/OutputTypeDeterminer.java 2f583d5 
>   common/src/main/java/org/apache/drill/common/expression/fn/CastFunctionDefs.java PRE-CREATION 
>   exec/java-exec/src/main/codegen/data/Casts.tdd 760b0f9 
>   exec/java-exec/src/main/codegen/templates/CastFunctions.java 85a00b0 
>   exec/java-exec/src/main/codegen/templates/CastFunctionsSrcVarLen.java PRE-CREATION 
>   exec/java-exec/src/main/codegen/templates/CastFunctionsSrcVarLenTargetVarLen.java PRE-CREATION 
>   exec/java-exec/src/main/codegen/templates/CastFunctionsTargetVarLen.java PRE-CREATION 
>   exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/DrillFuncHolder.java e001ffe 
>   exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/FunctionImplementationRegistry.java 5bbab76 
>   exec/java-exec/src/main/java/org/apache/drill/exec/vector/ValueHolderHelper.java 077c693 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/FragmentRunner.java d003972 
>   exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestCastFunctions.java PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastBigInt.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastFloat4.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastFloat8.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastInt.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastNested.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastVarBinary.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastVarChar.json PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/15871/diff/
> 
> 
> Testing
> -------
> 
> 7 physical plans are used in junit testcase testCastFunctions.java
> 
> 1) testCastInt.json : when target type is int, and src type is bigint, float4, float8, varchar, varbinary.
> 2) testCastBigInt.json: when target type is bigint, and src type is int, float4, float8, varchar, varbinary.
> 3) testCastFloat4.json: when target type is float4, and src type is int, bigint, float8, varchar, varbinary.
> 4) testCastFloat8.json: when target type is float8, and src type is int, bigint, float4, varchar, varbinary
> 5) testCastVarChar.json: when target tyep is varchar, and src type is int, bigint, float4,float8, varbinary
> 6) testCastVarBinary.json: when target type is varbinary, and src type is int, bigint, float4, float8, varchar.
> 7) testCastNested.json: test cast functions is nested in another cast function, or another normal functions.
>  
> 
> 
> Thanks,
> 
> Jinfeng Ni
> 
>


Re: Review Request 15871: DRILL-316. Explicit cast support in drill execution engine .

Posted by Timothy Chen <tn...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15871/#review30662
-----------------------------------------------------------



exec/java-exec/src/main/java/org/apache/drill/exec/work/FragmentRunner.java
<https://reviews.apache.org/r/15871/#comment58757>

    Was this intended to be here?



exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestCastFunctions.java
<https://reviews.apache.org/r/15871/#comment58758>

    Do you think we need to test cases when we cannot cast? And also since you set varchar(length) what happens if you set a smaller width than the actual content?


- Timothy Chen


On Dec. 16, 2013, 5:52 p.m., Jinfeng Ni wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15871/
> -----------------------------------------------------------
> 
> (Updated Dec. 16, 2013, 5:52 p.m.)
> 
> 
> Review request for drill.
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> The explicit cast function is treated as a special function call. Its syntax in logical/physical plan:
>     
>     cast (input_expr as int | bigint | float4 | float8 | varchar(length) | varbinary(length) )
> 
> The main code changes :
> 1) parser change to read expression in logical/physical plan, which may contain an explicit cast function
> 2) FunctionRegistry.java is added with two methods to build a logical expression for cast function.
> 3) Add 3 function template to generate the different type of cast functions :
>   
>    CastFunctionsSrcVarLen.java : when source type has varied length, target type has fixed length.
>    CastFunctionsTargetVarLen.java: when source type has fixed length, target type has varied length.
>    CastFunctionsSrcVarLenTargetVarLen.java: when both source and target type have varied length.
> 
> Currently, the following 6 types can be cast into each other
>   float4, float8, int, bigint, varchar, varbinary.
> 
> 
> Diffs
> -----
> 
>   common/src/main/antlr3/org/apache/drill/common/expression/parser/ExprLexer.g 76da965 
>   common/src/main/antlr3/org/apache/drill/common/expression/parser/ExprParser.g 7fc1651 
>   common/src/main/java/org/apache/drill/common/expression/FunctionCall.java a8b7e01 
>   common/src/main/java/org/apache/drill/common/expression/FunctionDefinition.java fb00dc7 
>   common/src/main/java/org/apache/drill/common/expression/FunctionRegistry.java 73e1925 
>   common/src/main/java/org/apache/drill/common/expression/OutputTypeDeterminer.java 2f583d5 
>   common/src/main/java/org/apache/drill/common/expression/fn/CastFunctionDefs.java PRE-CREATION 
>   exec/java-exec/src/main/codegen/data/Casts.tdd 760b0f9 
>   exec/java-exec/src/main/codegen/templates/CastFunctions.java 85a00b0 
>   exec/java-exec/src/main/codegen/templates/CastFunctionsSrcVarLen.java PRE-CREATION 
>   exec/java-exec/src/main/codegen/templates/CastFunctionsSrcVarLenTargetVarLen.java PRE-CREATION 
>   exec/java-exec/src/main/codegen/templates/CastFunctionsTargetVarLen.java PRE-CREATION 
>   exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/DrillFuncHolder.java e001ffe 
>   exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/FunctionImplementationRegistry.java 5bbab76 
>   exec/java-exec/src/main/java/org/apache/drill/exec/vector/ValueHolderHelper.java 077c693 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/FragmentRunner.java d003972 
>   exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestCastFunctions.java PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastBigInt.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastFloat4.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastFloat8.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastInt.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastNested.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastVarBinary.json PRE-CREATION 
>   exec/java-exec/src/test/resources/functions/cast/testCastVarChar.json PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/15871/diff/
> 
> 
> Testing
> -------
> 
> 7 physical plans are used in junit testcase testCastFunctions.java
> 
> 1) testCastInt.json : when target type is int, and src type is bigint, float4, float8, varchar, varbinary.
> 2) testCastBigInt.json: when target type is bigint, and src type is int, float4, float8, varchar, varbinary.
> 3) testCastFloat4.json: when target type is float4, and src type is int, bigint, float8, varchar, varbinary.
> 4) testCastFloat8.json: when target type is float8, and src type is int, bigint, float4, varchar, varbinary
> 5) testCastVarChar.json: when target tyep is varchar, and src type is int, bigint, float4,float8, varbinary
> 6) testCastVarBinary.json: when target type is varbinary, and src type is int, bigint, float4, float8, varchar.
> 7) testCastNested.json: test cast functions is nested in another cast function, or another normal functions.
>  
> 
> 
> Thanks,
> 
> Jinfeng Ni
> 
>


Re: Review Request 15871: DRILL-316. Explicit cast support in drill execution engine .

Posted by Jinfeng Ni <jn...@maprtech.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15871/
-----------------------------------------------------------

(Updated Dec. 19, 2013, 9:54 a.m.)


Review request for drill.


Changes
-------

Code change to address review comments : 

1. Use MajorType to capture the type infor (name, length, data mode, etc). Simplify the grammar rules.
2. Remove typeNameMap, after change the cast function name by using uppercase.
3. Optimize implementation of cast varchar/varbinary to int/long, by decode directly from byte array into int/long.
4. Remove an unnecessary line change in FragmentRunner.java
5. Add a testcase when cast will hit exception.


Repository: drill-git


Description
-------

The explicit cast function is treated as a special function call. Its syntax in logical/physical plan:
    
    cast (input_expr as int | bigint | float4 | float8 | varchar(length) | varbinary(length) )

The main code changes :
1) parser change to read expression in logical/physical plan, which may contain an explicit cast function
2) FunctionRegistry.java is added with two methods to build a logical expression for cast function.
3) Add 3 function template to generate the different type of cast functions :
  
   CastFunctionsSrcVarLen.java : when source type has varied length, target type has fixed length.
   CastFunctionsTargetVarLen.java: when source type has fixed length, target type has varied length.
   CastFunctionsSrcVarLenTargetVarLen.java: when both source and target type have varied length.

Currently, the following 6 types can be cast into each other
  float4, float8, int, bigint, varchar, varbinary.


Diffs (updated)
-----

  common/src/main/antlr3/org/apache/drill/common/expression/parser/ExprLexer.g 76da965 
  common/src/main/antlr3/org/apache/drill/common/expression/parser/ExprParser.g 7fc1651 
  common/src/main/java/org/apache/drill/common/expression/FunctionCall.java a8b7e01 
  common/src/main/java/org/apache/drill/common/expression/FunctionDefinition.java fb00dc7 
  common/src/main/java/org/apache/drill/common/expression/FunctionRegistry.java 73e1925 
  common/src/main/java/org/apache/drill/common/expression/OutputTypeDeterminer.java 2f583d5 
  common/src/main/java/org/apache/drill/common/expression/fn/CastFunctionDefs.java PRE-CREATION 
  exec/java-exec/src/main/codegen/data/Casts.tdd 760b0f9 
  exec/java-exec/src/main/codegen/templates/CastFunctions.java 85a00b0 
  exec/java-exec/src/main/codegen/templates/CastFunctionsSrcVarLen.java PRE-CREATION 
  exec/java-exec/src/main/codegen/templates/CastFunctionsSrcVarLenTargetVarLen.java PRE-CREATION 
  exec/java-exec/src/main/codegen/templates/CastFunctionsTargetVarLen.java PRE-CREATION 
  exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/DrillFuncHolder.java e001ffe 
  exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/FunctionImplementationRegistry.java 5bbab76 
  exec/java-exec/src/main/java/org/apache/drill/exec/vector/ValueHolderHelper.java 077c693 
  exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestCastFunctions.java PRE-CREATION 
  exec/java-exec/src/test/resources/functions/cast/testCastBigInt.json PRE-CREATION 
  exec/java-exec/src/test/resources/functions/cast/testCastFloat4.json PRE-CREATION 
  exec/java-exec/src/test/resources/functions/cast/testCastFloat8.json PRE-CREATION 
  exec/java-exec/src/test/resources/functions/cast/testCastInt.json PRE-CREATION 
  exec/java-exec/src/test/resources/functions/cast/testCastNested.json PRE-CREATION 
  exec/java-exec/src/test/resources/functions/cast/testCastNumException.json PRE-CREATION 
  exec/java-exec/src/test/resources/functions/cast/testCastVarBinary.json PRE-CREATION 
  exec/java-exec/src/test/resources/functions/cast/testCastVarChar.json PRE-CREATION 

Diff: https://reviews.apache.org/r/15871/diff/


Testing
-------

7 physical plans are used in junit testcase testCastFunctions.java

1) testCastInt.json : when target type is int, and src type is bigint, float4, float8, varchar, varbinary.
2) testCastBigInt.json: when target type is bigint, and src type is int, float4, float8, varchar, varbinary.
3) testCastFloat4.json: when target type is float4, and src type is int, bigint, float8, varchar, varbinary.
4) testCastFloat8.json: when target type is float8, and src type is int, bigint, float4, varchar, varbinary
5) testCastVarChar.json: when target tyep is varchar, and src type is int, bigint, float4,float8, varbinary
6) testCastVarBinary.json: when target type is varbinary, and src type is int, bigint, float4, float8, varchar.
7) testCastNested.json: test cast functions is nested in another cast function, or another normal functions.
 


Thanks,

Jinfeng Ni