You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Zhiyong Liu (JIRA)" <ji...@apache.org> on 2014/05/21 23:16:39 UTC

[jira] [Updated] (DRILL-801) merge joins fail with ArrayIndexOutOfBoundsException en masse

     [ https://issues.apache.org/jira/browse/DRILL-801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhiyong Liu updated DRILL-801:
------------------------------

    Summary: merge joins fail with ArrayIndexOutOfBoundsException en masse  (was: merge join fail with ArrayIndexOutOfBoundsException en masse)

> merge joins fail with ArrayIndexOutOfBoundsException en masse
> -------------------------------------------------------------
>
>                 Key: DRILL-801
>                 URL: https://issues.apache.org/jira/browse/DRILL-801
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Functions - Drill
>            Reporter: Zhiyong Liu
>
> Datasources: TPCH (10MB), three-way split parquet files
> git.commit.id.abbrev=5d7e3d3
> git.commit.id=5d7e3d3ab548eb2b23607df46ea843a9c1532b72
> All of the join queries in the smoke test suite with merge-join fail with ArrayIndexOutOfBoundsException.  An example follows:
> 0: jdbc:drill:schema=dfs.TpcHMulti> alter session set `planner.enable_hashjoin` = false;
> +------------+------------+
> |     ok     |  summary   |
> +------------+------------+
> | true       | planner.enable_hashjoin updated. |
> +------------+------------+
> 1 row selected (0.024 seconds)
> 0: jdbc:drill:schema=dfs.TpcHMulti> select o.O_TOTALPRICE, c.C_NAME
> . . . . . . . . . . . . . . . . . >    from orders o, customer c
> . . . . . . . . . . . . . . . . . >    where o.C_CUSTKEY = c.C_CUSTKEY and o.O_TOTALPRICE > 400000.00
> . . . . . . . . . . . . . . . . . >    order by o.O_TOTALPRICE;
> Query failed: org.apache.drill.exec.rpc.RpcException: Remote failure while running query.[error_id: "3914508b-6c56-4598-a5aa-5d3f51885ded"
> endpoint {
>   address: "perfnode104.perf.lab"
>   user_port: 31010
>   control_port: 31011
>   data_port: 31012
> }
> error_type: 0
> message: "Failure while running fragment. < ArrayIndexOutOfBoundsException:[ 16666 ]"
> ]
> Error: exception while executing query (state=,code=0)
> Physical plan:
> 0: jdbc:drill:schema=dfs.TpcHMulti> explain plan for select o.O_TOTALPRICE, c.C_NAME
> . . . . . . . . . . . . . . . . . >    from orders o, customer c                    
> . . . . . . . . . . . . . . . . . >    where o.C_CUSTKEY = c.C_CUSTKEY and o.O_TOTALPRICE > 400000.00
> . . . . . . . . . . . . . . . . . >    order by o.O_TOTALPRICE ;                                     
> +------------+------------+                                                                          
> |    text    |    json    |                                                                          
> +------------+------------+                                                                          
> | ScreenPrel                                                                                         
>   SingleMergeExchangePrel(sort0=[0 ASC])                                                             
>     SelectionVectorRemoverPrel                                                                       
>       SortPrel(sort0=[$0], dir0=[ASC])                                                               
>         HashToRandomExchangePrel(dist0=[[$0]])                                                       
>           ProjectPrel(O_TOTALPRICE=[$2], C_NAME=[$5])                                                
>             MergeJoinPrel(condition=[=($1, $4)], joinType=[inner])                                   
>               SelectionVectorRemoverPrel                                                             
>                 SortPrel(sort0=[$1], dir0=[ASC])                                                     
>                   HashToRandomExchangePrel(dist0=[[$1]])                                             
>                     FilterPrel(condition=[>($2, 400000.00)])                                         
>                       ScanPrel(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/tpch-multi/orders]], selectionRoot=/drill/testdata/tpch-multi/orders, columns=[SchemaPath [`C_CUSTKEY`], SchemaPath [`O_TOTALPRICE`]]]])                                                               
>               SelectionVectorRemoverPrel                                                                 
>                 SortPrel(sort0=[$1], dir0=[ASC])                                                         
>                   HashToRandomExchangePrel(dist0=[[$1]])                                                 
>                     ScanPrel(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/tpch-multi/customer]], selectionRoot=/drill/testdata/tpch-multi/customer, columns=[SchemaPath [`C_CUSTKEY`], SchemaPath [`C_NAME`]]]])                                                                   
>  | {                                                                                                     
>   "head" : {                                                                                             
>     "version" : 1,                                                                                       
>     "generator" : {                                                                                      
>       "type" : "ExplainHandler",                                                                         
>       "info" : ""                                                                                        
>     },                                                                                                   
>     "type" : "APACHE_DRILL_PHYSICAL",                                                                    
>     "options" : [ {                                                                                      
>       "name" : "planner.enable_hashjoin",                                                                
>       "kind" : "BOOLEAN",                                                                                
>       "type" : "SESSION",                                                                                
>       "num_val" : null,                                                                                  
>       "string_val" : null,                                                                               
>       "bool_val" : false,                                                                                
>       "float_val" : null                                                                                 
>     } ],                                                                                                 
>     "resultMode" : "EXEC"                                                                                
>   },                                                                                                     
>   "graph" : [ {                                                                                          
>     "pop" : "parquet-scan",                                                                              
>     "@id" : 1,                                                                                           
>     "entries" : [ {                                                                                      
>       "path" : "maprfs:/drill/testdata/tpch-multi/customer"                                              
>     } ],                                                                                                 
>     "storage" : {                                                                                        
>       "type" : "file",                                                                                   
>       "connection" : "maprfs:///",                                                                       
>       "workspaces" : {                                                                                   
>         "root" : {                                                                                       
>           "location" : "/",                                                                              
>           "writable" : false,                                                                            
>           "storageformat" : null                                                                         
>         },                                                                                               
>         "TpcHMulti" : {                                                                                  
>           "location" : "/drill/testdata/tpch-multi",                                                     
>           "writable" : true,                                                                             
>           "storageformat" : "parquet"                                                                    
>         },                                                                                               
>         "TpcH001" : {                                                                                    
>           "location" : "/drill/testdata/tpch001",                                                        
>           "writable" : true,                                                                             
>           "storageformat" : "parquet"                                                                    
>         },                                                                                               
>         "P1Tests" : {                                                                                    
>           "location" : "/drill/testdata/p1tests",                                                        
>           "writable" : true,                                                                             
>           "storageformat" : "parquet"                                                                    
>         },                                                                                               
>         "Krystal" : {                                                                                    
>           "location" : "/drill/testdata/krystal",                                                        
>           "writable" : true,                                                                             
>           "storageformat" : "parquet"                                                                    
>         },                                                                                               
>         "Amplab" : {                                                                                     
>           "location" : "/drill/testdata/amplab",                                                         
>           "writable" : true,                                                                             
>           "storageformat" : "parquet"                                                                    
>         },                                                                                               
>         "tmp" : {                                                                                        
>           "location" : "/tmp",                                                                           
>           "writable" : true,                                                                             
>           "storageformat" : "csv"                                                                        
>         }                                                                                                
>       },                                                                                                 
>       "formats" : {                                                                                      
>         "psv" : {                                                                                        
>           "type" : "text",                                                                               
>           "extensions" : [ "tbl" ],                                                                      
>           "delimiter" : "|"                                                                              
>         },                                                                                               
>         "csv" : {                                                                                        
>           "type" : "text",                                                                               
>           "extensions" : [ "csv" ],                                                                      
>           "delimiter" : ","                                                                              
>         },                                                                                               
>         "tsv" : {                                                                                        
>           "type" : "text",                                                                               
>           "extensions" : [ "tsv" ],                                                                      
>           "delimiter" : "\t"                                                                             
>         },                                                                                               
>         "parquet" : {                                                                                    
>           "type" : "parquet"                                                                             
>         },                                                                                               
>         "json" : {                                                                                       
>           "type" : "json"                                                                                
>         }                                                                                                
>       }                                                                                                  
>     },                                                                                                   
>     "format" : {                                                                                         
>       "type" : "parquet"                                                                                 
>     },                                                                                                   
>     "columns" : [ "`C_CUSTKEY`", "`C_NAME`" ],                                                           
>     "selectionRoot" : "/drill/testdata/tpch-multi/customer"                                              
>   }, {                                                                                                   
>     "pop" : "hash-to-random-exchange",                                                                   
>     "@id" : 2,                                                                                           
>     "child" : 1,                                                                                         
>     "expr" : "hash(`C_CUSTKEY`) ",                                                                       
>     "initialAllocation" : 1000000,                                                                       
>     "maxAllocation" : 10000000000                                                                        
>   }, {                                                                                                   
>     "pop" : "external-sort",                                                                             
>     "@id" : 3,                                                                                           
>     "child" : 2,                                                                                         
>     "orderings" : [ {                                                                                    
>       "order" : "ASC",                                                                                   
>       "expr" : "`C_CUSTKEY`",                                                                            
>       "nullDirection" : "UNSPECIFIED"                                                                    
>     } ],                                                                                                 
>     "reverse" : false,                                                                                   
>     "initialAllocation" : 1000000,                                                                       
>     "maxAllocation" : 10000000000                                                                        
>   }, {                                                                                                   
>     "pop" : "selection-vector-remover",                                                                  
>     "@id" : 4,                                                                                           
>     "child" : 3,                                                                                         
>     "initialAllocation" : 1000000,                                                                       
>     "maxAllocation" : 10000000000                                                                        
>   }, {                                                                                                   
>     "pop" : "project",                                                                                   
>     "@id" : 5,                                                                                           
>     "exprs" : [ {                                                                                        
>       "ref" : "`*0`",                                                                                    
>       "expr" : "`*`"                                                                                     
>     }, {                                                                                                 
>       "ref" : "`C_CUSTKEY0`",                                                                            
>       "expr" : "`C_CUSTKEY`"                                                                             
>     }, {                                                                                                 
>       "ref" : "`C_NAME`",                                                                                
>       "expr" : "`C_NAME`"                                                                                
>     } ],                                                                                                 
>     "child" : 4,                                                                                         
>     "initialAllocation" : 1000000,                                                                       
>     "maxAllocation" : 10000000000                                                                        
>   }, {                                                                                                   
>     "pop" : "parquet-scan",                                                                              
>     "@id" : 6,                                                                                           
>     "entries" : [ {                                                                                      
>       "path" : "maprfs:/drill/testdata/tpch-multi/orders"                                                
>     } ],                                                                                                 
>     "storage" : {                                                                                        
>       "type" : "file",                                                                                   
>       "connection" : "maprfs:///",                                                                       
>       "workspaces" : {                                                                                   
>         "root" : {                                                                                       
>           "location" : "/",                                                                              
>           "writable" : false,                                                                            
>           "storageformat" : null                                                                         
>         },                                                                                               
>         "TpcHMulti" : {                                                                                  
>           "location" : "/drill/testdata/tpch-multi",                                                     
>           "writable" : true,                                                                             
>           "storageformat" : "parquet"                                                                    
>         },                                                                                               
>         "TpcH001" : {                                                                                    
>           "location" : "/drill/testdata/tpch001",                                                        
>           "writable" : true,                                                                             
>           "storageformat" : "parquet"                                                                    
>         },                                                                                               
>         "P1Tests" : {                                                                                    
>           "location" : "/drill/testdata/p1tests",                                                        
>           "writable" : true,                                                                             
>           "storageformat" : "parquet"                                                                    
>         },                                                                                               
>         "Krystal" : {                                                                                    
>           "location" : "/drill/testdata/krystal",                                                        
>           "writable" : true,                                                                             
>           "storageformat" : "parquet"                                                                    
>         },                                                                                               
>         "Amplab" : {                                                                                     
>           "location" : "/drill/testdata/amplab",                                                         
>           "writable" : true,                                                                             
>           "storageformat" : "parquet"                                                                    
>         },                                                                                               
>         "tmp" : {                                                                                        
>           "location" : "/tmp",                                                                           
>           "writable" : true,                                                                             
>           "storageformat" : "csv"                                                                        
>         }                                                                                                
>       },                                                                                                 
>       "formats" : {                                                                                      
>         "psv" : {                                                                                        
>           "type" : "text",                                                                               
>           "extensions" : [ "tbl" ],                                                                      
>           "delimiter" : "|"                                                                              
>         },                                                                                               
>         "csv" : {                                                                                        
>           "type" : "text",                                                                               
>           "extensions" : [ "csv" ],                                                                      
>           "delimiter" : ","                                                                              
>         },                                                                                               
>         "tsv" : {                                                                                        
>           "type" : "text",                                                                               
>           "extensions" : [ "tsv" ],                                                                      
>           "delimiter" : "\t"                                                                             
>         },                                                                                               
>         "parquet" : {                                                                                    
>           "type" : "parquet"                                                                             
>         },                                                                                               
>         "json" : {                                                                                       
>           "type" : "json"                                                                                
>         }                                                                                                
>       }                                                                                                  
>     },                                                                                                   
>     "format" : {                                                                                         
>       "type" : "parquet"                                                                                 
>     },                                                                                                   
>     "columns" : [ "`C_CUSTKEY`", "`O_TOTALPRICE`" ],                                                     
>     "selectionRoot" : "/drill/testdata/tpch-multi/orders"                                                
>   }, {                                                                                                   
>     "pop" : "filter",                                                                                    
>     "@id" : 7,                                                                                           
>     "child" : 6,                                                                                         
>     "expr" : "greater_than(`O_TOTALPRICE`, 400000.0) ",                                                  
>     "initialAllocation" : 1000000,                                                                       
>     "maxAllocation" : 10000000000                                                                        
>   }, {                                                                                                   
>     "pop" : "hash-to-random-exchange",                                                                   
>     "@id" : 8,                                                                                           
>     "child" : 7,                                                                                         
>     "expr" : "hash(`C_CUSTKEY`) ",                                                                       
>     "initialAllocation" : 1000000,                                                                       
>     "maxAllocation" : 10000000000                                                                        
>   }, {                                                                                                   
>     "pop" : "external-sort",                                                                             
>     "@id" : 9,                                                                                           
>     "child" : 8,                                                                                         
>     "orderings" : [ {                                                                                    
>       "order" : "ASC",                                                                                   
>       "expr" : "`C_CUSTKEY`",                                                                            
>       "nullDirection" : "UNSPECIFIED"                                                                    
>     } ],                                                                                                 
>     "reverse" : false,                                                                                   
>     "initialAllocation" : 1000000,                                                                       
>     "maxAllocation" : 10000000000                                                                        
>   }, {                                                                                                   
>     "pop" : "selection-vector-remover",                                                                  
>     "@id" : 10,                                                                                          
>     "child" : 9,                                                                                         
>     "initialAllocation" : 1000000,                                                                       
>     "maxAllocation" : 10000000000                                                                        
>   }, {                                                                                                   
>     "pop" : "merge-join",                                                                                
>     "@id" : 11,                                                                                          
>     "left" : 10,                                                                                         
>     "right" : 5,                                                                                         
>     "conditions" : [ {                                                                                   
>       "relationship" : "==",                                                                             
>       "left" : "`C_CUSTKEY`",                                                                            
>       "right" : "`C_CUSTKEY0`"                                                                           
>     } ],                                                                                                 
>     "joinType" : "INNER",                                                                                
>     "initialAllocation" : 1000000,                                                                       
>     "maxAllocation" : 10000000000                                                                        
>   }, {                                                                                                   
>     "pop" : "project",                                                                                   
>     "@id" : 12,                                                                                          
>     "exprs" : [ {                                                                                        
>       "ref" : "`O_TOTALPRICE`",                                                                          
>       "expr" : "`O_TOTALPRICE`"                                                                          
>     }, {                                                                                                 
>       "ref" : "`C_NAME`",                                                                                
>       "expr" : "`C_NAME`"                                                                                
>     } ],                                                                                                 
>     "child" : 11,                                                                                        
>     "initialAllocation" : 1000000,                                                                       
>     "maxAllocation" : 10000000000                                                                        
>   }, {                                                                                                   
>     "pop" : "hash-to-random-exchange",                                                                   
>     "@id" : 13,                                                                                          
>     "child" : 12,                                                                                        
>     "expr" : "hash(`O_TOTALPRICE`) ",                                                                    
>     "initialAllocation" : 1000000,                                                                       
>     "maxAllocation" : 10000000000                                                                        
>   }, {                                                                                                   
>     "pop" : "external-sort",                                                                             
>     "@id" : 14,                                                                                          
>     "child" : 13,                                                                                        
>     "orderings" : [ {                                                                                    
>       "order" : "ASC",                                                                                   
>       "expr" : "`O_TOTALPRICE`",                                                                         
>       "nullDirection" : "UNSPECIFIED"                                                                    
>     } ],
>     "reverse" : false,
>     "initialAllocation" : 1000000,
>     "maxAllocation" : 10000000000
>   }, {
>     "pop" : "selection-vector-remover",
>     "@id" : 15,
>     "child" : 14,
>     "initialAllocation" : 1000000,
>     "maxAllocation" : 10000000000
>   }, {
>     "pop" : "single-merge-exchange",
>     "@id" : 16,
>     "child" : 15,
>     "orderings" : [ {
>       "order" : "ASC",
>       "expr" : "`O_TOTALPRICE`",
>       "nullDirection" : "UNSPECIFIED"
>     } ],
>     "initialAllocation" : 1000000,
>     "maxAllocation" : 10000000000
>   }, {
>     "pop" : "screen",
>     "@id" : 17,
>     "child" : 16,
>     "initialAllocation" : 1000000,
>     "maxAllocation" : 10000000000
>   } ]
> } |
> +------------+------------+
> 1 row selected (0.151 seconds)



--
This message was sent by Atlassian JIRA
(v6.2#6252)