You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Aswathy Chellammal Sreekumar (JIRA)" <ji...@apache.org> on 2018/04/24 21:48:00 UTC

[jira] [Assigned] (HIVE-19292) More than one materialized view in DB affecting query rewrite

     [ https://issues.apache.org/jira/browse/HIVE-19292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aswathy Chellammal Sreekumar reassigned HIVE-19292:
---------------------------------------------------


> More than one materialized view in DB affecting query rewrite
> -------------------------------------------------------------
>
>                 Key: HIVE-19292
>                 URL: https://issues.apache.org/jira/browse/HIVE-19292
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>    Affects Versions: 3.0.0
>            Reporter: Aswathy Chellammal Sreekumar
>            Assignee: Jesus Camacho Rodriguez
>            Priority: Major
>
> When there are more than one materialized view query rewrite fails to pick the materialized view, which it picks otherwise
> {noformat}
> 1: jdbc:hive2://<host-name>> show materialized views;
> INFO  : Compiling command(queryId=hive_20180424204708_e39107e4-ae65-4e3e-a73f-19e0519b515c): show materialized views
> INFO  : Semantic Analysis Completed
> INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:tab_name, type:string, comment:from deserializer)], properties:null)
> INFO  : Completed compiling command(queryId=hive_20180424204708_e39107e4-ae65-4e3e-a73f-19e0519b515c); Time taken: 0.021 seconds
> INFO  : Executing command(queryId=hive_20180424204708_e39107e4-ae65-4e3e-a73f-19e0519b515c): show materialized views
> INFO  : Starting task [Stage-0:DDL] in serial mode
> INFO  : Completed executing command(queryId=hive_20180424204708_e39107e4-ae65-4e3e-a73f-19e0519b515c); Time taken: 0.174 seconds
> INFO  : OK
> +----------------------+
> |       tab_name       |
> +----------------------+
> | cmv_mat_view         |
> | mv_agg               |
> | source_table_001_mv  |
> +----------------------+
> 3 rows selected (0.3 seconds)
> 1: jdbc:hive2://<host-name>> drop materialized view cmv_mat_view;
> INFO  : Compiling command(queryId=hive_20180424204724_5d4f3aaf-ed22-4828-a1a8-d8fe9f6bd9af): drop materialized view cmv_mat_view
> INFO  : Semantic Analysis Completed
> INFO  : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
> INFO  : Completed compiling command(queryId=hive_20180424204724_5d4f3aaf-ed22-4828-a1a8-d8fe9f6bd9af); Time taken: 0.029 seconds
> INFO  : Executing command(queryId=hive_20180424204724_5d4f3aaf-ed22-4828-a1a8-d8fe9f6bd9af): drop materialized view cmv_mat_view
> INFO  : Starting task [Stage-0:DDL] in serial mode
> INFO  : Completed executing command(queryId=hive_20180424204724_5d4f3aaf-ed22-4828-a1a8-d8fe9f6bd9af); Time taken: 0.312 seconds
> INFO  : OK
> No rows affected (0.369 seconds)
> 1: jdbc:hive2://<host-name>> explain
> . . . . . . . . . . . . . . . . . . . . . . .> select
> . . . . . . . . . . . . . . . . . . . . . . .> SUM(A.DOWN_VOLUME) AS DOWNLOAD_VOLUME_BYTES,
> . . . . . . . . . . . . . . . . . . . . . . .> FLOOR(A.MY_DATE to hour),A.MY_ID2,A.ENVIRONMENT
> . . . . . . . . . . . . . . . . . . . . . . .> FROM source_table_001 AS A
> . . . . . . . . . . . . . . . . . . . . . . .> group by A.MY_ID,A.MY_ID2,A.ENVIRONMENT,FLOOR(A.MY_DATE to hour);
> INFO  : Compiling command(queryId=hive_20180424204736_76958a4d-0f08-4e22-93c6-67e3a1493b92): explain
> select
> SUM(A.DOWN_VOLUME) AS DOWNLOAD_VOLUME_BYTES,
> FLOOR(A.MY_DATE to hour),A.MY_ID2,A.ENVIRONMENT
> FROM source_table_001 AS A
> group by A.MY_ID,A.MY_ID2,A.ENVIRONMENT,FLOOR(A.MY_DATE to hour)
> INFO  : Semantic Analysis Completed
> INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:Explain, type:string, comment:null)], properties:null)
> INFO  : Completed compiling command(queryId=hive_20180424204736_76958a4d-0f08-4e22-93c6-67e3a1493b92); Time taken: 0.374 seconds
> INFO  : Executing command(queryId=hive_20180424204736_76958a4d-0f08-4e22-93c6-67e3a1493b92): explain
> select
> SUM(A.DOWN_VOLUME) AS DOWNLOAD_VOLUME_BYTES,
> FLOOR(A.MY_DATE to hour),A.MY_ID2,A.ENVIRONMENT
> FROM source_table_001 AS A
> group by A.MY_ID,A.MY_ID2,A.ENVIRONMENT,FLOOR(A.MY_DATE to hour)
> INFO  : Starting task [Stage-3:EXPLAIN] in serial mode
> INFO  : Completed executing command(queryId=hive_20180424204736_76958a4d-0f08-4e22-93c6-67e3a1493b92); Time taken: 0.006 seconds
> INFO  : OK
> +----------------------------------------------------+
> |                      Explain                       |
> +----------------------------------------------------+
> | Plan optimized by CBO.                             |
> |                                                    |
> | Vertex dependency in root stage                    |
> | Reducer 2 <- Map 1 (SIMPLE_EDGE)                   |
> |                                                    |
> | Stage-0                                            |
> |   Fetch Operator                                   |
> |     limit:-1                                       |
> |     Stage-1                                        |
> |       Reducer 2 vectorized, llap                   |
> |       File Output Operator [FS_13]                 |
> |         Select Operator [SEL_12] (rows=1 width=143) |
> |           Output:["_col0","_col1","_col2","_col3"] |
> |           Group By Operator [GBY_11] (rows=1 width=151) |
> |             Output:["_col0","_col1","_col2","_col3","_col4"],aggregations:["sum(VALUE._col0)"],keys:KEY._col0, KEY._col1, KEY._col2, KEY._col3 |
> |           <-Map 1 [SIMPLE_EDGE] vectorized, llap   |
> |             SHUFFLE [RS_10]                        |
> |               PartitionCols:_col0, _col1, _col2, _col3 |
> |               Group By Operator [GBY_9] (rows=1 width=151) |
> |                 Output:["_col0","_col1","_col2","_col3","_col4"],aggregations:["sum(_col4)"],keys:_col0, _col1, _col2, _col3 |
> |                 Select Operator [SEL_8] (rows=1 width=151) |
> |                   Output:["_col0","_col1","_col2","_col3","_col4"] |
> |                   TableScan [TS_0] (rows=1 width=151) |
> |                     default@source_table_001,a, ACID table,Tbl:COMPLETE,Col:COMPLETE,Output:["my_id","my_id2","environment","my_date","down_volume"] |
> |                                                    |
> +----------------------------------------------------+
> 25 rows selected (0.41 seconds)
> 1: jdbc:hive2://<host-name>> drop materialized view mv_agg;
> INFO  : Compiling command(queryId=hive_20180424204752_149cc1fd-b21e-4b35-955a-67e019a34e98): drop materialized view mv_agg
> INFO  : Semantic Analysis Completed
> INFO  : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
> INFO  : Completed compiling command(queryId=hive_20180424204752_149cc1fd-b21e-4b35-955a-67e019a34e98); Time taken: 0.036 seconds
> INFO  : Executing command(queryId=hive_20180424204752_149cc1fd-b21e-4b35-955a-67e019a34e98): drop materialized view mv_agg
> INFO  : Starting task [Stage-0:DDL] in serial mode
> INFO  : Completed executing command(queryId=hive_20180424204752_149cc1fd-b21e-4b35-955a-67e019a34e98); Time taken: 0.234 seconds
> INFO  : OK
> No rows affected (0.297 seconds)
> 1: jdbc:hive2://<host-name>> explain
> . . . . . . . . . . . . . . . . . . . . . . .> select
> . . . . . . . . . . . . . . . . . . . . . . .> SUM(A.DOWN_VOLUME) AS DOWNLOAD_VOLUME_BYTES,
> . . . . . . . . . . . . . . . . . . . . . . .> FLOOR(A.MY_DATE to hour),A.MY_ID2,A.ENVIRONMENT
> . . . . . . . . . . . . . . . . . . . . . . .> FROM source_table_001 AS A
> . . . . . . . . . . . . . . . . . . . . . . .> group by A.MY_ID,A.MY_ID2,A.ENVIRONMENT,FLOOR(A.MY_DATE to hour);
> INFO  : Compiling command(queryId=hive_20180424204759_4bdef3d8-0f06-49fb-876e-fab8100359b6): explain
> select
> SUM(A.DOWN_VOLUME) AS DOWNLOAD_VOLUME_BYTES,
> FLOOR(A.MY_DATE to hour),A.MY_ID2,A.ENVIRONMENT
> FROM source_table_001 AS A
> group by A.MY_ID,A.MY_ID2,A.ENVIRONMENT,FLOOR(A.MY_DATE to hour)
> INFO  : Semantic Analysis Completed
> INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:Explain, type:string, comment:null)], properties:null)
> INFO  : Completed compiling command(queryId=hive_20180424204759_4bdef3d8-0f06-49fb-876e-fab8100359b6); Time taken: 0.369 seconds
> INFO  : Executing command(queryId=hive_20180424204759_4bdef3d8-0f06-49fb-876e-fab8100359b6): explain
> select
> SUM(A.DOWN_VOLUME) AS DOWNLOAD_VOLUME_BYTES,
> FLOOR(A.MY_DATE to hour),A.MY_ID2,A.ENVIRONMENT
> FROM source_table_001 AS A
> group by A.MY_ID,A.MY_ID2,A.ENVIRONMENT,FLOOR(A.MY_DATE to hour)
> INFO  : Starting task [Stage-3:EXPLAIN] in serial mode
> INFO  : Completed executing command(queryId=hive_20180424204759_4bdef3d8-0f06-49fb-876e-fab8100359b6); Time taken: 0.006 seconds
> INFO  : OK
> +----------------------------------------------------+
> |                      Explain                       |
> +----------------------------------------------------+
> | Plan optimized by CBO.                             |
> |                                                    |
> | Vertex dependency in root stage                    |
> | Reducer 2 <- Map 1 (SIMPLE_EDGE)                   |
> |                                                    |
> | Stage-0                                            |
> |   Fetch Operator                                   |
> |     limit:-1                                       |
> |     Stage-1                                        |
> |       Reducer 2 vectorized, llap                   |
> |       File Output Operator [FS_13]                 |
> |         Select Operator [SEL_12] (rows=1 width=143) |
> |           Output:["_col0","_col1","_col2","_col3"] |
> |           Group By Operator [GBY_11] (rows=1 width=151) |
> |             Output:["_col0","_col1","_col2","_col3","_col4"],aggregations:["sum(VALUE._col0)"],keys:KEY._col0, KEY._col1, KEY._col2, KEY._col3 |
> |           <-Map 1 [SIMPLE_EDGE] vectorized, llap   |
> |             SHUFFLE [RS_10]                        |
> |               PartitionCols:_col0, _col1, _col2, _col3 |
> |               Group By Operator [GBY_9] (rows=1 width=151) |
> |                 Output:["_col0","_col1","_col2","_col3","_col4"],aggregations:["sum(_col0)"],keys:_col1, _col2, _col3, _col4 |
> |                 Select Operator [SEL_8] (rows=1 width=151) |
> |                   Output:["_col0","_col1","_col2","_col3","_col4"] |
> |                   TableScan [TS_0] (rows=1 width=151) |
> |                     default@source_table_001_mv,default.source_table_001_mv,Tbl:COMPLETE,Col:COMPLETE,Output:["down_volume_sum","my_id","my_id2","environment","my_date"] |
> |                                                    |
> +----------------------------------------------------+
> 25 rows selected (0.403 seconds)
> 1: jdbc:hive2://<host-name>> show materialized views;
> INFO  : Compiling command(queryId=hive_20180424204948_16bcffca-f716-49b0-94f5-d088231dba5a): show materialized views
> INFO  : Semantic Analysis Completed
> INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:tab_name, type:string, comment:from deserializer)], properties:null)
> INFO  : Completed compiling command(queryId=hive_20180424204948_16bcffca-f716-49b0-94f5-d088231dba5a); Time taken: 0.013 seconds
> INFO  : Executing command(queryId=hive_20180424204948_16bcffca-f716-49b0-94f5-d088231dba5a): show materialized views
> INFO  : Starting task [Stage-0:DDL] in serial mode
> INFO  : Completed executing command(queryId=hive_20180424204948_16bcffca-f716-49b0-94f5-d088231dba5a); Time taken: 0.147 seconds
> INFO  : OK
> +----------------------+
> |       tab_name       |
> +----------------------+
> | source_table_001_mv  |
> +----------------------+
> 1 row selected (0.219 seconds)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)